Anthropic published a study that puts numbers to something I’ve been writing about in the design context for a while now. They ran a randomized controlled trial with 52 junior software engineers learning a new Python library. Half used AI assistance. Half coded by hand.
Judy Hanwen Shen and Alex Tamkin, writing for Anthropic Research:
Participants in the AI group scored 17% lower than those who coded by hand, or the equivalent of nearly two letter grades. Using AI sped up the task slightly, but this didn’t reach the threshold of statistical significance.
So the AI group didn’t finish meaningfully faster, but they understood meaningfully less. And the biggest gap was in debugging—the ability to recognize when code is wrong and figure out why. That’s the exact skill you need most when your job is to oversee AI-generated output.
The largest gap in scores between the two groups was on debugging questions, suggesting that the ability to understand when code is incorrect and why it fails may be a particular area of concern if AI impedes coding development.
This is the same dynamic I fear in design. When I wrote about the design talent crisis, educators like Eric Heiman told me “we internalize so much by doing things slower… learning through tinkering with our process, and making mistakes.” Bradford Prairie put it more bluntly: “If there’s one thing that AI can’t replace, it’s your sense of discernment for what is good and what is not good.” But discernment comes from reps, and AI is eating the reps.
The honest framing from Anthropic’s own researchers:
It is possible that AI both accelerates productivity on well-developed skills and hinders the acquisition of new ones.
Credit to Anthropic for publishing research that complicates the case for their own product. And the study’s footnote is worth noting: they used a chat-based AI assistant, not an agentic tool like Claude Code. Their expectation is that “the impacts of such programs on skill development are likely to be more pronounced.”
I can certainly attest that when I use Claude Code, I have no idea what’s going on!
The one bright spot: not all AI use was equal. Participants who asked conceptual questions and used AI to check their understanding scored well. The ones who delegated code generation wholesale scored worst. The difference was whether you were thinking alongside the tool or letting it think for you.
Cognitive effort—and even getting painfully stuck—is likely important for fostering mastery.
Getting painfully stuck. That’s the apprenticeship. That’s the grunt work. And it’s exactly what we’re optimizing away.

How AI assistance impacts the formation of coding skills
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.






















