Artificial intelligence tools can slow down experienced open-source developers rather than accelerate their work, according to findings released by the nonprofit AI research group METR.
Developers and experts widely anticipated that using advanced tools would significantly improve productivity and shorten the time required to complete tasks in familiar codebases by approximately 24%; however, a randomized study by METR found that the opposite was true.
“Surprisingly, we find that when developers use AI tools, they take 19% longer” than without the cutting-edge tools, according to the study’s authors.
Benchmarks ‘sacrifice realism’
The study also pointed out that “While coding/agentic benchmarks have proven useful for understanding AI capabilities, they typically sacrifice realism for scale and efficiency.”
The report’s authors go on to explain that coding tasks are often “self-contained” and do not require prior context to understand the tools’ capabilities. They also “use algorithmic evaluation that doesn’t capture many important capabilities … These properties may lead benchmarks to overestimate AI capabilities.”
Notably, in this study, the researchers observed that because benchmarks lack human oversight, AI models often stall on minor obstacles that a developer could quickly resolve in practical scenarios. Consequently, benchmarks may overestimate AI capabilities due to their design.
Future models may be better
While the slowdown appeared broadly consistent across tasks, the authors emphasized that the results could be specific to the environment tested. “These results do not imply that future models will not speed up developers in this exact setting,” which is “a salient possibility” given the rapid progress being made to increase AI capabilities.
Nonetheless, the findings challenge a broad perception that AI always makes highly paid software engineers much more productive, which has prompted significant investment in vendors whose AI coding tools, known as “vibe coders,” will enhance the software development process.
“Our results reveal a large disconnect between perceived and actual AI impact on developer productivity,’’ the study’s authors said. “Despite widespread adoption of AI tools and confident predictions of positive speedup from both experts and developers, we observe that AI actually slows down experienced developers in this setting.”
Researchers also found that AI coding tools can occasionally introduce errors and security vulnerabilities.
Methodology
To measure AI’s real-world impact on developer productivity, METR organized a randomized controlled trial in which 16 developers completed 246 tasks across open-source repositories they regularly contributed to. Each task was randomly assigned to allow or prohibit AI assistance, and researchers recorded the time it took participants to finish under each condition.
Explore how leading tech companies are reshaping the job market: Read our coverage of how Anthropic and IBM are transforming developer careers in the age of AI.