In 2026, researchers discovered that AI coding tools cannot be pried out of developers’ vises.
But other researchers warn that while AI is definitely helping programmers write faster code, it may not be making them write better code. And that can cause problems for them in the future.
Specifically, in February 2026, the acclaimed AI research institute METR announced a surprising revelation. “Most developers can no longer work without AI, even for a limited number of tasks.”
METR wanted to provide an update in 2025 on some of the groundbreaking research published months ago on AI coding productivity. In it, researchers measured how long it takes open source developers to perform tasks manually versus using AI.
Developers in this study reported that AI increased their productivity, but were shocked to learn that it was actually slowing them down. Sure, the code was generated faster, but we spent extra time finding and fixing errors, interacting with the AI, and waiting for tasks to complete.
When METR began repeating experiments to measure progress in AI and coder proficiency, it could not.
The researchers confessed that they were not willing to participate, even if it was just for research purposes, because they “didn’t want to work without the AI.”
Instead, METR released a study in May that allows tech employees to self-report AI productivity gains. Not surprisingly, they recognized that AI doubled their value to the organization.
However, recent headlines about the huge expenses of so-called token maxing, coupled with several recent studies, have called such self-perception into question.
Tokenmaxxing, or using the number of tokens a person uses as a proxy for AI productivity, is a trend for 2026 so far. And it may already be over.
Amazon has shut down its internal token-tracking leaderboard, called KiloRank, after employees overused AI agents to cheat and drive up costs, the Financial Times reported this week. Employees have proven that the use of AI does not automatically lead to increased productivity.
According to The Information, Uber used up its 2026 AI budget in the first four months of this year. COO Andrew MacDonald recently said on a podcast that this spending hasn’t led to any measurable increases in projects or productivity.
And AI-generated code may increase, not necessarily reduce, the need for ongoing code maintenance, programmer and author James Shore elegantly argued in a blog post that went viral on Hacker News.
“Did you write code twice as fast? You better hope your maintenance costs were cut in half,” he writes. “If you don’t, you’re screwed. You’re signing a permanent contract in exchange for a temporary speed boost.”
There is other evidence that AI can increase code maintenance issues.
A viral tweet from Aiswarya Sankar, founder and CEO of reliability engineering agency startup Entelligence AI, declares that companies spend 44% of their tokens on AI-generated bug fixes. Meanwhile, code review tool company CodeRabbit said its analysis of open source pull requests found that AI caused 1.7 times more problems than human code.
Admittedly, these are self-serving statistics from people trying to sell AI code review tools.
However, independent researchers have also found such problems. Researchers from the acclaimed Singapore Management University published a report in April warning that “AI-generated code can introduce long-term maintenance costs to real-world software projects.”
Given that programmers love AI assistants, what’s the solution?
People touting AI coding agents say that developers can simply use them to do the hard work of fixing code as fast as the AI can spit it out. This is what Scott Wu, founder and CEO of Cognition, developer of the AI coding agent Devin, suggests.
But even he admits that while Devin can work independently, he rates his skills as somewhere between a junior and intermediate-level programmer, depending on the task. This is not a pass-it-and-forget-it solution.
SMU researchers are proposing a more human approach. Programmers need to know as much about what tasks AI will and won’t perform as they do about their favorite coding language. They need strong quality assurance systems designed for AI and insist on carefully reviewing AI work as if it were a junior developer.
Meanwhile, researchers (and Wu agrees) argue that big-picture tasks like software architecture and security design should still be done by humans.
If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.
Source link
