Their study “found no significant improvements for developers” using Microsoft’s AI-powered coding assistant tool Copilot, according to the article (shared by Slashdot reader snydeq):
Use of GitHub Copilot also introduced 41% more bugs, according to the study…
In addition to measuring productivity, the Uplevel study looked at factors in developer burnout, and it found that GitHub Copilot hasn’t helped there, either. The amount of working time spent outside of standard hours decreased for both the control group and the test group using the coding tool, but it decreased more when the developers weren’t using Copilot.
An Uplevel product manager/data analyst acknowledged to the magazine that there may be other ways to measure developer productivity — but they still consider their metrics solid. “We heard that people are ending up being more reviewers for this code than in the past… You just have to keep a close eye on what is being generated; does it do the thing that you’re expecting it to do?”
The article also quotes the CEO of software development firm Gehtsoft, who says they didn’t see major productivity gains from LLM-based coding assistants — but did see them introducing errors into code. With different prompts generating different code sections, “It becomes increasingly more challenging to understand and debug the AI-generated code, and troubleshooting becomes so resource-intensive that it is easier to rewrite the code from scratch than fix it.”
On the other hand, cloud services provider Innovative Solutions saw significant productivity gains from coding assistants like Claude Dev and GitHub Copilot. And Slashdot reader destined2fail1990 says that while large/complex code bases may not see big gains, “I have seen a notable increase in productivity from using Cursor, the AI powered IDE.”
Yes, you have to review all the code that it generates, why wouldn’t you? But often times it just works. It removes the tedious tasks like querying databases, writing model code, writing forms and processing forms, and a lot more. Some forms can have hundreds of fields and processing those fields along with doing checks for valid input is time consuming, but can be automated effectively using AI.
This prompted an interesting discussion on the original story submission. Slashdot reader bleedingobvious responded:
Cursor/Claude are great BUT the code produced is almost never great quality. Even given these tools, the junior/intern teams still cannot outpace the senior devs. Great for learning, maybe, but the productivity angle not quite there…. yet.
It’s damned close, though. GIve it 3-6 months.
And Slashdot reader abEeyore posted:
I suspect that the results are quite a bit more nuanced than that. I expect that it is, even outside of the mentioned code review, a shift in where and how the time is spent, and not necessarily in how much time is spent.
Agree? Disagree? Share your own experiences in the comments.
And are developers really saving time with AI coding assistants?
Read more of this story at Slashdot.