The AI skeptics instead stick to hard data, which so far shows a 19% reduction i...

simonw · 2026-02-05T22:11:19 1770329479

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

> 1) We do NOT provide evidence that AI systems do not currently speed up many or most software developers. Clarification: We do not claim that our developers or repositories represent a majority or plurality of software development work.

> 2) We do NOT provide evidence that AI systems do not speed up individuals or groups in domains other than software development. Clarification: We only study software development.

> 3) We do NOT provide evidence that AI systems in the near future will not speed up developers in our exact setting. Clarification: Progress is difficult to predict, and there has been substantial AI progress over the past five years [3].

> 4) We do NOT provide evidence that there are not ways of using existing AI systems more effectively to achieve positive speedup in our exact setting. Clarification: Cursor does not sample many tokens from LLMs, it may not use optimal prompting/scaffolding, and domain/repository-specific training/finetuning/few-shot learning could yield positive speedup.

rhubarbtree · 2026-02-06T08:01:51 1770364911

Points 2 and 3 are irrelevant.

Point 1 is saying results may not generalise, which is not a counter claim. It’s just saying “we cannot speak for everyone”.

Point 4 is saying there may be other techniques that work better, which again is not a counter claim. It’s just saying “you may find bette methods.”

Those are standard scientific statements giving scope to the research. They are in no way contradicting their findings. To contradict their findings, you would need similarly rigorous work that perhaps fell into those scenarios.

Not pushing an opinion here, but if we’re talking about research then we should be rigorous and rationale by posting counter evidence. Anyone who has done serious research in software engineering knows the difficulties involved and that this study represents one set of data. But it is at least a rigorous set and not anecdata or marketing.

I for one would love a rigorous study that showed a reliable methodology for gaining generalised productivity gains with the same or better code quality.

raincole · 2026-02-05T23:04:49 1770332689

There is no such hard data. It's just research done on 16 developers using Cursor and Sonnet 3.5.