Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I gave Devstral 2 in their CLI a shot and let it run over one of my smaller private projects, about 500 KB of code. I asked it to review the codebase, understand the application's functionality, identify issues, and fix them.

It spent about half an hour, correctly identified what the program did, found two small bugs, fixed them, made some minor improvements, and added two new, small but nice features.

It introduced one new bug, but then fixed it on the first try when I pointed it out.

The changes it made to the code were minimal and localized; unlike some more "creative" models, it didn't randomly rewrite stuff it didn't have to.

It's too early to form a conclusion, but so far, it's looking quite competent.





Also tried it on a small project, it did ok finding issues but completely failed doing rather basic edits, like it lost closing brackets or used wrong syntax and couldn't recover. The CLI was easy to setup and use though.

Did you try it via OpenRouter? If so, what provider? I've noticed some providers seems to not exactly be upfront about what quantization they're using, you can see that the responses from some providers who supposedly run the exact same model and weights give vastly different responses.

Back when Devstral 1 released, this was made very noticeable to me because the ones who used the smaller quantizations were unable to actually properly format the code, just as you noticed, that's why this sounded so similar to what I've seen before.


In my experience, the messed up closing brackets are a surprisingly common issue for LLMs. Both Sonnet 4.5 and Gemini 3 also do this regularly. Seems like something that should be relatively easy to fix, though.

On what hardware did you run it?

FWIW, it’s free through Mistral right now


OpenRouter rate limit is pretty bad, almost unuseable. And they take margin 5.5% on the based models.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: