Looking at the responses below it's interesting how binary they are. It's classic hallucinations style where it's flopping between two alternatives but which ever one it picks it's absolutely confident about.
...is it though? Fundamentally, these are statistical models with harnesses that try to conform them to deterministic expectations via narrow goal massaging.
They're not improving on the underlying technology. Just iterating on the massaging and perhaps improved data accuracy, if at all. It's still a mishmash of code and cribbed scifi stories. So, of course it's going to hit loops because it's not fundamentally conscience.
> So, of course it's going to hit loops because it's not fundamentally conscience.
Wait, I was told that these are superintelligent agents with sophisticated reasoning skills, and that AGI is either here or right around the corner. Are you saying that's wrong?
Surely they can answer a simple question correctly. Just look at their ARC-AGI scores, and all the other benchmarks!
We made this unbeatable tests for AI then told some of the smartest engineering teams in the planet that they can present a solution in a black box without explaining if they cheated but if they win they get amazing headlines and to keep their jobs and funding.
Somehow thye beat the score in the same year, its crazy! No one could have seen this coming, and please do not test it at home to see if you get the same results, it gets embarrased outside of our office space
I think what's bewildering is the usual hypemongers promising (threatening) to replace entire categories of workers with this type of dogshit. As another commenter mentioned, most large employers are overstaffed by 2 to 3x so ai is mostly an excuse for investors not to get too worried about staffing cuts. The idea that Marc is blown away by this type of nonsense is indicative only of the types of people he surrounds himself with.
What's also bewildering is the complete opposite of the spectrum of calling something "dogshit" when it is quite obviously a very powerful tool. It won't replace workers. But it will make those workers more productive. You don't need to vibe-code to be able to do more work in the same amount of time with the help of an LLM coding agent.