Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I see people thinking modern AI models can’t generate working code.

Really? Can you show any examples of someone claiming AI models cannot generate working code? I haven't seen anyone make that claim in years, even from the most skeptical critics.



I've seen it said plenty of the times that the code might work eventually (after several cycles of prompting and testing), but even then the code you get might not be something you'd want to maintain, and it might contain bugs and security issues that don't (at least initially) seem to impact its ability to do whatever it was written to do but which could cause problems later.


Yeah but that's a completely different thing.


And really the problem isn’t that it can’t make working code, the problem is that it’ll never get the kind of context that is in your brain.

I started working today on a project I hadn’t touched in a while but I now needed to as it was involved in an incident where I needed to address some shortcomings. I knew the fix I needed to do but I went about my usual AI assisted workflow because of course I’m lazy the last thing I want to do is interrupt my normal work to fix this stupid problem.

The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it. I can give it a lot of instructions but it’s impossible to write out everything in my head across multiple systems.

The AI did write working code, but despite writing the code way faster than me, it made small but critical mistakes that I wouldn’t have made on my first draft.

For example, it just added in a command flag that I knew that it didn’t need, and it actually probably should have known it, too. Basically it changed a line of code that it didn’t need to touch.

It also didn’t realize that the curled URL was going to redirect so we needed an -L flag. Maybe it should have but my brain knew it already.

It also misinterpreted some changes in direction that a human never would have. It confused my local repository for the remote one because I originally thought I was going to set a mirror, but I changed plans and used a manual package upload to curl from. So it out the remote URL in some places where the local one should have been.

Finally, it seems to have just created some strange text gore while editing the readme where it deleted existing content for seemingly no reason other than some kind of readline snafu.

So yes it produced very fast great code that would have taken me way longer to do, but I had to go back and consume a very similar amount of time to fix so many things that I might as well have just done it manually.

But hey I’m glad my company is paying $XX/month for my lazy workday machine.


>>The AI doesn’t know anything about the full scope of all the things in my head about my company’s environment and the information I need to convey to it.<<

This is your problem: How should it know if you do not provide it?

Use Claude - in the pro version you can submit files for each project which are setting the context: This can be files, source code, SQL scripts, screenshots whatever - then the output will be based on your context given by providing these files.


Is this process of brain dumping faster than me just writing the code?

If I was truly going to automate this one-time task I would have to give the AI access to my browser or an API token for the repository provider, so I’m either giving it dangerous modification capability via browser automation or I’m spending even more time setting up API access and trusting that it actually knows how to interact with the service via API calls.

My company doesn’t provide Claude, they give me GitHub Copilot Pro or whatever it’s called, and when I provided the website it needed to get the RPM files I was working with it didn’t actually do anything with it. It just wrote a readme file that told me what to do. Like I mention it also just eventually mistook the remote repository as my local internal repository.

And one of the specific commands it screwed up was in my existing script and was already correct, it just decided to change it for no discernible reason. I didn’t ask it to do anything related to that particular line.

With such a high error rate, I would be hesitant to actually integrate AI to other systems to try to achieve a more fully automated workflow.


And your problem is that you didnt understood the point of their post. The full context was so complex and would be so time consuming to relay that they might as well code themselves.


Depends what they mean. Generate working code all the time or after going a few iterations of trying and promoting? It can very easily happen, that an LLM generates something that is a straight error, because it hallucinates some keyword argument or something like that, which doesn't actually exist. Only happened to me yesterday. So going from that, no, they are still not able to generate working code all the time. Especially, when the basis is a shoddy-made library itself, that is simply missing something required.


10 days ago someone was making this claim about copilot on legacy code: https://news.ycombinator.com/item?id=46932609

> Github Copilot has been great in getting that code coverage up marginally but ass otherwise.


That's a completely different claim. Or do you think an AI can always, without fail, produce working code in every situation? That's trivially false.


It's also trivially true that an AI has at least once been able to write a working hello world.

When someone claims that AI can't generate working code I assume that it means consistently generating working code. We're talking about a tool. It has to work more often than not and on codebases that we tend to work with, i.e legacy code.

Personally I don't claim that because I'm using everyday to generate working code.


I'll claim it. They can't generate working code for the things I am working on. They seem to be too complex or in languages that are too niche.

They can do a tolerable job with super popular /simple things like web dev and Python. It really depends on what you're doing.


Scroll up a few comments where someone said Claude is generating errors over and over again and that Claude cant work according to code guidelines etc :-))


That not the same.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: