In my experience LLMs mimic human thought, so they don't "copy" but they do write from "experience" -- and they know more than any single developer can.
So I'm getting tired of the argument that LLMs are "plagiarism machines" -- yes, they can be coaxed into repeating training material verbatim, but no, they don't do that unless you try.
Opus 4.6's C compiler? I've not looked at it, but I would bet it does not resemble GCC -- maybe some corners, but overall it must be new, and if the prompting was specific enough as to architecture and design then it might not resemble GCC or any other C compiler much at all.
Not only do LLMs mimic human thinking, but also they mimic human faults. Obviously one way in which they mimic human faults is that there are mistakes in the LLMs' training materials, so they will evince some imperfections, and even contradictions (since there will be contradictions in their training materials). Another way is that their context windows are limited, just like ours. I liken their hallucinations to crappy code written by a tired human at 3AM after a 20 hour day.
If they are so human-like, we really cannot ascribe their output to plagiarism except when prompted so as to plagiarize.
LLMs just predict the next token. They mimic humans because they were trained on terabytes of human-created data (with no credit given to the authors of the training data). They don't mimic human thinking. If they did, you would be able to train them by themselves, but if you do that you get Model Collapse
So I'm getting tired of the argument that LLMs are "plagiarism machines" -- yes, they can be coaxed into repeating training material verbatim, but no, they don't do that unless you try.
Opus 4.6's C compiler? I've not looked at it, but I would bet it does not resemble GCC -- maybe some corners, but overall it must be new, and if the prompting was specific enough as to architecture and design then it might not resemble GCC or any other C compiler much at all.
Not only do LLMs mimic human thinking, but also they mimic human faults. Obviously one way in which they mimic human faults is that there are mistakes in the LLMs' training materials, so they will evince some imperfections, and even contradictions (since there will be contradictions in their training materials). Another way is that their context windows are limited, just like ours. I liken their hallucinations to crappy code written by a tired human at 3AM after a 20 hour day.
If they are so human-like, we really cannot ascribe their output to plagiarism except when prompted so as to plagiarize.