That’s why you should always make your agent write tests first and run them to make sure they fail and aren’t testing that 1+1=2. Then and only then do you let it write the code that makes them pass.
That creates an enormous library of documentation of micro decisions that can be read by the agent later.
Your codebase ends up with executable documentation of how things are meant to work. Agents can do archaeology on it and demystify pretty much anything in the codebase. It makes agents’ amnesia a non-issue.
I find this far more impactful than keeping specs around. It detects regressions and doesn’t drift.
That’s why you should always make your agent write tests first and run them to make sure they fail and aren’t testing that 1+1=2. Then and only then do you let it write the code that makes them pass.
That creates an enormous library of documentation of micro decisions that can be read by the agent later.
Your codebase ends up with executable documentation of how things are meant to work. Agents can do archaeology on it and demystify pretty much anything in the codebase. It makes agents’ amnesia a non-issue.
I find this far more impactful than keeping specs around. It detects regressions and doesn’t drift.