I don't think "results don't match promises" is the same as "not knowing how to ...

scoopdewoop · 2026-02-18T06:28:53 1771396133

Prompting LLMs for code simply takes more than a couple of weeks to learn.

It takes time to get an intuition for the kinds of problems they've seen in pre-training, what environments it faced in RL, and what kind of bizarre biases and blindspots it has. Learning to google was hard, learning to use other peoples libraries was hard, and its on par with those skills at least.

If there is a well known design pattern you know, thats a great thing to shout out. Knowing what to add to the context takes time and taste. If you are asking for pieces so large that you can't trust them, ask for smaller pieces and their composition. Its a force multiplier, and your taste for abstractions as a programmer is one of the factors.

In early usenet/forum days, the XY problem described users asking for implementation details of their X solution to Y problem, rather than asking how to solve Y. In llm prompting, people fall into the opposite. They have an X implementation they want to see, and rather than ask for it, they describe the Y problem and expect the LLM to arrive at the same X solution. Just ask for the implementation you want.

Asking bots to ask bots seems to be another skill as well.

overgard · 2026-02-18T20:30:39 1771446639

Let me clarify, I've been using the latest models for the last two weeks, but I've been using AI for about a year now. I know how to prompt. I don't know why people think it's an amazing skill, it's not much different from writing a good ticket.

keeda · 2026-02-18T23:42:01 1771458121

Writing a good ticket is not a common skill. IMO it seems deceptively easy but usually requires years of experience to understand what to include and express it in the most concise yet unambiguous terms possible for the intended audience.

vidarh · 2026-02-18T08:01:51 1771401711

Do you use an agent harness to have it review code for you before you do?

If not, you don't know how to use it efficiently.

A large part of using AI efficiently is to significantly lower that review burden by having it do far more of the verification and cleanup itself before you even look at it.

docmars · 2026-02-18T15:01:55 1771426915

This is correct, but part of the issue is that it significantly increases token usage costs. Some companies are doing:

- PRD and spec fulfillment review

- code review + correction loops

- security review + corrections

- addl. test coverage and tidying

- addl. type checks and tidying

- addl. lint checks and tidying

- maybe more I haven't listed

And these are run after each commit, so you can only imagine the costs per engineer doing this 10, 20, 50+ times per day depending on how much work they're knocking out.

vidarh · 2026-02-18T18:59:13 1771441153

Sure, it adds tokens. I've burnt 200 million tokens today on a single project.

The question is what your time is worth for the company, and which tasks costs less to have an agent automate than having you do.

docmars · 2026-02-18T22:20:50 1771453250

I think if you were to scale that kind of usage across a reasonable team size, costs would start to add up fast — and possibly beyond the cost of paying another engineer every year, especially if a lot of your teammates are new to AI, or aren't using it efficiently. Of course, it all depends on the appetite of the company.

The other constraint is, for those who are being laid off (maybe because of cost reduction to support an AI budget for a smaller team to use), engineers wanting to expand their skill set and practice these levels of usage + efficiency are effectively unable to with their own funding, making it more difficult to find employment as expectations heighten.

Prior to AI entering the fray, software development was largely free for everyone, allowing anyone with enough time and motivation to build the skills towards gainful employment. As AI becomes more prevalent and expectations around how it's used become higher, fewer and fewer applicants will be able to claim they have the experience necessary because it was out of reach due to costs.

riskable · 2026-02-18T15:04:24 1771427064

> Do you use an agent harness to have it review code for you before you do?

Right now you need to be Uncle Moneybags to do this in your personal life.

If you're lucky, your employer is footing the bill but otherwise... Ugh. It's like converting your app running perfectly fine on a cheap VPS to AWS Lambda. In theory, it's fine but in reality the next bill you get could make you faint.

vidarh · 2026-02-18T18:56:36 1771440996

It's down to how much you value your time. If your value your time low enough, it doesn't pay to make AI take over. If you value it high enough, it does.

overgard · 2026-02-18T20:28:39 1771446519

I have it run tests and every few days I ask it to do a code quality analysis check on the codebase.

I'm unconvinced AI reviewing AI is the answer here, because all LLMs have the same flaws. To me, the harness/guard rails for AI should be different technologies that work differently and in a more formal sense. IE, static code analysis, linters, tests, etc.

(Linting has actually been, by far, the BEST code quality enforcers for the agents I've run so far, and it's a lot cheaper and more configurable than running more agents.)

anukin · 2026-02-19T17:16:44 1771521404

What sort of agent harness setup do you recommend?

XenophileJKO · 2026-02-18T10:12:07 1771409527

If you know good architecture and you are testing as you go, I would say, it is probably pretty damn close to being able to build a company without looking at the code. Not without "risk" but definitely doable and plausible.

My current project that I started this weekend is a rust client server game with the client compiled into web assembly.

I do these projects without reading the code at all as a way to gauge what I can possibly do with AI without reading code, purely operating as a PM with technical intuition and architectural opinions.

So far Opus 4.6 has been capable of building it all out. I have to catch issues and I have asked it for refactoring analysis to see if it could optimize the file structure/components, but I haven't read the code at all.

At work I certainly read all the code. But would recommend people try to build something non trivial without looking at the code. It does take skill though, so maybe start small and build up the intuition on how they have issues, etc. I think you'll be surprised how much your technical intuition can scale even when you are not looking at the code.

LunaSea · 2026-02-18T10:48:20 1771411700

Security auditor and criminals have a bright future ahead of them.

XenophileJKO · 2026-02-18T17:49:20 1771436960

That is why I said "risk". Though the models are pretty good "if" you ask for security audits. Notice I didn't say you could do it without technical knowledge right now, so you need to know to ask for security review.

I have friends in security on major platforms who are impressed by the security review of the SOT models. Certainly better than the average bootstrapped founder.

mikkupikku · 2026-02-18T11:35:58 1771414558

For a few years maybe, but I see little reason to think this stuff won't be coming for their jobs as well.

docmars · 2026-02-18T15:06:07 1771427167

True, but you'd be surprised how much you can tighten up a codebase by asking a heftier model to do a security review and suggest fixes.

fastasucan · 2026-02-18T16:03:35 1771430615

At what point do people really know if it has been tightened up if they never look at the code?

docmars · 2026-02-20T23:50:44 1771631444

That's the catch -- a team would need to care enough about quality, or don't at their own peril.

XenophileJKO · 2026-02-18T18:40:52 1771440052

How does a PM know that the code has been tighten up by the offshore team?

fastasucan · 2026-02-18T16:00:56 1771430456

Why is there a point of not reading the code? Even with very competent humans we have put in place systems for reviewing the code.

danbolt · 2026-02-18T12:59:07 1771419547

What’s the game? Genuinely curious!

XenophileJKO · 2026-02-18T17:27:54 1771435674

Simultaneous turn based top down car combat where you design the cars first. Inspired by Car Wars, but taking advantage of computers, so spline based path planing and much more complicated way of calculating armor penetration and damage.

I'm building to play with my friends online.

munksbeer · 2026-02-18T23:05:01 1771455901

> and it makes subtle hard-to-find mistakes all over the place.

I agree. I'm constantly correcting the code it generates. But then, I do the same for humans when I review their PRs, and the LLM generated the code in a 100th of the time (or whatever figure you prefer).

docmars · 2026-02-18T14:53:44 1771426424

And yet, this is exactly what my last job's engineering & product leadership did with their CEO at the helm, before they laid me off.

They vibe-coded a complete rewrite of their products in a few months without any human review. Hundreds of thousands LOC. I feel sorry for the remaining engineers having to learn everything they just generated, and are now having customers use.