Are you not giving it enough information to work with? All of these issues you and the parent comment mentioned can be worked around by telling it HOW to do things.
The whole shtick of LLMs is that it can do stuff without telling it explicitly. Not sure why people are blamed because they are using it based on that expectation....
Yes, it can. So can I. But neither of us will write the code exactly the way nitpicky PR reviewer #2 demands it be written unless he makes his preferences clear somewhere. Even at a nitpick-hellhole like Google that's mostly codified into a massive number of readability rules, which can be found and followed in theory. Elsewhere, most reviewer preferences are just individual quirks that you have to pick up on over time, and that's the kind of stuff that neither new employees nor Claude will ever possibly be able to get right in a one-shot manner.
you can tell it how to do things, but sometimes it still goes out on its own, I have some variant of "do not deviate from the plan" and yet sometimes if you look while it's coding it will "ah, this is too hard as per the plan, let me take this shortcut" or "this previous test fails, but it's not an issue with my code I just wrote, so let's just 'fix' the test"
For simple scripts and simple self contained problems fully agenting in yolo mostly works, but as soon as it's an existing codebase or plans get more complex I find I have to handhold claude a lot more and if I leave it to its own devices I find things later. I have found also that having it update the plan with what it did AND afterwards review the plan it will find deviations still in the codebase.
Like the other day I had in the plan to refactor something due to data model changes, specifying very clearly this was an intentional breaking change (greenfield project under development), and it left behind all the existing code to preserve backwards compatibility, and actually it had many code contortions to make that happen, so much so I had to redo the whole thing.
Sometimes it does feel that Anthropic turns up/down the intelligence (I always run opus in high reasoning) but sometimes it seems it's just the nature of things, it is not deterministic, and sometimes it will just go off and do what it thinks it's best whether or not you prompt it not to (if you ask it later why it did that it will apologize with some variation of well it made sense at the time)
There is an unconstrained number of ways it can write code and still not be how I want it. Sometimes it's easier to write the correction against the code that is already generated since now you at least have a reference to something there than describing code that doesn't yet exist. I don't think it's solvable in general until they have the neuralink skill that senses my approval as it materializes each token and autocorrects to the golden path based on whether I'm making a happy or frowny face.
Stop thinking like a programmer and start thinking like a business person. Invest time and energy in thinking about WHAT you want; let the LLM worry about the HOW.
The thing is that the HOW of today becomes the context of someone else's tomorrow session, that person may not be as knowledgeable about that particular part of the codebase (and the domain), their LLM will base its own solution on today's unchecked output and will, inevitably, stray a little bit further from the optimum. So far I haven't seen any mechanism and workflow that would consistently push in the opposite direction.
Technically that's true, but unless you literally write every single line of code, the LLM will find a way to smuggle in some weirdness. Usually it isn't that bad, but it definitely requires quite a lot of attention.