Whenever someone posts AI-written content, at least half the comments on this site are calling it out and saying they stopped reading. I think it's obvious at this point that AI has a certain writing profile, which includes blandness and punchy statements that are thin or vapid on inspection.
Sure, everyone deserves some share of the blame, but it's like 10% for the Ds and 90% Rs. We can't keep talking like it's 50/50, that's how people become completely disenchanted with politics and don't even bother to vote.
> That wouldn’t even be a big violation of the vibe coding concept. You’re reading the innards a little but you’re only giving high-level, conceptual, abstract ideas about how problems should be solved. The machine is doing the vast majority, if not literally all, of the actual writing.
Claude Code is being produced at AI Level 7 (Human specced, bots coded), whereas the author is arguing that AI Level 6 (Bots coded, human understands somewhat) yields substantially better results. I happen to agree, but I'd like to call out that people have wildly different opinions on this; some people say that the max AI Level should be 5 (Bots coded, human understands completely), and of course some people think that you lose touch with the ground if you go above AI Level 2 (Human coded with minor assists).
It's also a context-specific scale. I work in computer vision. Building the surrounding app, UI, checkout flow, etcetera is easily Level 6/7(sorry...) on this scale.
Building the rendering pipeline, algorithms, maths, I've turned off even level 2. It is just more of a distraction than it's worth for that deep state of focus.
So I imagine at least some of the disconnect comes from the area people work in and its novelty or complexity.
This is exactly true in my experience! The usefulness of AI varies wildly depending on the complexity, correctness-requirements, & especially novelty of the domain.
This attribute plus a bit of human tribalism, social echo-chambering, & some motivated reasoning by people with a horse in the race, easily explains the discord I see in rhetoric around AI.
Far from solved! Though, like seemingly everything, it has benefited from the transformer architecture. And computer vision is kind of the "input", it usually sits intersecting with some other field i.e. cv for medical analysis is different to self driving is different to reconstruction for games/movies.
I like this framing, but it does seem to imply that a whole dev shop, or a whole product, can or should be built at the same level.
The fact is, I think the art of building well with AI (and I'm not saying it's easy) is to have a heterogenously vibe-coded app.
For example, in the app I'm working on now, certain algorithmically novel parts are level 0 (I started at level 1, but this was a tremendously difficult problem and the AI actually introduced more confusion than it provided ideas.)
And other parts of the app (mostly the UI in this case) are level 7. And most of the middleware (state management, data model) is somewhere in between.
Identifying the appropriate level for a given part of the codebase is IMO the whole game.
100% agree. Velocity at level 8 or even 7 is a whole order of magnitude faster than even level 5. Like you said, identifying the core and letting everything else move fast is most of the game. The other part is finding ways to up the level at which you’re building the core, which is a harder problem.
Disagree, I don't particularly want to up the level at which I'm building the core. Core is where I want to prioritize quality over speed, and (at least with today's models) what I build by hand is much, much higher quality.
I'm at a 5, and only because I've implemented a lot of guardrails, am using a typed functional language with no nulls, TDD red/green, and a good amount of time spent spec'ing. No way I'd be comfortable enough this high with a dynamic language.
I could probably get to a 7 with some additional tooling and a second max 20 account, but I care too much about the product I'm building right now. Maybe for something I cared less about.
IMO if you're going 7+, you might as well just pick a statically typed and very safe (small surface area) language anyways, since you won't be coding yourself.
You aren't leveling up here... these levels are simple measures of how you use the tools to do something. You can regularly do things from any level or multiple levels at the same time.
I don't know why you're being downvoted, I agree that "more != better" with these levels. It's just a descriptor of how much human vs AI attention was given to a task/PR.
That's an interesting list. I think that the humans that will make the most progress in the next few years are the ones that push themselves up to the highest level of that list. Right now is a period of intense disruption and there are many coders that don't like the idea that their way of life is dead. There are still blacksmiths around today but for the most part it's made by factories and cheap 3rd world labor. I think the same is currently happening with coding, except it will allow single builders and designers to do the same thing as an entire team 5 years ago.
> I think the same is currently happening with coding, except it will allow single builders and designers to do the same thing as an entire team 5 years ago.
This part of your post I think signals that you are either very new or haven't been paying attention; single developers were outperforming entire teams on the regular long before LLMs were a thing in software development, and they still are. This isn't because they're geniuses, but rather because you don't get any meaningful speedup out of adding team members.
I've always personally thought there is a sweet spot at about 3 programmers where you still might see development velocity increase, but that's probably wrong and I just prefer it to not feel too lonely.
In any case teams are not there to speed anything up, and anyone who thinks they are is a moron. Many, many people in management are morons.
Thanks for that list of levels, it's helpful to understand how these things are playing out and where I'm at in relation to other engineers utilizing LLM agents.
I can say that I feel comfortable at approximately AI level 5, with occasional forays to AI level 6 when I completely understand the interface and can test it but don't fully understand the implementation. It's not really that different from working on a team, with the agent as a team member.
> some people say that the max AI Level should be 5
> of course some people think that you lose touch with the ground if you go above AI Level 2
I really think that this framing sometimes causes a loss of granularity. As with most things in life, there is nuance in these approaches.
I find that nowadays for my main project I where I am really leaning into the 'autonomous engineering' concept, AI Level 7 is perfect - as long as it is qualified through rigorous QA processes on the output (ie it is not important what the code does if the output looks correct). But even in this project that I am really leaning into the AI 'hands-off' methodology, there are a few areas that dip into Level 5 or 4 depending on how well AI does them (Frontend Design especially) or on the criticality of the feature (in my case E2EE).
The most important thing is recognizing when you need to move 'up' or 'down' the scale and having an understanding of the system you are building
At work I am at level 4, but my side projects have embarrassingly crept into Level 6. It is very tempting to accept the features as is, without taking the time understand how it works
To clarify, does this mean that Anthropic employees don't understand Claude Code's code since it's level 7? I've got to believe they have staff capable of understanding the output and they would spend at least some time reviewing code for a product like this?
This is why we need some kind of professional accountability for software engineers. This behavior is willful malpractice, and it only flies because they know they'll never face consequences when it goes wrong. Let's change that.
I'm not sure I believe them though, at face value anyway. Or at least, I would suspect the entire spectrum of levels 0-9 are constantly at play at Anthropic (or any sizeable company). Fully disavowing the code as a matter of policy seems needlessly reckless.
(Thanks for visidata btw, awesome tool that helped me with a side project not long ago.)
I’m not sure I believe that Level 7 exists for most projects. It is utterly *impossible* for most non-trivial programs to have a spec that doesn’t not have deep, carnal knowledge of the implementation. It can not be done.
For most interesting problems the spec HAS to include implementation details and architecture and critical data structures. At some point you’re still writing code, but in a different language, and it migtt hurt have actually been better to just write the damn struct declarations by hand and then let AI run with it.
I agree, I'm venturing into Level 6 myself and it often feels like being one step too high on a ladder. Level 7 feels like just standing on the very top of the ladder, which is terrifying (to me anyway as an experienced software engineer).
To me it’s not terrifying because it’s just so plainly bad and not good enough. If you try L7 it just doesn’t work. Unless you’re making a dashboard in which case sure yeah it’s fine. But not for complex problems.
> as if debugging python code is the point of it all.
You have a good point, but I would argue that debugging itself is a foundational skill. Like imagine Sherlock Holmes being able to use any modern crime-fighting technology, and using it extensively. If Sherlock is not using his deductive reasoning, then he's not a 'detective'. He's just some schmuck who has a cool device to find the right/wrong person to arrest.
Debugging is "problem-solving" in a specific domain. Sure, if the problem is solved, then I guess that's the point of it all and you don't have to solve the problem. But we're all looking towards a world in which people have to solve problems, but their only problem-solving skill is trying to get an AI to find someone to arrest. We need more Sherlocks to use their minds to get to the bottom of things, not more idiot cops who arrest the wrong person because the AI told them to.
This is like asking Claude to explain some aspect of physics to you. It'll 'feel' like you understand, but in order to really understand you have to work those annoying problems.
Same with anything. You can read about how to meditate, cook, sew, whatever. But if you only read about something, your mental model is hollow and purely conceptual, having never had to interact with actual reality. Your brain has to work through the problems.
> ...in order to really understand you have to work those annoying problems.
GP says that they have to come back tomorrow and edit the code to fix something. That's a verification step: if you can do that (even with some effort) you understand why the AI did what it did. This is not some completely new domain where what you wrote would apply very clearly, it's just a codebase that GP is supposed to be familiar with already!
There are 720 hours in a month. You'd have to be running 3 sessions in parallel continuously to be doing thousands of session-hours in a month. Are individual people really doing this?!
I work with 3-5 parallel sessions most of the time. Some of the projects are related, some are not, some sessions are just managing and tuning my system configuration, whatever it means at a given time.
In my OP I mention this is aggregated across both work + personal, so the comparison of just 8 hour workdays 5 days a week isn't accurate.
Running some `/stats` on my work computer shows for the last 30 days:
* Sessions: 341
* Active days: 21/30
* Longest session: 3d 20h 33m (Some large scale refactoring of types)
So I'm running a little over 10 sessions a day, each session varies from something like 1-2 hours to sometimes multiple days if it's a larger project. Running `/clear` actually doesn't start a new session fwiw, it will maintain the session but clear context, which explains why I can have a 3 day long session but I'm not actually using a single context window.
On the personal side I have activity in 30/30 of the last days (:yay); I've been learning game dev recently and use Claude a lot for helping digest documentation and learn about certain concepts as I try to build them in Unity. One of my more interesting use-cases is I have three skills I use during play tests:
* QA-Feedback: Takes random thoughts / feedback from me and writes to feedback markdown files
* Spec-Feedback: Loops every minute to grab a feedback item and spec out the intention / open questions
* Impl-Feedback: Loops every minute to grab a spec, clarify open questions with the user (me) first, then create an implementation plan
So I might have a friend play my game and I'll generate 20-30 items of feedback as I watch them play the game, things like minor bugs or mechanics in general. Over the course of the day my Claude will spec and plan out the feedback for me. I have remote sessions always on so I can use my phone to check in on the implementor job and answer open ended questions as they come up.
By the following day I'll usually have a bunch of plans ready for Claude to work on. I'll send agents off to do the simple ones throughout the day (bugs) and work with Claude on the bigger items.
Sorry for the long winded explanation but trying to convey the level of usage I have w/ Claude code. I do admit "thousands" is hyperbolic, as I'm probably only nearing 2k session hours in the most extreme months but I would say I on average use Claude every day to some capacity, often times both during work and after work (for my hobbies).
Great, thank you for the detailed response! The biggest difference in our use is your "loops every minute", which I've not been willing to try yet (even with me at the helm, Claude might try to make a fairly straightforward bugfix in a cracked-out way and I have to steer it in the right direction).
I also love using `/loop` at work on combination with a PR maintenance skill, helps me push up changes initially and have a session automatically monitor + fixup a branch to get it passing before I review it myself and then later send off for a human review.
From your first link, it says 10% of 28k employees in India were cut. I personally know several people who were laid off from Oracle this week (OCI). One person who's still there described it as a "bloodbath across our division" and says he counted 15k. I don't know what exactly he was counting but as we're in North America I am assuming they're all here. Whereas India layoffs were fewer than 3k. So that directly disputes your statement that "they've barely fired any American workers".
Can confirm. Friend laid off on team of 15, that team is now down to 7. They built datacenters, too. US based. That's, sorta concerning since I thought their entire future bag was making datacenters......
Yes, and, the world would be better off if the price of oil were higher. We would produce less plastic crap and take fewer frivolous airplane trips and take more public transit. Our petroleum consumption is based on underpriced oil.
reply