I could go on and on, but Claude recently decompiled the firmware of my camper van, documented all the CAN interfaces, then programmed an ESP32 module to talk to the van’s integrated systems (power, HVAC, lighting, tanks). That sort of embedded systems integration is completely out of my wheelhouse.
I honestly don’t understand AI naysayers. I use Claude every day both professionally as a Solution Architect and personally in a variety of projects I simply could not have ever approached alone.
> projects I simply could not have ever approached alone.
I think that's part of the divide between enthusiasts and naysayers. If you use GenAI on things that you couldn't approach alone, it's an incredible tool. If you use it on stuff that you're pretty good at, it's not a gamechanger (and if you're an expert, it's a minor boost at best). Many people's job are about doing what they're an expert at.
I'm about to complete a new non trivial functionality in a project of a costumer of mine. I spent an hour writing the spec. Then I asked Claude (Sonnet 4.6) to check if I missed something. I did, the sort of minor issues one notice after starting writing code, edge cases etc. That made me think about more issues and after a few iterations we settled down on a spec. I asked Claude to make an implementation plan and we ended up with 9 steps. It wrote the code for a step with new automatic tests and I performed some manual QA, which found further issues we didn't think about. We are at step 8 of 9 in about 12 hours of work. I would have needed a week to be there alone, with time spent researching and fixing bugs I created along the way, an inevitable part of our job but not exactly the most pleasant one.
This speedup is great. It improves the overall quality of the product (as perceived by the users) because I can ask Claude to add features that my customers and I would have dismissed because they take too long to implement. We would have settled down with a more basic UX.
So is it a game changer? It is in the same way those HTML / CSS framework like Bootstrap were game changers: suddenly every developer could create a decent and consistent UI in a fraction of the time with a few bells and whistles that we wouldn't have bothered coding. As a side effect a lot of web apps felt look alike mass products and web designers had to reinvent themselves, but the economics leaded inevitably in that direction. Would I spend again one of two weeks doing alone what I could write in a day or two with a LLM? Not anymore, not at this cost ($20 per month.)
I think part of it is we often notice bad AI usage. The llm generated "art" by someone with bad taste, or the patches to open source projects by people who cant program at all and are teerrible.
If the use is half decent people just dont notice it.
If you work on architecture and Claude docs, then you can essentially just have it fill in the gaps. Work then mostly becomes a matter of defining what the next piece of functionality is (which you can also use Claude to help with).
The stuff that used to take days now takes hours. It's not perfect, but if you get your codebase into a good shape then the payoff is huge.
> If you use GenAI on things that you couldn't approach alone, it's an incredible tool.
I think this isn't true in all cases
> If you use it on stuff that you're pretty good at, it's not a gamechanger (and if you're an expert, it's a minor boost at best).
I think even then there's a divide.
I mostly work greenfield projects (and love it!). For these, AI has been a literal game changer. Our projects are built faster, with one or two orders of magnitude more automated tests, and all quality metrics are up.
Meanwhile, nearly all of my friends complain that AI doesn't help them. But they mostly work in very large existing codebases.
Still, even in large projects I think AI (the expensive variant) has been a complete gamechanger for me. Sure, I spend a lot on tokens, but I just feel happier and enjoy what I do more. The singalong people say about "thinking at a higher abstraction level" is what I feel. I really am thinking about architecture and larger patterns, instead of the boring nitty-gritty (which wasn't boring at all when I was a kid learning to code!...)
I think a key factor in all of this, to me, has been dictation. Most of the time, I don't write -- I use voice-to-text. I don't even read what comes out of it -- the LLMs get it (it is mostly unintelligible to anyone else) .
This means when I'm planning a big feature, I give a gigantic brain dump to the LLM in perfect stream of consciousness way, going through ideas, pros and cons, edge cases, what exists, what doesn't exist, where I'm sure of something, where I'm not sure and want the LLM to browse the state-of-the-art. Sometimes I spend 20 minutes just talking to the microphone before I send the first prompt. When I pair that with Opus, I find that I am able to build much faster and to go through alternative designs much more frequently as well.
I keep trying to tell all my friends: use voice to text and braindump to the computer. But they refuse... I couldn't imagine having to type everything nowadays. Even though I'm a fast typer, it's still much slower than the speed of my thought, which, granted, is still faster than the speed of my voice.
In effect, I filter much less, but I've come to think that's positive for the good LLMs: I throw all the edge cases and what ifs I'm thinking about -- all those years of experience dealing with similar systems.
If I wanted to go back to work in-office, that would be my major problem: I need to be able to talk with my computer all the time, loudly, and pacing through my room.
Same. I'm a DevOps engineer, so a jack of all trades master of none type of guy, and Claude Code backfills my knowledge gaps and turns me into kind of a superhero. I think it's key to already have a pretty good idea of what you're looking at, though.
Just as bad as the technical debt is the cognitive debt in your codebase. When something breaks, your only recourse is to ask the AI how to fix it, since it wrote it and you did not have time to review all of its code. Except now the code base is so large it won’t fit into the context window, and the AI can’t help you, and…you’re screwed.
If you're vibing such complex things you should probably be in the habit of also generating detailed documentation and commits so the ai can follow breadcrumbs, add some playbooks for how to debug and it's actually pretty good. Too complex for local models context though - so you're probably still correct albeit there are ways to mitigate or delay this.
I’ll explain it: these tools are non-deterministic and people have different experiences with them. For a few people every interaction is totally fumbled and they think the cheerleaders of gen AI must be lying, for others the chatbot hits one home run after another and lets them add microcontrollers to their CAN bus. When these people’s good luck runs out and they start getting mixed results like the average user, they assert the service must have been down graded
I'll add to that: you are more likely to have a good experience if it has a lot of relevant data that it was trained on. You are also more likely to have a good experience if errors don't cause major issues.
So one-shotting a game of Snake should be great (tons of training data, errors are easily caught because it's a small program). Similar with building a lot of web UI front end, or one-shotting a personal project. On the other hand, I haven't been convinced that it's good enough to maintain large codebases or assist with niche topics that are not very well documented.
> if it has a lot of relevant data that it was trained on
This became evident to me the moment I tried to have these models work on some PowerShell tasks for me. Even Opus today struggles with PowerShell.
Since anything in PS is probably some internal sysadmin tool, there's not much public code out there outside of Microsoft's documentation. Plus the Verb-Noun naming scheme makes it really easy to just hallucinate cmdlets (which it does, often). Its easier to have the LLM just do things in python using M365 Graph API than any of the provided PowerShell cmdlets.
OTOH, I've been using Claude for a lot of Swift & Swift UI work lately and it has no problems there, and I'd imagine there's even less publicly available training data for that so to be honest I'm not entirely sure why it fails so badly at powershell.
> On the other hand, I haven't been convinced that it's good enough to maintain large codebases or assist with niche topics that are not very well documented.
Same is true of humans. So far my experience is that addressing the issue with the help of AI is faster than not (ie comprehending the system and creating the documentation).
I don't understand the comments of the kind of "same is true with human".
This feels a bit like whataboutism.
It also feels like people don't listen to each others.
For example, reading the previous comment, it feels like the thing that reduce the enthusiasm was that at first GenAI looks like it was "reading, understanding and using its own knowledge to answer the problem", but as soon as it is a ore niche or a more complex situation, GenAI looks like it "does not understand the code, just does the equivalent of a StackOverflow search and try to apply the solutions that it found there, and this is why it felt like it understood the code before".
It does not at all means that GenAI is not terribly useful. And even better than humans in some situations.
But it feels that answering "same with humans" is missing this point: that's the opposite, humans usually try to understand the code and are bad at covering a very large range of very well documented subjects. That's the "uncanny valley" they talk about: they assumed GenAI performance on a subject X is due to a "human-like" approach, and it feels very strange when this impression falls apart.
I still don’t get it I can dictate a prompt and sometimes I do it so quickly the text looks like a drunken parrot dictated it and it still always gets exactly what I’m asking for. I’m just going to attribute malice to the naysayers.
Some people are really bad at specifying what they want to ask for. Or they already start prompting with the attitude that it can't possibly work so they don't even really try, or stop at the first failure to point and say how bad it is.
People are really, really bad at specifying what they actually want. I've worked in IT for my whole career, starting in help desk (now an IT manager). My days in the service desk was enough proof that people have no idea what they actually want, or at least, they really struggle to articulate it into words.
It's the famous "email broken, fix pls" but in the form of an LLM prompt.
Well, today's multimodal llm agents with tools would at least have a good chance to do something with even such an underspecified query. Because fixing things is simpler to specify, the agent could look at config, network settings, send a test email, take a screenshot etc and get a good idea of what's broken. But when you want some new feature or new app, you can't do without actually asking for specifics, or at least you shouldn't complain if it didn't read your mind correctly. Or at least accept that you have to iterate. I think many average people can get this if they are motivated, and they can incrementally say what they don't like even in vague terms and it can get better. But some just stop without trying to ask for changes.
It can be frustrating to observe people interacting with these things. But it was just as frustrating 20 years ago, so maybe it's just a constant.
Similarly, doing service desk, the thing that makes me flip the table is how people start by explaining what does not work, instead of explaining what they are trying to do.
It's hard even at the highest levels, such as in writing scientific papers or doing scientific conference talks. People just generally have a hard time to step outside of their context and think with the head of someone who has a different set of facts and assumptions in their context. It's hard to know how much context you both share, and how to tailor the explanation so you also don't start from Adam and Eve but you explain just enough context and strip irrelevant tangents.
I don't think this is just about intention and willingness, it's just simply hard.
Or maybe people see how complex the code is and all the failure points, and don’t feel it’s ethical to use the output. In most of the comments, the most relevant point is that the poster is not an expert in the domain they got helped. While they can observe the result, they don’t have a causal model of the situation.
I honestly don’t understand AI naysayers. I use Claude every day both professionally as a Solution Architect and personally in a variety of projects I simply could not have ever approached alone.