Sure, but you then can't access the computronium, because communication is so bad.
Also, I wonder if good old-fashioned computing is interesting at all to a civilization that's had access to advanced AI and quantum computing for a while.
Like we haven't really figured out how to get an ML model to run on a quantum computer, or how to build a quantum-native computer (i.e. surface of a black hole, or some other way that doesn't rely on our current sense of quantum error correction), but I don't know of any physical laws that preclude it.
I'd bet if aliens invaded our galaxy, they'd go for the super black holes in the center, or some other resource beyond our use and understanding, not this random water planet on the edge.
It appeals to me because if you've ever taken a flight you can see how the details get progressively erased as you lift. Details that matter for a lot of reasons even if you can't see them.
It's also kind of interesting that they don't think they can do what an economy would normally do in this situation, which is raise prices until supply matches. Shortages generally imply mispricing.
There's a lot of angles you take from that as a starting point and I'm not confident that I fully understand it, so I'll leave it to the reader.
Ads do not pay enough to cover AI usage. People see the big numbers Google and Facebook make in ads and forget to divide the number by the number of people they serve ads to, let alone the number of ads they served to get to that per-user number. You can't pay for 3 cents of inference with .07 cents of revenue.
You also can't put ads in code completion AIs because the instant you do the utility to me of them at work drops to negative. Guess how much money companies are going to pay for negative-value AIs? Let's just say it won't exactly pay for the AI bubble. A code agent AI puts an ad for, well, anything and the AI accidentally puts it into code that gets served out to a customer and someone's going to sue. The merits of the case won't matter, nor the fact the customer "should have caught it in review", the lawsuit and public reputation hit (how many people here are reading this and salivating at the thought of being able to post an angrygram about AIs being nothing but ad machines?) still cost way too much for the AI companies creating the agents to risk.
I speculatively fired Claude Opus 4.6 at some code I knew very well yesterday as I was pondering the question. This code has been professionally reviewed about a year ago and came up fairly clean, with just a minor issue in it.
Opus "found" 8 issues. Two of them looked like they were probably realistic but not really that big a deal in the context it operates in. It labelled one of them as minor, but the other as major, and I'm pretty sure it's wrong about it being "major" even if is correct. Four of them I'm quite confident were just wrong. 2 of them would require substantial further investigation to verify whether or not they were right or wrong. I think they're wrong, but I admit I couldn't prove it on the spot.
It tried to provide exploit code for some of them, none of the exploits would have worked without some substantial additional work, even if what they were exploits for was correct.
In practice, this isn't a huge change from the status quo. There's all kinds of ways to get lots of "things that may be vulnerabilities". The assessment is a bigger bottleneck than the suspicions. AI providing "things that may be an issue" is not useless by any means but it doesn't necessarily create a phase change in the situation.
An AI that could automatically do all that, write the exploits, and then successfully test the exploits, refine them, and turn the whole process into basically "push button, get exploit" is a total phase change in the industry. If it in fact can do that. However based on the current state-of-the-art in the AI world I don't find it very hard to believe.
It is a frequent talking point that "security by obscurity" isn't really security, but in reality, yeah, it really is. An unknown but presumably staggering number of security bugs of every shape and size are out there in the world, protected solely by the fact that no human attacker has time to look at the code. And this has worked up until this point, because the attackers have been bottlenecked on their own attention time. It's kind of just been "something everyone knows" that any nation-state level actor could get into pretty much anything they wanted if they just tried hard enough, but "nation-state level" actor attention, despite how much is spent on it, has been quite limited relative to the torrent of software coming out in the world.
Unblocking the attackers by letting them simply purchase "nation-state level actor"-levels of attention in bulk is huge. For what such money gets them, it's cheap already today and if tokens were to, say, get an order of magnitude cheaper, it would be effectively negligible for a lot of organizations.
In the long run this will probably lead to much more secure software. The transition period from this world to that is going to be total chaos.
... again, assuming their assessment of its capabilities is accurate. I haven't used it. I can't attest to that. But if it's even half as good as what they say, yes, it's a huge huge huge deal and anyone who is even remotely worried about security needs to pay attention.
I don't need to conduct 1000 transactions per day. I don't forsee a world in which it will be some sort of fatal inconvenience to need to approve all purchases. I certainly don't plan on ever just handing over my credit card to an LLM, due to its fundamental architectural issues with injection, and I still don't anticipate handing it over to any future AI architecture anytime soon because I struggle to imagine what benefits could possibly be worth the risk of taking down such a basic, cheap barrier.
Agreed. My only real complaint with this article is it frames needing to argue with a machine as though this is a new, freshly annoying thing. I already do this constantly.
Every time I call the Costco pharmacy, I just hit 0 immediately because: Phone. Trees. Suck. They have always sucked, it's just an awful, grindingly slow way to accomplish ANYTHING, and it's so, so much easier to, when I need help, get a person on the line who can figure out what's gone wrong and sort it.
The only people benefiting from cutting that down are the scum class (combo of shareholders and executives) and who's shocked, really. Everything is being ruined nearly at all times to benefit the scum class.
At least phone trees are deterministic and there's still (usually) an option to get to a person for matters that aren't covered by the multiple choice options. Talking to AI is a much worse experience and the hope of the industry is that there won't need to be a human as a fallback anymore because (they believe) the AI is intelligent enough to handle anything.
The very very very lovely executives at Intuit (thank you for your contribution to society boize) have a good plan for calling their TurboTax help line: you don't spell your name to the robot, you don't talk to a human.
(unless saying "no" / "agent" etc. the fifteen time would've been the trick! Sure, my name can be "O K"...
(I would def love this system if I worked there though, just surprising it didn't have an offramp along the way... maybe they did but everyone used it)
I'm surprised to find so many people who consider human-based customer support a good experience. I wasted an hour on the phone last month with a series of polite support agents who I'm sure were wonderful people in their personal lives. They kept saying they'd like to try one more thing, making me wait 5 minutes (just short enough that I can't get anything done in the interim!), and then asking for one more pointless permutation of the workflow that did not work because their website was not showing me a button the support scripts said should be there. Talking to an LLM would have let me realize a lot faster that we weren't getting anywhere.
This happened to me when I tried to buy Oakley’s, it was because I’d changed my router to an ad blocking DNS which made their support session lookups fail, so they couldn’t help me. Transactions failing, all because of their site being too tightly integrated into tracking and ad platforms. I ended up going with Zenni and got similar glasses for 1/5 the price.
> because their website was not showing me a button the support scripts said should be there.
At that point, it's effectively a phone tree executed by a human. Colloquially, human-based support means getting a hold of someone who knows how to solve problems, and worst case, knowing who to contact to solve the problem. That means employees who know their worth which unfortunately, businesses do not want to pay.
At the risk of going against the gestalt, Facebook openly and publicly rejecting the ads is actually one of the better outcomes. They could have just put their thumbs on the scale, deprioritizing them, serving them to people they think are least likely to bite, etc. Lying about the number of times it was served because, after all, who can check? Many of us suspect the ad platforms already do this pretty routinely through one mechanism or another anyhow, after all.
It isn't reasonable to ask a platform to host content that is literally about suing them, not because of "freedom" concerns or whether or not Facebook is being hypocritical, but more because in the end there isn't a "fair" way for them to host that. The constraints people want to put on how Facebook would handle that ends up solving down to the null set by the time we account for them all. Open, public rejection is actually a fairly reasonable response and means the lawyers at least know what is up and can respond to a clear stimulus.
> It isn't reasonable to ask a platform to host content that is literally about suing them
Explicit rejection is better than opacity but better still is public accountability. Meta’s properties have a combined userbase that amounts to just over 1 in 4 people on earth; these platforms should have been regulated as utilities a long time ago. Suppose I wanted to run ad campaigns advocating for antitrust legislation targeting social media companies and ended up getting booted off of all of the major platforms; what feasible method is there for me to advance these ideas that could possibly compete with the platforms’ own abilities to influence public opinion?
You can really see this in the recent video generation where they try to incorporate text-to-speech into the video. All the tokens flying around, all the video data, all the context of all human knowledge ever put into bytes ingested into it, and the systems still completely routinely (from what I can tell) fails to put the speech in the right mouth even with explicit instruction and all the "common sense" making it obvious who is saying what.
There was some chatter yesterday on HN about the very strange capability frontier these models have and this is one of the biggest ones I can think of... a model that de novo, from scratch is generating megabyte upon megabyte of really quite good video information that at the same time is often unclear on the idea that a knock-knock joke does not start with the exact same person saying "Knock knock? Who's there?" in one utterance.
By the nature of the LLM architecture I think if you "colored" the input via tokens the model would about 85% "unlearn" the coloring anyhow. Which is to say, it's going to figure out that "test" in the two different colors is the same thing. It kind of has to, after all, you don't want to be talking about a "test" in your prompt and it be completely unable to connect that to the concept of "test" in its own replies. The coloring would end up as just another language in an already multi-language model. It might slightly help but I doubt it would be a solution to the problem. And possibly at an unacceptable loss of capability as it would burn some of its capacity on that "unlearning".
One of the reasons I'm comfortable using them as coding agents is that I can and do review every line of code they generate, and those lines of code form a gate. No LLM-bullshit can get through that gate, except in the form of lines of code, that I can examine, and even if I do let some bullshit through accidentally, the bullshit is stateless and can be extracted later if necessary just like any other line of code. Or, to put it another way, the context window doesn't come with the code, forming this huge blob of context to be carried along... the code is just the code.
That exposes me to when the models are objectively wrong and helps keep me grounded with their utility in spaces I can check them less well. One of the most important things you can put in your prompt is a request for sources, followed by you actually checking them out.
And one of the things the coding agents teach me is that you need to keep the AIs on a tight leash. What is their equivalent in other domains of them "fixing" the test to pass instead of fixing the code to pass the test? In the programming space I can run "git diff *_test.go" to ensure they didn't hack the tests when I didn't expect it. It keeps me wondering what the equivalent of that is in my non-programming questions. I have unit testing suites to verify my LLM output against. What's the equivalent in other domains? Probably some other isolated domains here and there do have some equivalents. But in general there isn't one. Things like "completely forged graphs" are completely expected but it's hard to catch this when you lack the tools or the understanding to chase down "where did this graph actually come from?".
The success with programming can't be translated naively into domains that lack the tooling programmers built up over the years, and based on how many times the AIs bang into the guardrails the tools provide I would definitely suggest large amounts of skepticism in those domains that lack those guardrails.
reply