Hacker Newsnew | past | comments | ask | show | jobs | submit | simianparrot's commentslogin

A single byte change in the input changes the output. The sentence "Please do this for me" and "Please, do this for me" can lead to completely distinct output.

Given this, you can't treat it as deterministic even with temp 0 and fixed seed and no memory.


Interestingly, this is the mathematical definition of "chaotic behaviour"; minuscule changes in the input result in arbitrarily large differences in the output.

It can arise from perfectly deterministic rules... the Logistic Map with r=4, x(n+1) = 4*(1 - x(n)) is a classic.


Correct, it's akin to chaos theory or the butterfly effect, which, even it can be predictable for many ranges of input: https://youtu.be/dtjb2OhEQcU

Which is also the desired behavior of the mixing functions from which the cryptographic primitives are built (e.g. block cipher functions and one-way hash functions), i.e. the so-called avalanche property.

Well yeah of course changes in the input result in changes to the output, my only claim was that LLMs can be deterministic (ie to output exactly the same output each time for a given input) if set up correctly.

You still can’t deterministically guarantee anything about the output based on the input, other than repeatability for the exact same input.

What does deterministic mean to you?

In this context, it means being able to deterministically predict properties of the output based on properties of the input. That is, you don’t treat each distinct input as a unicorn, but instead consider properties of the input, and you want to know useful properties of the output. With LLMs, you can only do that statistically at best, but not deterministically, in the sense of being able to know that whenever the input has property A then the output will always have property B.

I mean can’t you have a grammar on both ends and just set out-of-language tokens to zero. I thought one of the APIs had a way to staple a JSON schema to the output, for ex.

We’re making pretty strong statements here. It’s not like it’s impossible to make sure DROP TABLE doesn’t get output.


You still can’t predict whether the in-language responses will be correct or not.

As an analogy: If, for a compiler, you verify that its output is valid machine code, that doesn’t tell you whether the output machine code is faithful to the input source code. For example, you might want to have the assurance that if the input specifies a terminating program, then the output machine code represents a terminating program as well. For a compiler, you can guarantee that such properties are true by construction.

More generally, you can write your programs such that you can prove from their code that they satisfy properties you are interested in for all inputs.

With LLMs, however, you have no practical way to reason about relations between the properties of inputs and outputs.


And also have a blacklist of keywords detecting program that the LLM output is run through afterwards, that's probably the easiest filter.

I think they mean having some useful predicates P, Q such that for any input i and for any output o that the LLM can generate from that input, P(i) => Q(o).

If you could do that, why would you need an LLM? You'd already know the answer...

Having that property is still a looooong way away from being able to get a meaningful answer. Consider P being something like "asks for SQL output" and Q being "is syntactically valid SQL output". This would represent a useful guarantee, but it would not in any way mean that you could do away with the LLM.

You don't think this is pedantry bordering on uselessness?

No, determinism and predictability are different concepts. You can have a deterministic random number generator for example.

It's correcting a misconception that many people have regarding LLMs that they are inherently and fundamentally non-deterministic, as if they were a true random number generator, but they are closer to a pseudo random number generator in that they are deterministic with the right settings.

The comment that is being responded to describes a behavior that has nothing to do with determinism and follows it up with "Given this, you can't treat it as deterministic" lol.

Someone tried to redefine a well-established term in the middle of an internet forum thread about that term. The word that has been pushed to uselessness here is "pedantry".


Let's eat grandma.

Societies drive people to suicide in general. Families also do. I don't think the solution is to make the world a padded room.


>I don't think the solution is to make the world a padded room.

..neither do i?


We're struggling with the pollution levels from road dust now though. It's worse in most cities than it ever was with combustion engines. Yes there's lower Co2, but the dust and tire particles are actually more dangerous.


So EVs that reduce both are a double win!

EU is introducing regulations for this kind of emissions which will likely create a market for a few new techs that reduce it (reformulated tyres, modern drum brakes that capture dust, etc)


What about the increased pollution from road dust? In Norway this has led to higher pollution levels that are directly dangerous to people and animals than back when we were all combustion vehicles.

The heavier EV's are causing genuinely harmful particles simply by driving on the roads themselves.


Woah hold on there. Where is the evidence for both increased dust and increased pollution levels?

EVs generate next to no brake dust due to regenerative braking, most EVs have mechanisms to forcefully use the friction brakes at some points to stop surface rust for this reason.

It's true they're generally heavier than the equivalent ICE vehicle, but this is usually around 200-300KG heavier - it causes a small increase in tyre wear and associated particulates but these are heavy large particles - the majority larger than pm10. That's a problem for water courses and micro plastics but nothing that'll get in your lungs or bloodstream. Anecdotally, my EV tyres (a particularly heavy model too) have lasted fine - my last set did 53k miles.

ICE cars produce plenty of pm10s, pm2.5s and smaller particles as well as nitrogen oxide, carbon dioxide and plenty of other harmful pollutants that EVs inherently don't. Even the power generated for them is usually produced away from the majority of the population.


This claim keeps circulating around and around and is not "EVs are producing more pollution", it's "if EVs are going at motorway speed, and if we only look at the pollution generated by the tires, then indeed they produce a little more".

But that's completely ignoring tailpipe emission, and the fact that in an urban setting it's still vastly more advantageous to drive an EV.

See https://doi.org/10.1016/j.trd.2025.104622


Where did you get the idea that EVs have caused it? As far as I know the amount of road dust from EVs is within the same ballpark so the claim that it has led to overall higher pollution levels sounds inconceivable. I can't even find sources that indicate high pollution levels in Oslo besides a Bloomberg article that says the situation has actually improved in recent years. [1] On the contrary Oslo seems to be doing comparatively well according to the air quality data from iqair. [2]

[1] https://www.bloomberg.com/news/articles/2025-04-11/why-oslo-...

[2] https://www.iqair.com/us/norway/oslo/oslo


On the other hand the EVs produce no exhausts and less harmful particles from using their brakes.


This sounds like a nice problem to have. Most of the world lives cities blighted by ICE pollution.


Thats' why you need light EVs. Norway has electric tanks. China has light EVs.


Maybe it's ok to create something that isn't for most people. That's how the internet started out. It's only gotten worse the more accessible it became to most people. Maybe it's a good thing to create a split based on capabilities and technical know-how.


But we already have a bunch of social networks that are not for everybody. The problem is that social networks are pretty much a winner-takes-all market due to network effects.


We do and many of us prefer it that way. I’m not on any major social media because I personally consider it asocial — you can’t have that many actual friends or acquaintances. My «social media» is a handful of smaller discord servers and an irc channel, and an extensive webring of personal websites.


This and a lot of similar HN comments, often by fresh accounts, just read like viral marketing. Not least because of the capitalisation.

Claude Code sure is great. Claud Code and my Codex reignited my passion for programming. Codex and Claude.

Ugh.


140 year old here just to chime in -- Wowee Claude Code™® sure is magic and giving me back all the passion I've lost in my life now that I can Code Anything I want! It's not just a tool, it's a revolution!! Hell yeah brother let's go Code Some Stuff With Claude!!!!

It's really fucking absurd. This thread is such low quality garbage and it's somehow a top article with hundreds of bot comments all reading from the same template, what a joke.


Let's not forget No Man's Sky here. Or Elite Dangerous' planet-scale procedural generation using solar system properties to fuel the deterministic but procedural generation of tectonic plates that again seed how a planet's surface is deterministically generated, even down to impact craters over millennia, for a universe of billions of consistent deterministically generated full-scale planets you can land on. Something you couldn't do without proc-gen because there's not enough disk space to store it.


How do people keep track of all these versions and releases of all these models and their pros/cons? Seems like a fulltime hobby to me. I'd rather just improve my own skills with all that time and energy


Unless you're interested in this type of stuff, I'm not sure you really need to. Claude, Google, and ChatGPT have been fairly aggressive at pushing you towards whatever their latest shiny is and retiring the old one.

Only time it matters if you're using some type of agnostic "router" service.


> I'd rather just improve my own skills with all that time and energy

That's what I would recommend, it's time better spent. I use AI occasionally to bounce some questions around or have some math jargon explained in simpler terms (all of which I can verify with external sources) using the free version of chatgpt or gemini or whatever I'm feeling that day, without caring about whatever version the model is. I don't need an AI to write code for me because writing the code is not really the hard part of solving a problem, in my opinion.


For me it's simple. I did my research, settled on Anthropic and Claude and got the Pro plan at ~$20/month. That way I only have to keep track of what Anthropic are offering, and that isn't even necessary as the tools I use for AI-supported development (Claude Code for VS Code extension, Xcode Intelligence and Claude Desktop) offer me to use the newsest models as soon as they are released.


on a subscription you cant access all that many different options, so you just stay with whatever the newest is unless it doesnt work.


Same as when the EU puts a ton of restrictions on farmers within the EU countries -- Co2, fertiliser requirements, etc. -- making food so expensive to produce many go out of business and the remainder become practically luxury food, and then countries just end up having to import food from countries outside the EU _without_ those restrictions, simply offloading the environmental burden on "some other countries somewhere".

It's a farse.


Food is actually pretty cheap in the EU (in absolute prices compared to the US and relative to income compared to most other places), so I don't know what you mean.


You're not contradicting me. Read it again.


EU is a net food exporter and the only agricultural products the EU isn't self-sufficient in are animal feed, sugar, and tropical fruits & vegetables.

So, no, EU farmers are struggling at the moment because they aren't as competitive on the global markets as they used to. Not because Europeans aren't buying their food anymore.


Now why do you think they’re not competitive? Think about this more than one layer.


That's a feature


I would love to know why it's considered a feature for you.

I remember messing with bouncers and reading the backlog from a 3rd party page. Bots that would ping other members when they come online. It was cumbersome.


Because I prefer online conversations to work like IRL ones: Ephemeral. Sure each individual might keep their own log if they want but the server itself doesn’t and setting aside all the issues with modern datasets being used for training all sorts of algorithms, just the concept of stepping into a digital room without all the baggage of the last twenty hours of conversation is _mentally refreshing_. It also changes people’s behaviour for the better IME.


Saving logs is gross, chats should be ephemeral. In any case there's HistServ and IRCv3 /chathistory nowadays, so if you really want it you can have it.

That all the minute garbage everyone posts is preserved forever in an unfiltered state I think is a root cause of the mental degradation that results from using Discord: kids don't have anywhere to 'post into the void' anymore. Preserving past events and relationships through oral history as opposed to a big monolithic search engine entails a far more human element to IRC.


But on IRC you had your own log, and sometimes the server made the full logs public. It was just cumbersome to access. What I said and you said in my presets was still logged.

It's a muddy middle ground where neither you are I are satisfied. Far from perfect.


I wanted to disagree but I really miss IRC internet. Saving everything we ever said online was a mistake. We need to focus on ephemeral chat making a comeback.


IRC still exists at a semi large scale. If you're looking to return


Saving logs has been essential for work, in the past, because we were always to write real documentation when necessary. Mind you, this was local to our machine.


To a modern audience, it's definitely not.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: