Their problem space may be just fine with open weight models regardless, but yes the release of gemma 4, GLM 5.1 and qwen 3.5 (and now 3.6!) have all happened in the last 6 months
Nothing you've said about reasoning here is exclusive to LLMs. Human reasoning is also never guaranteed to be deterministic, excluding most correct solutions. As OP says, they may not be reasoning under the hood but if the effect is the same as a tool, does it matter?
I'm not sure if I'm up to date on the latest diffusion work, but I'm genuinely curious how you see them potentially making LLMs more deterministic? These models usually work by sampling too, and it seems like the transformer architecture is better suited to longer context problems than diffusion
The way I imagine greedy sampling for autoregressive language models is guaranteeing a deterministic result at each position individually. The way I'd imagine it for diffusion language models is guaranteeing a deterministic result for the entire response as a whole. I see diffusion models potentially being more promising because the unit of determinism would be larger, preserving expressivity within that unit. Additionally, diffusion language models iterate multiple times over their full response, whereas autoregressive language models get one shot at each token, and before there's even any picture of the full response. We'll have to see what impact this has in practice; I'm only cautiously optimistic.
I guess it depends on the definition of deterministic, but I think you're right and there's strong reason to expect this will happen as they develop. I think the next 5 - 10 years will be interesting!
Switzerland's draw is the money. It's true that a significant proportion of the population is foreign born, but the whole country is smaller than some tier 2 cities in China and many foreigners do not stay longterm. If China paid Swiss-level salaries there would be more people going for sure, but the country is so big that at a relative level I'm not sure if the proportion would change significantly
Yes because these descriptions are meant to foster dehumanization and detachment, which is very useful in military and scientific study contexts. That's why they also sound unnatural in casual conversation
A lot of mechanisation, especially in the modern world, is not deterministic and is not always 100% right; it's a fundamental "physics at scale" issue, not something new to LLMs. I think what happened when they first appeared was that people immediately clung to a superintelligence-type AI idea of what LLMs were supposed to do, then realised that's not what they are, then kept going and swung all the way over to "these things aren't good at anything really" or "if they only fix this ONE issue I have with them, they'll actually be useful"
After using Rust for many years now, I feel that a mutable global variable is the perfect example of a "you were so busy figuring out whether you could, you never stopped to consider whether you should".
Moving back to a language that does this kind of thing all the time now, it seems like insanity to me wrt safety in execution
Global mutable state is like a rite of passage for devs.
Novices start slapping global variables everywhere because it makes things easy and it works, until it doesn't and some behaviour breaks because... I don't even know what broke it.
On a smaller scale, mutable date handling libraries also provide some memorable WTF debugging moments until one learns (hopefully) that adding 10 days to a date should probably return a new date instance in most cases.
* old Toblerone Matterhorn logo unfortunately :( They've had to remove the mountain from branding since the chocolate is no longer produced in Switzerland. Still, I love finding the bear in the older boxes still floating around.
reply