This is 3x the price of GPT-5.1, released just 6 months ago. Is no one else alarmed by the trend? What happens when the cheaper models are deprecated/removed over time?
This is entirely expected. The low prices of using LLMs early on was totally and completely unsustainable. The companies providing such services were (and still are) burning money by the truckload.
The hope is to get a big userbase who eventually become dependent on it for their workflow, then crank up the price until it finally becomes profitable.
The price for all models by all companies will continue to go up, and quickly.
I recently looked at this a bit but came away with the impression that at least on API pricing the models should be very profitable considering primarily the electricity cost.
Subscriptions and free plans are the thing that can easily burn money.
> The price for all models by all companies will continue to go up, and quickly.
This might entirely be true but I'm hoping that's because the frontier models are just actually more expensive to run as well.
Said another way, I would hope, the price of GPT-5.5 falls significantly in a year when GPT-5.8 is out.
Someone else on this post commented:
> For API usage, GPT-5.5 is 2x the price of GPT-5.4, ~4x the price of GPT-5.1, and ~10x the price of Kimi-2.6.
Having used Kimi-2.6, it can go on for hours spewing nonsense. I personally am happy to pay 10x the price of something that doesn't help me, for something else that does, in even half the time.
Yes, sort of. Generally you can measure the pass rate on a benchmark given a fixed compute budget. A sufficiently smart model can hit a high pass rate with fewer tokens/compute. Check out the cost efficiency on https://artificialanalysis.ai/ (say this posted here the other day, pretty neat charts!)
It's much easier to measure a language model's intelligence than a human's because you can take as many samples as you want without affecting its knowledge. And we do measure human intelligence.
As others have mentioned you're ignoring the long tail of open-weights models which can be self hosted. As long as that quasi-open-source competition keeps up the pace, it will put a cap on how expensive the frontier models can get before people have to switch to self-hosting.
That's a big if, though. I wish Meta were still releasing top of the line, expensively produced open-weights models. Or if Anthropic, Google, or X would release an open mini version.
Not at all, it's more than enough for a large range of tasks. As for slow, that's just a function of how much compute you throw at it, which you actually control unlike with closed weights models.
How do we know that? There is a large gap between API pricing for SOTA models and similarly sized OSS models hosted by 3rd party providers.
Sure, they’re distilled and should be cheaper to run but at the same time, these hosting providers do turn a margin on these given it’s their core business, unless they do it out of the kindness of their heart.
So it’s hard for me to imagine these providers are losing money on API pricing.
It's far more meaningful to look at the actual cost to successfully something. The token efficiency of GPT-5.5 is real; as well as it just being far better for work.
SOTA models get distilled to open source weights in ~6 months. So paying premium for bleeding edge performance sounds like a fair compensation for enormous capex.
Not really a big problem. Switch to KIMI, Qwen, GLM. You’ll get 95% quality of GPT or Anthropic for a 10th of a price. I feel like the real dependency is more mental, more of a habit but if you actually dip your toes outside OpenAI, Anthropic, Gemini from time to time, you realise that the actual difference in code is not huge if prompted in a good way. Maybe you’ll have to tell it to do something twice and it won’t be a one shot, but it’s really not an issue at all.
Bit of a hype madhouse whenever a new model is released, but it's pretty easy to filter out simple hype from people showing reproducible experiments, specific configs for llama.cpp, github links etc.
> Output is too large: Disable unused breakpoints, variants, or colors in your build script.
So instead of using tailwind, which automatically strips unused CSS classes, here you're supposed to manually remove anything you think you might not need by editing lisp code?
Edit: I just took a look at one of the example projects listed, and sure enough it ships a 1 megabyte file called olive.min.css with every possible class:
Slightly tangential, how good/bad is 4o compared to the modern (5.3 I think?) one?
TBH I personally find non-thinking replies quite poor for the type of questions I ask so I haven't touched chatgpt for months (ever since Gemini 2.5 Pro I think.) (And even Gemini 3.1 Pro tends to still be too literal at times instead of understand the implied meaning lol. We've got more place to improve.)
HuggingFace has a nice UI where you can save your specs to your account and it will display a checkmark/red X next to every unsloth quantization to estimate if it will fit.
> Code-based animation: 3.1 Pro can generate website-ready, animated SVGs directly from a text prompt. Because these are built in pure code rather than pixels, they remain crisp at any scale and maintain incredibly small file sizes compared to traditional video.
reply