Hacker Newsnew | past | comments | ask | show | jobs | submit | mudkipdev's commentslogin

I believe the R stood for reasoning, just like OpenAI had their own dedicated o1/o3 family, but now every model just has it built-in.

This is refreshing right after GPT-5.5's $30

What the hell is this? This is not a real page.


It's the model marketing page for Hong Kong, they've link to a web ui where you can test the model.


Thanks for pointing that out!!!


This is not an official page; deepseek4.hk is not a domain owned by deepseek.


This is 3x the price of GPT-5.1, released just 6 months ago. Is no one else alarmed by the trend? What happens when the cheaper models are deprecated/removed over time?

This is entirely expected. The low prices of using LLMs early on was totally and completely unsustainable. The companies providing such services were (and still are) burning money by the truckload.

The hope is to get a big userbase who eventually become dependent on it for their workflow, then crank up the price until it finally becomes profitable.

The price for all models by all companies will continue to go up, and quickly.


I recently looked at this a bit but came away with the impression that at least on API pricing the models should be very profitable considering primarily the electricity cost.

Subscriptions and free plans are the thing that can easily burn money.


The physical buildouts and massive R+D spending is the big part.

> The price for all models by all companies will continue to go up, and quickly.

This might entirely be true but I'm hoping that's because the frontier models are just actually more expensive to run as well.

Said another way, I would hope, the price of GPT-5.5 falls significantly in a year when GPT-5.8 is out.

Someone else on this post commented:

> For API usage, GPT-5.5 is 2x the price of GPT-5.4, ~4x the price of GPT-5.1, and ~10x the price of Kimi-2.6.

Having used Kimi-2.6, it can go on for hours spewing nonsense. I personally am happy to pay 10x the price of something that doesn't help me, for something else that does, in even half the time.


Look a cost per intelligence or cost per task instead of cost per token.

How do I reliably measure 1 unit of intelligence?

In pelicans, obviously

Isn't the outcome / solution for a given task non-deterministic? So can we reliably measure that?

Yes, sort of. Generally you can measure the pass rate on a benchmark given a fixed compute budget. A sufficiently smart model can hit a high pass rate with fewer tokens/compute. Check out the cost efficiency on https://artificialanalysis.ai/ (say this posted here the other day, pretty neat charts!)

Statistically. Do many trials and measure how often it succeeds/fails.

This is the only correct take. The only metric that matters is cost per desired outcome.

Repetition and statistics, if you have $1000++ you didn't need anyway.

It's much easier to measure a language model's intelligence than a human's because you can take as many samples as you want without affecting its knowledge. And we do measure human intelligence.

As others have mentioned you're ignoring the long tail of open-weights models which can be self hosted. As long as that quasi-open-source competition keeps up the pace, it will put a cap on how expensive the frontier models can get before people have to switch to self-hosting.

That's a big if, though. I wish Meta were still releasing top of the line, expensively produced open-weights models. Or if Anthropic, Google, or X would release an open mini version.


Well, Google does release mini open versions of their models. https://deepmind.google/models/gemma/gemma-4/

And they're incredibly good for their size.

Which, unfortunately is still slow unusable garbage compared to fronteir models.

Not at all, it's more than enough for a large range of tasks. As for slow, that's just a function of how much compute you throw at it, which you actually control unlike with closed weights models.

We know they cost much more than this for OpenAI. Assume prices will continue to climb until they are making money.

How do we know that? There is a large gap between API pricing for SOTA models and similarly sized OSS models hosted by 3rd party providers.

Sure, they’re distilled and should be cheaper to run but at the same time, these hosting providers do turn a margin on these given it’s their core business, unless they do it out of the kindness of their heart.

So it’s hard for me to imagine these providers are losing money on API pricing.


source? There have also been a bunch of people here saying the opposite

Apparently the cost/price is 20x in the major providers. Not clear how it is a business

It's far more meaningful to look at the actual cost to successfully something. The token efficiency of GPT-5.5 is real; as well as it just being far better for work.

SOTA models get distilled to open source weights in ~6 months. So paying premium for bleeding edge performance sounds like a fair compensation for enormous capex.

GPT-4 cost 6x on input and 2x output tokens when it was released as compared go GPT-5.5

Not really a big problem. Switch to KIMI, Qwen, GLM. You’ll get 95% quality of GPT or Anthropic for a 10th of a price. I feel like the real dependency is more mental, more of a habit but if you actually dip your toes outside OpenAI, Anthropic, Gemini from time to time, you realise that the actual difference in code is not huge if prompted in a good way. Maybe you’ll have to tell it to do something twice and it won’t be a one shot, but it’s really not an issue at all.

I use glm and I like it, not they also increased the price to 18 usd /month.

I think Kimi and qwen are similar?


God I hope this is true.

Where can i find up to date resources on open source models for coding?


https://old.reddit.com/r/LocalLLaMA/

Bit of a hype madhouse whenever a new model is released, but it's pretty easy to filter out simple hype from people showing reproducible experiments, specific configs for llama.cpp, github links etc.


Such an increase tracks the company's valuation trend, which they constantly, somehow have to justify (let alone break even on costs).

Who would want Anthropic's business after they broke user trust?

> Output is too large: Disable unused breakpoints, variants, or colors in your build script.

So instead of using tailwind, which automatically strips unused CSS classes, here you're supposed to manually remove anything you think you might not need by editing lisp code?

Edit: I just took a look at one of the example projects listed, and sure enough it ships a 1 megabyte file called olive.min.css with every possible class:

https://wikimusic.jointhefreeworld.org/css/wikimusic.olive.m...

It's also heavily duplicated, searching for "blur-md" yields 12 entries all with the same definition.


If you are talking with Claude about AI, it will sometimes passively bring up "frontier models like GPT-4o"

Slightly tangential, how good/bad is 4o compared to the modern (5.3 I think?) one?

TBH I personally find non-thinking replies quite poor for the type of questions I ask so I haven't touched chatgpt for months (ever since Gemini 2.5 Pro I think.) (And even Gemini 3.1 Pro tends to still be too literal at times instead of understand the implied meaning lol. We've got more place to improve.)


HuggingFace has a nice UI where you can save your specs to your account and it will display a checkmark/red X next to every unsloth quantization to estimate if it will fit.

Why is the assumption that they trained for a pelican on a bicycle, rather than running RL for all kinds of 'generate an SVG' tasks?

Gemini did exactly that, and boasted about it at launch: https://x.com/JeffDean/status/2024525132266688757

That post doesn't say anything about training for SVG generation

https://blog.google/innovation-and-ai/models-and-research/ge...

> Code-based animation: 3.1 Pro can generate website-ready, animated SVGs directly from a text prompt. Because these are built in pure code rather than pixels, they remain crisp at any scale and maintain incredibly small file sizes compared to traditional video.


The GLM coding plan price increased dramatically

It is left unsaid, but throughput is also terrible.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: