More

dktp · 2026-04-21T20:13:39 1776802419

One interesting thing I found comparing OpenAI and Gemini image editing is - Gemini rejects anything involving a well known person. Anything. OpenAI is happy to edit and change every time I tried

I have a sideproject where I want to display standup comedies. I thought I could edit standup comedy posters with some AI to fit my design. Gemini straight up refuses to change any image of any standup comedy poster involving a well know human. OpenAI does not care and is happy to edit away

qingcharles · 2026-04-22T19:04:01 1776884641

OpenAI wouldn't make me a Looney Tunes Roadrunner Martin Scorsese "Absolute Cinema" parody, but Gemini didn't blink about the trademark violation. Also, the output was really nice:

https://imgur.com/a/Jclezyi

Melatonic · 2026-04-21T20:14:35 1776802475

How does it determine they are well known and not just similar looking?

yreg · 2026-04-21T23:36:34 1776814594

Gemini often rejects photos of random people (even ones it generated itself) because it thinks they look too similar to some well known person.

dktp · 2026-04-21T20:20:22 1776802822

I don't know tbh. I've tried it on 10-20 various level of famous standups and Gemini refuses every time

Just for testing, I just tried this https://i.ytimg.com/vi/_KJdP4FLGTo/sddefault.jpg ("Redesign this image in a brutalist graphic design style"). Gemini refuses (api as well as UI), OpenAI does it

arjie · 2026-04-21T20:53:41 1776804821

It's not super deterministic but it didn't fail once on my attempts. See: https://imgur.com/a/james-acaster-cold-lasagne-1R7fpzQ

dktp · 2026-04-21T21:05:11 1776805511

Very interesting. It fails every single time for me. I'm in Germany, maybe Google is stricter here?

See https://imgur.com/a/77BRDQv

arjie · 2026-04-21T21:28:47 1776806927

That makes sense to me. I just Googled around like a fool and got here https://en.wikipedia.org/wiki/Personality_rights#Germany

It seems like they're trying to follow local law. What a nightmare to have to manage all jurisdictions around such a product. Surprised it didn't kill image generation entirely.

jliptzin · 2026-04-21T22:51:27 1776811887

Yea, especially when they know all that work will be completely pointless in a few years when open source / local models will be just as good and won't have any legal limitations, so people will be generating fake images of famous people like crazy with nothing stopping them

Melatonic · 2026-04-21T20:59:06 1776805146

What if you change the prompt to tell it specifically its not a famous person? Or try it without text?

BoorishBears · 2026-04-22T02:42:14 1776825734

There are models specifically for detecting well known people https://docs.aws.amazon.com/rekognition/latest/dg/celebritie...

vunderba · 2026-04-21T23:44:36 1776815076

Are you using Google Gemini directly? I've found the Vertex API seems to be significantly less strict.

dktp · 2026-04-21T14:12:09 1776780729

I think these pledges offload some of the risk onto Amazon/Oracle/etc

If Anthropic/OpenAI miss projections, infra providers can somewhat likely still turn around and sell it to the next guy or use it themselves. If they have more demand than expected (as Anthropic currently does), vcs will throw money at them and they can outbid the competition

If they built it themselves and missed projections it's a much more expensive mistake

It's just risk sharing. Infra providers take some of the risk and some of the upside

throwup238 · 2026-04-21T14:28:58 1776781738

> If they built it themselves and missed projections it's a much more expensive mistake

Not if their pricing comes with multiyear commitments for reserved pricing. No doubt they get a huge volume discount but the advertised AWS reserved pricing is already enough for pay for a whole 8x HX00 pod plus the NVIDIA enterprise license plus the staff to manage it after only a one year commitment. On-demand pricing is significantly more expensive so they’re going to be boxed in by errors in capacity planning anyway (as has been happening the last few months).

The economics here are absurd unless you’re involved in a giant circular investment scheme to pump up valuations.

dweekly · 2026-04-21T14:59:39 1776783579

The pricing models that are published on AWS' website almost certainly have almost nothing to do with the pricing models that are discussed behind closed doors for a $100 billion commitment.

throwup238 · 2026-04-21T15:30:48 1776785448

Of course not, but unless they’re getting the sweet heart deal of a lifetime from Amazon of all places, it’s still a hogwash. We’re talking about enough capital to build their own fab and a dozen datacenters*. This deal isn’t going to be buying existing capacity because that’s already stretched, it will be paying for new buildouts.

Afterwards Amazon will be milking the machines these commitments buy for nearly a decade. That tradeoff makes sense at a small scale (even up to $X00 million or even billions), but at $Y0 or $Z00 billion?

Color me skeptical. There are plenty of other side benefits like upgrading to the newest GPUs every few years, but again we’re talking about paying for new buildouts with upfront commitments anyway.

* obviously the timelines, scientific risk, and opportunity cost make this completely infeasible but that’s the scale we’re talking about. It’s a major industrial project on the scale of the thirty year space shuttle program (~$200 billion).

coredog64 · 2026-04-21T19:30:43 1776799843

You can get a significant AWS discount with an annual spend starting around $1M/year.

dktp · 2026-04-18T17:12:27 1776532347

The idea is that smarter models might use fewer turns to accomplish the same task - reducing the overall token usage

Though, from my limited testing, the new model is far more token hungry overall

manmal · 2026-04-18T17:34:21 1776533661

Well you‘ll need the same prompt for input tokens?

httgbgg · 2026-04-18T18:09:31 1776535771

Only the first one. Ideally now there is no second prompt.

manmal · 2026-04-18T18:17:19 1776536239

Are you aware that every tool call produces output which also counts as input to the LLM?

squeaky-clean · 2026-04-19T02:58:45 1776567525

Are you aware that a lot of model tool calls are useless and a smarter model could avoid those?

Are you aware that output tokens are priced 5x higher than input tokens?

manmal · 2026-04-19T04:17:55 1776572275

> a lot of model tool calls are useless

That’s just wrong. File reads, searches, compiler output, are the top input token consumers in my workflow. None of them can be removed. And they are the majority of my input tokens. That’s also why labs are trying to make 1M input work, and why compaction is so important to get right.

Regarding output - yes, but that wasn’t the topic in this thread. It’s just easier to argue with input tokens that price has gone up. I have a hunch the price for output will go up similarly, but can’t prove it. The jury’s out IMO: https://news.ycombinator.com/item?id=47816960

httgbgg · 2026-04-19T16:32:25 1776616345

This has no bearing on my comment. The point is that a better model avoids dozens of prompts and tool calls by making fewer CORRECT tool calls, with the user needing no more prompts.

I’m surprised this is even a question; obviously a better prompter has the same properties and it’s not in dispute?

dktp · 2026-03-03T18:01:08 1772560868

Opus 4.5 became significantly cheaper than Opus 4.1

dktp · 2026-02-24T13:15:45 1771938945

From recent personal examples

We have a somewhat complicated OpenSearch reindexing logic and we had some issue where it happened more regularly than it should. I vibecoded a dashboard visualizing in a graph exactly which index gets reindexed when and into what. Code works, a little rough around the edges. But it serves the purpose and saved me a ton of time

Another example, in an internal project we made a recent change where we need to send specific headers depending on the environment. Mostly GET endpoint where my workflow is checking the API through browser. The list of headers is long, but predetermined. I vibecoded an extension that lets you pick the header and allows me to work with my regular workflow, rather than Postman or cURL or whatever. A little buggy UI, but good enough. The whole team uses it

I'm not a frontend developer and either of these would take me a lot of time to do by hand

dktp · 2026-02-02T22:58:30 1770073110

My best guess is that Nvidia is unhappy with how OpenAI is fishing for compute with its competitors (Jensen had some opinions on the AMD-OpenAI deal when it was announced). If this actually becomes a feasible reality, it gives OpenAI (and co) negotiating power - which is bad for Nvidia

Nvidia might have wanted more exclusivity/attachment. And OpenAI still seems to have no problem raising money. So maybe there was just a commitment mismatch

Pure speculation though

dktp · 2026-02-02T15:48:22 1770047302

I would agree. I've been using VSCode Copilot for the past (nearly) year. And it has gotten significantly better. I also use CC and Antigravity privately - and got access to Cursor (on top of VSCode) at work a month ago

CC is, imo, the best. The rest are largely on pair with each other. The benefit of VSCode and Antigravity is that they have the most generous limits. I ran through Cursor $20 limits in 3 days, where same tier VSCode subscription can last me 2+ weeks

dktp · 2026-01-17T20:30:48 1768681848

From a very entertaining Matt Levine article (https://archive.is/8QYxl)

> In a science fiction story, if you invented a superintelligent robot and asked it how to make money, it might come up with cool never-before-seen ideas, or at least massive fun market manipulation. But in real life, if you train a large language model on the internet and ask it how to make money, it will say “advertising, affiliate shopping links and porn.” That’s the lesson the internet teaches!

But I think it makes a lot of sense for very popular consumer products. In my honest opinion, I much prefer having services like Google, Youtube, Gmail, Maps, ChatGPT etc exist for free, but with ads, rather than not exist at all. Preferably with an option to pay and remove ads

Nowadays I'm happy to pay for Youtube premium or LLM, but back during my student days I could not really afford it - and I'm glad there was a free tier (with ads)

rcMgD2BwE72F · 2026-01-17T21:12:11 1768684331

>In my honest opinion, I much prefer having services like Google, Youtube, Gmail, Maps, ChatGPT

I don't use any of these except YouTube (if only I could find the content elsewhere…) and I still pay for them when I purchase anything advertised on these properties because, of course, the companies advertising on Google makes all their customers pay for the free (lol) services. All advertising expenses are included in the price of the products, even if you never saw any ads.

We could easily charge for each of these services and still have them. Advertising is not necessary at all. It's just a way to make others pay for your services. It's a free riding problem to externalize costs on those who don't partake in the scheme.

Pay your share and don't call free what others will subsidize. Unless if a public service and we collectively agree on the split (vote and taxes, which we can debate publicly)

dktp · 2026-01-17T21:48:27 1768686507

Right. But a good portion of the world can't afford the premium and having access to these services is still valuable. For every broke student or someone from a poor background, who probably don't make any money for the company (due to not buying advertised stuff), there's someone from a well off background, who will more than subsidize it by virtue of clicking on a lawyer ad (or whatever)

Nowadays I'm happy to pay, but that wasn't always the case. And I personally think that having an ad tier and fee tier is fine. Serves everyone

rcMgD2BwE72F · 2026-01-17T22:49:21 1768690161

I much prefer to subsidize my neighborhood / friends / colleagues / family / … than have the world sink in ads. That enshites everything. It turns all social media into hate machines. And the cost is only externalized and it is definitely not reduced by polluting the mind with all the ads (same as climate change where we're only making the situation worse by procrastinating). The free part and the fake generosity are an illusion.

rcMgD2BwE72F · 2026-01-17T22:52:16 1768690336

Freemium is the way if you're ok with paying forward. Not admium.

The online media I support as subscribers don't display any ad. And it's fine. I don't pay for the content, I pay for journalism.

stogot · 2026-01-18T00:09:19 1768694959

I’ve thought if they ban car commercials and truck ads, the price would go down. How much is an open question? Would they actually want to drop the cost?

dktp · 2026-01-12T18:17:31 1768241851

My guess is that this is bigger lock-in than it might seem on paper.

Google and Apple together will posttrain Gemini to Apple's specification. Google has the know-how as well as infra and will happily do this (for free ish) to continue the mutually beneficial relationship - as well as lock out competitors that asked for more money (Anthropic)

Once this goes live, provided Siri improves meaningfully, it is quite an expensive experiment to then switch to a different provider.

For any single user, the switching costs to a different LLM are next to nothing. But at Apple's scale they need to be extremely careful and confident that the switch is an actual improvement

TheOtherHobbes · 2026-01-12T19:51:11 1768247471

It's a very low baseline with Siri, so almost anything would be an improvement.

anamexis · 2026-01-12T21:58:37 1768255117

The point is that once Siri is switched to a Gemini-based model, the baseline presumably won't be low anymore.

brokencode · 2026-01-13T01:01:22 1768266082

I’m not so sure. Just think about coding assistants with MCP based tools. I can use multiple different models in GitHub Copilot and get good results with similarly capable models.

Siri’s functionality and OS integration could be exposed in a similar, industry-standard way via tools provided to the model.

Then any other model can be swapped in quite easily. Of course, they may still want to do fine tuning, quantization, performance optimization for Apple’s hardware, etc.

But I don’t see why the actual software integration part needs to be difficult.

andsoitis · 2026-01-13T05:54:47 1768283687

> But I don’t see why the actual software integration part needs to be difficult.

That’s not the issue. The issue is that once Gemini is in place as the intelligence behind Siri, the bar is now much higher than today and so you have to be more careful if you consider replacing Gemini, because you’re as likely as not to make Siri worse. Maybe more likely to make it worse.

brokencode · 2026-01-13T14:08:43 1768313323

Oh well that’s a good problem to have, isn’t it? Siri being so good that they don’t want to mess it up.

That gives them plenty of runway to test and optimize new models internally before release and not feel like they need to rush them out because Siri sucks.

inferiorhuman · 2026-01-13T03:05:24 1768273524

Doubt it. Of all the issues I run into with Siri none could be solved by throwing AI slop at it. Case in point: if I ask Siri to play an album and it can't match the album name it just plays some random shit instead of erroring out.

andy_ppp · 2026-01-13T04:03:54 1768277034

Um if I ask an LLM about a fake band it literally say I couldn't find any songs by the fake band did you type is correctly and it's about a millions times more likely to guess correctly. Why do you say it doesn't solve loads of things? I'm more concerned about the problems it creates (prompt injection, hallucinations in important work, bad logic in code), the actual functionality will be fantastic compared to Siri right now!

inferiorhuman · 2026-01-13T04:44:38 1768279478

  Why do you say it doesn't solve loads of things?

Because I'm sitting here twiddling my thumbs waiting for random pages to go through their anti-LLM bot crap. LLMs create more problems than they solve.

  Um if I ask an LLM about a fake band it literally say I couldn't find any
  songs by the fake band did you type is correctly and it's about a millions
  times more likely to guess correctly

Um if Apple wrote proper error handling in the first place the issue would be solve without LLM baggage. Apple made a conscious decision to handle "unknown" artists this way, LLMs don't change that.

eastbound · 2026-01-12T20:19:44 1768249184

Ollama! Why didn’t they just run Ollama and a public model! They’ve kept the last 10 years with a Siri who doesn’t know any contact named Chronometer only to require the best in class LLM?

chankstein38 · 2026-01-12T20:57:38 1768251458

The other day I was trying to navigate to a Costco in my car. So I opened google maps on Android Auto on the screen in my car and pressed the search box. My car won't allow me to type even while parked... so I have to speak to the Google Voice Assistant.

I was in the map search, so I just said "Costco" and it said "I can't help with that right now, please try again later" or something of the sort. I tried a couple more times until I changed up to saying "Navigate me to Costco" where it finally did the search in the textbox and found it for me.

Obviously this isn't the same thing as Gemini but the experience with Android Auto becomes more and more garbage as time passes and I'm concerned that now we're going to have 2 google product voice assistants.

Also, tbh, Gemini was great a month ago but since then it's become total garbage. Maybe it passes benchmarks or whatever but interacting with it is awful. It takes more time to interact with than to just do stuff yourself at this point.

I tried Google Maps AI last night and, wow. The experience was about as garbage as you can imagine.

woah · 2026-01-12T22:03:24 1768255404

Siri on my Apple Home will default to turning off all the lights in the kitchen if it misunderstands anything. Much hilarity ensues

eastbound · 2026-01-13T05:44:10 1768283050

Would be worse if it turned off your car.

antod · 2026-01-13T04:00:38 1768276838

Share and Enjoy

Agingcoder · 2026-01-13T07:41:39 1768290099

Same issue with apple car. ‘ hey Siri, please play ‘call your girlfriend’ on Spotify by (artist)’

‘Sorry I don’t know anyone called ‘your girlfriend’’ The kids find it hilarious

hosh · 2026-01-13T05:29:57 1768282197

I have been getting great results with Gemini 3 Deep Think, though I am not using it as my personal assistant.

crazygringo · 2026-01-13T00:43:31 1768265011

I'm genuinely curious about this too. If you really only need the language and common sense parts of an LLM -- not deep factual knowledge of every technical and cultural domain -- then aren't the public models great? Just exactly what you need? Nobody's using Siri for coding.

Are there licensing issues regarding commercial use at scale or something?

macNchz · 2026-01-13T02:20:49 1768270849

Pure speculation, but I’d guess that an arrangement with Google comes with all sorts of ancillary support that will help things go smoothly: managed fine tuning/post-training, access to updated models as they become available, safety/content-related guarantees, reliability/availability terms so the whole thing doesn’t fall flat on launch day etc.

andy_ppp · 2026-01-13T04:06:45 1768277205

Probably repeatability and privacy guarantees around infrastructure and training too. Google already have very defined splits for their Gemma and in house models with engineers and researchers rarely communicating directly.

JumpCrisscross · 2026-01-13T04:07:07 1768277227

> Why didn’t they just run Ollama and a public model

Same reason they switched to Intel chips in the 2000s. They were better. Then Cupertino watched. And it learned. And it leapfrogged.

If I were Google, my fear would be Apple launching and then cutting the line at TSMC to mass produce custom silicon in the 2030s.

ChrisMarshallNY · 2026-01-13T02:25:07 1768271107

> provided Siri improves meaningfully

Not a high bar…

That said, Apple is likely to end up training their own model, sooner or later. They are already in the process of building out a bunch of data centers, and I think they have even designed in-house servers.

Remember when iPhone maps were Google Maps? Apple Maps have been steadily improving, to the point they are as good as, if not better than, Google Maps, in many areas (like around here. I recently had a friend send me a GM link to a destination, and the phone used GM for directions. It was much worse than Apple Maps. After a few wrong turns, I pulled over, fed the destination into Apple Maps, and completed the journey).

dktp · 2025-12-31T01:17:03 1767143823

OpenAI is (was?) extremely good at making things that go viral. The successful ones for sure boost subscriber count meaningfully

Studio Ghibli, Sora app. Go viral, juice numbers then turn the knobs down on copyrighted material. Atlas I believe was a less successful than they would've hoped for.

And because of too frequent version bumps that are sometimes released as an answer to Google's launch, rather than a meaningful improvement - I believe they're also having harder time going viral that way

Overall OpenAI throws stuff at the wall and see what sticks. Most of it doesn't and gets (semi) abandoned. But some of it does and it makes for better consumer product than Gemini

It seems to have worked well so far, though I'm sceptical it will be enough for long

johnnyanmac · 2025-12-31T02:19:58 1767147598

Going viral is great when you're a small team or even a million dollar company. That can make or break your business.

Going viral as a billion dollar company spending upward of 1T is still not sustainable. You can't pay off a trillion dollars on "engagement". The entire advertising industry is "only" worth 1T as is: https://www.investors.com/news/advertising-industry-to-hit-1...

drowsspa · 2025-12-31T14:43:34 1767192214

I guess we'd have to see the graph with the evolution of paying customers: I don't see the number of potential-but-not-yet clients being that high, certainly not one order of magnitude higher. And everyone already knows OpenAI, they don't have the benefit of additional exposure when they go viral: the only benefit seems to be to hype up investors.

And there's something else about the diminishing returns of going viral... AI kind of breaks the usual assumptions in software: that building it is the hard part and that scaling is basically free. In that sense, AI looks more like regular commodities or physical products, in that you can't just Ctrl-C/Ctrl-V: resources are O(N) on the number of users, not O(log N) like regular software.

raw_anon_1111 · 2025-12-31T01:58:49 1767146329

Selling a bunch of $20 a month subscriptions isn’t going to make a dent in OpenAI losses. Going viral for a day or two doesn’t help.

Normal people are already getting tired of AI Slop