One interesting thing I found comparing OpenAI and Gemini image editing is - Gemini rejects anything involving a well known person. Anything. OpenAI is happy to edit and change every time I tried
I have a sideproject where I want to display standup comedies. I thought I could edit standup comedy posters with some AI to fit my design. Gemini straight up refuses to change any image of any standup comedy poster involving a well know human. OpenAI does not care and is happy to edit away
OpenAI wouldn't make me a Looney Tunes Roadrunner Martin Scorsese "Absolute Cinema" parody, but Gemini didn't blink about the trademark violation. Also, the output was really nice:
I don't know tbh. I've tried it on 10-20 various level of famous standups and Gemini refuses every time
Just for testing, I just tried this https://i.ytimg.com/vi/_KJdP4FLGTo/sddefault.jpg ("Redesign this image in a brutalist graphic design style"). Gemini refuses (api as well as UI), OpenAI does it
It seems like they're trying to follow local law. What a nightmare to have to manage all jurisdictions around such a product. Surprised it didn't kill image generation entirely.
Yea, especially when they know all that work will be completely pointless in a few years when open source / local models will be just as good and won't have any legal limitations, so people will be generating fake images of famous people like crazy with nothing stopping them
I think these pledges offload some of the risk onto Amazon/Oracle/etc
If Anthropic/OpenAI miss projections, infra providers can somewhat likely still turn around and sell it to the next guy or use it themselves. If they have more demand than expected (as Anthropic currently does), vcs will throw money at them and they can outbid the competition
If they built it themselves and missed projections it's a much more expensive mistake
It's just risk sharing. Infra providers take some of the risk and some of the upside
> If they built it themselves and missed projections it's a much more expensive mistake
Not if their pricing comes with multiyear commitments for reserved pricing. No doubt they get a huge volume discount but the advertised AWS reserved pricing is already enough for pay for a whole 8x HX00 pod plus the NVIDIA enterprise license plus the staff to manage it after only a one year commitment. On-demand pricing is significantly more expensive so they’re going to be boxed in by errors in capacity planning anyway (as has been happening the last few months).
The economics here are absurd unless you’re involved in a giant circular investment scheme to pump up valuations.
The pricing models that are published on AWS' website almost certainly have almost nothing to do with the pricing models that are discussed behind closed doors for a $100 billion commitment.
Of course not, but unless they’re getting the sweet heart deal of a lifetime from Amazon of all places, it’s still a hogwash. We’re talking about enough capital to build their own fab and a dozen datacenters*. This deal isn’t going to be buying existing capacity because that’s already stretched, it will be paying for new buildouts.
Afterwards Amazon will be milking the machines these commitments buy for nearly a decade. That tradeoff makes sense at a small scale (even up to $X00 million or even billions), but at $Y0 or $Z00 billion?
Color me skeptical. There are plenty of other side benefits like upgrading to the newest GPUs every few years, but again we’re talking about paying for new buildouts with upfront commitments anyway.
* obviously the timelines, scientific risk, and opportunity cost make this completely infeasible but that’s the scale we’re talking about. It’s a major industrial project on the scale of the thirty year space shuttle program (~$200 billion).
That’s just wrong. File reads, searches, compiler output, are the top input token consumers in my workflow. None of them can be removed. And they are the majority of my input tokens. That’s also why labs are trying to make 1M input work, and why compaction is so important to get right.
Regarding output - yes, but that wasn’t the topic in this thread. It’s just easier to argue with input tokens that price has gone up. I have a hunch the price for output will go up similarly, but can’t prove it. The jury’s out IMO: https://news.ycombinator.com/item?id=47816960
This has no bearing on my comment. The point is that a better model avoids dozens of prompts and tool calls by making fewer CORRECT tool calls, with the user needing no more prompts.
I’m surprised this is even a question; obviously a better prompter has the same properties and it’s not in dispute?
We have a somewhat complicated OpenSearch reindexing logic and we had some issue where it happened more regularly than it should. I vibecoded a dashboard visualizing in a graph exactly which index gets reindexed when and into what. Code works, a little rough around the edges. But it serves the purpose and saved me a ton of time
Another example, in an internal project we made a recent change where we need to send specific headers depending on the environment. Mostly GET endpoint where my workflow is checking the API through browser. The list of headers is long, but predetermined. I vibecoded an extension that lets you pick the header and allows me to work with my regular workflow, rather than Postman or cURL or whatever. A little buggy UI, but good enough. The whole team uses it
I'm not a frontend developer and either of these would take me a lot of time to do by hand
My best guess is that Nvidia is unhappy with how OpenAI is fishing for compute with its competitors (Jensen had some opinions on the AMD-OpenAI deal when it was announced). If this actually becomes a feasible reality, it gives OpenAI (and co) negotiating power - which is bad for Nvidia
Nvidia might have wanted more exclusivity/attachment. And OpenAI still seems to have no problem raising money. So maybe there was just a commitment mismatch
I would agree. I've been using VSCode Copilot for the past (nearly) year. And it has gotten significantly better. I also use CC and Antigravity privately - and got access to Cursor (on top of VSCode) at work a month ago
CC is, imo, the best. The rest are largely on pair with each other. The benefit of VSCode and Antigravity is that they have the most generous limits. I ran through Cursor $20 limits in 3 days, where same tier VSCode subscription can last me 2+ weeks
> In a science fiction story, if you invented a superintelligent robot and asked it how to make money, it might come up with cool never-before-seen ideas, or at least massive fun market manipulation. But in real life, if you train a large language model on the internet and ask it how to make money, it will say “advertising, affiliate shopping links and porn.” That’s the lesson the internet teaches!
But I think it makes a lot of sense for very popular consumer products. In my honest opinion, I much prefer having services like Google, Youtube, Gmail, Maps, ChatGPT etc exist for free, but with ads, rather than not exist at all. Preferably with an option to pay and remove ads
Nowadays I'm happy to pay for Youtube premium or LLM, but back during my student days I could not really afford it - and I'm glad there was a free tier (with ads)
>In my honest opinion, I much prefer having services like Google, Youtube, Gmail, Maps, ChatGPT
I don't use any of these except YouTube (if only I could find the content elsewhere…) and I still pay for them when I purchase anything advertised on these properties because, of course, the companies advertising on Google makes all their customers pay for the free (lol) services. All advertising expenses are included in the price of the products, even if you never saw any ads.
We could easily charge for each of these services and still have them. Advertising is not necessary at all. It's just a way to make others pay for your services. It's a free riding problem to externalize costs on those who don't partake in the scheme.
Pay your share and don't call free what others will subsidize. Unless if a public service and we collectively agree on the split (vote and taxes, which we can debate publicly)
Right. But a good portion of the world can't afford the premium and having access to these services is still valuable. For every broke student or someone from a poor background, who probably don't make any money for the company (due to not buying advertised stuff), there's someone from a well off background, who will more than subsidize it by virtue of clicking on a lawyer ad (or whatever)
Nowadays I'm happy to pay, but that wasn't always the case. And I personally think that having an ad tier and fee tier is fine. Serves everyone
I much prefer to subsidize my neighborhood / friends / colleagues / family / … than have the world sink in ads. That enshites everything. It turns all social media into hate machines. And the cost is only externalized and it is definitely not reduced by polluting the mind with all the ads (same as climate change where we're only making the situation worse by procrastinating). The free part and the fake generosity are an illusion.
I’ve thought if they ban car commercials and truck ads, the price would go down. How much is an open question? Would they actually want to drop the cost?
My guess is that this is bigger lock-in than it might seem on paper.
Google and Apple together will posttrain Gemini to Apple's specification. Google has the know-how as well as infra and will happily do this (for free ish) to continue the mutually beneficial relationship - as well as lock out competitors that asked for more money (Anthropic)
Once this goes live, provided Siri improves meaningfully, it is quite an expensive experiment to then switch to a different provider.
For any single user, the switching costs to a different LLM are next to nothing. But at Apple's scale they need to be extremely careful and confident that the switch is an actual improvement
I’m not so sure. Just think about coding assistants with MCP based tools. I can use multiple different models in GitHub Copilot and get good results with similarly capable models.
Siri’s functionality and OS integration could be exposed in a similar, industry-standard way via tools provided to the model.
Then any other model can be swapped in quite easily. Of course, they may still want to do fine tuning, quantization, performance optimization for Apple’s hardware, etc.
But I don’t see why the actual software integration part needs to be difficult.
> But I don’t see why the actual software integration part needs to be difficult.
That’s not the issue. The issue is that once Gemini is in place as the intelligence behind Siri, the bar is now much higher than today and so you have to be more careful if you consider replacing Gemini, because you’re as likely as not to make Siri worse. Maybe more likely to make it worse.
Oh well that’s a good problem to have, isn’t it? Siri being so good that they don’t want to mess it up.
That gives them plenty of runway to test and optimize new models internally before release and not feel like they need to rush them out because Siri sucks.
Doubt it. Of all the issues I run into with Siri none could be solved by throwing AI slop at it. Case in point: if I ask Siri to play an album and it can't match the album name it just plays some random shit instead of erroring out.
Um if I ask an LLM about a fake band it literally say I couldn't find any songs by the fake band did you type is correctly and it's about a millions times more likely to guess correctly. Why do you say it doesn't solve loads of things? I'm more concerned about the problems it creates (prompt injection, hallucinations in important work, bad logic in code), the actual functionality will be fantastic compared to Siri right now!
Because I'm sitting here twiddling my thumbs waiting for random pages to go through their anti-LLM bot crap. LLMs create more problems than they solve.
Um if I ask an LLM about a fake band it literally say I couldn't find any
songs by the fake band did you type is correctly and it's about a millions
times more likely to guess correctly
Um if Apple wrote proper error handling in the first place the issue would be solve without LLM baggage. Apple made a conscious decision to handle "unknown" artists this way, LLMs don't change that.
Ollama! Why didn’t they just run Ollama and a public model! They’ve kept the last 10 years with a Siri who doesn’t know any contact named Chronometer only to require the best in class LLM?
The other day I was trying to navigate to a Costco in my car. So I opened google maps on Android Auto on the screen in my car and pressed the search box. My car won't allow me to type even while parked... so I have to speak to the Google Voice Assistant.
I was in the map search, so I just said "Costco" and it said "I can't help with that right now, please try again later" or something of the sort. I tried a couple more times until I changed up to saying "Navigate me to Costco" where it finally did the search in the textbox and found it for me.
Obviously this isn't the same thing as Gemini but the experience with Android Auto becomes more and more garbage as time passes and I'm concerned that now we're going to have 2 google product voice assistants.
Also, tbh, Gemini was great a month ago but since then it's become total garbage. Maybe it passes benchmarks or whatever but interacting with it is awful. It takes more time to interact with than to just do stuff yourself at this point.
I tried Google Maps AI last night and, wow. The experience was about as garbage as you can imagine.
I'm genuinely curious about this too. If you really only need the language and common sense parts of an LLM -- not deep factual knowledge of every technical and cultural domain -- then aren't the public models great? Just exactly what you need? Nobody's using Siri for coding.
Are there licensing issues regarding commercial use at scale or something?
Pure speculation, but I’d guess that an arrangement with Google comes with all sorts of ancillary support that will help things go smoothly: managed fine tuning/post-training, access to updated models as they become available, safety/content-related guarantees, reliability/availability terms so the whole thing doesn’t fall flat on launch day etc.
Probably repeatability and privacy guarantees around infrastructure and training too. Google already have very defined splits for their Gemma and in house models with engineers and researchers rarely communicating directly.
That said, Apple is likely to end up training their own model, sooner or later. They are already in the process of building out a bunch of data centers, and I think they have even designed in-house servers.
Remember when iPhone maps were Google Maps? Apple Maps have been steadily improving, to the point they are as good as, if not better than, Google Maps, in many areas (like around here. I recently had a friend send me a GM link to a destination, and the phone used GM for directions. It was much worse than Apple Maps. After a few wrong turns, I pulled over, fed the destination into Apple Maps, and completed the journey).
OpenAI is (was?) extremely good at making things that go viral. The successful ones for sure boost subscriber count meaningfully
Studio Ghibli, Sora app. Go viral, juice numbers then turn the knobs down on copyrighted material. Atlas I believe was a less successful than they would've hoped for.
And because of too frequent version bumps that are sometimes released as an answer to Google's launch, rather than a meaningful improvement - I believe they're also having harder time going viral that way
Overall OpenAI throws stuff at the wall and see what sticks. Most of it doesn't and gets (semi) abandoned. But some of it does and it makes for better consumer product than Gemini
It seems to have worked well so far, though I'm sceptical it will be enough for long
Going viral is great when you're a small team or even a million dollar company. That can make or break your business.
Going viral as a billion dollar company spending upward of 1T is still not sustainable. You can't pay off a trillion dollars on "engagement". The entire advertising industry is "only" worth 1T as is: https://www.investors.com/news/advertising-industry-to-hit-1...
I guess we'd have to see the graph with the evolution of paying customers: I don't see the number of potential-but-not-yet clients being that high, certainly not one order of magnitude higher. And everyone already knows OpenAI, they don't have the benefit of additional exposure when they go viral: the only benefit seems to be to hype up investors.
And there's something else about the diminishing returns of going viral... AI kind of breaks the usual assumptions in software: that building it is the hard part and that scaling is basically free. In that sense, AI looks more like regular commodities or physical products, in that you can't just Ctrl-C/Ctrl-V: resources are O(N) on the number of users, not O(log N) like regular software.
I have a sideproject where I want to display standup comedies. I thought I could edit standup comedy posters with some AI to fit my design. Gemini straight up refuses to change any image of any standup comedy poster involving a well know human. OpenAI does not care and is happy to edit away
reply