Hacker Newsnew | past | comments | ask | show | jobs | submit | tekacs's commentslogin

... this (Telegram, etc. integration) is what CC's Channels feature is for:

https://code.claude.com/docs/en/channels

It's... fine. A bit half-baked like a lot of CC features right now.


I would be curious what you think of the idea of Sail and Muddy being... small. Technically complex, but small in the mind of the user. Not lacking in features (you talked about that), but 'feeling small/bounded, and therefore with small divergence' to the user. Does that... fit at all with your mental model of them?

I ask because I feel like Linear, Vercel, Figma, Notion, hell even Airtable... landed 'big' (felt like a big step change) with users when they arrived for most (I was a super super early user of Notion because my friend angel invested).

I used Sail and Muddy back when and... the small vs big distinction feels like my perception of the divergence between those things that get washed out by this effect and those that don't.

(also DM-ed you!)


Yeah I think that framing fits. The technical complexity in Sail and Muddy was real, but hidden in a way that didn’t translate into perceived user value.

We had some theories for how it could land big, but none strongly resonated. It wasn’t just “put websites in another app.” We were hoping multiplayer would do something similar to what Notion and Airtable did. In my mind, those products “land big” because they feel like docs and sheets on steroids. Blocks, databases, formulas, all inside surfaces people spend so much time in, so the step change feels obvious.

With Sail/Muddy, the bet was that multiplayer browser surfaces would land big and help with collaboration, alignment, handoff, etc. Someone sends you the exact things to click on inside a message, you pin them to come back to later, no more switching tabs, you can see what other people are doing. Some users did see Sail as a tool for big research projects, accumulating tabs and sources spatially, though mostly single player.

In both products, we were also rendering browser tabs and web content inside their own processes. Sail on an infinite canvas, Muddy inside a shared chat workspace. Architecturally, there’s a big difference between “this is an iframe in a web app” and “this is a real browser tab with full capabilities.” But that distinction doesn’t land unless people feel a step change in what they can do. To most users, it just read as embeds. They weren’t thinking about iframe limitations, process isolation, site compatibility, browser architecture, or the experience that enabled. And they shouldn’t have had to.

So yeah, not small in ambition or product theory, but small in perceived divergence. The system was ambitious, but the delta users felt was often more like “a nicer way to look at web stuff inside another interface,” not “this changes how I work with people or how I use my computer”.


Not sure if this fits, but I think that the browser as a portal to the web of all information invites you into a space of so many possibilities, so many user interfaces and experiences that any change to the portal itself in that context exists alongside the sea of other (to the person at least) similar variants, and must necessarily feel small. Also because the browser has perhaps succeeded as a category because it was the thing that got out of the way for all of that content, and so you weren’t just competing with browsers for user mindshare and differentiation, you are competing with the entire Internet. Because peoples’ experience of the browser is not really of the browser as a product. It is what it accesses to. And there may be reasons outside of lack of imagination that all major browsers have converged on essentially the same pattern. Just an idea

It can come up as "I did not expect _arbitrary_ code execution/overwrite, especially not as root."

e.g. in an installer:

  1. Download package
  2. Maybe 'prepare' as the user – this could be _entirely_ caller-driven (i.e. you didn't run any code, you just provided materials for the installer to unpack/place/prepare), or it could include some light/very restricted code execution
  3. Perform 'just one operation' such as 'copying things into place' (potentially with escalation/root)
  4. In step 3, the preparation from 2 resulted in the placement of something in binary position (that then runs), and/or overwriting of important files (if something placed in step 2 was used as a target)
I'm collapsing and simplifying - lots more possibilities and detail than the above.

I mean Java's Loom feels like the 'ultimate' example of the latter for the _ordinary_ programmer, in that it effectively leaves you just doing what looks like completely normal threads however you so please, and it all 'just works'.

Java has gone full circle.

Java had green threads in 1997, removed them in 2000 and brought them back properly now as virtual threads.

I'm kinda glad they've sat out the async mania, with virtual threads/goroutines, the async stuff just feels like lipstick on a pig. Debugging, stacktrackes etc. are just jumbled.


Java didn't really "sit it out". It launched CompletableFutures, CompletionStages, Sources and Sinks, arguably even streams. All of those are standard library forms of async programming. People tried to make it catch on, but the experience of using it, The runtime wrapping all your errors in completion exceptions, destroying your callstacks, just made it completely useless.

Every Java codebase using something like Flux serves as a datapoint in favor of this argument - they're an abomination to read, reason about or (heaven help) debug.

I don't think comparing 97's green threads to virtual threads ever made sense.

Like their purpose/implementation everything is just so different, they don't share anything at all.


In Rust debugging and stacktraces are perfectly fine because async/futures compile to a perfect state machine.

They are not perfectly fine. If a task panics then you will get the right stack trace, but there is no way to get a stack trace for a task that’s currently waiting. (At least not without intrusive hacks.)


> This functionality is experimental, and comes with a number of requirements and limitations.

I assume that answers your question.


So once it's out of the experimental stage it won't be an intrusive hack anymore?

They stopped at the Promises level with CompletableFuture that lead to "colored frameworks" like WebMVC vs. WebFlux in Spring.

Who is they? Java has moved past those promise based API and avoided async/await mistake.

I'm curious how escape analysis works with virtual threads. With the asynchronous model, an object local to a function will be migrated to the old generation heap while the external call gets executed. With virtual threads I imagine the object remains in the virtual thread "stack", therefore reducing pressure in garbage collection.

The initial Loom didn't really provide the semantics and ergonomics of async/await which is why they immediately started working on structured concurrency.

And for my money I prefer async/await to the structured concurrency stuff..


Something in favor of this is the fact that it runs in their cloud and literally tells you that it costs I think $10 to $25 per run

It wasn't even the local-ness so much. Even if they stored at remotely it would be okay like ChatGPT or Claude but unlike the others for a long time the only way to let it store history on their servers was also allowing them to train on it. I haven't checked if it's changed.

I think this is so relevant, and thank you for posting this.

Of course it's trivially NOT true that you can defend against all exploits by making your system sufficiently compact and clean, but you can certainly have a big impact on the exploitable surface area.

I think it's a bit bizarre that it's implicitly assumed that all codebases are broken enough, that if you were to attack them sufficiently, you'll eventually find endlessly more issues.

Another analogy here is to fuzzing. A fuzzer can walk through all sorts of states of a program, but when it hits a password, it can't really push past that because it needs to search a space that is impossibly huge.

It's all well and good to try to exploit a program, but (as an example) if that program _robustly and very simply_ (the hard part!) says... that it only accepts messages from the network that are signed before it does ANYTHING else, you're going to have a hard time getting it to accept unsigned messages.

Admittedly, a lot of today's surfaces and software were built in a world where you could get away with a lot more laziness compared to this. But I could imagine, for example, a state of the world in which we're much more intentional about what we accept and even bring _into_ our threat environment. Similarly to the shift from network to endpoint security. There are for sure, uh, million systems right now with a threat model wildly larger than it needs to be.


Problem is, the way economic activity is organised in general, there is no transition path from complex bloated systems to well designed completely human auditable systems. For example given the inherent (and proven) security risks of the Wordpress ecosystem, nobody should run WP anymore.

I'd hazard a guess 90% of WP instances could be replaced by static site generator + some tiny app to handle forms, and the 9/10th of remaining ones with static gen + form + some external commenting system, whether in cloud or something like commento.

Correct. And yet, people are not doing it.

Right, but until now, and even today, in most people's early and primitive use of AI, it's been relatively difficult to make that change. To the extent that later this year and next year, people are able to point an agent at a WordPress instance, and iterate with it until it has a parity version of their surface in a custom form, things might start to change.

To be clear, I'm not one of the people who believes that software is going away or that UX is going away. I think those are both still very important. But I do think that a lot of legacy software can be replaced, and then we'll end up with a new level of software in the longer term.


Or maybe the majority of those people have crappy WordPress websites because they got some social proof that having a website equals making more money, so they scammed some freelancer out of some hours and hey presto.

Then the social proof moved to proprietary darknets, e.g. Facebook pages, which is easier - you don't have to learn anything.

I've seen no local small business care about its webpage, but I've seen a lot of them painfully struggle with crappy LOB smartphone apps.

I expect software and UX to only decline in quality.


Gall's Law applies here: "A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system."

Usually the way this happens in practice is that you take what has been learned about the market and the requirements from older, bloated, not-working-anymore products, and then start a new company and a new product that is simpler and hits 80% of the use cases with 20% of the complexity. There's even a name for this in business: "Disruptive Innovation". The simple product will eventually become bloated and complex and fail once it gets popular and lots of people start working on it, but then you start the cycle anew.

The economy is actually very well structured to accommodate this. One of the great parts of capitalism and market economies is that it tolerates partial failures extremely well: you just buy from a different supplier. This is in contrast to other systems like fascism, communism, socialism, bureaucracy, and state capitalism where the failure of the system usually means the failure of the state as well, because there is no way to replace parts of the system without a revolution.

There is arguably a problem with the current U.S. economy where the government has become overly involved in certain "too big to fail" industries, thus creating a system much closer to state capitalism that can no longer tolerate partial failures and so is condemned to one huge failure. This is unfortunate, but the eventual resolution is the same: throw it out and start again.


> whether or not Gemini really does forget what it has seen as easily as claimed

Whoever is writing this seems to have absolutely no clue how AI works.

Given that Google is clear about the fact that they don't train on your emails, the worst that could be happening here is that... within the scope of your account they maintain an extra index or two, or... additional synthesized data, in addition to the many indexes that they already maintain over your email.


While composing a reply to recipient B leaking some details that it "learned" when reading a mail from sender A, which you did not want to share with B. I have no idea how they organize sessions, indexes and whatever they use. But if no "side-channels" existed, I would be extremely surprised.

Of course reading generated text remains the sole responsibility of the user before clicking "Send". We all know that reading drafts can happen more or less carefully, especially when being in a hurry.


> Whoever is writing this seems to have absolutely no clue how AI works.

The question isn’t really about how AI works. It’s about how Google (the company) works. Do their actions match their stated intentions? Which is really a question of trust. Are they incentivised to lie? Yes. Are they likely to survive a disclosure scandal? Facebook’s experience inclines me to believe yes.


Google has been "clear" on many things in the past that were outright lies.


https://meta.ai/share/pe4HxOfv2Bp

Finding a little bit tricky to evaluate because the harness is unfortunately very, very bad (e.g. search is awful). Can't wait to try this in some real external services where we can see how it performs for real.

Definitely getting ordinary high-quality results, overall. But hard to test agentic behavior and hard to test prose quality, even, when just working off of the default chat interface.

One thing that stands out is that _for_ the quality it feels very, very fast. Perhaps it's just only very lightly loaded right now, but irrespective it's lovely to feel.

I'm quite impressed with the tone overall. It definitely feels much more like Opus than it does, like, GPT or Grok in the sense that the style is conversational, natural and enjoyable.


This seems pretty good.


"We want to see risks in the models, so no matter how good the performance and alignment, we’ll see risks, results and reality be damned."


i mean, to be fair, these are professional researchers.

i'm very inclined to trust them on the various ways that models can subtly go wrong, in long-term scenarios

for example, consider using models to write email -- is it a misalignment problem if the model is just too good at writing marketing emails?? or too good at getting people to pay a spammy company?

another hot use case: biohacking. if a model is used to do really hardcore synthetic chemistry, one might not realize that it's potentially harmful until too late (ie, the human is splitting up a problem so that no guardrails are triggered)


"for example, consider using models to write email -- is it a misalignment problem if the model is just too good at writing marketing emails?? or too good at getting people to pay a spammy company?"

But who gets to be the judge of that kind of "misalignment"? giant tech companies?


Might makes right; brains hold reigns.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: