Hacker Newsnew | past | comments | ask | show | jobs | submit | tobyhinloopen's commentslogin

“Use an agent to…” is much more effective in my experience, because they have no means in communicating with you. They are more likely to just do it

Yeah this happens to me all the time! I have a separate session for discussing and only apply edits in worktrees / subagents to clearly separate discuss from work and it still does it

Sonnet is very clearly worse than Opus for a lot of tasks. Sonnet is still awesome, but less so.

How would you know the invocation is correct when written by a human? Don’t humans make mistakes?


Sure, humans make mistakes... but rarely, vanishingly rarely about commands they use often. Are you going to make a non-typo kind of mistake when typing `ls -l`? AI hallucinations don't happen all the time, but they happen so much more often than "vanishingly rarely".

That's why you can't just vibe-code something and expect it to work 100% correctly with no design flaws, you need to check the AI's output and correct its mistakes. Just yesterday I corrected a Claude-generated PR that my colleague had started, but hadn't had time to finish checking before he went on vacation. He'd caught most of its mistakes, but there was one unit test that showed that Claude had completely misunderstood how a couple of our services are intended to work together. The kind of mistake a human would never have made: a novice wouldn't have understood those services enough to use them in the first place, and an expert would have understood them and how they are supposed to work together.

You always, always, have to double-check the output of LLMs. Their error rate is quite low, thankfully, but on work of any significant size their error rate is pretty much never zero. So if you don't double-check them then you're likely to end up introducing more bugs than you're fixing in any given week, leading to a codebase whose quality is slowly getting worse.


Russia is the aggressor, Iran is a defender. That’s a huge difference.


how many users are using lockdown mode


I’ve been using it for more than a year.

Parts of it are pretty inconvenient, like with iMessage and FaceTime not working normally, but aside from that it’s not noticeable for my use case.

Despite the inconveniences, unless animated emmojis are important to you I don’t know why you wouldn’t enable it given how strong its protections are.


Every day users? Probably not many. It forcibly disables lots of nice-to-have features.

But users who need a highly secure phone? It’s entirely possible to use the phone without media embeds in iMessage, or shared photo albums, or websites loading in 900 fonts. It’s a trade off likely worth making in some situations.


You can make a shared photo album with family members. It’s everyone else that is problematic with the feature enabled. In my case I only want to share with my wife and son so it wasn’t a detractor for me.


I’ve used it on my personal iPhone since the feature was released. The impact to my life has been minor. I can’t share some thing with my wife in the health app and my son can’t SharePlay with me in the car while I use CarPlay.


I turned it on, out of curiosity, and the impact is minimal, for me.


I was using it till the 26 upgrade on my iOS 13 Mini. Became very sluggish and unusable that I had to disable it. It clearly isn't tested well.


I turn it on when I travel overseas, and have considered turning it on when I’m near border regions in America.

It’s mostly that I don’t want to be that guy that leaks my company’s secrets.


I use Preact without reactivity. That way we can have familiar components that look like React (including strong typing, Typescript / TSX), server-side rendering and still have explicit render calls using an MVC pattern.


How and when do your components update in such an architecture?


View triggers an event -> Controller receives event, updating the model as it sees fit -> Controller calls render to update views

Model knows nothing about controller or views, so they're independently testable. Models and views are composed of a tree of entities (model) and components (views). Controller is the glue. Also, API calls are done by the controller.

So it is more of an Entity-Boundary-Control pattern.


From what I can tell, they do full page reloads when visiting a different page, and use Preact for building UIs using components. Those components and pages then get rendered on the server as typical template engines.


Could you show an example?


Neat! I was looking for something like this


thanks! let me know how it goes


Way too expensive, I'll wait for a free/open source browser optimized to be used by agents.


Our approach is actually very cost-effective compared to alternatives. Our browser uses a token-efficient LLM-friendly representation of the webpage that keeps context size low, while also allowing small and efficient models to handle the low-level navigation. This means agents like Claude can work at a higher abstraction level rather than burning tokens on every click and scroll, which would be far more expensive


If a potential user says it is too expensive, better to ask why than to tell them they are wrong. You likely have assumptions you have not validated


Definitely! Making Smooth as cost-effective as possible it's been a core goal for us, so we'd really love to hear your thoughts on this

We'll continue to make Smooth more affordable and accessible as this is a core principle of our work (https://www.smooth.sh/images/comparison.gif)


are your evals / comparisons publicly/3rd party reproducible?

If it's "trust me, I did a fair comparison", that's not going to fly today. There's too much lying in society, trusting people trying to sell you something to be telling the truth is not the default anymore, skepticism is


That's a great point, we'll publish everything on our docs as soon as possible


I'm paying a fixed amount on Claude and other agents, so "more tokens" is "free" for me. There's a lot of niche tools out there but I think we all have "subscription fatigue".

But maybe that's just me - Maybe im just not your target audience :)


Same! If I put the skill's instructions in the general AGENTS.md, it works just fine.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: