More

afro88 · 2026-04-24T03:46:59 1777002419

Experienced engineers that know the codebase and system well, and with enough time to consider the problem properly would likely consider this case.

But if we're vibing... This is the kind of bug that should make it back into a review agent/skill's instructions in a more generic format. Essentially if something is done to the message history, check there tests that subsequent turns work as expected.

But yeah, you'd have to piss off a bunch of users in prod first to discover the blind spot.

afro88 · 2026-04-24T03:26:29 1777001189

Or learnt to use an existing one.

I vibed a low stakes budgeting app before realising what I actually needed was Actual Budget and to change a little bit how I budget my money.

afro88 · 2026-04-24T03:25:11 1777001111

I can't remember what the technique is called, but back in the GPT 4 days there was a paper published about having a number of attempts at responding to a prompt and then having a final pass where it picks the best one. I believe this is part of how the "Pro" GPT variant works, and Cursor also supports this in a way (though I'm not sure if the auto pick best one at the end is part of it - never tried)

afro88 · 2026-04-22T07:12:42 1776841962

That's not the same picture

afro88 · 2026-04-21T22:38:33 1776811113

This is an excellent article. And can I just remind everyone that this is what human authorship looks like? Clearly not LLM generated. It has the author's unique tone, take on the subject, research, clear compelling story... A real breath of fresh air to be honest.

grvdrm · 2026-04-22T11:12:06 1776856326

Recently think that Ben's writing is more complex and verbose than ever, but I agree with your point entirely. He is writing it, not AI. I don't listen to his voiceovers but think of the articles as narrated by a captivating in-person presenter/lecturer.

afro88 · 2026-04-18T04:09:09 1776485349

I got a lot of <empty> as well. But was able to get a slide deck out of it before that happened, and it was reasonably good. Not 1-shot good, but better than what I have gotten out of Opus 4.6 with a skill previously

afro88 · 2026-04-17T19:35:16 1776454516

If you work with an exceptional one, sure

afro88 · 2026-04-17T12:29:07 1776428947

And India. It's a common experience that engineering teams from India will say yes to everything and then do what they think is best. Rather than saying no and explaining what they want to do instead

afro88 · 2026-04-16T09:26:39 1776331599

I believe it generates playwright scripts (non deterministically) which are saved and executed again (deterministically)

afro88 · 2026-04-15T03:35:15 1776224115

I know this was just an example, but:

> I know I wrote a good one for my git commit/push flow somewhere, but finding it when I need it usually takes longer than just rewriting it

This is actually a really good use case for a skill. Then when you go "commit and push" it'll do the right thing