It was announced in like November of last year, so it's certainly taken some time. The announcement was by some senior management at GitHub, so it has some degree of buy-in.
You can write your own linters for every dumb AI mistake, add them as pre-commit checks, and never see that mistake in committed code ever again.. it’s really empowering.
You don’t even have to code the linters yourself. The agent can write a python script that walks the AST of the code, or uses regex, or tries to run it or compile it. Non zero exit code and a line number and the agent will fix the problem then and rerun the linter and loop until it passes.
Lint your architecture - block any commit which directly imports the database from a route handler. Whatever the coding agent thinks - ask it for recommendations for an approach!
Get out of the business of low level code review. That stuff is automatable and codifiable and it’s not where you are best poised to add value, dear human.
There should be volunteer groups at local libraries running these services for their local communities.
It’d be a great way for kids to learn to operate services and a great alternative for anyone who wants to use the fantastic open source stuff that’s out there but lacks expertise or time.
> There should be volunteer groups at local libraries running these services for their local communities.
The problem with bespoke anything in computers is always the support.
No one wants to be on the hook for customer support. I absolutely agree with them.
There are a ton of "services" that exist solely to enable people to cut a check and say "Customer support is over there. Go talk to them and leave me alone."
> Claude produced a complete first draft in three days. It looked professional. The equations seemed right. The plots matched expectations. Then Schwartz read it, and it was wrong. Claude had been adjusting parameters to make plots match instead of finding actual errors. It faked results. It invented coefficients. It produced verification documents that verified nothing. It asserted results without derivation. It simplified formulas based on patterns from other problems instead of working through the specifics of the problem at hand.
This is solvable with harness engineering.
The model’s first try is never ready for human consumption. There needs to be automation (bespoke, a mix of code and prompt based hooks - which agents can build) to force the agent’s output back through itself to tell it to be more rigorous, search online for proof of its claims, etc etc. and not stop until every claim is verifiable.
No human should see the model’s output until it’s met these (again bespoke but not hand written) guardrails.
What I’m talking about doesn’t exist and really has no analogy yet, so you can think of it as a super advanced form of linting. It’s grounding, but also verification that the grounding links to the material, and refusal to accept the model’s work until it meets the bar.
We are asking models to dream (invent purely from their weights), and are surprised when their dreams, just like ours, have little relationship to reality. The current state of the art is going to look very naive in a few years’ time.
25 years ago my friends and I all had hard drives crash, and lost my data” experience. Corporate cancelation is harder to mitigate - but far rarer. The way things work un 2026 is very friendly to normies.
But you dear HNer ain’t a normie!
You have the skills to migrate your stuff away. It is time to pull the trigger.
If you’re in a position of considering alternatives, I find Fastmail to be fully featured, support saving the key stuff offline, and most importantly FAST!
No ”try our AI for free!” nudges or “smart features” that you need to go through and decide whether to disable.. which is a feature these days.
appreciate the suggestion, but I like my gmail account! I just want a fast, stripped down interface. and BAREmail is free, doesn't need a backend, and open source. looks like Fastmail is paid only?
I mean, they do one thing.
Looking forward to seeing if they respond.
reply