More

zuzululu · 2026-06-06T05:32:20 1780723940

When did HN become Reddit ? This is a real demographic shift I am seeing after a long time hiatus. The people who hate AI are largely those that lean far left and see themselves as liberal progressive.

haunter · 2026-06-06T06:09:46 1780726186

I also see that and I'd say around Covid / past-Covid. More people became terminally online in those 6-12 months, like another eternal September.

Funnily you will always see some people waving the HN guidelines [1] flag: nooooo, don’t compare this site to Reddit. Yet there is another „rule” in the guidelines about politics being off-topic… which is the biggest symptom of HN turning into Reddit: General, especially US domestic, politics became excessively acceptable to be posted here. That wasn’t the case 10 years ago or more. Of course if you point that out then the „everything is politics” crowd will show up and the „should we close our eyes and ears to all the tragedies happening in the world”. Rinse and repeat.

That’s the problem with ambigous rules and to some extent why I still prefer Reddit. If you don’t like it you make a new sub, find another one etc. At least the bias is clearly known

1, https://news.ycombinator.com/newsguidelines.html

zuzululu · 2026-06-06T04:15:46 1780719346

I know it isn't really a number station but I wanted it to be true...

Someone broadcasting one time pad messages using GPS over years...

a spy operative using jogging app changing routes slightly

or maybe a cartel member embedded inside highly hostile countries like Singapore

zuzululu · 2026-06-06T01:43:15 1780710195

perhaps for companies that are low balling compensation will not budge on leetcode and only those that see real uplifts from AI use will move away from it, testing more for holistic and experience based tests.

i could be misreading it

zuzululu · 2026-06-05T21:59:12 1780696752

interesting but i rarely use any HN reader

there's just something about this UI and its consistency

I also don't mind all the AI related news

If anything I just wish they had a mute/block button. its not fun when somebody is stalking you replying to every comment you make.

zuzululu · 2026-06-05T21:05:40 1780693540

People forget skill is just a markdown file and I don't think TDD makes sense. It's more for specific niches like working on your custom codebase or some less beaten paths you take and save the lessons going forward

But everybody is free to choose how they work and it may be required in ways that we can't know about.

zuzululu · 2026-06-05T21:02:45 1780693365

TDD sounds great on paper for agentic development but you quickly realize it balloons the token cost. Often I write some feature and then its repurposed or removed, code is refactored moved around as time goes. With TDD I would be taxed heavily and velocity slow to a crawl.

The waterfall approach is better after trying out TDD especially when you have a multi-agent setup. Also I found that in some cases the tests were just superficial hallucinations that never actually tested the components written or there some some context corruption and ultimately triggered a false positive that kicked off a completely unintentional refactoring.

pramodbiligiri · 2026-06-06T07:47:39 1780732059

My approach (with LLMs especially) aligns more with what's outlined in "Growing OO Software Guided by Tests" (https://growing-object-oriented-software.com/toc.html). Chapter 4 there says "First, Test a Walking Skeleton", and Chapter 5 has "Start Each Feature with an Acceptance Test". I think it comes down to: get something working end-to-end first in a verifiable way, and then keep refining both the feature and its tests (preferable with TDD).

I've noticed that LLMs tend to generate multiple testcases in one shot (which is not how humans usually go about TDD), and also they don't start with Integration Tests, unless instructed to do so.

__mharrison__ · 2026-06-05T21:13:32 1780694012

My experience is the opposite. TDD keeps the guardrails on and let's me refactor with confidence.

Crazy times here in the development world. I'm always curious to watch other's best practices.

dools · 2026-06-05T21:22:59 1780694579

Yeah I specifically tell it not to pre-emptively fix tests that it knows will break as a result of changes its making and instead limit itself only to creating new tests for new changes. I want to see the tests break, then we go through and review each set of breakages versus the mission and assess if they’re regressions or stale assertions. This is a) how I know it’s actually writing meaningful tests b) a very functional and useful form of “code review” versus just trying to catch problems by reading diffs and c) helped me find real problems and regressions.

Almost all the breakages after a big refactor are stale assertions but every time I catch a couple of critical problems that make the entire exercise very worth it.

The whole dev process is so fast compared to writing software manually that I find it absurd that I wouldn’t invest heavily in automated tests.

__mharrison__ · 2026-06-05T21:23:43 1780694623

See my AGENTS.md in nearby comment

rsalus · 2026-06-05T23:24:47 1780701887

I was a big proponent of encoding TDD red-green-refactor methodology into my agent workflows until recently when I made the same realization after reading this study: https://arxiv.org/pdf/2602.07900

TLDR; it found test-writing volume only weakly correlates with success and that encoding test-writing principles did not move resolution rates but _did_ materially change cost. Encouraging tests cost +19.8% output tokens for 0% gain; discouraging them saved 33–49% input tokens for ≤2.6pp accuracy loss. Separately, imposing the TDD procedure specifically seems like it can backfire: it actually _increased_ regressions from 6.08% to 9.94%.

IMO, where tests clearly help is primarily as an "oracle" applied after generation. It gives the models a signal that enables them to verify and self-correct if necessary.

zuzululu · 2026-06-06T01:05:45 1780707945

Very interesting paper and it lines up exactly with my observations. The ROI just isn't there writing tests up front and the conclusion in that paper lays it out clearly

    Overall, these findings suggest that agent-written
    tests often behave more like a habitual software-development rou-
    tine than a dependable source of validation in this setting. More
    agent-written tests do not mean more solves; what they more reli-
    ably change is the process footprint—API calls, token usage, and
    interaction patterns. Improving the value of testing for code agents
    may therefore require better oracles and more actionable validation
    signals, rather than simply inducing agents to write more tests.

> IMO, where tests clearly help is primarily as an "oracle" applied after generation

Bingo. I'm not against writing tests it's that the returns are better when its used as verification feedback and as "Oracle" exactly as you put it.

girvo · 2026-06-06T04:01:19 1780718479

Just chiming in to say that I've seen the exact same that you have. Tests are better used to help validate that was was generated worked after the fact.

That, and even the absolute SOTA models still suck at writing tests.

Which shouldn't be surprising: humans suck at it too most of the time...

zuzululu · 2026-06-06T04:18:40 1780719520

Absolutely, there's no reason to believe that agents will be more capable of writing tests than any other piece of code. The big pay off is actually verifying the code that was generated.

necovek · 2026-06-06T04:54:16 1780721656

The paper focuses on two things: default behavior and behavior with a prompt to write at least one new test.

In general — just like with humans — I find "just add more tests" to be counter-productive.

Tests make sense in a testable architecture: TDD can encourage one to be implicitly used, but it is a design, architectural choice that should be made explicit (lean to functional code; use direct, explicit dependency injection; ensure test stubs are just variants of the real implementation and fully tested using the same test as the real one...). LLMs should be prompted with this guidance instead for proper value estimation.

dnautics · 2026-06-06T03:43:49 1780717429

> it balloons the token cost

how!!??

you write a test, which is one extra function. and maybe a paragraph or so per feature ("i made a RED test"... "i made it GREEN"), everything else is the same between normal development and TDD. this is chump change compared to the rest of development, including thinking tokens

manmal · 2026-06-05T21:56:27 1780696587

> With TDD I would be taxed heavily and velocity slow to a crawl.

And the code will be good.

rsalus · 2026-06-05T23:46:58 1780703218

not necessarily, TDD has little bearing on output quality

bfeynman · 2026-06-06T02:06:50 1780711610

In what world or frame of reference would doing TDD have "little" bearing on output quality? If you build a system around satisfying some set of requirements it seems logical that output quality would have pretty heavy correlation.

emigre · 2026-06-06T08:13:40 1780733620

It's possible to satisfy a set of requirements with code that's low quality. There's the maintainability of the code, for example, or the performance of the system.

mpweiher · 2026-06-06T09:03:58 1780736638

The set of requirements TDD encourages code to meet happen to be ones that increase code quality.

Code that is easy to test tends to be well-structured.

Code that is badly structured tends to be hard to test.

TDD is not a QA methodology, it is a design methodology. It also tends to help quality out a lot, but that's a secondary effect.

manmal · 2026-06-06T04:51:01 1780721461

That’s an interesting proposition, are you saying people do TDD just for the heck of it?

jzig · 2026-06-05T21:48:29 1780696109

Pattern-based testing can theoretically reduce the token cost?

reg_dunlop · 2026-06-05T21:08:40 1780693720

But that repurposing/removal is exactly what's avoided if you follow through with the SEF framework he outlines.

I have to push back on the idea that token costs balloon when using TDD within the context of a strong framework such as Jason has laid out here.

If the feature is repurposed/removed/refactored....I'd argue the specification wasn't well thought out prior to burning into tokens.

We're so eager to do a lot of the wrong things quickly, when it may serve us better to do a more precise thing slowly.

zuzululu · 2026-06-05T21:30:22 1780695022

You cant spec out what you dont know, scope, requirements change from real world feedback

zuzululu · 2026-06-05T20:44:35 1780692275

they probably dont have the scale for support for anything lower vs stripe

im amazed that stripe is able to handle small guys like me

trumpdong · 2026-06-06T00:49:56 1780706996

They can't afford it because they have no customers, because they turn down small customers and then Stripe already has them when they become big enough for Adyen.

zuzululu · 2026-06-05T20:43:32 1780692212

doesnt seem like stripe has anything to worry about here the total contract value is of irrelevant scale

i guess i expected it to be more significant seeing that its the UK gov

johannes1234321 · 2026-06-05T21:13:53 1780694033

Well, it is signalling. It's signalling that there are competitors which are trustworthy enough for a government and it signals the overall trend one can observe how European governments detach from US companies. With some lower hanging fruit and some larger projects.

aiisjustanif · 2026-06-06T06:45:10 1780728310

Most government processing is prohibited at Stripe that is why it always tends to be small. Too much risk being exposed to more regulatory scrutiny that is not worth it as payments processor. > “Prohibited Businesses… Government services… Disbursement of government economic support, such as grant” [1] [1]: https://stripe.com/en-br/legal/restricted-businesses

zuzululu · 2026-06-05T17:41:25 1780681285

Where are they going? If its not self hosted I don't see it not ending up like github.

crazysim · 2026-06-05T17:54:34 1780682074

codeberg

I had a repo with more than a dozen forks banned on GitHub for some unclear TOS violations. Ticket has been sitting for a week plus now, asking for clarification and guidance.

So, it lives in codeberg now. https://codeberg.org/nelsonjchen/op-replay-clipper

zuzululu · 2026-06-05T18:09:48 1780682988

this just looks like a reskinned gitea

crazysim · 2026-06-05T18:49:06 1780685346

It's a running a fork (codeberg specific) of a fork of gitea called forgejo (https://codeberg.org/forgejo/forgejo) so it's not surprising. The people behind it were a bit miffed at Gitea doing some questionable commercial endeavors in their view and also not dog-fooding Gitea for Gitea.

zuzululu · 2026-06-05T20:39:04 1780691944

huh i did not know that . thanks for forgejo guess im moving

throawayonthe · 2026-06-05T23:17:53 1780701473

it's the "main" instance of forgejo, a gitea fork

https://codeberg.org/forgejo/forgejo

arealaccount · 2026-06-05T19:11:19 1780686679

Why do people not like gitlab? I’ve always found it a better experience than github

dwedge · 2026-06-06T07:34:47 1780731287

I tried self hosting gitlab. I installed it and got miffed that it wouldn't let me change password complexity requirements for a user, so I left it but left it running for "maybe later".

Two weeks later it had spammed 50GB of logs to the disk and was idling at 11GB RAM. With zero repos and zero active users. I don't want a git interface to be full of bloat.

That's why I don't like it. I'm moving a client from gitlab to forgejo at the moment.

parliament32 · 2026-06-05T22:35:39 1780698939

Personally, they're going wayyy too hard on the AI stuff. I just want an interface to git and maybe an issue tracker.

selfhoster1312 · 2026-06-05T19:52:07 1780689127

Gitlab's UI changes every now and then, for seemingly no reason. The UI is very full of stuff (hard to find your way around), and very slow. Notably in the past months, they've changed the issues/tickets board into a "work items" board which feels infinitely slower to load, has such a vague meaning that nobody can find it (especially when translated), and brings exactly 0 use to anyone i know. They just seem to be doing that with every feature and every part of the interface.

On the server side, gitlab was always very hard to selfhost with many moving parts, many requirements, and using much resources. gitlab-runner is not very explicit about things when you're not in the happy path (why is it not picking up jobs?).

I'm not even a minimalist. I've been running gitea/forgejo for the past 8 years or so and it's been a miracle in comparison: lightweight server, easy setup/upgrades, and super simpler UI/UX that everybody understands on the first try. Forgejo (gitea community fork) learns from everything that Github historically made good (UX) without any enshitiffication in sight (developed by a non-profit). I highly recommend it.

plagiarist · 2026-06-05T20:46:59 1780692419

If you're leaving based on security failures, Gitlab is not the place to go.

stronglikedan · 2026-06-05T19:23:18 1780687398

same. so much more intuitive

phoronixrly · 2026-06-05T18:16:34 1780683394

There exist competent operations people and competent developers.

zuzululu · 2026-06-05T17:40:18 1780681218

I was impacted. found weird spam repos that later were deployed on cloudflare redirecting my domains.

meanwhile the gitea running on my metalbox for nearly a decade has seen no compromise and 100% uptime when cloudflare has gone down repeatedly

im rethinking the whole "go where crowd is" , while great from evolutionary point of view, its the complete opposite. Where the crowd gathers online is the most dangerous place.

em-bee · 2026-06-05T18:47:10 1780685230

it's the same with linux viruses. they were always a possibility, but because linux is not popular, they were never an issue.

LoganDark · 2026-06-05T20:41:25 1780692085

Linux is absolutely popular for servers. If you put a WordPress installation on the IPv4 address space, or any other kind of PHP you usually find a webshell has appeared after just a few minutes.

dwedge · 2026-06-06T07:29:46 1780730986

This totally isn't true. Sure, if you load it with vulnerable plugins, but otherwise this type of FUD helps nobody.

em-bee · 2026-06-05T21:48:20 1780696100

true, i get these attempts on my server daily. but here too you got less popular alternatives, so the same principle applies.