Hacker Newsnew | past | comments | ask | show | jobs | submit | StarlaAtNight's commentslogin

I’m not very well read on the topic and you seen to take a strong “con” stance. Curious to hear why you think it deserves such a demise

People think that just because they have a way to prove that an image is AI, their worries of misinformation are solved. Better to acknowledge that wherever you look people will be trying to deceive you even if their content won't have as obvious an indicator as SynthID.

Not GP, but I’m pretty “con” too.

Because it’s meaningless for what it’s being marketed for. It’s conceptually inverted. It’s a detector that will detect 100% of the stuff that doesn’t mind being detected, and only the dumbest fraction of stuff that doesn’t want to be detected.

No fault of the extremely smart and capable people who built it. It’s the underlying notion that an imperceptible watermark could survive contact with mass distribution… it gives the futile cat-and-mouse vibes of the DRM era.

Good guys register their guns or whatever, bad guys file off the serial numbers or make their own. Sometimes poorly, but still.

All of which would be fine as one imperfect layer of trust among many (good on Google for doing what they can today). The frustrating/dangerous part is that it seems to be holding itself out as reliable to laypeople (including regulators). Which is how we end up responding to real problems with stupid policy.

People really want to trust “detectors,” even when they know they’re flawed. Already credulous journalists report stuff like “according to LLMDetector.biz, 80% of the student essays were AI-generated.” Jerry Springer built an empire on lie detector tests. British defense contractor ATSC sold literal dowsing rods as “bomb detectors,” and got away with it for a while [2].

It’s backward to “assume it’s not AI-origin unless the detector detects a serial number, since we made the serial number hard to remove.” Instead, if we’re going to “detector” anything, normalize detecting provenance/attestation [e.g. 0]: “maybe it’s an original @alwa work, but she always signs her work, and I don’t see her signature on this one.”

Something without a provable source should be taken with a grain of salt. Make it easy for anyone to sign their work, and get audiences used to looking for that signature as their signal. Then they can decide how much they trust the author.

Do it through an open standards process that preserves room for anyone to play, and you don’t depend on Big Goog’s secret sauce as the arbiter of authenticity.

I hear that sort of thinking is pretty far along, with buy-in from pretty major names in media/photography/etc. The C2PA and CAI are places to look if you’re interested [1].

…and that is why I am “con.”

[0] https://contentcredentials.org/

[1] https://c2pa.org/ , https://contentauthenticity.org/

[2] https://en.wikipedia.org/wiki/ADE_651


Consultants or professional services folks will be working in their company’s GitHub account and several clients. Requires managing lots of git/GitHub accounts


Simplifying for brevity* -- there are three levels in the GitHub entity:

  - accounts (personal)
  - orgs (companies, directories, teams, roles etc.)
  - enterprises (sets of orgs)
Even with enterprise SSO, the initial connect to GH can (is typically) "you" (just as you have the same driver license to show at the front desk when registering to visit a secured firm or random hotel), then you elevate "you" into the org through SSO, and what policies apply to you via your org can be 'governed' at the enterprise.

The idea behind this model is that no, you don't have to manage lots of those as you, you're just you, and each of those you aim to use has an elevation that entity controls instead of you controlling it.

This ultimately results in way less key material floating around, and you losing, leaking, or lousing up your own GH cred doesn't auto-give an attacker the SSO elevation.

• • •

Not incidentally, I have a slew of "accounts" given to me by companies that don't bother to make an org, they just invite individuals to repos or make individual accounts for their repo. I suppose it's cheaper in the short run. In the long run, these accounts are 90% still left active years to (no kidding) decade+ later. Seems a better idea to "don't do this." If you're a company, be an org.

---

* Expanded for more depth: https://docs.github.com/en/get-started/learning-about-github...


We should be able to pin to a version of training data history like we can pin to software package versions. Release new updates w/ SemVer and let the people decide if it’s worth upgrading to

I’m sure it will get there as this space matures, but it feels like model updates are very force-fed to users


If you talk to people who deal with inference using large fungible datasets, this is an extremely difficult governance problem. semver is incredibly insufficient and you don't have a well defined meaning of what "upgrade" even means let alone "major", "minor", and "patch".

It's a major disservice to the problem to act like it's new and solved or even solvable using code revision language.


I think the models are so big that they can’t keep many old versions around because they would take away from the available GPUs they use to serve the latest models, and thereby reduce overall throughput. So they phase out older models over time. However, the major providers usually provide a time snapshot for each model, and keep the latest 2-3 available.


If you're an API customer, you can pin to a specific dated snapshot of the model.

See the "Snapshots" section on these pages for GPT-4o and 4.1, for example:

https://platform.openai.com/docs/models/gpt-4o https://platform.openai.com/docs/models/gpt-4.1

This is done so that application developers whose systems depend upon specific model snapshots don't have to worry about unexpected changes in behaviour.

You can access these snapshots through OpenRouter too, I believe.


Every model update would be a breaking change, an honest application of SemVer has no place in AI model versions.

Not saying using major.minor depending on architecture is a bad thing, but it wouldn’t be SemVer, and that doesn’t even cover all the different fine tuning / flavors that are done off those models, which generally have no way to order them.


there's figurative and literal though. Figurative semver (this is a system prompt update vs a model train) would actually work ok... at least build numbers.

I think you could actually pretty cleanly map semver onto more structured prompt systems ala modern agent harnesses.


that's not enough, the tool definitions change, the agent harness changes, you need to pin a lot of stuff


Wonder if the YAML fixtures drew inspiration from dbt’s unit tests: https://docs.getdbt.com/docs/build/unit-tests#unit-testing-a...


If you build it, they will come


This headline sounds like a euphemism for something or one of those folksy sounding bits of wisdom


Same here, I thought it was going to be an analogy for a political story.

Now that I know it is literal leeches and that the options are scraping them off or waiting for them to finish, avoiding areas with leeches feels like the move.


One MILLION dollars puts pinky to corner of mouth


Nice try, AI!


Just curious, what made you go down that rabbit hole?


When I was about 10 I picked my first ever CD at a music shop, and it was a recording of the Gershwin piano rolls, because the cover photo caught my eye [1]. I didn't really understand what I was listening to, I assumed "piano roll" was a musical genre, like "rock'n'roll", until years later when my English became good enough to read the CD's booklet.

It was also a time when all these midi files started being available, like the 6000 rolls from Terry Smythe [2], and I figured out transcribing these could be a good way to learn old-school Jazz, which is otherwise difficult to find as sheet music.

[1] https://www.youtube.com/watch?v=BX9MCyO6smk

[2] https://archive.org/details/terrysmythe.ca-archive/mp3s/Ampi...


Does a piano roll sound different (I assume it does)? Ie, is or was there a specific market for a CD of a piano roll specifically, not, of someone playing the piano?


In terms of the music being played, piano rolls can be different from "normal piano music" because it's not played live by a real human, so it can have complex parts with full chords, additional voices, all with perfect rhythm and no wrong notes. This can be very compelling when well executed on the right songs (and it can also sound "mechanical" on others).

There isn't a huge market for piano roll recordings, and these recordings are rare. It's a niche topic that can attract

- Older people who have known the time piano rolls (say, until the 1950s)

- People nostagic of old times in general (in particular the 1910s-1940s), the age of early jazz with stride piano and early Broadway.

- Music scholars, because some of these rolls are of historical/musical importance, in particular those "recorded" by George Gershwin or Fats Waller and other big names. A lot of material exists only as piano rolls.

For the example of the Gershwin CD I posted above, it was produced by musicologist Artis Wodehouse [1] in parnership with the yamaha disklavier pianos iirc [2], so my guess is this was a passion project above all, with a bit of Yamaha marketing.

[1] https://www.artiswodehouse.com/biography/ [2] https://usa.yamaha.com/products/musical_instruments/pianos/d...



I thought purple drank https://en.wikipedia.org/wiki/Lean_(drug) Always seemed odd they would name a proof assistant language after cough syrup


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: