More

brey · 2026-04-23T07:41:41 1776930101

The next sentence after your quoted section:

“Even then, AI output is never treated as an authoritative source. Everything must be verified.”

applfanboysbgon · 2026-04-23T07:46:04 1776930364

Any verification process thorough enough to catch all LLM fabrications would take more work than simply not using the LLM in the first place. If anything verifying what an LLM wrote is substantially more difficult than just reading the material it's "summarising", because you need to fully read and comprehend the material and then also keep in mind what the LLM generated to contrast and at that point what the fuck are you even doing?

I believe this policy can never result in a positive outcome. The policy implicitly suggests that verification means taking shortcuts and letting fabrications slip through in the name of "efficiency", with the follow-up sentence existing solely so that Ars won't take accountability for enabling such a policy but instead place the blame entirely on the reporters it told to take shortcuts.

klausa · 2026-04-23T09:17:41 1776935861

The LLM can find material that it would be hard or time-consuming for you to do.

You still need to verify it, but "find the right things to read in the first place" is often a time intensive process in itself.

(You might, at that point, argue that "what if LLM fails to find a key article/paper/whatever", which I think is both a reasonable worry, and an unreasonable standard to apply. "What if your google search doesn't return it" is an obvious counterpoint, and I don't think you can make a reasonable argument that you journalists should be forced to cross-compare SERPs from Google/Bing/DuckDuckGo/AltaVista or whatever.)

madamelic · 2026-04-23T12:54:51 1776948891

I believe what their point is is that if you give people a "extract-needle-from-haystack" machine and then tell them they have to manually find where in the haystack the needle was, it defeats the purpose of having the machine.

With that said, a good RAG solution would come with metadata to point to where it was sourced from.

palmotea · 2026-04-23T14:14:20 1776953660

> I believe what their point is is that if you give people a "extract-needle-from-haystack" machine and then tell them they have to manually find where in the haystack the needle was, it defeats the purpose of having the machine.

We've got to be careful to not let the perfect be the enemy of the good.

I'm not an LLM enthusiast, but I think you have actually compare it against what the alternative would really be. If you give the journalist a haystack but insufficient time to manually search it properly, they're going to have to take some shortcut. And using an LLM to sort through it and verifying it actually found a needle probably better than randomly sampling documents at random or searching for keywords.

klausa · 2026-04-23T14:38:10 1776955090

I don't want to come off as an AI-maximalist or whatever, but, I mean, at some point, skill issue, right?

You can use Google to find you results reinforcing your belief that the earth is flat too; but we don't condemn Google as a helpful tool during research.

If you trust whatever the LLM spits out unconditionally, that's sorta on you. But they _can_ be helpful when treated as research assistants, not as oracles.

Borealid · 2026-04-23T16:10:40 1776960640

This is a bogus analogy leaidng to a bogus conclusion.

If something points to the needle in the haytack (saying "this haystack has a needle positioned eighteen centimeters from the top and three left of center"), it's much easier to verify that indeed there is a needle there than it would be to find that needle in the first place.

If an LLM spits out a claim that something happened (citing a certain article), it's less work to read the article and verify the claim than it would be to DISCOVER the article in the first place.

In other words, LLMs can be a time-saving search engine, and the idea that it's just as much work to find+verify information as it is to have the LLM find it and then you verify it is hokum.

madeofpalk · 2026-04-23T14:42:21 1776955341

when you use the extract-needle-from-haystack machine, verify that it actually extracted a needle.

that's much easier than manually extracting the needle yourself

ihuman · 2026-04-23T15:07:04 1776956824

Another interpretation is if you have multiple haystacks, and the machine tells you which haystack likely has a needle in it. You still need to extract the needle yourself,

JumpCrisscross · 2026-04-23T07:59:51 1776931191

> Any verification process thorough enough to catch all LLM fabrications would take more work than simply not using the LLM in the first place

Sometimes you have a weak hunch that may take hours to validate. Putting an LLM to doing the preliminary investigation on that can be fruitful. Particularly if, as if often the case, you don't have a weak hunch, but a small basket of them.

Mordisquitos · 2026-04-23T09:40:55 1776937255

You can prompt LLMs to scan thousands of documents to generate text validating your hunches. In some cases those validated hunches may even be correct.

Eisenstein · 2026-04-23T13:01:29 1776949289

It's easy to get an LLM to make any argument you like based on whatever data is available. Those arguments are going to be trivially bad if that data is bad.

Jtarii · 2026-04-23T12:20:00 1776946800

It's more using LLMs like a metal detector, rather than digging through the entire beach by yourself.

You still need to check the junk you dig up using the metal detector.

Angostura · 2026-04-23T08:33:21 1776933201

Disagree. If I’m I’m a reporter and I’m trawling though a mass data dump - say the Epstein files or Wilileaks or statistics on environmental spills or something, using AI to pull out potential patterns in the data, or find specific references can be useful. Obviously you go and then check the particular citations. This will still save a lot of time.

Paracompact · 2026-04-23T07:57:57 1776931077

> I believe this policy can never result in a positive outcome.

I get where you're coming from (I'm learning more and more over time that every sentence or line of code I "trust" an AI with, will eventually come back to bite me), but this is too absolutist. Really, no positive result, ever, in any context? We need more nuanced understanding of this technology than "always good" or "always bad."

bandrami · 2026-04-23T09:53:25 1776938005

If you need accuracy, an LLM is not the tool for that use case. LLMs are for when you need plausibility. There are real use cases for that, but journalism is not one of them.

applfanboysbgon · 2026-04-23T08:04:38 1776931478

I didn't say in any context. I'm specifically talking about this policy on journalistic research.

brey · on May 26, 2022

I work at Featurespace - we stop fraud, scams, money laundering and other financial crimes.

We build software that runs real-time machine learning models at scale, deep in the financial 'rails' that move money around, and we license it to banks and other financial institutions.

A surprisingly large number of my colleagues are here because of similar reasons - we like making a positive societal contribution.

We're hiring!

brey · on March 10, 2022

'permissible' isn't quite the point ... if it makes the jury think you're lying, maybe it isn't the best strategy.

Mordisquitos · on March 10, 2022

1. The jury should not base their decision on their belief whether either side is lying.

2. The defence is expected to lie, and if the prosecution cannot prove that every single one of the defence's arguments are lies, then the jury cannot convict beyond reasonable doubt.

3. The jury should assume that the prosecution is lying by default, and acquit if the prosecution does not convince them otherwise.

gpm · on March 10, 2022

> 2. The defence is expected to lie

The defence is expressly prohibited from lying. Lawyers have a so called duty of candor [1] outlining this. Defendants testify under oath to make this clear to them. Defence attorneys must disclose to the court if their client lies to the court (and they can't convince their client to voluntarily disclose it instead) [2].

That doesn't mean people are expected to take defendants at their word during trial, juries are allowed to decide they think that someone was lying, but they aren't expected to lie.

[1] https://definitions.uslegal.com/d/duty-of-candor/

[2] https://www.eiglarshlaw.com/when-clients-liewhat-must-you-do...

duxup · on March 10, 2022

The jury can decide that any given testimony is a lie and weight it accordingly.

dragonwriter · on March 10, 2022

Defense theories are not testimony and, ideally, should not be considered in the evaluation of testimony.

(In practice, humans don't consistently compartmentalize well enough to reliably avoid this, though.)

duxup · on March 10, 2022

"defense theories" are going to have evidence associated with them. At that point my point applies.

prvc · on March 10, 2022

Certainly, but is there some other penalty for what might be construed as contemptuous and disreputable behavior?

thehappypm · on March 10, 2022

It might be perjury

brey · on Aug 12, 2021

The laws of physics permit (physically allow) me to punch someone in the face. The laws of my country say if I do that, there will be consequences.

This is all about the level of abstraction

The ‘laws of physics’ in a game may allow you to do unintended things, it’s a complex system. Doesn’t mean that it’s OK at a higher abstraction.

brey · on Feb 21, 2021

That does seem the better outcome - there’s no downside in being lenient here.

> I did a quick scan of the Alexa Top 1 Million list. Currently around 0,06 % are affected

Only affects a small minority of mailservers, and even then only 0.06% of domains.

brey · on Feb 5, 2021

I have, it's worth getting, it definitely scratches the same 'must ... optimise ... better ...' itches.

It's early access but very playable and feature complete, but it does feel less polished than factorio (obviously can't compare to an 8 year development effort!)

Satisfactory is another great play for fans of the genre.

Both games are 3D, and both use the additional dimension well. You can play them fine just like factorio (I did the first runthrough) and treat 3D as eye candy, but you'll do better if you 'cut with the grain' and learn how to build 3D factories.

brey · on Feb 1, 2021

> in an enclosure of chrome steel uprights, mounted on a wooden base, with a handle at each end

That sounds so much better than “built on an upturned coffee table” ;-)

But I jest - it’s very cool. I love the physicality - makes it feel much more real than 3D graphs represented on a screen.

brey · on June 7, 2020

> what is the best calculation to make when trading off code quality vs features?

> do most YC startups write tests and try to write cleanish code in V1 or does none of this matter?

It only matters when bad code hurts your overall business velocity - what that means, only you can answer.

Nobody's writing tests for their purist aethestics, they're there to let you go faster - but there's an up-front cost you have to pay for them. Sometimes that's worth paying, sometimes the land grab is more important.

There's no single answer to this question.

_ix · on June 7, 2020

Tend to agree. Leadership needs to send strong, clear signals about quality and acknowledge existence of potential technical debt well before the team starts feeling crushed by it.

brey · on Nov 28, 2019

a few principles I've found helpful:

There are no coincidences, unless proven otherwise.

If something smells wrong, it probably is. Trust your gut.

Make sure you're building the right thing, before you build the thing right.

Don't be clever. Elegant one-liners that make you feel like a genius when writing it are probably not very maintainable.

The second best piece of code is the one you just deleted. The best one is the one you didn't write in the first place.

Plan to fail, and gracefully degrade.

brey · on Oct 27, 2019

I've been through deals with large financial institutions that have taken 2-3 years from an initial incredibly positive demo and PoC and strong buying signals, through to a signed contract.

IMO going through that successfully is also sending a signal that you have sufficient endurance - there's a large risk for a corporate to sign up for a long contract with a startup, they don't know how long you're going to be in business for. If the startup doesn't have the appetite to spend 2-3 years to get a contract over the line, then they aren't going to be a stable partner for the long term.

You need to also have sufficient numbers of small/medium scale deals so you're not 100% relying on elephant hunting.