More

hackinthebochs · 2026-04-25T10:10:03 1777111803

>Do you think grokking is consistent with implicit regularization as compression

Pretty sure it's been shown that grokking requires L1 regularization which pushes model parameters towards zero. This can be viewed as compression in the sense of encoding the distribution in the fewest bits possible, which happens to correspond to better generalization.

hodgehog11 · 2026-04-25T12:19:18 1777119558

Couldn't have said it better, although this is only for grokking with the modular addition task on networks with suitable architectures. L1 regularization is absolutely a clear form of compression. The modular addition example is one of the best cases to see the phenomenon in action.

hackinthebochs · 2026-04-19T03:59:34 1776571174

It's interesting how much some of you expect us to ignore gut feelings and statistics to avoid the appearance of bigotry. We should at the very least be able to acknowledge statistical reality then we can debate what is an appropriate response. Hell, I don't even need to know the backgrounds of the immigrants. We know that males engage in almost all the violent/forcible sexual assaults. We know that a lack of community engagement increases the chance for anti-social behavior. We know that access is a prerequisite for interpersonal crime. That itself is enough to warrant heightened concern.

hackinthebochs · 2026-04-17T05:35:12 1776404112

The breathless fearmongering over an age field on account set up is just completely over-the-top. This is probably the least bad out of all possible ways to implement age checking. The benefit of this is that it can short-circuit support for more onerous age verification. The writing has been on the wall for some time now: the era of completely unrestricted internet is coming to an end. The question is how awful will the new normal be? Legislation like this is a win all around, a complete nothingburger. We should be celebrating it, not fighting it tooth and nail.

The tech crowds utter derangement over this minor mandate is truly a sight to behold.

Random_BSD_Geek · 2026-04-17T06:27:12 1776407232

Like the authors of these bills, you appear not to understand the technology.

Consider AB1043. It mandates that applications check the age of the user each time the application is launched.

Think about what that means when you run `make` in a source directory. How many times is the compiler application launched?

hackinthebochs · 2026-04-17T10:57:54 1776423474

Let's try to be a little bit sensible here. Presumably the requirement to check depends on the nature of the application. A completely offline app for example has no use for an age check and thus wouldn't need to read it.

ndriscoll · 2026-04-17T11:45:03 1776426303

This bill doesn't seem to create a requirement for the application, but e.g. the California one required all applications to check age.

Random_BSD_Geek · 2026-04-17T16:41:47 1776444107

The sensible thing is to abandon these bills.

AB1043 is short; you can read it for yourself: https://legiscan.com/CA/text/AB1043/id/3273385

``` (b) (1) A developer shall request a signal with respect to a particular user from an operating system provider or a covered application store when the application is downloaded and launched. ```

tzs · 2026-04-17T19:44:08 1776455048

That should be read as "when the application is (downloaded and launched)".

If it were meant as "when the application is downloaded and every time the application is launched" it would probably have been written as "when the application is downloaded or launched".

Also, there would be no point in mentioning downloads if that was a separate check because the app developer cannot request the signal upon download because their app is not running then.

The most reasonable conclusion is that the app must check the first time it is launched.

greyface- · 2026-04-17T05:39:19 1776404359

https://en.wikipedia.org/wiki/Anticipatory_obedience

3form · 2026-04-17T08:36:37 1776414997

This needs to be simply fought because it's a measure that is supposed to fight the reluctance of the society, not actual problem. For the actual problem it's ineffective. This will be met by surprise once it's fully implemented and new, worse measures will be proposed. Hence, it needs to be cut off as early as possible to spare everyone the trouble.

iamnothere · 2026-04-17T11:29:59 1776425399

This bill requires actual verification and leaves it up to the politically controlled FTC to determine how this should happen. It’s a disaster.

> The Parents Decide Act solves the self-reported-birthday problem by demanding something verifiable, which in practice means a government ID, a credit card, a biometric scan, or some combination.

> However, Gottheimer has not specified which. The bill does not either. It’s up to the FTC to decide.

hackinthebochs · 2026-04-17T11:56:33 1776426993

The article's analysis doesn't appear to be accurate. From the bill:

(a) Requirements.—An operating system provider, with respect to any operating system of such provider, shall carry out the following:

(1) Require any user of the operating system to provide the date of birth of the user in order to—

(A) set up an account on the operating system; and

(B) use the operating system.

(2) If the relevant user of the operating system is under 18 years of age, require a parent or legal guardian of the user to verify the date of birth of the user.

(3) Develop a system to allow an app developer to access any information as is necessary, collected by the operating system to carry out this section and any regulation promulgated under this section, to verify the date of birth of a user of an app of the app developer.

The only requirement for "verification" is to enter a birthdate on account set up, and underage accounts have the parent "verify" the birthdate. There is certainly some ambiguity in the bill which is not good, but efforts should be towards resolving the ambiguity in favor of a lack of intrusiveness.

iamnothere · 2026-04-17T12:02:16 1776427336

Verification is explicitly required.

(d) Regulations.—

(1) IN GENERAL.—Not later than 180 days after the date of the enactment of this Act, the Commission shall promulgate, under section 553 of title 5, United States Code, regulations to carry out this section, including regulations relating to the following:

(A) How an operating system provider can—

(i) verify the date of birth of a parent or legal guardian described in subsection (a)(2); and

(ii) carry out the requirements described in subsection (a) with respect to an operating system of such provider that may be shared by individuals of varying ages.

(B) Data protection standards related to how an operating system provider shall ensure a date of birth collected by the operating system provider from a user, or the parent or legal guardian of the user, to carry out this section and any regulation promulgated under this section—

(i) is collected in a secure manner to maintain the privacy of the user or the parent or legal guardian of the user; and

(ii) is not stolen or breached.

hackinthebochs · 2026-04-17T12:44:34 1776429874

Fair point. Leaving the nature of verification open ended is not good and should be part of the legislation.

Neikius · 2026-04-17T15:08:33 1776438513

Very well hidden. Ugh. So close to a really good solution but ofc there is always a rider somewhere.

Nasrudith · 2026-04-17T08:40:49 1776415249

No, derangement is declaring "The writing has been on the wall for some time now: the era of completely unrestricted internet is coming to an end." without fighting it at all and just mindlessly accepting it because you were told it was going to happen.

It should be really easy to get your bank account information then. You're just going to give it to me, right? What is this? You're fighting me tooth and nail instead of celebrating giving me your banking info?

hackinthebochs · 2026-04-17T10:55:52 1776423352

It's derangement to jump from an adult/not-adult bit to bank account information.

kahrl · 2026-04-17T12:07:54 1776427674

[flagged]

hackinthebochs · 2026-04-17T12:37:35 1776429455

Yes, because consensus surely is a reliable guide to truth.

kahrl · 2026-04-17T15:27:38 1776439658

NPD is a bitch.

phendrenad2 · 2026-04-17T05:42:06 1776404526

Well, perhaps your mental model of the actual objections to it are incomplete. There are a few problems and I'm curious what you have to say about them. First, "The benefit of this is that it can short-circuit support for more onerous age verification". Do you think that it "can" or that it "will"? Big difference. It could also go the other way, right? Opening the door to a more onerous version? Why do you think that isn't worth considering? Secondly, "This is probably the least bad out of all possible ways to implement age checking". What about parental controls that exist already? Someone seriously tried to tell me last time that parental controls "suck", but that's irrelevant, they don't have to suck, and in fact anything can suck. That's just happenstance. So, assuming parental controls were correctly implemented, why do you think this is "least bad" including parental controls? Thirdly, this "age verification" doesn't actually verify anything, because underage people can just choose "adult" anyway. What do you have to say to that? In that case, parental controls actually give you more power, and make this new age check completely obsolete. Thoughts? Lastly, maybe you're not from the USA, but we have a concept of "free speech" which includes the idea that people cannot be "compelled" to certain speech. If people were required to add a "sign here to confirm you're an adult" in every romance novel, that would be fine right? It's also a nothingburger, right? But then, you've compelled people to put something in every published book. Actually, that's a bad analogy. We should say that ALL BOOKS require this signature field on the first page. After all, we don't know what kinds of expletives and horrible things people might have written in the margins of the book (assuming it's being sold second-hand). That would be okay with you, right? Nothingburger? But it compels people to write something, and that's a door most legal scholars know not to open.

> The writing has been on the wall for some time now: the era of completely unrestricted internet is coming to an end.

And books..? And the newspaper? What if a child reads about a horrible murder in the newspaper that keeps them up at night? What if the government outlaws books and newspapers because they can contain bad things? We'd better add a "adult/ not adult" checkbox to the first page to "short-circuit support for more onerous age verification".

gxs · 2026-04-17T06:32:50 1776407570

This was a great comment, you challenged them but in a reasonable way and with really good questions

I wish public discourse were more this way - if someone is arguing in good faith, actually answering what you asked moves the conversation forward, it’s just on the person to give you a serious answer

frm88 · 2026-04-17T09:55:25 1776419725

This is brilliant. I haven't even thought about some of the questions you ask. Thank you.

soniczentropy · 2026-04-17T10:47:36 1776422856

This is the most elegant and polite refutation of age verification I've ever seen

hackinthebochs · 2026-04-17T10:41:22 1776422482

>It could also go the other way, right? Opening the door to a more onerous version?

I don't see a plausible scenario where the implementation of this mandate makes further mandates more easy to get passed. An age field and an API to access it is as trivial as it gets. More onerous age checking is not something that is an extension to or somehow made more easy given the pre-existence of the age field. No argument against more onerous checking is undermined or rendered less severe due to an age field already existing. There is no slippery slope here.

>So, assuming parental controls were correctly implemented, why do you think this is "least bad" including parental controls?

There is already a pretty significant market for parental controls, so presumably if their quality were a limiting factor in their adoption the market would have responded already. Parents simply aren't interested enough or savvy enough to apply them. Parental controls also just intrinsically suck for a lot of reasons. They are either mostly ineffective or wildly intrusive, like giving total access to children's communications and internet activity to external companies.

>Thirdly, this "age verification" doesn't actually verify anything, because underage people can just choose "adult" anyway. What do you have to say to that?

Presumably an adult is involved in purchasing devices and setting up accounts for their young children. Putting an age of account holder field into the account set up workflow seems pretty effective. It's not 100%, but it doesn't need to be for it to be a major improvement over the status quo. The lack of verification is a feature of this mandate, not a bug.

>we have a concept of "free speech" which includes the idea that people cannot be "compelled" to certain speech. If people were required to add a "sign here to confirm you're an adult" in every romance novel, that would be fine right?

As those pushing this kind of legislation are fond of pointing out, we have age checks for buying alcohol or purchasing adult magazines in shops. Presumably these don't run afoul of the first amendment. This idea that we can't or shouldn't mandate age checking in some form to access content deemed inappropriate to children is just a losing argument. Again, the writing is on the wall here.

3form · 2026-04-17T13:32:00 1776432720

>No argument against more onerous checking is undermined or rendered less severe due to an age field already existing

From your point of view.

What I can tell you is that there are definitely people who will argue that this is, by the fact of being written into law, now the spirit of the law.

Then these people will argue that the spirit of the law is being broken, and the implementation needs to be better and tighter. Not that it needs to be repealed! Because clearly this is something that was wanted. And to many, many people, this will be sufficient argument not to complain about further measures.

Jtarii · 2026-04-17T09:49:05 1776419345

At this point anything that makes computers less usable is a good thing, time we go back to the real world. It was extremely unpleasant while it lasted.

tzs · 2026-04-17T19:23:37 1776453817

You are probably mixing up this bill with the California law, which its title kind of suggests it would be similar to but it isn't really.

nurumaik · 2026-04-17T10:20:54 1776421254

least bad way to implement age checking is just asking user

hackinthebochs · 2026-04-17T10:54:26 1776423266

An completely ineffective age check is not an age check.

hackinthebochs · 2026-04-09T05:13:08 1775711588

>We know that they do not reason because we know the algorithm behind the curtain.

In other words, we didn't put the "reasoning algorithm" in LLMs therefore they do not reason. But what is this reasoning algorithm that is a necessary condition for reasoning and how do you know LLMs parameters didn't converge on it in the process of pre-training?

drob518 · 2026-04-09T17:55:01 1775757301

Model parameters are weights, not algorithms. The LLM algorithm is (relatively) fixed: generate the next token according to the existing context, the model weights, and some randomization. That’s it. There is no more algorithm than that. The training parameters can shift the probabilities for predicting a token given the context, but there’s no more to it than that. There is no “reasoning algorithm” in the weights to converge to.

hackinthebochs · 2026-04-09T21:49:37 1775771377

This overly reductive description of LLMs misses the forest for the trees. LLMs are circuit builders, the converged parameters pick out specific paths through the network that define programs. In other words, LLMs are differentiable computers[1]. Analogous to how a CPU is configured by the program state to execute arbitrary programs, the parameters of a converged LLM configure the high level matmul sequences towards a wide range of information dynamics.

Statistics has little relevance to LLM operation. The statistics of the training corpus imparts constraints on the converged circuit dynamics, but otherwise has no representation internally to the LLM.

[1] https://x.com/karpathy/status/1582807367988654081

qsera · 2026-04-10T01:12:12 1775783532

> LLMs are circuit builders

I think they are circuit "approximators". In other words, a result of a glorified linear regression..

drob518 · 2026-04-10T18:06:26 1775844386

I called it a “big wad of linear algebra,” above. That’s all it is.

qsera · 2026-04-09T14:59:06 1775746746

https://arxiv.org/abs/2603.09678

hackinthebochs · 2026-04-08T19:17:01 1775675821

I see nothing to preclude a foundation model being augmented by a smaller model that serializes particulars about an individuals cumulative interaction with the model and then streamlines it into the execution thread of the foundation model.

hackinthebochs · 2026-04-08T18:03:09 1775671389

>this LLM behavior can be analogized in terms of some human behavior, thus it follows that LLMs are human-like

No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"

>"Confabulation" in LLMs and "confabulation" in humans have basically nothing in common

I don't know why you think this. They seem to have a lot in common. I call it sensible nonsense. Humans are prone to this when self-reflective neural circuits break down. LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)

root_axis · 2026-04-08T19:22:06 1775676126

> No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"

I'm not really following. LLM capabilities are self-evident, comparing them to a human doesn't add any useful information in that context.

> LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)

You're just drawing lines between superficial descriptions from disparate concepts that have a metaphorical overlap. It's also wrong. LLMs do not "craft a narrative around available information when critical input is missing", LLM confabulations are statistical, not a consequence of missing information or damage.

hackinthebochs · 2026-04-08T19:38:22 1775677102

>LLM capabilities are self-evident

This is undermined by all the disagreement about what LLMs can do and/or how to characterize it.

>LLM confabulations are statistical, not a consequence of missing information or damage.

LLMs aren't statistical in any substantive sense. LLMs are a general purpose computing paradigm. They are circuit builders, the converged parameters define pathways through the architecture that pick out specific programs. Or as Karpathy puts it, LLMs are a differentiable computer[1]. So yes, narrative crafting in terms of leveraging available putative facts into a narrative is an apt characterization of what LLMs do.

[1] https://x.com/karpathy/status/1582807367988654081

hackinthebochs · 2026-04-08T17:56:25 1775670985

>Structurally a transformer model is so unrelated to the shape of the brain there's no reason to think they'd have many similarities.

Substrate dissimilarities will mask computational similarities. Attention surfaces affinities between nearby tokens; dendrites strengthen and weaken connections to surrounding neurons according to correlations in firing rates. Not all that dissimilar.

hackinthebochs · 2026-03-24T10:30:41 1774348241

Linear regression has well characterized mathematical properties. But we don't know the computational limits of stacked transformers. And so declaring what LLMs can't do is wildly premature.

datsci_est_2015 · 2026-03-24T10:52:30 1774349550

> And so declaring what LLMs can't do is wildly premature.

The opposite is true as well. Emergent complexity isn’t limitless. Just like early physicists tried to explain the emergent complexity of the universe through experimentation and theory, so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Specifically not pseudoscience, though.

famouswaffles · 2026-03-24T13:31:28 1774359088

>so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Physicists had the real world to verify theories and explanations against.

So far anyone 'explaining the emergent complexity of LLMs through experimentation and theory' is essentially just making stuff up nobody can verify.

datsci_est_2015 · 2026-03-24T13:53:28 1774360408

Well that’s why I provided the caveat “specifically not pseudoscience”, which is, as you described, “just making stuff up nobody can verify”.

famouswaffles · 2026-03-24T14:02:47 1774360967

If you say not pseudoscience and then make up pseudoscience anyway then what's the point? The field has not advanced anywhere enough in understanding for convoluted explanations about how LLMs can never do x to be anything but pseudoscience.

hackinthebochs · 2026-03-24T11:02:31 1774350151

Sure, that's true as well. But I don't see this as a substantive response given that the only people making unsupported claims in this thread are those trying to deflate LLM capabilities.

datsci_est_2015 · 2026-03-24T11:44:38 1774352678

So, to review this thread

  - OP asked for someone to make a logical argument for the separation of “training” from “model”
  - I made the argument
  - You cherry picked an argument against my specific example and made an appeal to emergent complexity
  - I pointed out that emergent complexity isn’t limitless
  - “the only people making unsupported claims in this thread are those trying to deflate LLM capabilities”

famouswaffles · 2026-03-24T13:34:36 1774359276

You made a pretty nonsensical argument, pretty much seems like the big standard for these arguments.

What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here. You don't know shit and just make up whatever. You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.

datsci_est_2015 · 2026-03-24T13:51:19 1774360279

> You don’t know shit

lol. Why so emotionally charged? Are you perhaps worried that you’ve invested too much time and effort into a technology that may not deliver what influencers have been promising for years? Like a proverbial bagholder?

> What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here.

We’re talking about fundamental concepts of modeling in this subthread. LLMs, despite what influencers may tell you, are simply models. I’ll even throw you a bone and admit they are models for intelligence. But they are still models, and therefore all of the things that we have learned about “models” since Plato are still relevant. Most importantly, since Plato we’ve known that “models” have fundamental limits vs. what they try to represent, otherwise they would be a facsimile, not a model.

> You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.

I hope you enjoy winning these imaginary arguments against these imaginary comments. The fundamental limitations of LLMs discussed since GPT-1 have never been addressed by changing the architecture of the underlying model. All of the improvements we’ve experienced have been due to (1) improvements in training regime and (2) harnesses / heuristics (e.g. Agents).

Now, care to provide a counterargument that shows you know a little more than “shit”?

famouswaffles · 2026-03-24T14:07:00 1774361220

>We’re talking about fundamental concepts of modeling in this subthread. LLMs, despite what influencers may tell you, are simply models. I’ll even throw you a bone and admit they are models for intelligence. But they are still models, and therefore all of the things that we have learned about “models” since Plato are still relevant. Most importantly, since Plato we’ve known that “models” have fundamental limits vs. what they try to represent, otherwise they would be a facsimile, not a model.

Okay, but the brain is also “just a model” of the world in any meaningful sense, so that framing does not really get you anywhere. Calling something a model does not, by itself, establish a useful limit on what it can or cannot do. Invoking Plato here just sounds like pseudo-profundity rather than an actual argument.

>I hope you enjoy winning these imaginary arguments against these imaginary comments. The fundamental limitations of LLMs discussed since GPT-1 have never been addressed by changing the architecture of the underlying model. All of the improvements we’ve experienced have been due to (1) improvements in training regime and (2) harnesses / heuristics (e.g. Agents).

If a capability appears once training improves, scale increases, or better inference-time scaffolding is added, then it was not demonstrated to be a 'fundamental impossibility'.

That is the core issue with your argument: you keep presenting provisional limits as permanent ones, and then dressing that up as theory. A lot of people have done that before, and they have repeatedly been wrong.

hackinthebochs · 2026-03-24T14:43:36 1774363416

To be clear, you are confusing me with other commenters in this thread. All I want is for those that liken LLMs to stochastic parrots and other deflationary claims to offer an argument that engages with the actual structure of LLMs and what we know about them. No one seems to be up to that challenge. But then I can't help but wonder where people's confident claims come from. I'm just tired of the half-baked claims and generic handwavy allusions that do nothing but short-circuit the potential for genuine insight.

hackinthebochs · 2026-03-24T06:55:48 1774335348

>AlphaGo didn't teach itself that move. The verifier taught AlphaGo that move.

No. AlphaGo developed a heuristic by playing itself repeatedly, the heuristic then noticed the quality of that move in the moment.

Heuristics are the core of intelligence in terms of discovering novelty, but this is accessible to LLMs in principle.

hackinthebochs · 2026-03-23T07:27:07 1774250827

Why would you want every site on the internet to traffic in government IDs? This is by far the least bad out of all possible ways to implement age checking. The benefit of this is that it can short-circuit support for more onerous age verification. The writing has been on the wall for some time now: the era of completely unrestricted internet is coming to an end. The question is how awful will the new normal be? This implementation is a win all around, a complete nothingburger. We should be celebrating it, not fighting it tooth and nail.

The tech crowds utter derangement over this minor mandate is truly a sight to behold.

fc417fc802 · 2026-03-23T11:35:59 1774265759

> This is by far the least bad out of all possible ways to implement age checking.

Not quite. The least bad (that I'm aware of) is to mandate RTA headers (or an equivalent more comprehensive self categorization system) and to also mandate that major platforms (presumably OS and browsers, based on MAU or some such) implement support for filtering on those headers.

But sending a binned age as per the California law is the next best thing to that.