More

menaerus · 2026-04-16T07:28:12 1776324492

So where's the proof of this exact problem and why exactly did it end up on the list of unsolved problems if it has been already solved?

menaerus · 2026-04-15T07:26:26 1776237986

> Which regulated professions?

All other engineering (civil, chemistry, mechanical, electrical, ...) and semi-engineering (architecture) disciplines.

menaerus · 2026-04-15T06:57:41 1776236261

Like what exactly?

menaerus · 2026-04-15T06:56:36 1776236196

The problem jj is trying to solve is not entirely clear to me but I guess there is enough people who aren't able to find their way with git so for them it probably makes switching to jj more appealing, or at least that's my first impression without going too deep into the documentation.

oniony · 2026-04-15T08:34:20 1776242060

I wouldn't say it's that people are not able to find their way with Git. I was a competent Git user and would carefully and squash my commits. It's just easier and nicer with Jujutsu.

The way all changes (except those in the ignore file) are automatically incorporated into the current commit means I don't have to faff about with stash when I need to pivot and then try to remember which commit to pop against. I can just hop around the tree with gay abandon. That alone is invaluable.

Then add in the fact that a change to history gets rippled down the descendent commits. And the fact that conflicts are recorded in the history and can be dealt with at your leisure. Or the fact that `jj undo` is a thing.

ncphillips · 2026-04-15T11:06:54 1776251214

There must be some kind of split in how people work or something. I’ve never had the desire to jump around the git tree. I never squash commits. I basically never stash changes. All the things that people say jj makes easier are things I never even want to do. Not because they’re not easy with git, but because it sounds hard to keep straight in my head.

oniony · 2026-04-15T15:12:12 1776265932

Maybe. Different organisations work at different paces and with different contention rates. If you're on a small team and less being tugged about then you might not find value with this stuff.

But I frequently have cases where I have some changes I'm making to repo `acme`. I'll put a PR up for review and then I'll start on a second PR for the same repo. I stack these commits on top of my previous PR. If I then notice a problem in the earlier work I can easily fix it and have the changes ripple down to the later PR. Or if somebody else merges something in against `main` it's really easy using `jj rebase` to move my commits against the new version of `main`. With a single `jj rebase` I can move all of my stacked PRs over to the new version of `main` and have every stacked PR updated.

whytevuhuni · 2026-04-15T09:51:54 1776246714

> Then add in the fact that a change to history gets rippled down the descendent commits.

This sounds interesting. Could you go into a bit more detail?

I have 3 branches off of a single commit, update that commit, and all branches automatically rebase? Or?

oniony · 2026-04-15T14:54:03 1776264843

Yes, exactly that. In Jujutsu you don't have Branches like you do in Git. You have branches in the sense that you have forks in the tree and you can place a "bookmark" against any revision in that tree. (When exporting to a Git repo those bookmarks are mapped to Git branch heads.)

So yeah if I have revision `a` with two children `b` and `c`, and even if those children have their own children, a change to `a` will get rippled down to `b` and `c` and any further children. It's a bit like Git rerere if you've used it, except you're not forced to fix every conflict immediately.

Any conflicts along the way are marked on those revisions, you just fix the earliest conflicts first and quite often that'll ripple down and fix everything up. Or maybe there'll be a second conflict later down the stack of commits and you'll just fix that one the same way.

To fix a conflict you typically create a new revision off the conflict (effectively forking the tree at that point) using `jj new c` (let's call the result `cxy`) fix the revision in that commit and then you can `jj squash` that revision `cxy` back into `c`. This, again, gets rippled down fixing up all of the descendent commits.

willhbr · 2026-04-15T11:11:50 1776251510

Yep they automatically rebase. If that creates conflicts it's marked on the child commit and you can swap over and resolve it any time.

steveklabnik · 2026-04-15T14:32:10 1776263530

Some of jj's users are "I find git hard and like jj more" but a lot of us are/were git experts before switching.

menaerus · 2026-04-15T16:35:15 1776270915

Yes, I understand that but what I'm saying is that the problem definition isn't completely clear to me. I'm not saying that there is none, it's just that it may not be obvious at the first read.

steveklabnik · 2026-04-15T19:07:22 1776280042

Ah, no worries.

menaerus · 2026-04-10T08:05:19 1775808319

You're condescending for no valid reason and I will tell you that what you say is not correct. Models superseded "plumbing" tasks and went well into the engineering grounds a generation or two ago already. Evidence is plenty. We see models perfectly capable reasoning about the kernel code yet you're convinced that game engines are somewhat more special. Why? There're plenty of such examples where AI is successfully applied to hard engineering tasks (database kernels), and where it became obvious that the models are almost perfectly capable reasoning about that tbh quite difficult code. I think you should reevaluate your stance and become more humble.

jenniferhooley · 2026-04-10T14:19:57 1775830797

Link me the research on the hard engineering tasks they've done on database kernels, I'd love to see it, sounds interesting.

As long as people comment, "Only bad/stupid engineers hand-write code because LLMs are better in every way," and that's objectively not true in various engineering circles, I'll keep trolling them and being just as hyperbolic in the inverse because it amuses me. Don't take things too seriously on the internet; you'll have a bad time ;)

menaerus · 2026-04-15T07:53:49 1776239629

> Link me the research on the hard engineering tasks they've done on database kernels, I'd love to see it, sounds interesting.

https://www.datadoghq.com/blog/ai/harness-first-agents

https://www.datadoghq.com/blog/ai/fully-autonomous-optimizat...

https://www.datadoghq.com/blog/engineering/self-optimizing-s...

menaerus · 2026-04-07T06:50:03 1775544603

I am an experienced C++ developer, I know what happens in this particular case, but this type of minutiae are only interesting to the developers who have never had an actually hard problem to solve so it's a red flag to me as well. 10 years ago I would have thought differently but today I do not. High performance teams do not care about this stuff.

menaerus · 2026-04-06T09:25:59 1775467559

Contention doesn't exist in older kernel versions even with huge-pages disabled, no?

anarazel · 2026-04-06T12:22:12 1775478132

The contention does exist in older kernels and is quite substantial.

menaerus · 2026-04-06T14:35:40 1775486140

You said

> Maybe we should, but requiring the use of a new low level facility that was introduced in the 7.0 kernel, to address a regression that exists only in 7.0+, seems not great.

... so that leaves me confused. My understanding is that the regression is triggered with the 7.0+ kernel and can be mitigated with huge pages turned on.

My question therefore was how come this regression hasn't been visible with huge pages turned off with older kernel versions? You say that it was but I can't find this data point.

anarazel · 2026-04-06T14:59:50 1775487590

> ... so that leaves me confused. My understanding is that the regression is triggered with the 7.0+ kernel and can be mitigated with huge pages turned on.

It gets a bit worse with preempt_lazy - for me just 15% percent or so - because the lock holder is scheduled out a bit more often. But it was bad before.

> My question therefore was how come this regression hasn't been visible with huge pages turned off with older kernel versions? You say that it was but I can't find this data point.

I mean it wasn't a regression before, because this is how it has behaved for a long time.

This workload is not a realistic thing that anybody would encounter in this form in the real world. Even without the contention - which only happens the first time the buffer pool is filled - you lose so much by not using huge pages with a 100gb buffer pool that you will have many other issues.

We (postgres and me personally) were concerned enough about potential contention in this path that we did get rid of that lock half a year ago (buffer replacement selection has been lock free for close to a decade, just unused buffers were found via a list protected by this lock).

But the performance gains we saw were relatively small, we didn't measure large buffer pools without huge pages though.

And at least I didn't test with this many connections doing small random reads into a cold buffer pool, just because it doesn't seem that interesting.

menaerus · 2026-04-06T09:23:05 1775467385

You're assuming that they ran the workload with huge-pages disabled unintentionally.

db48x · 2026-04-06T12:06:22 1775477182

No… I’m assuming that they didn’t use the same automation that creates RDS clusters for actual customers. No doubt that automation configures the EC2 nodes sanely, with hugepages turned on. Leaving them turned off in this benchmark could have been accidental, but some accident of that kind was bound to happen as soon as the tests use any kind of setup that is different from what customers actually get.

menaerus · 2026-04-06T14:41:21 1775486481

You're again assuming that having huge pages turned on always brings the net benefit, which it doesn't. I have at least one example where it didn't bring any observable benefit while at the same time it incurred extra code complexity, server administration overhead, and necessitated extra documentation.

scottlamb · 2026-04-07T15:28:43 1775575723

FYI: huge pages isn't just a system-wide toggle, but a variety of things you can do:

* explicit huge pages

* transparent huge pages system-wide default

* app-specific or even mapping-specific toggles

* various memory allocator settings to raise its effectiveness

It would be really surprising to me to see a workload for which it's optimal to not use huge pages anywhere on the system.

menaerus · 2026-04-08T05:59:20 1775627960

It is a system-wide toggle in a sense that it requires you to first enable huge-pages, and then set them up, even if you just want to use explicit huge pages from within your code only (madvise, mmap). I wasn't talking about the THP.

When you deploy software all around the globe and not only on your servers that you fully control this becomes problematic. Even in the latter case it is frowned upon by admins/teams if you can't prove the benefit.

Yes, there are workloads where huge-pages do not bring any measurable benefit, I don't understand why would that be questionable? Even if they don't bring the runtime performance down, which they could, extra work and complexity they incur is in a sense not optimal when compared to the baseline of not using huge-pages.

scottlamb · 2026-04-12T23:37:29 1776037049

> Yes, there are workloads where huge-pages do not bring any measurable benefit

I really doubt it, except of course workloads where you just use a trivial amount of memory to begin with. In systems I've seen, anywhere from 5% to 15% of the CPU time is spent waiting for TLB misses. It's obvious then that huge pages can be hugely beneficial if properly used; by definition they hugely relieve TLB pressure.

You can of course end up in situations where transparent TLB scanning is worse than nothing, but that's exactly why I pointed out there's a variety of ways to use huge pages.

menaerus · 2026-04-15T07:58:47 1776239927

You don't seem to understand the idea that CPU spending time on TLB misses and at the same time seeing no measureable effects in E2E performance because much larger bottleneck is elsewhere can be both valid simultaneously. In database kernels with large and unpredictable workloads, high IO and memory footprint, this is certainly easy to prove.

scottlamb · 2026-04-16T01:16:55 1776302215

I think you're moving the goalpost here. There's a measurement improvement to CPU usage. You're over-provisioned on CPU and don't care. Fine.

menaerus · 2026-04-06T09:12:01 1775466721

Huge pages and THP are not the same thing.

menaerus · 2026-04-03T08:32:45 1775205165

> I know git-bisect doesn’t support this

It does.