The problem jj is trying to solve is not entirely clear to me but I guess there is enough people who aren't able to find their way with git so for them it probably makes switching to jj more appealing, or at least that's my first impression without going too deep into the documentation.
I wouldn't say it's that people are not able to find their way with Git. I was a competent Git user and would carefully and squash my commits. It's just easier and nicer with Jujutsu.
The way all changes (except those in the ignore file) are automatically incorporated into the current commit means I don't have to faff about with stash when I need to pivot and then try to remember which commit to pop against. I can just hop around the tree with gay abandon. That alone is invaluable.
Then add in the fact that a change to history gets rippled down the descendent commits. And the fact that conflicts are recorded in the history and can be dealt with at your leisure. Or the fact that `jj undo` is a thing.
There must be some kind of split in how people work or something. I’ve never had the desire to jump around the git tree. I never squash commits. I basically never stash changes. All the things that people say jj makes easier are things I never even want to do. Not because they’re not easy with git, but because it sounds hard to keep straight in my head.
Maybe. Different organisations work at different paces and with different contention rates. If you're on a small team and less being tugged about then you might not find value with this stuff.
But I frequently have cases where I have some changes I'm making to repo `acme`. I'll put a PR up for review and then I'll start on a second PR for the same repo. I stack these commits on top of my previous PR. If I then notice a problem in the earlier work I can easily fix it and have the changes ripple down to the later PR. Or if somebody else merges something in against `main` it's really easy using `jj rebase` to move my commits against the new version of `main`. With a single `jj rebase` I can move all of my stacked PRs over to the new version of `main` and have every stacked PR updated.
Yes, exactly that. In Jujutsu you don't have Branches like you do in Git. You have branches in the sense that you have forks in the tree and you can place a "bookmark" against any revision in that tree. (When exporting to a Git repo those bookmarks are mapped to Git branch heads.)
So yeah if I have revision `a` with two children `b` and `c`, and even if those children have their own children, a change to `a` will get rippled down to `b` and `c` and any further children. It's a bit like Git rerere if you've used it, except you're not forced to fix every conflict immediately.
Any conflicts along the way are marked on those revisions, you just fix the earliest conflicts first and quite often that'll ripple down and fix everything up. Or maybe there'll be a second conflict later down the stack of commits and you'll just fix that one the same way.
To fix a conflict you typically create a new revision off the conflict (effectively forking the tree at that point) using `jj new c` (let's call the result `cxy`) fix the revision in that commit and then you can `jj squash` that revision `cxy` back into `c`. This, again, gets rippled down fixing up all of the descendent commits.
Yes, I understand that but what I'm saying is that the problem definition isn't completely clear to me. I'm not saying that there is none, it's just that it may not be obvious at the first read.
You're condescending for no valid reason and I will tell you that what you say is not correct. Models superseded "plumbing" tasks and went well into the engineering grounds a generation or two ago already. Evidence is plenty. We see models perfectly capable reasoning about the kernel code yet you're convinced that game engines are somewhat more special. Why? There're plenty of such examples where AI is successfully applied to hard engineering tasks (database kernels), and where it became obvious that the models are almost perfectly capable reasoning about that tbh quite difficult code. I think you should reevaluate your stance and become more humble.
Link me the research on the hard engineering tasks they've done on database kernels, I'd love to see it, sounds interesting.
As long as people comment, "Only bad/stupid engineers hand-write code because LLMs are better in every way," and that's objectively not true in various engineering circles, I'll keep trolling them and being just as hyperbolic in the inverse because it amuses me. Don't take things too seriously on the internet; you'll have a bad time ;)
I am an experienced C++ developer, I know what happens in this particular case, but this type of minutiae are only interesting to the developers who have never had an actually hard problem to solve so it's a red flag to me as well. 10 years ago I would have thought differently but today I do not. High performance teams do not care about this stuff.
> Maybe we should, but requiring the use of a new low level facility that was introduced in the 7.0 kernel, to address a regression that exists only in 7.0+, seems not great.
... so that leaves me confused. My understanding is that the regression is triggered with the 7.0+ kernel and can be mitigated with huge pages turned on.
My question therefore was how come this regression hasn't been visible with huge pages turned off with older kernel versions? You say that it was but I can't find this data point.
> ... so that leaves me confused. My understanding is that the regression is triggered with the 7.0+ kernel and can be mitigated with huge pages turned on.
It gets a bit worse with preempt_lazy - for me just 15% percent or so - because the lock holder is scheduled out a bit more often. But it was bad before.
> My question therefore was how come this regression hasn't been visible with huge pages turned off with older kernel versions? You say that it was but I can't find this data point.
I mean it wasn't a regression before, because this is how it has behaved for a long time.
This workload is not a realistic thing that anybody would encounter in this form in the real world. Even without the contention - which only happens the first time the buffer pool is filled - you lose so much by not using huge pages with a 100gb buffer pool that you will have many other issues.
We (postgres and me personally) were concerned enough about potential contention in this path that we did get rid of that lock half a year ago (buffer replacement selection has been lock free for close to a decade, just unused buffers were found via a list protected by this lock).
But the performance gains we saw were relatively small, we didn't measure large buffer pools without huge pages though.
And at least I didn't test with this many connections doing small random reads into a cold buffer pool, just because it doesn't seem that interesting.
No… I’m assuming that they didn’t use the same automation that creates RDS clusters for actual customers. No doubt that automation configures the EC2 nodes sanely, with hugepages turned on. Leaving them turned off in this benchmark could have been accidental, but some accident of that kind was bound to happen as soon as the tests use any kind of setup that is different from what customers actually get.
You're again assuming that having huge pages turned on always brings the net benefit, which it doesn't. I have at least one example where it didn't bring any observable benefit while at the same time it incurred extra code complexity, server administration overhead, and necessitated extra documentation.
It is a system-wide toggle in a sense that it requires you to first enable huge-pages, and then set them up, even if you just want to use explicit huge pages from within your code only (madvise, mmap). I wasn't talking about the THP.
When you deploy software all around the globe and not only on your servers that you fully control this becomes problematic. Even in the latter case it is frowned upon by admins/teams if you can't prove the benefit.
Yes, there are workloads where huge-pages do not bring any measurable benefit, I don't understand why would that be questionable? Even if they don't bring the runtime performance down, which they could, extra work and complexity they incur is in a sense not optimal when compared to the baseline of not using huge-pages.
> Yes, there are workloads where huge-pages do not bring any measurable benefit
I really doubt it, except of course workloads where you just use a trivial amount of memory to begin with. In systems I've seen, anywhere from 5% to 15% of the CPU time is spent waiting for TLB misses. It's obvious then that huge pages can be hugely beneficial if properly used; by definition they hugely relieve TLB pressure.
You can of course end up in situations where transparent TLB scanning is worse than nothing, but that's exactly why I pointed out there's a variety of ways to use huge pages.
You don't seem to understand the idea that CPU spending time on TLB misses and at the same time seeing no measureable effects in E2E performance because much larger bottleneck is elsewhere can be both valid simultaneously. In database kernels with large and unpredictable workloads, high IO and memory footprint, this is certainly easy to prove.
reply