I think they just swapped out LuaJIT's modified built-it dlmalloc[1] with some standard allocator. Then just set some turning values of the allocator to make to more eager to return pages with no allocations left to the OS.
LuaJIT has always had pluggable allocator system you can set at state construction time[2], it did have a restriction you could only use the built-it allocator for 64 bit builds if you don't use the GC64 build option, but thats been default enabled for a while now.
Someone already created that[1] using custom kernel driver and there own CDN, but they seem to of abandoned it[2], maybe because they would of attracted Valve's wrath trying to monetized it.
That's actually quite interesting. Not entirely what I had in mind but close! My version would have only the first boot be a bit slow, but the aspect of dynamically replacing local content there is cool.
This would be extra cool for LAN parties with good network hardware
There was commercial fork of clang zapcc[1] that did caching of headers and template instantiations with an in memory client server system[2], but idk if they solved all the correctness issues or not before abandoning it.
AMD also switched to 16k(4 x 4K) down from 8 in Zen1 for there PTE Coalescing system that is effectively run length like compression of page table entries with sequential addresses in to one TLB slot.
You can sort of do that with some of LLVM's JIT systems https://llvm.org/docs/JITLink.html, I'm surprised that no one has yet made a edit and continue system using it.
Maybe of interest: https://github.com/clasp-developers/clasp/ (Lisp env. that uses LLVM for compilation; new-ish, actively developed.) However, my impression (I didn't measure it) is that the compilation speed is an order of magnitude slower than in SBCL, never mind CCL.
I personally just went down the route of stripping down the FFI system when integrating LuaIT. It included things like removing the ability to define new ffi types\functions or loading libraries, as well as removing most casting and pointer indexing.
Make sure to also remove any way to load bytecode. Luau has a good page on what they've done in pursuit of sandboxing: https://luau-lang.org/sandbox (it's also a good alternative to consider if ever you don't need LuaJIT specifically)
[1]: https://github.com/LuaJIT/LuaJIT/blob/v2.1/src/lj_alloc.c
[2]: https://github.com/LuaJIT/LuaJIT/blob/707c12bf00dafdfd3899b1...