The state of the system can be cached after the system prompt is calculated and all new chats start from that state. O(n^2) is not great but apparently its fine at these context lengths and I'm sure this is a factor in their minimum prompt cost. Advances like grouped query or multi head attention or sparse attention will eventually get rid of that exponential, hopefully.
That's not how it works. The system prompt doesn't "get calculated first" or anything. You combine it with the user prompt and then run the generation for the first new token on that thing, which basically boils down to one huge matmul that runs in parallel. So you can literally just cache a part of the input matrices for the first step and then you'll very quickly run into n^2 complexity.
The system prompt will always match in the prefix cache. I just meant it could be prefilled before any user queries on completely different hardware. Then you are only dealing with the n^2 only for the actual user prompt. We're in agreeance I think.
> And it gets churned by every single request they receive.
Not true, it gets calculated once and essentially baked into initial state basically and gets stored in a standard K/V prefix cache. Processing only happens on new input (minus attention which will have to content with tokens from the prompt)
In my area street parking is banned on collection day until 5pm. This is also when they do street cleaning. Somehow everyone finds room for all their cars on this day. Otherwise its similar to how you describe.
Guilty of garage as a storage shed, but its also crazy to me people don't store their second most expensive asset inside their garage.
I have space in garage for car at times of year when plowing may be needed. But plenty of space outside on driveway at times of year when it's not. Live in a very safe area and it's easier to just pull up in front of the garage door. Not sure what's crazy about that.
Read every word. i liked this detail in the footnotes:
> The Astro Compass needed to know approximately where in the sky to find the star, in order to point its sensor in the right direction. The direction didn't need to be exact because the Astro Compass performed a spiral search pattern to find the star. This search pattern covered ±4° in bearing and ±2.5° in altitude. In comparison, the Moon is 0.5° wide, so it's a fairly large target area. ↩
Laser trackers (used for metrology) can also use spiral search to find retroreflectors. Although I believe newer models generate a flash of infrared and find them via bright spots in the resulting image from a camera. I imagine modern celestial navigation systems evolved similarly (minus the infrared flash, not very useful for stars).
Honestly that footnote really stood out to me too! the spiral search detail makes the whole system feel a lot more alive than I expected like it’s actively hunting for the star rather than just pointing and hoping.
And? Is that a hurdle or something? You know homeless people are allowed to go on the internet? Smartphones? You'll find other homeless or desolate people here on HN - I won't name anyone out of respect but if you read enough comments here over time you would recognize them.
I'd suspect bitter people who embellish and tell falsehoods in an attempt to smear the thing they're bitter against. At least, more so than a principled person would.
reply