Prompt caching is done on the provider side. If you send two requests to a provider in short succession and the beginning of your second request is the same as your first (for example, because your second request is the continuation of an ongoing chat), the repeated tokens are much less expensive the second time.
Obviously, your tool does not provide this. But I think GP is undervaluing the UX advantages of having your conversation history.
Yes that's it. I actually just ask codex/claude code to look up the session id when I want to resume sessions cross harness, it's just jsonl files locally so it can access the full conversation history when needed.
I know you're just sharing a single sample, but is this even the same test? In the article, the model is being inspected while generating the next token(s), and the probabilities are listed.
Here, you're asking the model to retrospectively fill in a missing word, and it's answering your prompt. We have no idea what the actual token probability in Claude is and no way of probing it by asking it.
FWIW eviction was what I immediately thought would fill in the blank, and without the Trump presidency, I think deportation would probably be a lot less common of a choice despite fitting quite fine.
When the bad guys are too impatient to wait until you leave the computer but not fast enough to stop you before 30 degrees while keeping the convenience of life.
That sounds likely to increase their costs and create new opportunities to get caught. Not a silver bullet but not "absolutely nothing". Like how anti-money laundering laws don't wipe out all crime, but are still worthwhile.
If the API costs are gonna be thousands, or the subscription will be $20/month, is it really that expensive to pay some guy on Discord a $50 gift card to verify the account as a one-time setup? Better yet, we'll probably start seeing fake porn websites and other phishing sites that ask to verify your age but end up proxy verifying a bunch of these services in an automated manner with minimal costs, and you'll be able to buy verified Claude accounts for tens of cents on account marketplaces. Just as you have been able to buy verified Discord accounts, aged Steam accounts, etc...
This is so cool! One thing that occurred to me while watching the video: would it potentially make more sense to just do this in the kernel? That way, you don't have to fight virtual addressing, and I imagine (?) you could even know for sure which channel you're on instead of guessing.
Obviously, your tool does not provide this. But I think GP is undervaluing the UX advantages of having your conversation history.
reply