Your LLM's Cache Charges You Double to Save You Money (And It Makes Sense)

A few weeks ago, I published an article explaining why 99% of what you send to Claude is already cached. KV tensors, VRAM, local SSDs — the full internal machinery. But I left out the part that hurts the most: the bill. Because prompt caching seems like a sweet deal until you look closely at the numbers. And then you realize that you’re paying to save. The cost paradox Let’s crunch the numbers. With Claude Sonnet: ...

March 10, 2026 · Fernando