Claude Code cache confusion as Anthropic tweaks defaults, but quotas still drain
Anthropic recently reduced the time-to-live (TTL) for its Claude Code prompt cache feature from one hour to five minutes for many requests, a technical adjustment that the company claimed would not increase costs for users. However, developers are reporting that their usage quotas are depleting significantly faster during extended work sessions, contradicting Anthropic's assurances and suggesting the change may have unintended consequences.
The prompt cache feature is designed to store frequently accessed information, reducing the need to reprocess the same data repeatedly. By shortening the TTL, cached content expires more quickly, forcing the system to regenerate and reprocess information more often. This increased reprocessing activity appears to be consuming more tokens—the units by which API usage and costs are measured—than users expected based on Anthropic's communications about the change.
The discrepancy between Anthropic's claims and user experiences highlights ongoing confusion about how the cache change affects actual costs and usage patterns. This matters because developers relying on Claude for code-related tasks depend on predictable pricing and resource consumption. The situation underscores the importance of clear communication around technical changes that directly impact billing and service performance.
Key Takeaways
- Anthropic recently reduced the time-to-live (TTL) for its Claude Code prompt cache feature from one hour to five minutes for many requests, a technical adjustment that the company claimed would not increase costs for users.
- However, developers are reporting that their usage quotas are depleting significantly faster during extended work sessions, contradicting Anthropic's assurances and suggesting the change may have unintended consequences.
- The prompt cache feature is designed to store frequently accessed information, reducing the need to reprocess the same data repeatedly.
- By shortening the TTL, cached content expires more quickly, forcing the system to regenerate and reprocess information more often.
Read the full article on The Register
Read on The Register