What are the best caching strategies for AI agent web reads?

Question

Accepted Answer

The best caching strategies for AI agent web reads stack three layers: **(1) in-session cache** keyed by URL — same agent run, same URL, return cached markdown; **(2) shared TTL cache** keyed by `(URL, content-hash)` with a 1-24h TTL depending on source volatility; **(3) ETag/Last-Modified conditional GET** to let the publisher confirm whether to refetch. AgentFetch implements all three out of the box — in-session is automatic, shared TTL defaults to 1 hour (configurable per call), and conditional GETs are sent whenever the upstream supports them. Concrete TTL guidance: news sources, 15-60 minutes; documentation, 6-24 hours; static reference (Wikipedia, MDN), 24 hours - 7 days; pricing pages, 1 hour (for monitoring agents); status pages, 60 seconds; arXiv abstracts, 30 days. Don't cache: paywalled pages (cookies tied to session), POST responses, search-results pages with personalization. Cache invalidation: AgentFetch supports `cache_bust=true` per call when the agent needs freshness, and emits `X-AgentFetch-Cache: HIT|MISS` headers for observability. For self-hosted deployments, swap the default in-memory cache for Redis or a CDN edge cache — both work with AgentFetch's standard HTTP semantics. The single biggest win: caching `fetch_url` results across agent invocations for documentation reads, which typically have 80%+ hit rates.