Caching
- Caching is layered; each layer has different cost, granularity and invalidation.
In day-to-day conversation “cache it” can refer to:
- A CPU register that avoids a RAM fetch
- A Redis hash that avoids a SQL query
- A CDN edge that avoids a trans-Atlantic round-trip
- A browser memory object that avoids a 30 ms HTTP call
- A Service-Worker entry that lets the app start while the phone is in airplane mode
All of those are “caching”, yet each layer has different granularity, lifetime, invalidation mechanism and failure mode.
If you treat them as one concept you will over-cache, under-cache, or ship invisible bugs that only show up on the customer’s machine.
- (where, freshness, invalidation, spectrum)
1. Decide where the bits may live (caching layers)
| Layer | Typical latency saved | Who owns it? | How to wipe it? |
|---|---|---|---|
| Browser memory (tab open) | 0-10 ms | JS code | F5, close tab |
| Browser HTTP cache | 10-300 ms | Browser + HTTP headers | Ctrl-F5, Clear-Site-Data header |
| Service-Worker cache | 0-∞ (offline) | Your JS | caches.delete() or new SW install |
| CDN / edge | 50-500 ms | Ops / Cloud console | Purge API |
| Reverse-proxy (Nginx, Varnish, Envoy) | 1-20 ms | DevOps | Restart or ban URL regex |
| Application memory (in-process) | 0.1-1 ms | Server code | Pod restart |
| Distributed cache (Redis, Memcached) | 1-5 ms | Backend | DEL key or TTL expiry |
| Database buffer pool | 0.05-0.5 ms | DBA | DB restart |
Rule of thumb: the farther away from the user, the longer TTL you can afford, but the harder it is to invalidate.
2. Pick a freshness strategy (not only TTL)
| Strategy | When to choose | HTTP headers | Example |
|---|---|---|---|
| Cache forever + fingerprint | Static assets that never change content without changing name | max-age=1y, immutable | main.3b14c8.js |
| Stale-while-revalidate | User sees data instantly, update in background | Cache-Control: max-age=300, stale-while-revalidate=86400 | Stock-price widget |
| Must-revalidate | Content may be stale, but never show older than X without checking | max-age=600, must-revalidate | News article |
| No-cache | Always revalidate with server, keep copy for speed | no-cache (NOT no-store) | Personalized dashboard |
| No-store | Do not write to any cache (privacy or legal) | no-store | Medical record PDF |
3. Choose an invalidation model
- TTL only – simplest, good for quarterly releases
- Key versioning – change URL → new cache key (
v2/endpoint) - Purging API – keep URL, evict on deploy (CloudFront, Azure CDN, Fastly)
- Event-driven – DB write → publish event → wipe Redis/CDN key
4. Tune for static-vs-dynamic spectrum
A. “Brochure” site – changes quarterly
- Cache rules:
–*.htmlCache-Control: no-cache(so browser revalidates)
–*.js/css/pngmax-age=1y, immutable - Invalidate only
index.html(or rename) on each release - Use Service-Worker for offline, but skip runtime caching of API calls (there are none)
B. “Dashboard” app – widgets refresh every 5-30 s
- Keep shell static (cache 1 year)
- Each widget hits its own JSON endpoint
- Endpoints use stale-while-revalidate (SWR) or Server-Sent Events / WebSocket for live data
- Browser memory cache (SWR libraries: React-Query, SvelteKit
load, VueUse, urql, Apollo) keeps last response for zero-jump navigation - Redis/CDN caches shared expensive queries (rankings, analytics) with TTL 30-300 s
- Version the query (
/api/v2/chart?interval=1h) so you can roll out new logic instantly
Quick decision cheat-sheet (copy into exam note)
| Question to ask | If YES → do this | If NO → do this |
|---|---|---|
| Can user tolerate 5-min old data? | Set max-age=300 | Use no-cache or WebSocket |
| Is asset name hashed? | max-age=1y, immutable | no-cache |
| Does data vary per user? | CDN: bypass cache OR use Vary: Cookie | Cache at edge with s-maxage |
| Must work offline? | Install Service-Worker, runtime cache API responses | Skip SW, rely on HTTP cache |
| Deploy <1× per week? | TTL only is enough | Add purge API or key versioning |
Traps that kill assessment answers
- “no-store” means private – wrong, it means do not write; still readable by anyone who sniffs memory.
- 304 Not Modified is slower than 200 from cache – yes, but it is consistently 1 RTT, whereas 200 (from cache) is 0 RTT if resource is still in cache.
- CDN hit ratio = performance – not if your TTL is so long that users see stale data and file bug tickets.
- “Cache busting query string” (
main.js?v=2) – some proxies ignore query string when caching; always change the filename instead. - Forgetting
Vary– if you compress, serve WebP, or return localized content, sendVary: Accept-Encoding, Accept-Languageor you will poison the cache.