Caching Strategies: Redis, CDN, Browser, and Application-Level
The slowest query in our application took 4.2 seconds. It joined six tables, aggregated a month of analytics data, and ran every time a user loaded their dashboard. We spent two weeks trying to optimize the SQL. We reduced it to 1.8 seconds. Still too slow. Then we added a single line of Redis caching with a 5-minute TTL. Response time: 3 milliseconds. The 4.2-second query ran once every 5 minutes instead of hundreds of times.
Caching is the most impactful performance optimization in web development. It's also the most misunderstood. "Just add a cache" sounds simple until you're dealing with stale data, cache stampedes, invalidation bugs, and mysterious inconsistencies that only appear under load.
This guide covers caching at every layer of the stack: browser, CDN, application, and database. For each layer, I'll explain when to use it, how to implement it correctly, and the pitfalls that catch experienced engineers.
The Caching Hierarchy
Modern web applications have multiple caching layers. A request might hit all of them:
| Layer | Location | Speed | Scope | Controls |
|---|---|---|---|---|
| Browser Cache | User's device | ~0ms | Single user | HTTP headers (Cache-Control) |
| CDN Cache | Edge servers worldwide | 5-50ms | All users in a region | HTTP headers + CDN config |
| Application Cache | Application memory (in-process) | <1ms | Single application instance | Application code |
| Distributed Cache | Redis/Memcached cluster | 1-5ms | All application instances | Application code |
| Database Cache | Database memory (buffer pool) | 1-10ms | Database-level | Database config |
The golden rule: cache at the outermost layer possible. A browser cache hit costs nothing. A CDN cache hit costs a network round trip. An application cache hit costs a function call. A database cache hit costs a query. Each layer inward is slower and more expensive.
Browser Caching
Cache-Control Header
The Cache-Control HTTP header is the primary mechanism for browser caching. The most important directives:
| Directive | Meaning | Use When |
|---|---|---|
max-age=3600 | Cache for 3600 seconds (1 hour) | Static assets with versioned URLs |
no-cache | Cache but revalidate before each use | HTML pages, API responses |
no-store | Don't cache at all | Sensitive data (banking, health) |
public | CDN and browser can cache | Public content (images, CSS, JS) |
private | Only browser can cache (not CDN) | User-specific content |
immutable | Content will never change | Versioned assets (app.a1b2c3.js) |
stale-while-revalidate=60 | Serve stale while fetching fresh | Content where slight staleness is OK |
The Best Practice for Static Assets
Modern build tools (Webpack, Vite, Next.js) generate files with content hashes in the filename (e.g., main.a1b2c3d4.js). For these files, use:
Cache-Control: public, max-age=31536000, immutable
This tells browsers to cache the file for 1 year and never revalidate. When you change the file, the hash changes, creating a new URL. The old cached file is never used again. According to Google's web.dev documentation, this pattern eliminates revalidation requests and gives the fastest possible repeat loads.
The ETag and Last-Modified Dance
For content that changes (HTML pages, API responses), use conditional caching with ETag or Last-Modified. The browser sends the cached version's ETag in the If-None-Match header. If the content hasn't changed, the server returns 304 Not Modified (no body), saving bandwidth. If it has changed, the server returns the new content.
CDN Caching
A Content Delivery Network caches content on edge servers distributed worldwide, serving users from the nearest location. The major CDN providers:
| CDN | Best For | Edge Locations | Pricing Model |
|---|---|---|---|
| Cloudflare | General purpose, free tier | 310+ | Free tier + paid plans |
| AWS CloudFront | AWS ecosystem | 600+ | Pay per GB + request |
| Fastly | Real-time purging, edge compute | 90+ | Pay per GB + request |
| GCP Cloud CDN | GCP ecosystem | 180+ | Pay per GB + request |
| Vercel Edge Network | Next.js applications | 100+ | Included in plan |
What to Cache on a CDN
- Always cache: Images, CSS, JavaScript, fonts, videos, PDFs
- Cache with caution: HTML pages (use short TTLs or stale-while-revalidate)
- Cache selectively: API responses (only if public and not user-specific)
- Never cache: Authentication endpoints, user-specific data, form submissions
Cache Invalidation at the CDN
CDN cache invalidation is the process of telling edge servers to discard cached content. Different CDNs handle this differently:
- Cloudflare: Purge by URL, by tag (Enterprise), or purge everything. Propagation: 30 seconds globally.
- CloudFront: Create an "invalidation" for specific paths. Propagation: up to 15 minutes. Use versioned URLs to avoid needing invalidations.
- Fastly: Purge by URL, by surrogate key (tag), or purge all. Propagation: under 150ms. This is Fastly's killer feature.
According to Fastly's engineering blog, surrogate key purging allows content owners to invalidate all related content (e.g., all pages that display a specific product) in milliseconds. This is the gold standard for CDN cache management.
Application-Level Caching with Redis
Redis is the dominant choice for application-level distributed caching. As of 2025, DB-Engines ranks it as the #1 key-value store by a wide margin.
Caching Patterns
Cache-Aside (Lazy Loading)
The most common pattern. The application checks the cache first. On a miss, it queries the database, stores the result in the cache, and returns it.
- Check Redis for the key
- If found (cache hit): return the cached value
- If not found (cache miss): query the database
- Store the result in Redis with a TTL
- Return the result
Pros: Simple, only caches data that's actually requested, cache failures don't break the application.
Cons: First request is always slow (cache miss). Potential for cache stampede.
Write-Through
Every write goes to both the cache and the database simultaneously. The cache is always up to date.
Pros: Cache is always consistent with the database.
Cons: Write latency increases (two writes per operation). Caches data that may never be read.
Write-Behind (Write-Back)
Writes go to the cache first. The cache asynchronously writes to the database in batches.
Pros: Fastest writes. Batching reduces database load.
Cons: Risk of data loss if the cache fails before writing to the database. Complex to implement correctly.
Read-Through
Similar to cache-aside, but the cache itself is responsible for loading data from the database on a miss. The application only talks to the cache.
Pros: Simpler application code.
Cons: Requires cache infrastructure that supports data loading.
| Pattern | Read Performance | Write Performance | Consistency | Complexity |
|---|---|---|---|---|
| Cache-Aside | Good (after first miss) | Normal | Eventual (TTL-based) | Low |
| Write-Through | Excellent | Slower (dual write) | Strong | Medium |
| Write-Behind | Excellent | Fastest | Weak (async) | High |
| Read-Through | Good | Normal | Eventual | Medium |
Cache Invalidation: The Hard Problem
Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things." He wasn't joking.
Strategies for Cache Invalidation
1. Time-Based (TTL)
Set an expiration time on every cached value. After the TTL expires, the next request fetches fresh data. This is the simplest and most common approach.
How to choose a TTL:
- User profile data: 5-15 minutes (changes infrequently, slight staleness acceptable)
- Product catalog: 1-5 minutes (changes sometimes, staleness visible to users)
- Analytics dashboards: 5-30 minutes (expensive to compute, staleness expected)
- Session data: 30 minutes - 24 hours (align with session timeout)
- API rate limit counters: Exact window size (must be precise)
2. Event-Based Invalidation
When data changes, publish an event that triggers cache invalidation. For example, when a user updates their profile, publish a "user.updated" event that invalidates the cached user profile.
3. Version-Based Invalidation
Append a version number or hash to the cache key. When data changes, increment the version. Old cache entries become orphaned and eventually expire.
Common Caching Problems and Solutions
Cache Stampede (Thundering Herd)
When a popular cache entry expires, hundreds of requests simultaneously query the database to repopulate it. This can overwhelm the database.
Solutions:
- Locking: Use a distributed lock (Redis SETNX). Only one request queries the database; others wait for the cache to be repopulated.
- Stale-while-revalidate: Serve stale data while one request refreshes the cache in the background.
- Early expiration: Before the TTL expires, randomly refresh the cache (probabilistic early expiration).
Cache Penetration
Requests for data that doesn't exist bypass the cache every time (because there's nothing to cache). An attacker can exploit this by requesting non-existent keys to overwhelm the database.
Solutions:
- Cache negative results: Cache "not found" with a short TTL (e.g., 1 minute).
- Bloom filter: Use a bloom filter to check if a key could exist before querying the database.
Cache Avalanche
Many cache entries expire at the same time, causing a sudden spike in database queries.
Solutions:
- Jittered TTLs: Add random jitter to TTLs (e.g., 300 +/- 30 seconds) so entries expire at different times.
- Tiered expiration: Use different TTLs for different data types.
Redis vs. Memcached vs. In-Process Cache
| Feature | Redis | Memcached | In-Process (e.g., Node.js Map) |
|---|---|---|---|
| Data structures | Strings, hashes, lists, sets, sorted sets, streams | Strings only | Anything |
| Persistence | Yes (RDB, AOF) | No | No |
| Replication | Yes (primary-replica) | No (client-side sharding) | No |
| Pub/Sub | Yes | No | N/A |
| Max memory | Practical limit ~100GB | Practical limit ~64GB | Limited by process memory |
| Multi-threaded | Partially (I/O threads in 6.0+) | Yes | Single-threaded (Node.js) |
| Latency | ~0.5-1ms | ~0.2-0.5ms | <0.01ms |
| Shared across instances | Yes | Yes | No |
My recommendation: Use Redis. Unless you have a specific reason to use Memcached (higher throughput for simple key-value pairs) or in-process caching (single instance, sub-millisecond latency needed), Redis is the right choice. Its data structures, persistence, and pub/sub capabilities make it versatile beyond just caching.
According to Redis benchmarks, a single Redis instance can handle 100,000+ operations per second on commodity hardware, which is sufficient for most applications.
My Opinionated Take
1. Start with browser caching. Before you add Redis or a CDN, make sure your Cache-Control headers are correct. Proper browser caching is free and eliminates requests entirely. I've seen applications add Redis caching without fixing their browser caching, which is like optimizing your database queries while your CDN serves every request uncached.
2. TTL-based invalidation is almost always enough. Event-based invalidation is theoretically superior but operationally complex. For most applications, a 5-minute TTL means your data is at most 5 minutes stale. Ask yourself: does your business actually need fresher data than that? Usually not.
3. Cache miss penalty matters more than cache hit speed. Everyone optimizes for cache hits. But the user experience is determined by cache misses, since those are the requests users wait for. Optimize your database queries and set reasonable TTLs to minimize the impact of misses.
4. Monitor your cache hit ratio. If your cache hit ratio is below 90%, something is wrong. Either your TTLs are too short, your cache is too small, or your access patterns don't benefit from caching. According to Datadog's 2025 infrastructure report, the median Redis cache hit ratio across their customers is 94%.
5. Don't cache everything. Caching adds complexity. Every cached value is a potential source of stale data. Cache the expensive, frequently-accessed, rarely-changing data. Don't cache cheap queries or data that changes every request.
Action Plan: Building Your Caching Strategy
Phase 1: Browser + CDN (Week 1)
- Audit your Cache-Control headers using Chrome DevTools
- Add
immutableandmax-age=31536000to versioned static assets - Set up a CDN (Cloudflare free tier is a great start)
- Configure CDN caching for static assets
Phase 2: Application Cache (Week 2-3)
- Identify your 5 slowest database queries
- Set up Redis (managed service recommended: AWS ElastiCache, Redis Cloud)
- Implement cache-aside pattern for the top 3 slowest queries
- Add jittered TTLs to prevent cache avalanche
- Add cache hit/miss monitoring
Phase 3: Optimization (Month 2)
- Add cache stampede protection (locking or stale-while-revalidate)
- Cache negative results for frequently-queried non-existent keys
- Review and tune TTLs based on actual access patterns
- Set up alerting on cache hit ratio drops
Phase 4: Advanced (Month 3+)
- Consider event-based invalidation for critical data
- Evaluate CDN edge computing for dynamic content caching
- Implement cache warming for predictable traffic patterns
- Load test your cache under realistic conditions
Key Takeaways
- Cache at the outermost layer possible: browser > CDN > application > database.
- Use
Cache-Control: public, max-age=31536000, immutablefor versioned static assets. - Cache-aside with Redis is the right default pattern for application-level caching.
- TTL-based invalidation is simpler and sufficient for most applications.
- Protect against cache stampedes with locking or stale-while-revalidate.
- Add jitter to TTLs to prevent cache avalanche.
- Monitor your cache hit ratio. Target 90%+.
- Don't cache everything. Cache expensive, frequent, and stable data.
Sources
- Google web.dev - HTTP Cache
- Redis Benchmarks
- Cloudflare - What Is Caching?
- Fastly Engineering Blog
- AWS ElastiCache Documentation
- DB-Engines Key-Value Store Ranking
- Datadog Blog
I'm Ismat, and I build BirJob — Azerbaijan's job aggregator scraping 80+ sources daily.
