Caching Strategies: Redis, Memcached, CDN Patterns (2025)
Caching is the highest ROI performance tool when applied carefully. This guide provides concrete patterns and pitfalls.
Executive summary
- Choose consistent keys; sensible TTLs; prevent stampedes; partition hot keys
- Select strategy per use case: write-through/around/back; read-through; negative caching
Stampede prevention
- Request coalescing; jittered TTL; soft TTL + background refresh; locks
Invalidation
- Event-driven invalidation; tag-based; versioned keys; fallbacks
Observability
- Hit ratio, latency, key size distributions, eviction rates; sample payloads safely
CDN
- Cache-control headers; immutable assets; signed URLs; edge functions
FAQ
Q: Is Redis better than Memcached?
A: Redis offers richer data types and persistence; Memcached is a simple, fast in-memory cache; choose by feature needs and ops.
Related posts
- Streaming (Kafka/Flink): /blog/real-time-data-streaming-kafka-flink-architecture-2025
- Event-Driven Architecture: /blog/event-driven-architecture-patterns-async-messaging
- ClickHouse Performance: /blog/clickhouse-analytics-database-performance-guide-2025
- Sharding Strategies: /blog/database-sharding-partitioning-strategies-scale-2025
- Orchestration: /blog/data-pipeline-orchestration-airflow-prefect-dagster
Call to action
Want a caching audit and performance plan? Request a review.
Contact: /contact • Newsletter: /newsletter
Executive Summary
This guide provides a production-focused blueprint for caching in 2025: Redis/Memcached/CDN/edge, cache patterns (cache-aside, write-through, write-back, refresh-ahead), invalidation strategies, stampede prevention, metrics/observability, HA/DR, and cost modeling.
Caching Fundamentals
Patterns
- Cache-Aside: app reads from cache; on miss, load from source and populate
- Write-Through: writes go to cache and source synchronously
- Write-Back (Write-Behind): write to cache, flush to source asynchronously
- Refresh-Ahead: refresh items before TTL expires for hot keys
TTL and Eviction
- TTL per key or per namespace
- Evictions: LRU/LFU/Random; size-based limits
- Soft TTL (serve stale) vs Hard TTL (strict expiry)
Keys, Namespacing, Versioning
function key(ns: string, id: string, ver = 'v1'){ return `${ns}:${ver}:${id}` }
- Namespaces per tenant/service/route
- Version bump to invalidate entire namespace on deploy
Consistency and Invalidation Patterns
- Source of truth: database/object store
- Invalidate on write; publish invalidation events
- Patterns: time-based TTL, explicit delete, version key, write-through update
async function setWithVersion(k: string, v: string, ttl: number){
const pipe = redis.multi(); pipe.set(k, v, { EX: ttl }); pipe.incr('cache:version'); await pipe.exec();
}
Stampede/Dogpile Prevention
// single-flight: only one loader per key
const inflight = new Map<string, Promise<any>>();
export async function cached<T>(k: string, loader: () => Promise<T>, ttl = 300){
const cv = await redis.get(k); if (cv) return JSON.parse(cv) as T;
if (inflight.has(k)) return inflight.get(k)! as Promise<T>;
const p = loader().then(async v => { await redis.set(k, JSON.stringify(v), { EX: ttl }); inflight.delete(k); return v; })
.catch(e => { inflight.delete(k); throw e; });
inflight.set(k, p); return p;
}
// mutex lock
async function withLock(lockKey: string, fn: () => Promise<void>, ttlMs = 2000){
const ok = await redis.set(lockKey, '1', { NX: true, PX: ttlMs });
if (!ok) return; try { await fn(); } finally { await redis.del(lockKey); }
}
Hot Key Mitigation
- Shard key with suffixes; client-side consistent hashing
- Use local in-process cache for ultrahot items
- Cap TTL and use probabilistic early refresh
Negative and Partial Caching
- Negative caching: cache not-found (404) for short TTL to prevent backend hits
- Partial responses: cache fragments (e.g., GraphQL fields) with keys
Stale-While-Revalidate (SWR)
export async function swr<T>(k: string, loader: () => Promise<T>, ttl = 60, swrTtl = 300){
const v = await redis.get(k); if (v) {
const meta = await redis.ttl(k);
if (meta < swrTtl) loader().then(nv => redis.set(k, JSON.stringify(nv), { EX: ttl })).catch(()=>{})
return JSON.parse(v) as T;
}
const nv = await loader(); await redis.set(k, JSON.stringify(nv), { EX: ttl }); return nv;
}
Rate Limiting Tokens (Caches)
import { RateLimiterRedis } from 'rate-limiter-flexible'
const rl = new RateLimiterRedis({ storeClient: redis, keyPrefix: 'rl', points: 60, duration: 60 })
Probabilistic Data Structures
// Bloom filter for existence checks (use RedisBloom module in prod)
// HyperLogLog for approximate unique counts
await redis.pfadd('hll:users', userId)
const approx = await redis.pfcount('hll:users')
Redis: Topologies and Persistence
- Standalone for dev, Sentinel for HA failover, Cluster for sharding
- Persistence: AOF (append-only) vs RDB snapshots; combine for durability
- TLS and ACLs; network policies; isolate from public internet
# redis.conf snippets
aof-use-rdb-preamble yes
maxmemory 4gb
maxmemory-policy allkeys-lfu
tls-port 6379
# Sentinel
sentinel monitor mymaster 10.0.1.10 6379 2
Redis Pub/Sub and Streams
// Pub/Sub invalidation
await redis.publish('cache:invalidate', key)
// Streams for events
await redis.xadd('events', '*', 'type', 'order_created', 'order_id', orderId)
Memcached Basics
- LRU eviction; no persistence; simple strings; multi-get support
- Use for ephemeral app-level caching where durability not required
import memjs from 'memjs'
const mc = memjs.Client.create()
await mc.set(key, Buffer.from(JSON.stringify(v)), { expires: 300 })
CDN Caching (CloudFront/Cloudflare/Fastly)
Cache-Control: public, max-age=600, s-maxage=1200, stale-while-revalidate=300
ETag: "abc123"
Vary: Accept-Encoding, Accept-Language
- Signed URLs/cookies to protect private content
- Invalidate on deploy; use versioned asset names
# CloudFront invalidation
aws cloudfront create-invalidation --distribution-id D123 --paths "/app/*"
Reverse Proxies: NGINX/Varnish
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=STATIC:100m inactive=60m use_temp_path=off;
server {
location / {
proxy_cache STATIC;
proxy_cache_key "$scheme$request_method$host$request_uri";
proxy_cache_valid 200 302 10m;
proxy_cache_valid 404 1m;
add_header X-Cache-Status $upstream_cache_status;
proxy_pass http://app;
}
}
sub vcl_backend_response {
set beresp.ttl = 10m;
if (beresp.status == 404) { set beresp.ttl = 60s; }
}
Edge Caching and API Caching
- Use Cloudflare Workers/CloudFront Functions for header manipulation and SWR
- Cache API GET responses with short TTL, vary by auth/tenant
// Cache API responses by tenant
function apiKey(tenant: string, path: string){ return `api:${tenant}:${path}` }
GraphQL Caching
- Persisted queries; cache query+variables signature
- Per-field caching and dataloaders to batch backend calls
const cacheKey = `gql:${hash(query+JSON.stringify(variables))}`
Database and Materialized Views
CREATE MATERIALIZED VIEW mv_orders_1h AS
SELECT date_trunc('hour', created_at) AS h, SUM(amount) AS revenue
FROM orders WHERE created_at >= now() - interval '7 days'
GROUP BY 1;
- Refresh policies aligned to freshness SLAs; invalidate dependent caches
Application Caches
// In-process LRU cache for small hot set
import LRU from 'lru-cache'
const lru = new LRU<string, any>({ max: 5000, ttl: 1000 * 60 })
Multi-Tenant Namespacing and Security
function tenantKey(tenant: string, k: string){ return `${tenant}:${k}` }
- TLS to cache servers; ACL roles; network isolation; avoid PII storage
Observability: Metrics and Dashboards
import client from 'prom-client'
const hits = new client.Counter({ name: 'cache_hits_total', help: 'hits', labelNames: ['cache'] })
const misses = new client.Counter({ name: 'cache_misses_total', help: 'misses', labelNames: ['cache'] })
const backends = new client.Counter({ name: 'backend_requests_total', help: 'backend' })
sum(rate(cache_hits_total[5m])) / (sum(rate(cache_hits_total[5m])) + sum(rate(cache_misses_total[5m])))
{
"title": "Cache Ops",
"panels": [
{"type":"stat","title":"Hit Ratio","targets":[{"expr":"sum(rate(cache_hits_total[5m]))/(sum(rate(cache_hits_total[5m]))+sum(rate(cache_misses_total[5m])))"}]},
{"type":"timeseries","title":"Backend Offload","targets":[{"expr":"sum(rate(backend_requests_total[5m]))"}]}
]
}
OTEL Traces for Cache Layers
span.addEvent('cache.get', { key: k })
span.addEvent('cache.miss', { key: k })
span.addEvent('backend.fetch', { ms: 45 })
HA/DR and Autoscaling
- Redis Sentinel or Managed (Elasticache/Memorystore/Azure Cache)
- Redis Cluster for sharding; reshard on growth
- Autoscale based on memory usage, CPU, and latency
- Set proper maxmemory and eviction policy
Cost Modeling
provider,tier,memory_gb,usd_month
elasticache,cache.r6g.large,12.3,110
memorystore,standard-1,12.0,120
azure,standard_c3,12.0,115
- Improve hit ratio to offload backends; tune TTL and keys
Runbooks and SOPs
Stampede Event
- Identify hot key; enable single-flight; prewarm cache; increase TTL; add SWR
Hit Ratio Drop
- Review keys; increase TTL; reduce fragmentation; cache partials; add negative caching
Latency Spike
- Inspect network and CPU; adjust maxmemory-policy; scale up/out; reduce serialization overhead
JSON-LD
Related Posts
- AWS Architecture Patterns: Well-Architected Framework (2025)
- API Security: OWASP Top 10 Prevention Guide (2025)
- ClickHouse Analytics Database Performance Guide (2025)
Call to Action
Need help optimizing caching? We design, benchmark, and operate caches at scale with robust observability and cost controls.
Extended FAQ (1–150)
-
Cache size?
Size to hold hot working set; monitor hit ratio. -
TTL length?
Balance freshness vs offload; start with minutes for dynamic content. -
LRU vs LFU?
LFU for frequency-heavy workloads; LRU simpler. -
When to use negative caching?
For expensive lookups that frequently miss. -
SWR for APIs?
Yes—serve stale while background refresh. -
Stampede edges?
Jitter TTL; request coalescing; locks. -
Hot key?
Shard and prewarm; local LRU. -
Redis persistence?
AOF + RDB for durability; test failover. -
Sentinel vs Managed?
Managed for ops; Sentinel for DIY. -
CDN ETag or versioned assets?
Versioned assets preferred; ETags for validation.
... (add 140+ practical Q/A on Redis/Memcached/CDN/edge, invalidation, keys, metrics, HA, cost)
Appendix A — Reference Implementations by Language
A.1 Node.js (Express/Fastify)
import express from 'express'
import { createClient } from 'redis'
import LRU from 'lru-cache'
const app = express()
const redis = createClient({ url: process.env.REDIS_URL, socket: { tls: true } })
await redis.connect()
const localCache = new LRU<string, any>({ max: 5000, ttl: 60_000 })
function k(ns: string, id: string, ver = 'v1'){ return `${ns}:${ver}:${id}` }
async function getCached<T>(key: string, loader: () => Promise<T>, ttl = 120){
const lc = localCache.get(key); if (lc) return lc as T
const rv = await redis.get(key); if (rv) { const v = JSON.parse(rv) as T; localCache.set(key, v); return v }
const value = await loader();
await redis.set(key, JSON.stringify(value), { EX: ttl });
localCache.set(key, value)
return value
}
app.get('/products/:id', async (req, res) => {
const id = req.params.id
const key = k('product', id)
const data = await getCached(key, async () => fetchProductFromDB(id))
res.json(data)
})
app.post('/products/:id', async (req, res) => {
const id = req.params.id
const body = req.body
await updateProductInDB(id, body)
await Promise.all([
redis.del(k('product', id)),
redis.publish('cache:invalidate', k('product', id))
])
res.sendStatus(204)
})
// SWR helper with probabilistic early refresh
function shouldRefresh(ttlRemaining: number){
const p = Math.exp(-ttlRemaining / 30) // refresh more likely near expiry
return Math.random() < p
}
A.2 Python (FastAPI/Django)
import os, json
import aioredis
from fastapi import FastAPI
app = FastAPI()
redis = await aioredis.from_url(os.getenv('REDIS_URL'), encoding='utf-8', decode_responses=True, ssl=True)
async def cache_get_or_set(key: str, loader, ttl: int = 120):
v = await redis.get(key)
if v: return json.loads(v)
data = await loader()
await redis.set(key, json.dumps(data), ex=ttl)
return data
@app.get('/users/{uid}')
async def user(uid: str):
key = f'user:v1:{uid}'
return await cache_get_or_set(key, lambda: load_user(uid))
A.3 Go (Gin/Fiber)
var rdb = redis.NewClient(&redis.Options{Addr: os.Getenv("REDIS_ADDR"), TLSConfig: &tls.Config{InsecureSkipVerify: false}})
func CacheGetOrSet(ctx context.Context, key string, ttl time.Duration, loader func() (any, error)) (any, error) {
if val, err := rdb.Get(ctx, key).Result(); err == nil {
var v any; json.Unmarshal([]byte(val), &v); return v, nil
}
v, err := loader(); if err != nil { return nil, err }
b, _ := json.Marshal(v); rdb.Set(ctx, key, string(b), ttl)
return v, nil
}
A.4 Java (Spring Boot)
@EnableCaching
@SpringBootApplication
public class App {}
@Service
public class ProductService {
@Cacheable(value = "product", key = "#id", cacheManager = "redisCacheManager")
public Product getProduct(String id) { return repo.load(id); }
}
A.5 .NET (ASP.NET Core)
builder.Services.AddStackExchangeRedisCache(options => { options.Configuration = redisConn; });
public class CachedService {
private readonly IDistributedCache _cache;
public CachedService(IDistributedCache cache){ _cache = cache; }
public async Task<T> GetOrSet<T>(string key, Func<Task<T>> loader, TimeSpan ttl){
var v = await _cache.GetStringAsync(key);
if (v != null) return JsonSerializer.Deserialize<T>(v)!;
var data = await loader();
await _cache.SetStringAsync(key, JsonSerializer.Serialize(data), new DistributedCacheEntryOptions{ AbsoluteExpirationRelativeToNow = ttl });
return data;
}
}
A.6 Rust (Actix)
let client = redis::Client::open(redis_url).unwrap();
let mut con = client.get_connection().unwrap();
redis::cmd("SET").arg(&key).arg(&payload).arg("EX").arg(ttl).execute(&mut con);
Appendix B — Redis Configuration Cookbook
# Memory and eviction
maxmemory 12gb
maxmemory-policy allkeys-lfu
# Persistence
appendonly yes
appendfsync everysec
save 900 1 300 10 60 10000
# Security
requirepass ${REDIS_PASSWORD}
aclfile /etc/redis/users.acl
protected-mode yes
# TLS
tls-port 6379
tls-cert-file /etc/redis/tls/tls.crt
tls-key-file /etc/redis/tls/tls.key
tls-ca-cert-file /etc/redis/tls/ca.crt
# Cluster
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
Appendix C — Cloudflare Workers/Edge Examples
export default {
async fetch(req, env, ctx) {
const url = new URL(req.url)
const key = `edge:v1:${url.pathname}`
let v = await env.CACHE_KV.get(key)
if (v) return new Response(v, { headers: { 'Cache-Control': 'public, max-age=60, stale-while-revalidate=300' }})
const origin = await fetch(`https://origin.example.com${url.pathname}`)
v = await origin.text()
ctx.waitUntil(env.CACHE_KV.put(key, v, { expirationTtl: 60 }))
return new Response(v, origin)
}
}
Appendix D — Benchmarks and Load Testing
k6 run load.js
import http from 'k6/http'
import { sleep, check } from 'k6'
export const options = { vus: 200, duration: '5m' }
export default function(){
const r = http.get('https://api.example.com/products/42')
check(r, { 'status 200': (res) => res.status === 200 })
sleep(1)
}
Measure:
- P50/P95 latency
- Backend offload (origin QPS vs edge QPS)
- Cache hit ratio (layer-specific)
- CPU/mem for caches
Appendix E — Advanced Patterns
Soft TTL + Background Refresh
type Entry<T> = { data: T, hardExp: number, softExp: number }
Jittered Expiration
const jitter = (ttl: number) => Math.floor(ttl * (0.9 + Math.random()*0.2))
Request Coalescing with Abort
const inFlight = new Map<string, { p: Promise<any>, c: AbortController }>()
Appendix F — Security and Compliance
- Do not store PII in caches; if unavoidable, encrypt at rest and in transit.
- Rotate credentials; short-lived tokens; scoped ACLs.
- Audit logs of cache administrative commands.
Appendix G — Observability Playbook
# Layered hit ratio
sum by(layer) (rate(cache_hits_total[5m])) / (sum by(layer) (rate(cache_hits_total[5m])) + sum by(layer) (rate(cache_misses_total[5m])))
# OTEL semantic attributes
cache.system: redis|memcached|cdn
cache.op: get|set|del
cache.hit: true|false
Appendix H — HA/DR Scenarios
- Primary node failure: sentinel/managed failover within 3–10s.
- Region outage: active-active with global traffic manager; eventual consistency for caches.
- Cold start: prewarm hot keys via job.
Appendix I — Cost Optimization
- Prefer LFU to retain long-tail hotset.
- Compress large JSON blobs or store fields separately.
- Use short TTL for low-repeat endpoints.
- Offload at CDN/edge when possible.
Appendix J — API Gateway and GraphQL
// Apollo Server persisted queries + cache
// REST: vary by auth scope/tenant
Appendix K — Database Integration
-- PostgreSQL: refresh MV and bump version key
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_orders_1h;
Appendix L — NGINX/Varnish Advanced
map $http_authorization $skip_cache { default 1; "" 0; }
if (req.http.Authorization) { return (pass); }
Appendix M — Runbooks (Detailed)
-
Cache Stampede Mitigation
- Enable single-flight per key
- Roll out SWR for hot namespaces
- Increase TTL with jitter
- Warm key via background job
-
Hit Ratio Regression
- Compare key cardinality before/after deploy
- Inspect top-miss endpoints
- Add negative caching for common 404s
- Evaluate per-field caching in GraphQL
-
Latency Regression
- Inspect network path to cache
- Raise connection pool and pipeline
- Reduce payload size; compress JSON
Extended FAQ (continued)
-
How to choose TTLs per route?
Model by freshness SLA and consumption patterns; start small, measure. -
Should I cache POST responses?
Rare; only if idempotent and safe; better use GET. -
Can I cache authenticated content at CDN?
Yes with Signed Cookies/Headers and cache key scoped to user/session when safe. -
How to avoid stale writes with write-behind?
Enforce durable queue, retries, and idempotency keys; monitor lag. -
How big can values be in Redis?
Keep under 512KB ideally; split large docs into fragments. -
JSON vs MessagePack?
MessagePack smaller/faster; ensure language support. -
How to cache GraphQL?
Persisted queries; per-field cache; dataloaders; federation layer cache. -
Which Redis eviction policy?
LFU for recency+frequency; validate with production traces. -
Should I cluster or scale up?
Cluster at >75% memory or CPU saturation; also for multi-tenant isolation. -
TLS overhead?
Negligible within VPC; use keep-alive and pooling. -
Prevent thundering herd at edge?
Use SWR, serve-stale, and coalescing at edge worker. -
Normalize keys?
Yes; sort query params; lowercase host; consistent delimiters. -
Version invalidation?
Bump namespace version on deploy to wipe entire space atomically. -
Can I rely only on TTL?
Often; combine with explicit delete for critical writes. -
Rate limiting in cache?
Token bucket with Redis scripts or libraries; per-tenant keys. -
Cache warming?
Precompute hot pages on deploy; schedule jobs. -
What is stale-if-error?
Serve stale content if backend 5xx; mark and alert. -
What is dogpile protection?
Mechanisms that stop multiple requests from regenerating the same item. -
Negative cache TTL?
Short: 10–60s typically. -
Multi-region cache?
Region-local caches with async replication if necessary; beware consistency. -
Cross-DC invalidation?
Pub/Sub via global bus; CRDT-style counters for idempotency. -
CDN and APIs?
Cache GET with conservative TTL; vary by headers. -
How to log cache events?
Emit structured logs with key namespace and hit/miss outcome. -
Cache poisoning risks?
Strict key normalization and validation; signed keys if user-influenced. -
Avoid key collisions?
Namespace + version + delimiter discipline; hash long parts. -
Per-user cache?
Consider session-local caches; purge on logout or profile update. -
Redis streams for invalidation?
Yes; consumers per service; ensure delivery semantics. -
Async cache population?
Background job consumes a queue and populates keys. -
Binary values?
Store as base64 or raw buffers; mind size. -
ETag vs Last-Modified?
ETag stronger; LM easier; support both. -
CDN private content?
Use Signed URLs and short TTL with origin auth. -
Mobile clients and caching?
Cache-Control headers; respect offline behavior; service workers. -
Service Worker caching?
Cache-first for static; network-first with fallback for dynamic. -
Vary pitfalls?
Explodes cache cardinality; keep Vary minimal. -
Prefetching?
Predictive prefetch for next pages when bandwidth idle. -
Backpressure?
Limit concurrent refreshers; queue overflow policies. -
Redis timeouts?
Tune socket/connect timeouts; circuit break on failures. -
Circuit breakers with cache?
Trip if backend unhealthy; serve stale and degrade gracefully. -
Observability golden signals?
Hit ratio, latency, errors, capacity, evictions, offload. -
Alert thresholds?
Hit ratio drop >10% for 10m; latency > P95 SLO; evictions > N/min.
...
- What is the one rule of caching?
There are two hard things in CS: cache invalidation and naming things.
Appendix N — Redis Lua Scripts (Atomic Ops)
-- Rate limit: N requests per window
-- KEYS[1] = key, ARGV[1] = windowSeconds, ARGV[2] = limit
local current = redis.call('INCR', KEYS[1])
if tonumber(current) == 1 then
redis.call('EXPIRE', KEYS[1], ARGV[1])
end
if tonumber(current) > tonumber(ARGV[2]) then
return {err = 'rate_limited'}
end
return current
-- Mutex with TTL
-- KEYS[1] = lockKey, ARGV[1] = token, ARGV[2] = ttl
if redis.call('SET', KEYS[1], ARGV[1], 'NX', 'PX', ARGV[2]) then
return 'OK'
else
return nil
end
-- Safe unlock
-- KEYS[1] = lockKey, ARGV[1] = token
if redis.call('GET', KEYS[1]) == ARGV[1] then
return redis.call('DEL', KEYS[1])
else
return 0
end
-- SWR gate: set a short-lived key to signal refresh-in-progress
-- KEYS[1] = swrKey, ARGV[1] = ttlSeconds
return redis.call('SET', KEYS[1], '1', 'NX', 'EX', ARGV[1])
Appendix O — Terraform Modules (Managed Redis)
# modules/elasticache/main.tf
variable "name" { type = string }
variable "node_type" { type = string }
variable "engine_version" { type = string }
variable "num_cache_nodes" { type = number }
resource "aws_elasticache_replication_group" "this" {
replication_group_id = var.name
engine = "redis"
engine_version = var.engine_version
node_type = var.node_type
automatic_failover_enabled = true
multi_az_enabled = true
transit_encryption_enabled = true
at_rest_encryption_enabled = true
parameter_group_name = "default.redis7"
number_cache_clusters = var.num_cache_nodes
}
output "primary_endpoint" { value = aws_elasticache_replication_group.this.primary_endpoint_address }
# modules/memorystore/main.tf
data "google_project" "this" {}
resource "google_redis_instance" "this" {
name = var.name
tier = "STANDARD_HA"
memory_size_gb = var.memory_gb
region = var.region
transit_encryption_mode = "SERVER_AUTHENTICATION"
}
# modules/azure-cache/main.tf
resource "azurerm_redis_cache" "this" {
name = var.name
location = var.location
resource_group_name = var.rg
capacity = 3
family = "C"
sku_name = "Standard"
minimum_tls_version = "1.2"
enable_non_ssl_port = false
}
Appendix P — Helm Chart Values (Redis)
# values.yaml
architecture: replication
auth:
enabled: true
password: ${REDIS_PASSWORD}
master:
persistence:
enabled: true
size: 20Gi
replica:
replicaCount: 2
persistence:
enabled: true
size: 20Gi
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 2
memory: 8Gi
Appendix Q — Kubernetes Manifests (Sentinel/Cluster)
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
spec:
replicas: 1
selector:
matchLabels: { app: redis }
template:
metadata:
labels: { app: redis }
spec:
containers:
- name: redis
image: redis:7
ports: [{ containerPort: 6379 }]
args: ["--appendonly", "yes", "--maxmemory", "4gb", "--maxmemory-policy", "allkeys-lfu"]
resources:
requests: { cpu: "500m", memory: "1Gi" }
limits: { cpu: "1", memory: "4Gi" }
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
selector: { app: redis }
ports:
- name: redis
port: 6379
targetPort: 6379
Appendix R — CI/CD Pipelines (GitHub Actions)
name: cache-pipeline
on:
push: { branches: [ main ] }
jobs:
test-and-benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- run: npm ci
- run: npm run test
- name: Run k6
uses: grafana/k6-action@v0.3.1
with:
filename: k6/load.js
- name: Invalidate CDN
run: aws cloudfront create-invalidation --distribution-id ${{ secrets.DIST }} --paths "/app/*"
Appendix S — CDN Configuration Samples
{
"Behaviors": [
{
"PathPattern": "/api/*",
"AllowedMethods": ["GET", "HEAD"],
"CachedMethods": ["GET", "HEAD"],
"ForwardedValues": {
"QueryString": true,
"Headers": { "Quantity": 2, "Items": ["Authorization", "Accept-Language"] }
},
"MinTTL": 0, "DefaultTTL": 30, "MaxTTL": 300
}
]
}
# Cloudflare rules (pseudo)
rules:
- action: cache
conditions: [ path_matches:/images/* ]
ttl: 86400
- action: bypass_cache
conditions: [ header_present:Authorization ]
Appendix T — Service Worker and Workbox
// sw.js
self.addEventListener('install', (e) => { self.skipWaiting() })
self.addEventListener('activate', (e) => { clients.claim() })
importScripts('https://storage.googleapis.com/workbox-cdn/releases/6.5.4/workbox-sw.js')
workbox.precaching.precacheAndRoute(self.__WB_MANIFEST || [])
workbox.routing.registerRoute(
({ request }) => request.destination === 'document',
new workbox.strategies.StaleWhileRevalidate({ cacheName: 'pages' })
)
Appendix U — Next.js Server-Side Caching Patterns
// app/api/products/[id]/route.ts
export const revalidate = 60 // ISR for pages
export async function GET(_: Request, { params }: { params: { id: string } }){
const id = params.id
const key = `product:v1:${id}`
const cached = await redis.get(key)
if (cached) return NextResponse.json(JSON.parse(cached), { headers: { 'X-Cache': 'HIT' } })
const data = await db.product.findUnique({ where: { id } })
await redis.set(key, JSON.stringify(data), { EX: 120 })
return NextResponse.json(data, { headers: { 'X-Cache': 'MISS' } })
}
Appendix V — Troubleshooting Matrix
Symptom: Low hit ratio
- Check key normalization; verify TTLs; add negative cache
Symptom: High Redis CPU
- Hot key; enable local LRU; shard key; increase network buffers
Symptom: Latency spikes
- Nagle disabled; pipeline enabled; pool size; TLS session reuse; co-location
Symptom: Excessive evictions
- Increase memory; LFU; reduce key cardinality; compress values
Appendix W — Prometheus Alerts
- alert: CacheHitRatioDrop
expr: (sum(rate(cache_hits_total[10m])) / (sum(rate(cache_hits_total[10m])) + sum(rate(cache_misses_total[10m])))) < 0.7
for: 15m
labels: { severity: warning }
annotations:
description: "Cache hit ratio below 70% for 15m"
- alert: CacheLatencyHigh
expr: histogram_quantile(0.95, sum(rate(cache_request_duration_seconds_bucket[5m])) by (le)) > 0.050
for: 10m
Appendix X — Full Grafana Dashboard JSON (Excerpt)
{
"title": "Caching Overview",
"panels": [
{ "type": "stat", "title": "Hit Ratio", "targets": [{ "expr": "sum(rate(cache_hits_total[5m]))/(sum(rate(cache_hits_total[5m]))+sum(rate(cache_misses_total[5m])))" }] },
{ "type": "timeseries", "title": "Redis CPU", "targets": [{ "expr": "avg(rate(process_cpu_seconds_total{job='redis'}[1m]))" }] },
{ "type": "timeseries", "title": "Evictions", "targets": [{ "expr": "sum(rate(redis_evicted_keys_total[5m]))" }] }
]
}
Appendix Y — Security Checklists
- TLS 1.2+ for all cache connections
- ACLs with least privilege; rotate tokens
- No secrets/PII in cache; if needed, encrypt values
- Audit command usage; disable dangerous commands where possible
- Private networks; firewall rules; deny public access
Appendix Z — SLOs and Error Budgets
- Hit ratio SLO: >= 85% for static assets, >= 65% for APIs
- P95 latency SLO: < 50ms for cache GET
- Availability SLO: 99.95%
- Error budget policies: slow-roll changes when burn rate > 2x over 1h
Long-form FAQ (51–180)
-
What is SWR versus ISR?
SWR: serve stale while revalidate; ISR: rebuild static pages on interval in Next.js. -
Can I cache WebSocket data?
Generally no; cache REST endpoints feeding WebSocket publish. -
Should I cache search results?
Yes with normalized queries; short TTL; respect personalization. -
Why normalize query params?
Prevents duplicate keys causing low hit ratios. -
Is gzip worth it?
Yes for large text; CPU tradeoff minimal at edge. -
JSON compaction?
Remove whitespace; consider binary formats. -
Hash keys?
Hash long components to meet key length limits and avoid PII. -
How to avoid cache poisoning?
Validate inputs; rigid key construction; whitelist Vary headers. -
Use RedisJSON?
When partial updates/reads are frequent; watch memory overhead. -
RedisGraph?
Not for general caching; specialized workload. -
Can LFU degrade?
Yes with scans; tune decay-time; monitor. -
Memcached vs Redis latency?
Comparable; Redis offers richer ops; test with your stack. -
Multi-tier caches?
Edge → regional → app → in-process; ensure coherence or accept staleness. -
Write amplification?
Batch writes; write-behind queue; compress payloads. -
Quotas per tenant?
Namespace limits; monitor memory usage per tenant via key patterns. -
Key scans safe?
Use SCAN with small count in maintenance windows; never KEYS in prod. -
LUA versus transactions?
Lua for atomic multi-step logic; MULTI/EXEC for simpler sets. -
Redis Cluster resharding impact?
Client needs cluster-aware driver; temporary latency spikes. -
Eviction storms?
Raise memory; use LFU; stagger expirations with jitter. -
CDN origin shield?
Yes to reduce origin load and cache misses. -
Origin auth with CDN?
Use signed headers at edge; verify at origin. -
Cache invalidation bus?
Pub/Sub topic; consumers delete local/in-process entries. -
JSON-LD impact on caching?
Treat as static asset with long TTL; invalidate on content change. -
ESI with Varnish?
Edge Side Includes allows partial page caching and composition. -
Hybrid rendering?
Cache SSR output; hydrate client with SWR for fresh data. -
Canary cache configs?
Roll out to subset of traffic; measure hit ratio and latency. -
Backfill cache after incident?
Run prewarm jobs for hot routes; monitor backend load. -
LZ4 vs zstd?
zstd compresses better; LZ4 is faster; choose per route. -
Redis I/O threads?
Enable if CPU-bound on networking; measure. -
Pipelining vs batching?
Both reduce RTT; pipelining sends without waits; batching groups ops. -
Multi-get pattern?
Use MGET and fill local cache; reduce per-key calls. -
Write coalescing?
Aggregate frequent small writes in buffer then set. -
Cache bloat?
Remove unused namespaces; expire old versions; report cardinality. -
Safe TTL increase?
Yes; decreases origin load; beware stale content. -
Cache transactional data?
Usually no; must maintain strict consistency. -
Eventual consistency acceptable?
For content and most read-heavy APIs, yes with guardrails. -
Dedup across tenants?
If allowed, use shared cache with tenant-aware identity. -
Priority-based eviction?
Store priority as score; evict low-priority first via maintenance job. -
Rate limiter storage?
Prefer Redis with Lua; accurate and atomic. -
Sliding window limits?
Use sorted sets or leaky bucket approximations. -
Geo-replication?
Managed offerings provide; evaluate write latency impacts. -
Cache consistency testing?
Replay writes; verify reads match DB; chaos tests. -
Backoff on backend failures?
Exponential backoff; serve stale; trip circuit. -
Is binary safe in Redis?
Yes; drivers support buffers. -
Max key length?
Redis allows long keys; keep under ~1KB practical. -
CDN compression?
Enable Brotli; fallback to gzip. -
Cache busting best practice?
Content-hash filenames; never mutable URLs for static assets. -
Dynamic image resizing cache?
Cache variants by width/quality; long TTL. -
Cache headers for APIs?
Cache-Control with s-maxage and stale-while-revalidate. -
SDP and cache?
Service Data Policy: ensure retention rules and PII handling. -
Signed cookies versus headers?
Headers simpler for APIs; cookies for web assets. -
Bot traffic?
Edge rate-limit; bypass dynamic expensive routes. -
SSR and user-specific data?
Split into cached frame + client-side fetch for private data. -
GRPC caching?
Typically at application level; limited CDN support. -
HTTP/3 QUIC?
Improves edge perf; caching semantics unchanged. -
H2 push deprecated?
Yes; prefer prefetch/preload hints. -
Preconnect benefits?
Reduces connection setup time to CDN/origin. -
Warm TLS sessions?
Enable session resumption and keep-alive. -
Origin concurrency?
Limit to protect DB; rely on cache to queue demand. -
Idempotency keys?
Essential for write-through and retryable ops. -
Key rotation?
Versioned namespaces; scheduled cleanup of old versions. -
Payload encryption?
Only when needed; understand CPU overhead and key management. -
Hash collisions?
Use robust hashes (sha256); include namespace; low risk. -
Cache line alignment?
Not applicable; optimize serialization instead. -
Redis modules?
RedisBloom, RedisJSON, RedisSearch when applicable; watch memory. -
Local cache coherence?
Invalidate via Pub/Sub; TTL cap; accept brief staleness. -
Db caching versus app caching?
Combine: materialized views + app/edge caches. -
Blue/green cache migration?
Run both; mirror writes; cut over when warm. -
Canary invalidation?
Test purge strategies on subset routes. -
Cache layer ownership?
Platform team owns infra; product teams own keys/policies. -
Tagged invalidation?
Maintain tag→keys mapping; purge by tag. -
Redis memory fragmentation?
Restart off-peak; upgrade allocator; measure. -
Worker pools?
Size to CPU; avoid sync I/O; use async clients. -
Backpressure to clients?
Return 429 with Retry-After if rate limited. -
Cacheable errors?
Short TTL for 404/410; avoid caching 500s unless stale-if-error. -
ETag weak vs strong?
Strong for exact match; weak for semantically equivalent. -
Mobile bandwidth saver?
Longer TTL for large assets; respect save-data header. -
Privacy mode?
Skip caching when DNT or private browsing detected (policy dependent). -
Feature flags in cache keys?
Only when content differs; otherwise keep out to avoid bloat. -
Structured logging fields?
cache_layer, cache_op, key_ns, hit, latency_ms, size_bytes. -
Key compression?
Hash long tails; store mapping for debugging. -
Cache chooser?
Policy-based: memory threshold, latency SLO, tenant priority. -
Request collapse across instances?
Use Redis locks or a shared single-flight registry. -
How to track top keys?
Keyspace notifications + sampling; external telemetry. -
Eviction policy per namespace?
Run separate instances or databases per class. -
Cache warmup duration?
Measure QPS to hot keys; complete before peak traffic. -
Multi-tenant cost allocation?
Track per-tenant bytes and ops; showback/chargeback. -
Do I need Ristretto/ARC?
Try LFU first; specialized algos if workload demands. -
Zstandard levels?
Level 3–6 for balanced performance. -
Redis latency percentile goals?
P95 < 5–10ms intra-region. -
CDN stale-if-error value?
300–600s typical. -
Does Vary: User-Agent make sense?
Avoid unless critical; huge cardinality. -
Cache TTL in DB?
Store per-record freshness hints for app logic. -
Compress HTML?
Yes; minify; ensure no layout shifts from inline CSS changes. -
Image formats?
Prefer AVIF/WebP; negotiate via Accept header and Vary when necessary. -
COOP/COEP and caching?
Security headers; caching behavior unchanged. -
Cache debug endpoint?
Only internal; returns hit/miss metrics per route. -
Per-route revalidation?
HEAD with If-None-Match for cheap validation. -
Client hints?
Use Accept-CH to guide asset variants. -
Serve stale on backend deploys?
Yes; avoid spikes; purge incrementally after healthy. -
Time to live versus time to idle?
TTI resets on access; Redis TTL is absolute unless app-managed. -
Redis SCAN schedule?
Nightly off-peak; small COUNT batches. -
KV store limits?
Know provider quotas for key count, size, throughput. -
CDN shielding layers?
Edge → shield → origin; reduces origin load. -
Locality-aware routing?
Send users to nearest edge/region for latency. -
Key leaks in logs?
Mask sensitive parts; never log values. -
IP-based caching?
Avoid; unstable and privacy sensitive. -
IPv6 differences?
None specific to caching. -
Cache control for previews?
No-store; bypass all caches; add X-Robots-Tag: noindex. -
RFC compliance?
Honor RFC 7234 semantics for shared caches. -
Redis GEO redundancy?
Active-active with CRDTs not typical for caching; prefer per-region. -
KV service versus Redis?
Simple KV (DynamoDB DAX/Cloudflare KV) for long-lived items; Redis for hot paths. -
Split large values?
Chunk; store index; fetch partials. -
CDN functions for personalization?
Edge compute injects headers, not full user data. -
Preconnect versus DNS-prefetch?
Preconnect includes TLS/handshake; more impactful. -
CDN cache key normalization?
Lowercase, strip tracking params, sort query params. -
Replacer tests?
Simulate memory pressure; compare LRU/LFU hit ratio. -
Cache layering anti-pattern?
Too many tiers without observability; debugging complexity. -
Redis eviction telemetry?
redis_evicted_keys_total, used_memory, keyspace_hits. -
Binary protocol benefits?
Memcached binary protocol is efficient; Redis RESP3 improved too. -
Cache warming on blue/green?
Warm both; cutover when both stable. -
Retry storms?
Cap retries; jitter; circuit break. -
CDN purge strategies?
Prefix purge cautiously; prefer versioned assets. -
PCI/GDPR concerns?
No PAN/PII in caches; DSR workflows to purge. -
S3 + CloudFront OAC?
Yes; private buckets with Origin Access Control; cache long. -
Headers to avoid in Vary?
Authorization, Cookie (unless necessary), User-Agent. -
Max stale budget?
Define per route; e.g., 5m for blogs, 30s for prices. -
Backend pressure forecast?
Use offload metrics to predict capacity needs. -
Instrumentation overhead?
Sample at 1–10% if high volume. -
Final advice?
Measure, iterate, and treat caching as a product with owners and SLOs.
Appendix AA — Client Configuration Guide (Advanced)
- TCP settings: keepalive, nodelay, backoff, pool size, pipeline length
- Serialization: JSON vs MessagePack vs Protobuf; field-level compression
- Retries: exponential backoff with jitter, max attempts, idempotency keys
- Timeouts: connect/read/write; set sane defaults (50–200ms)
- Circuit breakers: half-open probes; serve-stale on open
// Node Redis advanced
const client = createClient({
url: process.env.REDIS_URL,
socket: { reconnectStrategy: (retries) => Math.min(retries * 50, 1000), keepAlive: 5000, noDelay: true }
})
Appendix AB — Compression Strategies
- Content-aware: compress HTML/JSON; skip already-compressed (JPEG/MP4)
- Dictionary-based (zstd DICT) for repetitive JSON structures
- Per-route compression levels; monitor CPU cost
import { compress, decompress } from 'lzutf8'
const enc = (o: any) => Buffer.from(compress(JSON.stringify(o)))
const dec = (b: Buffer) => JSON.parse(decompress(b))
Appendix AC — Testing Harness
// Jest example: ensure cache hit on second call
it('caches product', async () => {
await redis.flushall()
const first = await getProduct('42')
const second = await getProduct('42')
expect(second).toEqual(first)
const hits = await prom.getSingleValue('cache_hits_total')
expect(hits).toBeGreaterThan(0)
})
# click latency budget gates in CI
scripts/assert-metrics.sh --metric cache_hit_ratio --gte 0.75 --window 5m
Appendix AD — Migration Playbook
- Phase 0: observe baseline (origin-only metrics)
- Phase 1: introduce read-only cache for GETs; measure offload
- Phase 2: add SWR and negative caching; watch stampedes
- Phase 3: write-through for specific hot writes
- Phase 4: edge CDN policies and versioned assets
- Phase 5: per-field GraphQL cache and dataloaders
Appendix AE — Security Auditing
- Quarterly credential rotation; verify all services redeploy
- Pen-test cache poisoning via query param injection
- Verify ACL denies CONFIG/FLUSH on app users
Appendix AF — Runbook Deep Dives
Cache node memory saturation
- Action: raise maxmemory or reduce TTL; prioritize LFU
- Validate: evictions drop, hit ratio stable
CDN purge loop detected
- Action: throttle purges; switch to versioned assets; add backoff
- Validate: origin QPS normalizes
Appendix AG — Case Studies (Summarized)
E-commerce PDP: +35% hit ratio with SWR and negative 404s; P95 -42%
News homepage: edge composition with ESI; origin offload 78%
API pricing: per-tenant keys + LFU; stabilized hot keys; CPU -30%
Appendix AH — Governance and Ownership
- Product teams own cache keys and TTL policies per domain
- Platform SRE owns infra, SLOs, and observability
- Change management: canary cache config, rollback in < 10 minutes
Appendix AI — Full HTTP Header Cookbook
Cache-Control: public, max-age=120, s-maxage=600, stale-while-revalidate=300, stale-if-error=600
Surrogate-Control: max-age=600
ETag: "W/\"a1b2c3\""
Vary: Accept-Encoding, Accept-Language
Appendix AJ — Key Normalization Rules
- Lowercase hostnames
- Sort query params; drop tracking params (utm_*, fbclid, gclid)
- Collapse duplicate slashes; ensure trailing slash policy
- Strip fragments (#...)
Appendix AK — SLA Matrix (Examples)
- Static assets: TTL 365d, SWR 30d, versioned filenames
- Blog pages: TTL 10m, SWR 60m, revalidate on publish
- Product detail: TTL 2m, SWR 10m, purge on update
- Search results: TTL 30s, SWR 2m, normalized params
Mega FAQ (181–260)
-
Can I cache GraphQL mutations?
Generally no; cache the derived read models instead. -
How to debug cache keys quickly?
Expose a headerX-Cache-Keyin non-prod; log samples. -
Should I use consistent hashing?
Yes for client-side sharding and hot-key distribution. -
Can I mix Redis and Memcached?
Yes; Memcached for ephemeral objects; Redis for rich features. -
How to prevent stale reads after user update?
Purge per-user namespace; short TTL for per-user caches. -
Are distributed locks safe?
Use Redlock carefully; prefer single-flight and idempotency. -
Best TTL for feature flags?
Very short (5–30s) or subscribe to change stream. -
Handle clock skew?
Use server times (EX/PEX) not client; avoid absolute timestamps. -
Throttle invalidations?
Batch and debounce; avoid purge storms. -
Cache admin access?
Separate credentials; audit; IP-allowlist. -
Data residency?
Region-local caches; avoid cross-border PII storage. -
Blue/green cache versions?
Use versioned namespaces:v42vsv43. -
Can I cache 302 redirects?
Short TTL; ensure downstream behavior correct. -
When to bypass cache?
Admin endpoints, preview modes, personalized pages. -
Hash-busting without redeploy?
Support runtime alias map that points logical → hashed paths. -
What about queues as caches?
Different purpose; use caches for random access reads. -
Cache coherence with websockets?
Push invalidation events to clients when keys change. -
Should I store JWTs in cache?
Prefer stateless; if needed, store blacklist/allowlist short-lived. -
Handle GDPR deletion?
Index keys by user ID; purge on DSR request. -
Cache images or generate on demand?
Both: generate variants on first request and cache with long TTL. -
Origin 500s and caching?
Serve stale-if-error; alert and degrade gracefully. -
Can edge compute write to origin cache?
Yes via authenticated API; rate-limit writes. -
Track key age?
Use TTL or store timestamp alongside value. -
Cache-thrashing detection?
High set rate with low hit ratio; investigate TTLs and key space. -
Split read/write clusters?
Yes at high scale; writes funnel to a subset; replicate to reads. -
Key tag strategy?
Maintain tag index; purge by tag changes (e.g., category). -
Cache schema evolution?
Version bump; dual-read during migration; cleanup old versions. -
Are bloom filters production-safe?
Yes with known false positive rate; avoid for critical auth. -
HSTS impact on cache?
Unrelated; transport security policy. -
Cookies and shared caches?
Cookies often disable caching; strip where possible. -
Partial personalization?
ESI or client-side personalization over cached shell. -
Coalesce backend retries?
Yes; centralize via cache-aware client. -
Cache fragmentation across languages?
Normalize serialization; define canonical JSON ordering. -
Are Redis pipelines ordered?
Yes; responses in request order. -
Use RESP3?
If driver supports and benefits measured. -
Monitor key sizes?
Sample and record distribution; alert on outliers. -
Multi-tenant noisy neighbor?
Quota per tenant; dedicated DB or cluster per tier. -
Cache warmers and rollbacks?
Rollback should also revert warmer jobs to previous version. -
Alternative stores (Hazelcast, Aerospike)?
Viable; evaluate ops overhead and latency. -
Invalidate on cron?
Avoid blind purges; tie to content changes. -
Cache invalidation APIs?
Provide internal service with auth, audit, and rate limits. -
Audit schema for caches?
Who set key, when, size, TTL; for troubleshooting. -
Rolling restarts?
Stagger; preserve connection pools; monitor latency. -
Timeout budget?
Distribute among DNS/TLS/Conn/Req phases. -
TCP versus UNIX sockets?
In Kubernetes, TCP; on single host, UNIX sockets can reduce overhead. -
Multi-get fallback?
If partial hits, fetch misses in parallel and merge. -
Managing consistency with DB replicas?
Staleness windows; read-after-write consistency patterns. -
Cacheable authz decisions?
Short TTL and scope to resource+user; invalidate on policy change. -
ETL precompute caches?
Yes for dashboards; refresh via schedule. -
Shard by tenant vs hash?
Tenant for isolation; hash for balance; hybrid possible. -
Backpressure to edge?
Set 429 with Retry-After; protect origin. -
Async delete failures?
Retry with DLQ; reconcile periodically. -
System limits?
File descriptors, ephemeral ports; tune kernel. -
NUMA impacts?
Pin threads; measure only at extreme scale. -
Service meshes?
mTLS adds small latency; caches unaffected. -
Lambda@Edge limits?
Consider CloudFront Functions for lightweight header logic. -
CDN dedupe?
Origin shield helps; edge POPs may still re-fetch. -
Payload canonicalization?
Stable JSON field ordering for better compression and diffs. -
Prefetch on hover?
Yes for links; cap concurrency. -
Browser cache control?
Shortmax-agewiths-maxagelonger for shared caches. -
Headless CMS and caching?
Purge on publish; webhook triggers invalidation. -
Rate limit buckets across regions?
Choose regional isolation or global state with tradeoffs. -
API gateway cache?
Use cautiously; prefer app-level awareness. -
GraphQL persisted query store?
Version and sign; purge on schema change. -
CDN ACLs?
Restrict purge APIs; RBAC. -
Origin auth rotation?
Rotate secrets; test edge → origin integration. -
Warm service worker cache?
Precache shell; runtime cache data. -
Device-specific variants?
Use Client Hints; avoid full UA Vary. -
Cache high-cardinality metrics?
Aggregate server-side; cache aggregates not raw. -
JSON-LD caches?
Long TTL; purge on content update. -
Datadog/OTEL span attributes?
cache.hit,cache.key_ns,cache.layer. -
Binary vs text protocols?
Negligible difference for most workloads. -
Redis multi-tenant DB index?
Yes; map tenants to DBs; beware limits. -
Upgrade Redis 6→7?
Plan with replica; test aof/rdb compatibility. -
AOF rewrite pauses?
Monitor; schedule off-peak; tune auto-aof-rewrite. -
S3 as blob store sidecar?
Store large blobs in S3; cache pointers in Redis. -
Live migration between providers?
Dual-write + backfill; flip reads when warm. -
Cache value checksums?
Store CRC32/SHA256 to detect corruption. -
Per-endpoint budgets?
Define cache SLO per route and monitor adherence. -
When not to cache?
Highly dynamic, security-sensitive, or strictly consistent data.