API Gateway Rate Limiting Strategies: Ocelot, YARP, and Kong Complete Guide 2025

Nov 15, 2025•

api-gatewayrate-limitingocelotyarp

•

Rate limiting is essential for API gateways to protect backend services from overload, prevent abuse, and ensure fair resource allocation. Different API gateway solutions implement rate limiting differently, each with unique capabilities and trade-offs. Understanding these differences is crucial for choosing the right strategy for your architecture.

This comprehensive guide covers rate limiting strategies for Ocelot, YARP, and Kong API gateways. You'll learn how to implement distributed rate limiting, configure token bucket algorithms, handle rate limit headers, and deploy production-ready rate limiting configurations.

Understanding Rate Limiting

Why Rate Limiting Matters

Rate limiting provides:

Protection: Prevents backend overload
Fairness: Ensures equitable resource access
Cost Control: Limits API usage costs
Security: Mitigates DDoS and abuse
Quality of Service: Maintains performance for all users

Rate Limiting Algorithms

1. Fixed Window

Simple time-based window
Resets at fixed intervals
Can allow bursts at window boundaries

2. Sliding Window

More accurate than fixed window
Smooths out bursts
More complex to implement

3. Token Bucket

Allows bursts up to bucket capacity
Refills at constant rate
Good for variable traffic patterns

4. Leaky Bucket

Smooths out traffic spikes
Constant output rate
Prevents bursts

Ocelot Rate Limiting

Basic Configuration

Configure rate limiting in Ocelot:

{
  "Routes": [
    {
      "DownstreamPathTemplate": "/api/{everything}",
      "UpstreamPathTemplate": "/api/{everything}",
      "RateLimitOptions": {
        "ClientWhitelist": [],
        "EnableRateLimiting": true,
        "Period": "1m",
        "PeriodTimespan": 60,
        "Limit": 100
      }
    }
  ]
}

Per-Client Rate Limiting

Limit by client identifier:

builder.Services.AddOcelot()
    .AddCacheManager(x => x.WithDictionaryHandle())
    .AddRateLimiting(options =>
    {
        options.EnableRateLimiting = true;
        options.ClientIdHeader = "X-Client-Id";
        options.HttpStatusCode = 429;
        options.QuotaExceededMessage = "API rate limit exceeded";
    });

Custom Rate Limiting

Implement custom rate limiting logic:

public class CustomRateLimitProcessor : IRateLimitProcessor
{
    private readonly IMemoryCache _cache;
    private readonly ILogger<CustomRateLimitProcessor> _logger;

    public async Task<RateLimitCounter> ProcessRequestAsync(
        ClientRequestIdentity identity,
        RateLimitRule rule)
    {
        var key = $"{identity.ClientId}:{rule.Endpoint}";
        var counter = await GetOrCreateCounterAsync(key, rule);

        if (counter.Count >= rule.Limit)
        {
            _logger.LogWarning(
                "Rate limit exceeded for {ClientId} on {Endpoint}",
                identity.ClientId, rule.Endpoint);
            
            throw new RateLimitExceededException(
                $"Rate limit of {rule.Limit} exceeded");
        }

        counter.Count++;
        await _cache.SetAsync(key, counter, 
            TimeSpan.FromSeconds(rule.PeriodTimespan));

        return counter;
    }
}

YARP Rate Limiting

Built-in Rate Limiting

YARP uses .NET rate limiting middleware:

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("api", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100;
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 10;
    });
});

var app = builder.Build();

app.UseRateLimiter();

app.MapReverseProxy()
    .RequireRateLimiting("api");

Per-Route Rate Limiting

Configure different limits per route:

builder.Services.AddRateLimiter(options =>
{
    // General API limit
    options.AddFixedWindowLimiter("api", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100;
    });

    // Strict limit for sensitive endpoints
    options.AddFixedWindowLimiter("strict", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 10;
    });
});

app.MapReverseProxy(proxyPipeline =>
{
    proxyPipeline.Use(async (context, next) =>
    {
        var route = context.Request.Path;
        
        if (route.StartsWithSegments("/api/admin"))
        {
            context.Features.Set<IRateLimiterPolicyKey>("strict");
        }
        else
        {
            context.Features.Set<IRateLimiterPolicyKey>("api");
        }
        
        await next();
    });
});

Token Bucket with YARP

Implement token bucket algorithm:

public class TokenBucketRateLimiter : IRateLimiter
{
    private readonly ConcurrentDictionary<string, TokenBucket> _buckets;
    private readonly int _capacity;
    private readonly int _refillRate;

    public TokenBucketRateLimiter(int capacity, int refillRate)
    {
        _buckets = new ConcurrentDictionary<string, TokenBucket>();
        _capacity = capacity;
        _refillRate = refillRate;
    }

    public ValueTask<RateLimitLease> AcquireAsync(
        HttpContext context,
        int permitCount = 1,
        CancellationToken cancellationToken = default)
    {
        var key = GetClientKey(context);
        var bucket = _buckets.GetOrAdd(key, _ => new TokenBucket(_capacity, _refillRate));

        if (bucket.TryConsume(permitCount))
        {
            return new ValueTask<RateLimitLease>(new SuccessLease());
        }

        return new ValueTask<RateLimitLease>(new FailureLease());
    }

    private string GetClientKey(HttpContext context)
    {
        return context.Connection.RemoteIpAddress?.ToString() 
            ?? context.Request.Headers["X-Client-Id"].FirstOrDefault() 
            ?? "anonymous";
    }
}

public class TokenBucket
{
    private int _tokens;
    private readonly int _capacity;
    private readonly int _refillRate;
    private DateTime _lastRefill;

    public TokenBucket(int capacity, int refillRate)
    {
        _capacity = capacity;
        _refillRate = refillRate;
        _tokens = capacity;
        _lastRefill = DateTime.UtcNow;
    }

    public bool TryConsume(int count)
    {
        Refill();
        
        if (_tokens >= count)
        {
            _tokens -= count;
            return true;
        }

        return false;
    }

    private void Refill()
    {
        var now = DateTime.UtcNow;
        var elapsed = (now - _lastRefill).TotalSeconds;
        var tokensToAdd = (int)(elapsed * _refillRate);
        
        _tokens = Math.Min(_capacity, _tokens + tokensToAdd);
        _lastRefill = now;
    }
}

Kong Rate Limiting

Basic Plugin Configuration

Configure rate limiting in Kong:

services:
  - name: api-service
    url: https://api.example.com
    routes:
      - name: api-route
        paths:
          - /api
        plugins:
          - name: rate-limiting
            config:
              minute: 100
              hour: 1000
              day: 10000
              policy: local

Distributed Rate Limiting

Use Redis for distributed rate limiting:

plugins:
  - name: rate-limiting
    config:
      minute: 100
      policy: redis
      redis_host: redis.example.com
      redis_port: 6379
      redis_password: secret
      redis_timeout: 2000
      redis_database: 1

Advanced Rate Limiting

Configure advanced options:

plugins:
  - name: rate-limiting
    config:
      minute: 100
      hour: 1000
      policy: cluster
      limit_by: consumer
      hide_client_headers: false
      header_name: X-RateLimit-Limit

Distributed Rate Limiting

Redis-Based Implementation

Implement distributed rate limiting with Redis:

public class RedisRateLimiter : IRateLimiter
{
    private readonly IConnectionMultiplexer _redis;
    private readonly ILogger<RedisRateLimiter> _logger;

    public RedisRateLimiter(
        IConnectionMultiplexer redis,
        ILogger<RedisRateLimiter> logger)
    {
        _redis = redis;
        _logger = logger;
    }

    public async Task<RateLimitResult> CheckRateLimitAsync(
        string key,
        int limit,
        TimeSpan window)
    {
        var db = _redis.GetDatabase();
        var now = DateTimeOffset.UtcNow.ToUnixTimeSeconds();
        var windowStart = now - (long)window.TotalSeconds;

        // Use Redis sorted set for sliding window
        var count = await db.SortedSetCountAsync(key, windowStart, now);

        if (count >= limit)
        {
            var ttl = await db.KeyTimeToLiveAsync(key);
            return new RateLimitResult
            {
                Allowed = false,
                Remaining = 0,
                ResetAt = DateTimeOffset.FromUnixTimeSeconds(now + (ttl ?? window.TotalSeconds))
            };
        }

        // Add current request
        await db.SortedSetAddAsync(key, now, now);
        await db.KeyExpireAsync(key, window);

        var remaining = limit - (int)count - 1;
        return new RateLimitResult
        {
            Allowed = true,
            Remaining = remaining,
            ResetAt = DateTimeOffset.FromUnixTimeSeconds(now + (long)window.TotalSeconds)
        };
    }
}

Lua Script for Atomic Operations

Use Lua script for atomic rate limiting:

-- rate-limit.lua
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

-- Remove expired entries
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)

-- Count current requests
local count = redis.call('ZCARD', key)

if count < limit then
    -- Add current request
    redis.call('ZADD', key, now, now)
    redis.call('EXPIRE', key, window)
    return {1, limit - count - 1, now + window}
else
    -- Rate limit exceeded
    local ttl = redis.call('TTL', key)
    return {0, 0, now + (ttl > 0 and ttl or window)}
end

public async Task<RateLimitResult> CheckRateLimitLuaAsync(
    string key,
    int limit,
    TimeSpan window)
{
    var db = _redis.GetDatabase();
    var now = DateTimeOffset.UtcNow.ToUnixTimeSeconds();
    
    var script = await LoadLuaScriptAsync("rate-limit.lua");
    var result = await db.ScriptEvaluateAsync(script, new RedisKey[] { key },
        new RedisValue[] { limit, (long)window.TotalSeconds, now });

    var values = (RedisValue[])result;
    return new RateLimitResult
    {
        Allowed = values[0] == 1,
        Remaining = (int)values[1],
        ResetAt = DateTimeOffset.FromUnixTimeSeconds((long)values[2])
    };
}

Rate Limit Headers

Standard Headers

Include rate limit information in responses:

public class RateLimitHeaderMiddleware
{
    private readonly RequestDelegate _next;
    private readonly IRateLimiter _rateLimiter;

    public async Task InvokeAsync(HttpContext context)
    {
        var key = GetClientKey(context);
        var result = await _rateLimiter.CheckRateLimitAsync(key, 100, TimeSpan.FromMinutes(1));

        context.Response.Headers.Add("X-RateLimit-Limit", "100");
        context.Response.Headers.Add("X-RateLimit-Remaining", result.Remaining.ToString());
        context.Response.Headers.Add("X-RateLimit-Reset", 
            result.ResetAt.ToUnixTimeSeconds().ToString());

        if (!result.Allowed)
        {
            context.Response.StatusCode = 429;
            context.Response.Headers.Add("Retry-After", 
                ((int)(result.ResetAt - DateTimeOffset.UtcNow).TotalSeconds).ToString());
            await context.Response.WriteAsync("Rate limit exceeded");
            return;
        }

        await _next(context);
    }
}

Best Practices

1. Choose Appropriate Algorithm

Fixed Window: Simple, good for basic use cases
Sliding Window: More accurate, prevents boundary bursts
Token Bucket: Allows controlled bursts
Leaky Bucket: Smooths traffic spikes

2. Use Distributed Storage

For multi-instance deployments:

Use Redis for shared state
Ensure atomic operations
Handle Redis failures gracefully

3. Configure Per-Client Limits

Different limits for different clients
Higher limits for premium users
Stricter limits for anonymous users

4. Monitor Rate Limiting

Track metrics:

Rate limit hits
Client distribution
Peak usage patterns
False positives

5. Handle Rate Limit Errors

Provide clear error messages:

Include retry-after header
Explain rate limit policy
Suggest alternatives

Advanced Rate Limiting Patterns

Adaptive Rate Limiting

Adjust limits based on system load:

public class AdaptiveRateLimiter
{
    private readonly ISystemLoadMonitor _loadMonitor;

    public async Task<int> GetAdaptiveLimitAsync(string clientId)
    {
        var baseLimit = GetBaseLimit(clientId);
        var systemLoad = await _loadMonitor.GetSystemLoadAsync();
        
        // Reduce limit when system is under stress
        if (systemLoad > 0.8)
        {
            return (int)(baseLimit * 0.5); // 50% reduction
        }
        else if (systemLoad > 0.6)
        {
            return (int)(baseLimit * 0.75); // 25% reduction
        }
        
        return baseLimit;
    }
}

Tiered Rate Limiting

Implement different limits for different client tiers:

public class TieredRateLimiter
{
    private readonly Dictionary<string, RateLimitTier> _tiers;

    public RateLimitTier GetTierForClient(string clientId)
    {
        return _tiers.GetValueOrDefault(clientId, RateLimitTier.Free);
    }

    public int GetLimitForTier(RateLimitTier tier)
    {
        return tier switch
        {
            RateLimitTier.Free => 100,
            RateLimitTier.Basic => 1000,
            RateLimitTier.Premium => 10000,
            RateLimitTier.Enterprise => 100000,
            _ => 100
        };
    }
}

Real-World Scenarios

Scenario 1: DDoS Protection

Protect against DDoS attacks:

public class DdosProtectionRateLimiter
{
    public async Task<bool> IsDdosAttackAsync(string clientIp)
    {
        var requestCount = await GetRequestCountAsync(clientIp, TimeSpan.FromSeconds(1));
        
        // More than 100 requests per second from single IP
        if (requestCount > 100)
        {
            await BlockIpAsync(clientIp, TimeSpan.FromMinutes(10));
            return true;
        }
        
        return false;
    }
}

Scenario 2: API Key-Based Limits

Different limits per API key:

public class ApiKeyRateLimiter
{
    public async Task<RateLimitResult> CheckLimitAsync(string apiKey)
    {
        var keyInfo = await GetApiKeyInfoAsync(apiKey);
        var limit = GetLimitForKeyType(keyInfo.Type);
        
        return await CheckRateLimitAsync(apiKey, limit);
    }
}

Extended FAQ

Q: How do I handle rate limit bursts?

A: Use token bucket algorithm:

public class TokenBucketRateLimiter
{
    private readonly int _capacity;
    private readonly int _refillRate;
    private int _tokens;

    public bool TryConsume(int tokens)
    {
        Refill();
        
        if (_tokens >= tokens)
        {
            _tokens -= tokens;
            return true;
        }
        
        return false;
    }
}

Q: Should I rate limit by IP or user?

A: Use both:

IP-based for DDoS protection
User-based for fair resource allocation
API key-based for service tiers

Conclusion

Effective rate limiting is crucial for API gateway deployments. By understanding the different approaches in Ocelot, YARP, and Kong, and implementing distributed rate limiting with proper algorithms, you can protect your backend services while ensuring fair resource allocation.

Key Takeaways:

Choose the right algorithm for your use case
Use distributed storage for multi-instance deployments
Configure per-client limits for fairness
Include rate limit headers in responses
Monitor rate limiting metrics
Handle errors gracefully with clear messages
Adaptive limiting based on system load
Tiered limits for different client types
DDoS protection with aggressive limits
Burst handling with token bucket

Next Steps:

Evaluate your rate limiting requirements
Choose appropriate algorithm
Implement distributed rate limiting
Configure rate limit headers
Set up monitoring
Implement adaptive limiting
Configure tiered limits

Rate Limiting Implementation Guide

Step-by-Step Implementation

Analyze Requirements
- Determine rate limits per client
- Identify rate limit tiers
- Define burst allowances
- Set up monitoring
Choose Algorithm
- Fixed window for simple cases
- Sliding window for smooth limiting
- Token bucket for burst handling
- Leaky bucket for constant rate
Implement Storage
- In-memory for single instance
- Redis for distributed
- Database for persistence
- Hybrid for complex scenarios
Configure Headers
- X-RateLimit-Limit
- X-RateLimit-Remaining
- X-RateLimit-Reset
- Retry-After
Set Up Monitoring
- Track rate limit hits
- Monitor client usage
- Alert on abuse
- Generate reports

Rate Limiting Patterns

Per-Client Limiting

public class PerClientRateLimiter
{
    public async Task<bool> CheckLimitAsync(string clientId, int limit)
    {
        var key = $"ratelimit:client:{clientId}";
        return await CheckRateLimitAsync(key, limit);
    }
}

Per-Endpoint Limiting

public class PerEndpointRateLimiter
{
    public async Task<bool> CheckLimitAsync(string endpoint, int limit)
    {
        var key = $"ratelimit:endpoint:{endpoint}";
        return await CheckRateLimitAsync(key, limit);
    }
}

Tiered Limiting

public class TieredRateLimiter
{
    public async Task<int> GetLimitForTierAsync(string tier)
    {
        return tier switch
        {
            "free" => 100,
            "basic" => 1000,
            "premium" => 10000,
            "enterprise" => 100000,
            _ => 100
        };
    }
}

Advanced Scenarios

Dynamic Rate Limiting

Adjust limits based on system load:

public class DynamicRateLimiter
{
    public async Task<int> GetDynamicLimitAsync(string clientId)
    {
        var baseLimit = await GetBaseLimitAsync(clientId);
        var systemLoad = await GetSystemLoadAsync();
        
        if (systemLoad > 0.8)
        {
            return (int)(baseLimit * 0.5); // Reduce by 50%
        }
        
        return baseLimit;
    }
}

Whitelist/Blacklist

Handle special clients:

public class WhitelistRateLimiter
{
    public async Task<bool> IsWhitelistedAsync(string clientId)
    {
        var whitelist = await GetWhitelistAsync();
        return whitelist.Contains(clientId);
    }

    public async Task<bool> IsBlacklistedAsync(string clientId)
    {
        var blacklist = await GetBlacklistAsync();
        return blacklist.Contains(clientId);
    }
}

Rate Limiting Configuration Examples

Ocelot Configuration

Configure rate limiting in Ocelot:

{
  "ReRoutes": [
    {
      "DownstreamPathTemplate": "/api/{everything}",
      "DownstreamScheme": "https",
      "DownstreamHostAndPorts": [
        {
          "Host": "api.example.com",
          "Port": 443
        }
      ],
      "UpstreamPathTemplate": "/api/{everything}",
      "UpstreamHttpMethod": [ "GET", "POST" ],
      "RateLimitOptions": {
        "ClientWhitelist": [],
        "EnableRateLimiting": true,
        "Period": "1s",
        "PeriodTimespan": 1,
        "Limit": 100
      }
    }
  ]
}

YARP Configuration

Configure rate limiting in YARP:

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("api", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100;
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 50;
    });
});

app.UseRateLimiter();

app.MapReverseProxy()
    .RequireRateLimiting("api");

Kong Configuration

Configure rate limiting in Kong:

plugins:
- name: rate-limiting
  config:
    minute: 100
    hour: 1000
    policy: local

Rate Limiting Monitoring

Metrics Collection

Track rate limiting metrics:

public class RateLimitMetrics
{
    private readonly Counter<long> _rateLimitHits;
    private readonly Counter<long> _rateLimitExceeded;

    public void RecordRateLimitHit(string clientId)
    {
        _rateLimitHits.Add(1, new KeyValuePair<string, object>("client_id", clientId));
    }

    public void RecordRateLimitExceeded(string clientId)
    {
        _rateLimitExceeded.Add(1, new KeyValuePair<string, object>("client_id", clientId));
    }
}

Rate Limiting Best Practices

Configuration Management

Manage rate limit configuration:

public class RateLimitConfigManager
{
    public async Task<RateLimitConfig> GetConfigAsync(string clientId)
    {
        var baseConfig = await GetBaseConfigAsync();
        var clientConfig = await GetClientConfigAsync(clientId);
        
        return new RateLimitConfig
        {
            Limit = clientConfig?.Limit ?? baseConfig.Limit,
            Window = clientConfig?.Window ?? baseConfig.Window,
            Burst = clientConfig?.Burst ?? baseConfig.Burst
        };
    }
}

Rate Limit Headers

Include rate limit information in responses:

public class RateLimitHeaderMiddleware
{
    public async Task InvokeAsync(HttpContext context)
    {
        await _next(context);
        
        var rateLimitInfo = await GetRateLimitInfoAsync(context);
        
        context.Response.Headers.Add("X-RateLimit-Limit", 
            rateLimitInfo.Limit.ToString());
        context.Response.Headers.Add("X-RateLimit-Remaining", 
            rateLimitInfo.Remaining.ToString());
        context.Response.Headers.Add("X-RateLimit-Reset", 
            rateLimitInfo.ResetTime.ToUnixTimeSeconds().ToString());
    }
}

Rate Limiting Monitoring

Metrics Dashboard

Create rate limiting dashboard:

public class RateLimitDashboard
{
    public async Task<DashboardData> GetDashboardDataAsync(TimeSpan period)
    {
        return new DashboardData
        {
            TotalRequests = await GetTotalRequestsAsync(period),
            RateLimitedRequests = await GetRateLimitedRequestsAsync(period),
            TopClients = await GetTopClientsAsync(period),
            RateLimitHits = await GetRateLimitHitsAsync(period)
        };
    }
}

Rate Limiting Algorithms Deep Dive

Token Bucket Implementation

Implement token bucket algorithm:

public class TokenBucketRateLimiter
{
    private readonly int _capacity;
    private readonly int _refillRate;
    private int _tokens;
    private DateTime _lastRefill;

    public TokenBucketRateLimiter(int capacity, int refillRate)
    {
        _capacity = capacity;
        _refillRate = refillRate;
        _tokens = capacity;
        _lastRefill = DateTime.UtcNow;
    }

    public async Task<bool> TryAcquireAsync(int tokens = 1)
    {
        RefillTokens();

        if (_tokens >= tokens)
        {
            _tokens -= tokens;
            return true;
        }

        return false;
    }

    private void RefillTokens()
    {
        var now = DateTime.UtcNow;
        var elapsed = (now - _lastRefill).TotalSeconds;
        var tokensToAdd = (int)(elapsed * _refillRate);

        _tokens = Math.Min(_capacity, _tokens + tokensToAdd);
        _lastRefill = now;
    }
}

Leaky Bucket Implementation

Implement leaky bucket algorithm:

public class LeakyBucketRateLimiter
{
    private readonly Queue<DateTime> _requests;
    private readonly int _capacity;
    private readonly TimeSpan _leakInterval;

    public LeakyBucketRateLimiter(int capacity, TimeSpan leakInterval)
    {
        _requests = new Queue<DateTime>();
        _capacity = capacity;
        _leakInterval = leakInterval;
    }

    public async Task<bool> TryAcquireAsync()
    {
        LeakRequests();

        if (_requests.Count < _capacity)
        {
            _requests.Enqueue(DateTime.UtcNow);
            return true;
        }

        return false;
    }

    private void LeakRequests()
    {
        var now = DateTime.UtcNow;
        while (_requests.Count > 0 && 
               now - _requests.Peek() > _leakInterval)
        {
            _requests.Dequeue();
        }
    }
}

Rate Limiting Configuration Management

Dynamic Configuration

Update rate limits dynamically:

public class DynamicRateLimitConfig
{
    private readonly IConfiguration _config;
    private readonly Timer _refreshTimer;

    public DynamicRateLimitConfig(IConfiguration config)
    {
        _config = config;
        _refreshTimer = new Timer(RefreshConfig, null, 
            TimeSpan.Zero, TimeSpan.FromMinutes(5));
    }

    private void RefreshConfig(object state)
    {
        // Reload configuration from source
        var newConfig = LoadConfigFromSource();
        UpdateRateLimits(newConfig);
    }
}

Rate Limit Analytics

Analyze rate limit usage:

public class RateLimitAnalytics
{
    public async Task<AnalyticsReport> GenerateReportAsync(TimeSpan period)
    {
        var data = await CollectAnalyticsDataAsync(period);
        
        return new AnalyticsReport
        {
            TotalRequests = data.TotalRequests,
            RateLimitedRequests = data.RateLimitedRequests,
            RateLimitHitRate = (double)data.RateLimitedRequests / data.TotalRequests,
            TopClients = data.TopClients,
            PeakUsage = data.PeakUsage,
            AverageUsage = data.AverageUsage
        };
    }
}

Rate Limiting Best Practices Summary

Implementation Checklist

Production Deployment

Before deploying rate limiting:

Test all rate limiting algorithms
Verify distributed storage works
Test rate limit headers
Validate error handling
Set up monitoring
Load test rate limiting
Document procedures
Review security settings

Rate Limiting Troubleshooting

Common Issues

Troubleshoot rate limiting issues:

public class RateLimitTroubleshooter
{
    public async Task<DiagnosticsResult> DiagnoseAsync(string clientId)
    {
        var diagnostics = new DiagnosticsResult();

        // Check rate limit configuration
        diagnostics.ConfigExists = await CheckConfigExistsAsync(clientId);

        // Check storage
        diagnostics.StorageWorking = await TestStorageAsync();

        // Check rate limit state
        diagnostics.CurrentState = await GetRateLimitStateAsync(clientId);

        return diagnostics;
    }
}

Rate Limiting Implementation Examples

Fixed Window Rate Limiter

Complete fixed window implementation:

public class FixedWindowRateLimiter
{
    private readonly IDistributedCache _cache;
    private readonly int _limit;
    private readonly TimeSpan _window;

    public async Task<bool> CheckRateLimitAsync(string key)
    {
        var cacheKey = $"ratelimit:fixed:{key}:{GetWindowStart()}";
        var current = await _cache.GetStringAsync(cacheKey);

        if (current == null || int.Parse(current) < _limit)
        {
            var newCount = current == null ? 1 : int.Parse(current) + 1;
            await _cache.SetStringAsync(cacheKey, newCount.ToString(), 
                new DistributedCacheEntryOptions
                {
                    AbsoluteExpirationRelativeToNow = _window
                });
            return true;
        }

        return false;
    }

    private DateTime GetWindowStart()
    {
        var now = DateTime.UtcNow;
        return new DateTime(
            now.Year, 
            now.Month, 
            now.Day, 
            now.Hour, 
            now.Minute / (int)_window.TotalMinutes * (int)_window.TotalMinutes, 
            0);
    }
}

Sliding Window Rate Limiter

Complete sliding window implementation:

public class SlidingWindowRateLimiter
{
    private readonly IDistributedCache _cache;
    private readonly int _limit;
    private readonly TimeSpan _window;

    public async Task<bool> CheckRateLimitAsync(string key)
    {
        var now = DateTime.UtcNow;
        var windowStart = now - _window;

        // Remove old entries
        await RemoveOldEntriesAsync(key, windowStart);

        // Count current entries
        var count = await GetCurrentCountAsync(key, windowStart, now);

        if (count < _limit)
        {
            await AddEntryAsync(key, now);
            return true;
        }

        return false;
    }
}

Rate Limiting Configuration Management

Dynamic Rate Limit Configuration

Configure rate limits dynamically:

public class DynamicRateLimitConfigurator
{
    public async Task UpdateRateLimitAsync(
        string endpoint, 
        RateLimitConfig config)
    {
        // Update configuration
        await UpdateConfigAsync(endpoint, config);

        // Notify rate limiters
        await NotifyRateLimitersAsync(endpoint, config);

        // Update monitoring
        await UpdateMonitoringAsync(endpoint, config);
    }
}

Rate Limit Analytics

Analyze rate limit usage:

public class RateLimitAnalytics
{
    public async Task<RateLimitReport> GetRateLimitReportAsync(
        string endpoint, 
        TimeSpan period)
    {
        var data = await GetRateLimitDataAsync(endpoint, period);

        return new RateLimitReport
        {
            Endpoint = endpoint,
            Period = period,
            TotalRequests = data.Sum(d => d.RequestCount),
            AllowedRequests = data.Count(d => d.Allowed),
            BlockedRequests = data.Count(d => !d.Allowed),
            BlockRate = (double)data.Count(d => !d.Allowed) / data.Count * 100,
            PeakUsage = data.Max(d => d.RequestCount)
        };
    }
}

Rate Limiting Advanced Patterns

Adaptive Rate Limiting

Implement adaptive rate limiting:

public class AdaptiveRateLimiter
{
    public async Task<bool> CheckAdaptiveLimitAsync(string key)
    {
        var currentLoad = await GetCurrentLoadAsync();
        var baseLimit = await GetBaseLimitAsync(key);

        // Adjust limit based on load
        var adjustedLimit = AdjustLimitForLoad(baseLimit, currentLoad);

        return await CheckLimitAsync(key, adjustedLimit);
    }

    private int AdjustLimitForLoad(int baseLimit, double currentLoad)
    {
        if (currentLoad > 0.8)
        {
            return (int)(baseLimit * 0.7); // Reduce by 30%
        }
        else if (currentLoad < 0.3)
        {
            return (int)(baseLimit * 1.2); // Increase by 20%
        }

        return baseLimit;
    }
}

Distributed Rate Limiting

Implement distributed rate limiting:

public class DistributedRateLimiter
{
    private readonly IDistributedCache _cache;

    public async Task<bool> CheckDistributedLimitAsync(
        string key, 
        int limit, 
        TimeSpan window)
    {
        var cacheKey = $"ratelimit:distributed:{key}";
        var current = await _cache.GetStringAsync(cacheKey);

        if (current == null || int.Parse(current) < limit)
        {
            var newCount = current == null ? 1 : int.Parse(current) + 1;
            await _cache.SetStringAsync(
                cacheKey, 
                newCount.ToString(),
                new DistributedCacheEntryOptions
                {
                    AbsoluteExpirationRelativeToNow = window
                });
            return true;
        }

        return false;
    }
}

Rate Limiting Best Practices Summary

Key Takeaways

Choose the Right Algorithm: Fixed window for simplicity, sliding window for accuracy, token bucket for burst handling
Implement Distributed Limiting: Use distributed cache for multi-instance deployments
Monitor and Adjust: Continuously monitor rate limit effectiveness and adjust as needed
Handle Edge Cases: Consider timezone issues, clock skew, and distributed system challenges
Provide Clear Feedback: Return appropriate HTTP status codes and headers to clients

Common Pitfalls to Avoid

Not considering distributed system challenges
Setting limits too low or too high without monitoring
Not handling edge cases like clock skew
Failing to provide clear error messages to clients
Not implementing proper monitoring and alerting

Rate Limiting Implementation Checklist

Pre-Implementation

Identify endpoints that need rate limiting
Determine appropriate rate limits for each endpoint
Choose rate limiting algorithm (fixed window, sliding window, token bucket)
Select storage mechanism (in-memory, distributed cache)
Plan for distributed system challenges

Implementation

Implement rate limiting middleware
Configure rate limits per endpoint/client
Add rate limit headers to responses
Implement proper error handling
Add logging and monitoring

Post-Implementation

Monitor rate limit effectiveness
Adjust limits based on usage patterns
Review and optimize performance
Update documentation
Train team on rate limiting policies

Rate Limiting Troubleshooting Guide

Common Issues

Issue 1: Rate limits too restrictive

Symptom: Legitimate users getting blocked
Solution: Increase limits or implement whitelist
Prevention: Monitor usage patterns before setting limits

Issue 2: Rate limits not working in distributed environment

Symptom: Limits not enforced across instances
Solution: Use distributed cache for rate limit storage
Prevention: Test in distributed environment before production

Issue 3: Clock skew causing issues

Symptom: Inconsistent rate limit behavior
Solution: Use NTP for time synchronization
Prevention: Implement time synchronization checks

For more API gateway guidance, explore our API Gateway Comparison or YARP Production Guide.