API Gateway Rate Limiting Strategies: Ocelot, YARP, and Kong Complete Guide 2025

Nov 15, 2025
api-gatewayrate-limitingocelotyarp
0

Rate limiting is essential for API gateways to protect backend services from overload, prevent abuse, and ensure fair resource allocation. Different API gateway solutions implement rate limiting differently, each with unique capabilities and trade-offs. Understanding these differences is crucial for choosing the right strategy for your architecture.

This comprehensive guide covers rate limiting strategies for Ocelot, YARP, and Kong API gateways. You'll learn how to implement distributed rate limiting, configure token bucket algorithms, handle rate limit headers, and deploy production-ready rate limiting configurations.

Understanding Rate Limiting

Why Rate Limiting Matters

Rate limiting provides:

  • Protection: Prevents backend overload
  • Fairness: Ensures equitable resource access
  • Cost Control: Limits API usage costs
  • Security: Mitigates DDoS and abuse
  • Quality of Service: Maintains performance for all users

Rate Limiting Algorithms

1. Fixed Window

  • Simple time-based window
  • Resets at fixed intervals
  • Can allow bursts at window boundaries

2. Sliding Window

  • More accurate than fixed window
  • Smooths out bursts
  • More complex to implement

3. Token Bucket

  • Allows bursts up to bucket capacity
  • Refills at constant rate
  • Good for variable traffic patterns

4. Leaky Bucket

  • Smooths out traffic spikes
  • Constant output rate
  • Prevents bursts

Ocelot Rate Limiting

Basic Configuration

Configure rate limiting in Ocelot:

{
  "Routes": [
    {
      "DownstreamPathTemplate": "/api/{everything}",
      "UpstreamPathTemplate": "/api/{everything}",
      "RateLimitOptions": {
        "ClientWhitelist": [],
        "EnableRateLimiting": true,
        "Period": "1m",
        "PeriodTimespan": 60,
        "Limit": 100
      }
    }
  ]
}

Per-Client Rate Limiting

Limit by client identifier:

builder.Services.AddOcelot()
    .AddCacheManager(x => x.WithDictionaryHandle())
    .AddRateLimiting(options =>
    {
        options.EnableRateLimiting = true;
        options.ClientIdHeader = "X-Client-Id";
        options.HttpStatusCode = 429;
        options.QuotaExceededMessage = "API rate limit exceeded";
    });

Custom Rate Limiting

Implement custom rate limiting logic:

public class CustomRateLimitProcessor : IRateLimitProcessor
{
    private readonly IMemoryCache _cache;
    private readonly ILogger<CustomRateLimitProcessor> _logger;

    public async Task<RateLimitCounter> ProcessRequestAsync(
        ClientRequestIdentity identity,
        RateLimitRule rule)
    {
        var key = $"{identity.ClientId}:{rule.Endpoint}";
        var counter = await GetOrCreateCounterAsync(key, rule);

        if (counter.Count >= rule.Limit)
        {
            _logger.LogWarning(
                "Rate limit exceeded for {ClientId} on {Endpoint}",
                identity.ClientId, rule.Endpoint);
            
            throw new RateLimitExceededException(
                $"Rate limit of {rule.Limit} exceeded");
        }

        counter.Count++;
        await _cache.SetAsync(key, counter, 
            TimeSpan.FromSeconds(rule.PeriodTimespan));

        return counter;
    }
}

YARP Rate Limiting

Built-in Rate Limiting

YARP uses .NET rate limiting middleware:

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("api", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100;
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 10;
    });
});

var app = builder.Build();

app.UseRateLimiter();

app.MapReverseProxy()
    .RequireRateLimiting("api");

Per-Route Rate Limiting

Configure different limits per route:

builder.Services.AddRateLimiter(options =>
{
    // General API limit
    options.AddFixedWindowLimiter("api", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100;
    });

    // Strict limit for sensitive endpoints
    options.AddFixedWindowLimiter("strict", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 10;
    });
});

app.MapReverseProxy(proxyPipeline =>
{
    proxyPipeline.Use(async (context, next) =>
    {
        var route = context.Request.Path;
        
        if (route.StartsWithSegments("/api/admin"))
        {
            context.Features.Set<IRateLimiterPolicyKey>("strict");
        }
        else
        {
            context.Features.Set<IRateLimiterPolicyKey>("api");
        }
        
        await next();
    });
});

Token Bucket with YARP

Implement token bucket algorithm:

public class TokenBucketRateLimiter : IRateLimiter
{
    private readonly ConcurrentDictionary<string, TokenBucket> _buckets;
    private readonly int _capacity;
    private readonly int _refillRate;

    public TokenBucketRateLimiter(int capacity, int refillRate)
    {
        _buckets = new ConcurrentDictionary<string, TokenBucket>();
        _capacity = capacity;
        _refillRate = refillRate;
    }

    public ValueTask<RateLimitLease> AcquireAsync(
        HttpContext context,
        int permitCount = 1,
        CancellationToken cancellationToken = default)
    {
        var key = GetClientKey(context);
        var bucket = _buckets.GetOrAdd(key, _ => new TokenBucket(_capacity, _refillRate));

        if (bucket.TryConsume(permitCount))
        {
            return new ValueTask<RateLimitLease>(new SuccessLease());
        }

        return new ValueTask<RateLimitLease>(new FailureLease());
    }

    private string GetClientKey(HttpContext context)
    {
        return context.Connection.RemoteIpAddress?.ToString() 
            ?? context.Request.Headers["X-Client-Id"].FirstOrDefault() 
            ?? "anonymous";
    }
}

public class TokenBucket
{
    private int _tokens;
    private readonly int _capacity;
    private readonly int _refillRate;
    private DateTime _lastRefill;

    public TokenBucket(int capacity, int refillRate)
    {
        _capacity = capacity;
        _refillRate = refillRate;
        _tokens = capacity;
        _lastRefill = DateTime.UtcNow;
    }

    public bool TryConsume(int count)
    {
        Refill();
        
        if (_tokens >= count)
        {
            _tokens -= count;
            return true;
        }

        return false;
    }

    private void Refill()
    {
        var now = DateTime.UtcNow;
        var elapsed = (now - _lastRefill).TotalSeconds;
        var tokensToAdd = (int)(elapsed * _refillRate);
        
        _tokens = Math.Min(_capacity, _tokens + tokensToAdd);
        _lastRefill = now;
    }
}

Kong Rate Limiting

Basic Plugin Configuration

Configure rate limiting in Kong:

services:
  - name: api-service
    url: https://api.example.com
    routes:
      - name: api-route
        paths:
          - /api
        plugins:
          - name: rate-limiting
            config:
              minute: 100
              hour: 1000
              day: 10000
              policy: local

Distributed Rate Limiting

Use Redis for distributed rate limiting:

plugins:
  - name: rate-limiting
    config:
      minute: 100
      policy: redis
      redis_host: redis.example.com
      redis_port: 6379
      redis_password: secret
      redis_timeout: 2000
      redis_database: 1

Advanced Rate Limiting

Configure advanced options:

plugins:
  - name: rate-limiting
    config:
      minute: 100
      hour: 1000
      policy: cluster
      limit_by: consumer
      hide_client_headers: false
      header_name: X-RateLimit-Limit

Distributed Rate Limiting

Redis-Based Implementation

Implement distributed rate limiting with Redis:

public class RedisRateLimiter : IRateLimiter
{
    private readonly IConnectionMultiplexer _redis;
    private readonly ILogger<RedisRateLimiter> _logger;

    public RedisRateLimiter(
        IConnectionMultiplexer redis,
        ILogger<RedisRateLimiter> logger)
    {
        _redis = redis;
        _logger = logger;
    }

    public async Task<RateLimitResult> CheckRateLimitAsync(
        string key,
        int limit,
        TimeSpan window)
    {
        var db = _redis.GetDatabase();
        var now = DateTimeOffset.UtcNow.ToUnixTimeSeconds();
        var windowStart = now - (long)window.TotalSeconds;

        // Use Redis sorted set for sliding window
        var count = await db.SortedSetCountAsync(key, windowStart, now);

        if (count >= limit)
        {
            var ttl = await db.KeyTimeToLiveAsync(key);
            return new RateLimitResult
            {
                Allowed = false,
                Remaining = 0,
                ResetAt = DateTimeOffset.FromUnixTimeSeconds(now + (ttl ?? window.TotalSeconds))
            };
        }

        // Add current request
        await db.SortedSetAddAsync(key, now, now);
        await db.KeyExpireAsync(key, window);

        var remaining = limit - (int)count - 1;
        return new RateLimitResult
        {
            Allowed = true,
            Remaining = remaining,
            ResetAt = DateTimeOffset.FromUnixTimeSeconds(now + (long)window.TotalSeconds)
        };
    }
}

Lua Script for Atomic Operations

Use Lua script for atomic rate limiting:

-- rate-limit.lua
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

-- Remove expired entries
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)

-- Count current requests
local count = redis.call('ZCARD', key)

if count < limit then
    -- Add current request
    redis.call('ZADD', key, now, now)
    redis.call('EXPIRE', key, window)
    return {1, limit - count - 1, now + window}
else
    -- Rate limit exceeded
    local ttl = redis.call('TTL', key)
    return {0, 0, now + (ttl > 0 and ttl or window)}
end
public async Task<RateLimitResult> CheckRateLimitLuaAsync(
    string key,
    int limit,
    TimeSpan window)
{
    var db = _redis.GetDatabase();
    var now = DateTimeOffset.UtcNow.ToUnixTimeSeconds();
    
    var script = await LoadLuaScriptAsync("rate-limit.lua");
    var result = await db.ScriptEvaluateAsync(script, new RedisKey[] { key },
        new RedisValue[] { limit, (long)window.TotalSeconds, now });

    var values = (RedisValue[])result;
    return new RateLimitResult
    {
        Allowed = values[0] == 1,
        Remaining = (int)values[1],
        ResetAt = DateTimeOffset.FromUnixTimeSeconds((long)values[2])
    };
}

Rate Limit Headers

Standard Headers

Include rate limit information in responses:

public class RateLimitHeaderMiddleware
{
    private readonly RequestDelegate _next;
    private readonly IRateLimiter _rateLimiter;

    public async Task InvokeAsync(HttpContext context)
    {
        var key = GetClientKey(context);
        var result = await _rateLimiter.CheckRateLimitAsync(key, 100, TimeSpan.FromMinutes(1));

        context.Response.Headers.Add("X-RateLimit-Limit", "100");
        context.Response.Headers.Add("X-RateLimit-Remaining", result.Remaining.ToString());
        context.Response.Headers.Add("X-RateLimit-Reset", 
            result.ResetAt.ToUnixTimeSeconds().ToString());

        if (!result.Allowed)
        {
            context.Response.StatusCode = 429;
            context.Response.Headers.Add("Retry-After", 
                ((int)(result.ResetAt - DateTimeOffset.UtcNow).TotalSeconds).ToString());
            await context.Response.WriteAsync("Rate limit exceeded");
            return;
        }

        await _next(context);
    }
}

Best Practices

1. Choose Appropriate Algorithm

  • Fixed Window: Simple, good for basic use cases
  • Sliding Window: More accurate, prevents boundary bursts
  • Token Bucket: Allows controlled bursts
  • Leaky Bucket: Smooths traffic spikes

2. Use Distributed Storage

For multi-instance deployments:

  • Use Redis for shared state
  • Ensure atomic operations
  • Handle Redis failures gracefully

3. Configure Per-Client Limits

  • Different limits for different clients
  • Higher limits for premium users
  • Stricter limits for anonymous users

4. Monitor Rate Limiting

Track metrics:

  • Rate limit hits
  • Client distribution
  • Peak usage patterns
  • False positives

5. Handle Rate Limit Errors

Provide clear error messages:

  • Include retry-after header
  • Explain rate limit policy
  • Suggest alternatives

Advanced Rate Limiting Patterns

Adaptive Rate Limiting

Adjust limits based on system load:

public class AdaptiveRateLimiter
{
    private readonly ISystemLoadMonitor _loadMonitor;

    public async Task<int> GetAdaptiveLimitAsync(string clientId)
    {
        var baseLimit = GetBaseLimit(clientId);
        var systemLoad = await _loadMonitor.GetSystemLoadAsync();
        
        // Reduce limit when system is under stress
        if (systemLoad > 0.8)
        {
            return (int)(baseLimit * 0.5); // 50% reduction
        }
        else if (systemLoad > 0.6)
        {
            return (int)(baseLimit * 0.75); // 25% reduction
        }
        
        return baseLimit;
    }
}

Tiered Rate Limiting

Implement different limits for different client tiers:

public class TieredRateLimiter
{
    private readonly Dictionary<string, RateLimitTier> _tiers;

    public RateLimitTier GetTierForClient(string clientId)
    {
        return _tiers.GetValueOrDefault(clientId, RateLimitTier.Free);
    }

    public int GetLimitForTier(RateLimitTier tier)
    {
        return tier switch
        {
            RateLimitTier.Free => 100,
            RateLimitTier.Basic => 1000,
            RateLimitTier.Premium => 10000,
            RateLimitTier.Enterprise => 100000,
            _ => 100
        };
    }
}

Real-World Scenarios

Scenario 1: DDoS Protection

Protect against DDoS attacks:

public class DdosProtectionRateLimiter
{
    public async Task<bool> IsDdosAttackAsync(string clientIp)
    {
        var requestCount = await GetRequestCountAsync(clientIp, TimeSpan.FromSeconds(1));
        
        // More than 100 requests per second from single IP
        if (requestCount > 100)
        {
            await BlockIpAsync(clientIp, TimeSpan.FromMinutes(10));
            return true;
        }
        
        return false;
    }
}

Scenario 2: API Key-Based Limits

Different limits per API key:

public class ApiKeyRateLimiter
{
    public async Task<RateLimitResult> CheckLimitAsync(string apiKey)
    {
        var keyInfo = await GetApiKeyInfoAsync(apiKey);
        var limit = GetLimitForKeyType(keyInfo.Type);
        
        return await CheckRateLimitAsync(apiKey, limit);
    }
}

Extended FAQ

Q: How do I handle rate limit bursts?

A: Use token bucket algorithm:

public class TokenBucketRateLimiter
{
    private readonly int _capacity;
    private readonly int _refillRate;
    private int _tokens;

    public bool TryConsume(int tokens)
    {
        Refill();
        
        if (_tokens >= tokens)
        {
            _tokens -= tokens;
            return true;
        }
        
        return false;
    }
}

Q: Should I rate limit by IP or user?

A: Use both:

  • IP-based for DDoS protection
  • User-based for fair resource allocation
  • API key-based for service tiers

Conclusion

Effective rate limiting is crucial for API gateway deployments. By understanding the different approaches in Ocelot, YARP, and Kong, and implementing distributed rate limiting with proper algorithms, you can protect your backend services while ensuring fair resource allocation.

Key Takeaways:

  1. Choose the right algorithm for your use case
  2. Use distributed storage for multi-instance deployments
  3. Configure per-client limits for fairness
  4. Include rate limit headers in responses
  5. Monitor rate limiting metrics
  6. Handle errors gracefully with clear messages
  7. Adaptive limiting based on system load
  8. Tiered limits for different client types
  9. DDoS protection with aggressive limits
  10. Burst handling with token bucket

Next Steps:

  1. Evaluate your rate limiting requirements
  2. Choose appropriate algorithm
  3. Implement distributed rate limiting
  4. Configure rate limit headers
  5. Set up monitoring
  6. Implement adaptive limiting
  7. Configure tiered limits

Rate Limiting Implementation Guide

Step-by-Step Implementation

  1. Analyze Requirements

    • Determine rate limits per client
    • Identify rate limit tiers
    • Define burst allowances
    • Set up monitoring
  2. Choose Algorithm

    • Fixed window for simple cases
    • Sliding window for smooth limiting
    • Token bucket for burst handling
    • Leaky bucket for constant rate
  3. Implement Storage

    • In-memory for single instance
    • Redis for distributed
    • Database for persistence
    • Hybrid for complex scenarios
  4. Configure Headers

    • X-RateLimit-Limit
    • X-RateLimit-Remaining
    • X-RateLimit-Reset
    • Retry-After
  5. Set Up Monitoring

    • Track rate limit hits
    • Monitor client usage
    • Alert on abuse
    • Generate reports

Rate Limiting Patterns

Per-Client Limiting

public class PerClientRateLimiter
{
    public async Task<bool> CheckLimitAsync(string clientId, int limit)
    {
        var key = $"ratelimit:client:{clientId}";
        return await CheckRateLimitAsync(key, limit);
    }
}

Per-Endpoint Limiting

public class PerEndpointRateLimiter
{
    public async Task<bool> CheckLimitAsync(string endpoint, int limit)
    {
        var key = $"ratelimit:endpoint:{endpoint}";
        return await CheckRateLimitAsync(key, limit);
    }
}

Tiered Limiting

public class TieredRateLimiter
{
    public async Task<int> GetLimitForTierAsync(string tier)
    {
        return tier switch
        {
            "free" => 100,
            "basic" => 1000,
            "premium" => 10000,
            "enterprise" => 100000,
            _ => 100
        };
    }
}

Advanced Scenarios

Dynamic Rate Limiting

Adjust limits based on system load:

public class DynamicRateLimiter
{
    public async Task<int> GetDynamicLimitAsync(string clientId)
    {
        var baseLimit = await GetBaseLimitAsync(clientId);
        var systemLoad = await GetSystemLoadAsync();
        
        if (systemLoad > 0.8)
        {
            return (int)(baseLimit * 0.5); // Reduce by 50%
        }
        
        return baseLimit;
    }
}

Whitelist/Blacklist

Handle special clients:

public class WhitelistRateLimiter
{
    public async Task<bool> IsWhitelistedAsync(string clientId)
    {
        var whitelist = await GetWhitelistAsync();
        return whitelist.Contains(clientId);
    }

    public async Task<bool> IsBlacklistedAsync(string clientId)
    {
        var blacklist = await GetBlacklistAsync();
        return blacklist.Contains(clientId);
    }
}

Rate Limiting Configuration Examples

Ocelot Configuration

Configure rate limiting in Ocelot:

{
  "ReRoutes": [
    {
      "DownstreamPathTemplate": "/api/{everything}",
      "DownstreamScheme": "https",
      "DownstreamHostAndPorts": [
        {
          "Host": "api.example.com",
          "Port": 443
        }
      ],
      "UpstreamPathTemplate": "/api/{everything}",
      "UpstreamHttpMethod": [ "GET", "POST" ],
      "RateLimitOptions": {
        "ClientWhitelist": [],
        "EnableRateLimiting": true,
        "Period": "1s",
        "PeriodTimespan": 1,
        "Limit": 100
      }
    }
  ]
}

YARP Configuration

Configure rate limiting in YARP:

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("api", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100;
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 50;
    });
});

app.UseRateLimiter();

app.MapReverseProxy()
    .RequireRateLimiting("api");

Kong Configuration

Configure rate limiting in Kong:

plugins:
- name: rate-limiting
  config:
    minute: 100
    hour: 1000
    policy: local

Rate Limiting Monitoring

Metrics Collection

Track rate limiting metrics:

public class RateLimitMetrics
{
    private readonly Counter<long> _rateLimitHits;
    private readonly Counter<long> _rateLimitExceeded;

    public void RecordRateLimitHit(string clientId)
    {
        _rateLimitHits.Add(1, new KeyValuePair<string, object>("client_id", clientId));
    }

    public void RecordRateLimitExceeded(string clientId)
    {
        _rateLimitExceeded.Add(1, new KeyValuePair<string, object>("client_id", clientId));
    }
}

Rate Limiting Best Practices

Configuration Management

Manage rate limit configuration:

public class RateLimitConfigManager
{
    public async Task<RateLimitConfig> GetConfigAsync(string clientId)
    {
        var baseConfig = await GetBaseConfigAsync();
        var clientConfig = await GetClientConfigAsync(clientId);
        
        return new RateLimitConfig
        {
            Limit = clientConfig?.Limit ?? baseConfig.Limit,
            Window = clientConfig?.Window ?? baseConfig.Window,
            Burst = clientConfig?.Burst ?? baseConfig.Burst
        };
    }
}

Rate Limit Headers

Include rate limit information in responses:

public class RateLimitHeaderMiddleware
{
    public async Task InvokeAsync(HttpContext context)
    {
        await _next(context);
        
        var rateLimitInfo = await GetRateLimitInfoAsync(context);
        
        context.Response.Headers.Add("X-RateLimit-Limit", 
            rateLimitInfo.Limit.ToString());
        context.Response.Headers.Add("X-RateLimit-Remaining", 
            rateLimitInfo.Remaining.ToString());
        context.Response.Headers.Add("X-RateLimit-Reset", 
            rateLimitInfo.ResetTime.ToUnixTimeSeconds().ToString());
    }
}

Rate Limiting Monitoring

Metrics Dashboard

Create rate limiting dashboard:

public class RateLimitDashboard
{
    public async Task<DashboardData> GetDashboardDataAsync(TimeSpan period)
    {
        return new DashboardData
        {
            TotalRequests = await GetTotalRequestsAsync(period),
            RateLimitedRequests = await GetRateLimitedRequestsAsync(period),
            TopClients = await GetTopClientsAsync(period),
            RateLimitHits = await GetRateLimitHitsAsync(period)
        };
    }
}

Rate Limiting Algorithms Deep Dive

Token Bucket Implementation

Implement token bucket algorithm:

public class TokenBucketRateLimiter
{
    private readonly int _capacity;
    private readonly int _refillRate;
    private int _tokens;
    private DateTime _lastRefill;

    public TokenBucketRateLimiter(int capacity, int refillRate)
    {
        _capacity = capacity;
        _refillRate = refillRate;
        _tokens = capacity;
        _lastRefill = DateTime.UtcNow;
    }

    public async Task<bool> TryAcquireAsync(int tokens = 1)
    {
        RefillTokens();

        if (_tokens >= tokens)
        {
            _tokens -= tokens;
            return true;
        }

        return false;
    }

    private void RefillTokens()
    {
        var now = DateTime.UtcNow;
        var elapsed = (now - _lastRefill).TotalSeconds;
        var tokensToAdd = (int)(elapsed * _refillRate);

        _tokens = Math.Min(_capacity, _tokens + tokensToAdd);
        _lastRefill = now;
    }
}

Leaky Bucket Implementation

Implement leaky bucket algorithm:

public class LeakyBucketRateLimiter
{
    private readonly Queue<DateTime> _requests;
    private readonly int _capacity;
    private readonly TimeSpan _leakInterval;

    public LeakyBucketRateLimiter(int capacity, TimeSpan leakInterval)
    {
        _requests = new Queue<DateTime>();
        _capacity = capacity;
        _leakInterval = leakInterval;
    }

    public async Task<bool> TryAcquireAsync()
    {
        LeakRequests();

        if (_requests.Count < _capacity)
        {
            _requests.Enqueue(DateTime.UtcNow);
            return true;
        }

        return false;
    }

    private void LeakRequests()
    {
        var now = DateTime.UtcNow;
        while (_requests.Count > 0 && 
               now - _requests.Peek() > _leakInterval)
        {
            _requests.Dequeue();
        }
    }
}

Rate Limiting Configuration Management

Dynamic Configuration

Update rate limits dynamically:

public class DynamicRateLimitConfig
{
    private readonly IConfiguration _config;
    private readonly Timer _refreshTimer;

    public DynamicRateLimitConfig(IConfiguration config)
    {
        _config = config;
        _refreshTimer = new Timer(RefreshConfig, null, 
            TimeSpan.Zero, TimeSpan.FromMinutes(5));
    }

    private void RefreshConfig(object state)
    {
        // Reload configuration from source
        var newConfig = LoadConfigFromSource();
        UpdateRateLimits(newConfig);
    }
}

Rate Limit Analytics

Analyze rate limit usage:

public class RateLimitAnalytics
{
    public async Task<AnalyticsReport> GenerateReportAsync(TimeSpan period)
    {
        var data = await CollectAnalyticsDataAsync(period);
        
        return new AnalyticsReport
        {
            TotalRequests = data.TotalRequests,
            RateLimitedRequests = data.RateLimitedRequests,
            RateLimitHitRate = (double)data.RateLimitedRequests / data.TotalRequests,
            TopClients = data.TopClients,
            PeakUsage = data.PeakUsage,
            AverageUsage = data.AverageUsage
        };
    }
}

Rate Limiting Best Practices Summary

Implementation Checklist

  • Choose appropriate algorithm (fixed window, sliding window, token bucket)
  • Implement distributed storage (Redis, database)
  • Configure per-client limits
  • Set up rate limit headers
  • Monitor rate limiting metrics
  • Handle rate limit errors gracefully
  • Test rate limiting scenarios
  • Document rate limit policies
  • Review and optimize limits
  • Set up alerting

Production Deployment

Before deploying rate limiting:

  1. Test all rate limiting algorithms
  2. Verify distributed storage works
  3. Test rate limit headers
  4. Validate error handling
  5. Set up monitoring
  6. Load test rate limiting
  7. Document procedures
  8. Review security settings

Rate Limiting Troubleshooting

Common Issues

Troubleshoot rate limiting issues:

public class RateLimitTroubleshooter
{
    public async Task<DiagnosticsResult> DiagnoseAsync(string clientId)
    {
        var diagnostics = new DiagnosticsResult();

        // Check rate limit configuration
        diagnostics.ConfigExists = await CheckConfigExistsAsync(clientId);

        // Check storage
        diagnostics.StorageWorking = await TestStorageAsync();

        // Check rate limit state
        diagnostics.CurrentState = await GetRateLimitStateAsync(clientId);

        return diagnostics;
    }
}

Rate Limiting Implementation Examples

Fixed Window Rate Limiter

Complete fixed window implementation:

public class FixedWindowRateLimiter
{
    private readonly IDistributedCache _cache;
    private readonly int _limit;
    private readonly TimeSpan _window;

    public async Task<bool> CheckRateLimitAsync(string key)
    {
        var cacheKey = $"ratelimit:fixed:{key}:{GetWindowStart()}";
        var current = await _cache.GetStringAsync(cacheKey);

        if (current == null || int.Parse(current) < _limit)
        {
            var newCount = current == null ? 1 : int.Parse(current) + 1;
            await _cache.SetStringAsync(cacheKey, newCount.ToString(), 
                new DistributedCacheEntryOptions
                {
                    AbsoluteExpirationRelativeToNow = _window
                });
            return true;
        }

        return false;
    }

    private DateTime GetWindowStart()
    {
        var now = DateTime.UtcNow;
        return new DateTime(
            now.Year, 
            now.Month, 
            now.Day, 
            now.Hour, 
            now.Minute / (int)_window.TotalMinutes * (int)_window.TotalMinutes, 
            0);
    }
}

Sliding Window Rate Limiter

Complete sliding window implementation:

public class SlidingWindowRateLimiter
{
    private readonly IDistributedCache _cache;
    private readonly int _limit;
    private readonly TimeSpan _window;

    public async Task<bool> CheckRateLimitAsync(string key)
    {
        var now = DateTime.UtcNow;
        var windowStart = now - _window;

        // Remove old entries
        await RemoveOldEntriesAsync(key, windowStart);

        // Count current entries
        var count = await GetCurrentCountAsync(key, windowStart, now);

        if (count < _limit)
        {
            await AddEntryAsync(key, now);
            return true;
        }

        return false;
    }
}

Rate Limiting Configuration Management

Dynamic Rate Limit Configuration

Configure rate limits dynamically:

public class DynamicRateLimitConfigurator
{
    public async Task UpdateRateLimitAsync(
        string endpoint, 
        RateLimitConfig config)
    {
        // Update configuration
        await UpdateConfigAsync(endpoint, config);

        // Notify rate limiters
        await NotifyRateLimitersAsync(endpoint, config);

        // Update monitoring
        await UpdateMonitoringAsync(endpoint, config);
    }
}

Rate Limit Analytics

Analyze rate limit usage:

public class RateLimitAnalytics
{
    public async Task<RateLimitReport> GetRateLimitReportAsync(
        string endpoint, 
        TimeSpan period)
    {
        var data = await GetRateLimitDataAsync(endpoint, period);

        return new RateLimitReport
        {
            Endpoint = endpoint,
            Period = period,
            TotalRequests = data.Sum(d => d.RequestCount),
            AllowedRequests = data.Count(d => d.Allowed),
            BlockedRequests = data.Count(d => !d.Allowed),
            BlockRate = (double)data.Count(d => !d.Allowed) / data.Count * 100,
            PeakUsage = data.Max(d => d.RequestCount)
        };
    }
}

Rate Limiting Advanced Patterns

Adaptive Rate Limiting

Implement adaptive rate limiting:

public class AdaptiveRateLimiter
{
    public async Task<bool> CheckAdaptiveLimitAsync(string key)
    {
        var currentLoad = await GetCurrentLoadAsync();
        var baseLimit = await GetBaseLimitAsync(key);

        // Adjust limit based on load
        var adjustedLimit = AdjustLimitForLoad(baseLimit, currentLoad);

        return await CheckLimitAsync(key, adjustedLimit);
    }

    private int AdjustLimitForLoad(int baseLimit, double currentLoad)
    {
        if (currentLoad > 0.8)
        {
            return (int)(baseLimit * 0.7); // Reduce by 30%
        }
        else if (currentLoad < 0.3)
        {
            return (int)(baseLimit * 1.2); // Increase by 20%
        }

        return baseLimit;
    }
}

Distributed Rate Limiting

Implement distributed rate limiting:

public class DistributedRateLimiter
{
    private readonly IDistributedCache _cache;

    public async Task<bool> CheckDistributedLimitAsync(
        string key, 
        int limit, 
        TimeSpan window)
    {
        var cacheKey = $"ratelimit:distributed:{key}";
        var current = await _cache.GetStringAsync(cacheKey);

        if (current == null || int.Parse(current) < limit)
        {
            var newCount = current == null ? 1 : int.Parse(current) + 1;
            await _cache.SetStringAsync(
                cacheKey, 
                newCount.ToString(),
                new DistributedCacheEntryOptions
                {
                    AbsoluteExpirationRelativeToNow = window
                });
            return true;
        }

        return false;
    }
}

Rate Limiting Best Practices Summary

Key Takeaways

  1. Choose the Right Algorithm: Fixed window for simplicity, sliding window for accuracy, token bucket for burst handling
  2. Implement Distributed Limiting: Use distributed cache for multi-instance deployments
  3. Monitor and Adjust: Continuously monitor rate limit effectiveness and adjust as needed
  4. Handle Edge Cases: Consider timezone issues, clock skew, and distributed system challenges
  5. Provide Clear Feedback: Return appropriate HTTP status codes and headers to clients

Common Pitfalls to Avoid

  • Not considering distributed system challenges
  • Setting limits too low or too high without monitoring
  • Not handling edge cases like clock skew
  • Failing to provide clear error messages to clients
  • Not implementing proper monitoring and alerting

Rate Limiting Implementation Checklist

Pre-Implementation

  • Identify endpoints that need rate limiting
  • Determine appropriate rate limits for each endpoint
  • Choose rate limiting algorithm (fixed window, sliding window, token bucket)
  • Select storage mechanism (in-memory, distributed cache)
  • Plan for distributed system challenges

Implementation

  • Implement rate limiting middleware
  • Configure rate limits per endpoint/client
  • Add rate limit headers to responses
  • Implement proper error handling
  • Add logging and monitoring

Post-Implementation

  • Monitor rate limit effectiveness
  • Adjust limits based on usage patterns
  • Review and optimize performance
  • Update documentation
  • Train team on rate limiting policies

Rate Limiting Troubleshooting Guide

Common Issues

Issue 1: Rate limits too restrictive

  • Symptom: Legitimate users getting blocked
  • Solution: Increase limits or implement whitelist
  • Prevention: Monitor usage patterns before setting limits

Issue 2: Rate limits not working in distributed environment

  • Symptom: Limits not enforced across instances
  • Solution: Use distributed cache for rate limit storage
  • Prevention: Test in distributed environment before production

Issue 3: Clock skew causing issues

  • Symptom: Inconsistent rate limit behavior
  • Solution: Use NTP for time synchronization
  • Prevention: Implement time synchronization checks

For more API gateway guidance, explore our API Gateway Comparison or YARP Production Guide.

Related posts