CargoWise eAdapter Error Handling & Retry Patterns: Production-Ready Strategies

Nov 15, 2025•

cargowiseeadaptererror-handlingretry-patterns

•

Building production-ready CargoWise eAdapter integrations requires robust error handling and retry mechanisms. Unlike simple REST APIs, eAdapter's message-based architecture introduces unique challenges: network failures, message validation errors, business logic failures, and system timeouts. Without proper error handling, your integration can lose critical shipment data, create duplicate records, or fail silently.

This comprehensive guide covers production-tested error handling and retry patterns for CargoWise eAdapter integrations. You'll learn how to implement exponential backoff, dead-letter queues, idempotency checks, and comprehensive error recovery strategies that ensure your logistics integrations remain reliable under all conditions.

Understanding eAdapter Error Scenarios

Common Error Types

CargoWise eAdapter integrations face several categories of errors, each requiring different handling strategies:

1. Network Errors

Connection timeouts
DNS resolution failures
SSL/TLS handshake failures
Network partitions

2. Message Validation Errors

Invalid XML structure
Schema validation failures
Missing required fields
Data type mismatches

3. Business Logic Errors

Invalid shipment data
Duplicate records
Business rule violations
State conflicts

4. System Errors

CargoWise service unavailability
Database locks
Resource exhaustion
Authentication failures

Error Response Structure

eAdapter returns structured error responses that include:

<?xml version="1.0" encoding="UTF-8"?>
<ErrorResponse>
  <ErrorCode>VALIDATION_ERROR</ErrorCode>
  <ErrorMessage>Shipment ID is required</ErrorMessage>
  <ErrorDetails>
    <Field>ShipmentId</Field>
    <Reason>Required field missing</Reason>
  </ErrorDetails>
  <Timestamp>2025-11-15T10:30:00Z</Timestamp>
  <MessageId>MSG-2025-001</MessageId>
</ErrorResponse>

Retry Strategy Fundamentals

When to Retry

Not all errors should trigger retries. Understanding which errors are transient versus permanent is crucial:

Transient Errors (Should Retry):

Network timeouts
Service unavailable (503)
Rate limiting (429)
Database locks
Temporary service degradation

Permanent Errors (Should Not Retry):

Validation errors (400)
Authentication failures (401)
Authorization failures (403)
Not found errors (404)
Business logic violations

Exponential Backoff Pattern

Exponential backoff prevents overwhelming failing systems while ensuring eventual success:

public class ExponentialBackoffRetryPolicy
{
    private readonly int _maxRetries;
    private readonly TimeSpan _initialDelay;
    private readonly double _backoffMultiplier;
    private readonly TimeSpan _maxDelay;
    private readonly Random _jitter;

    public ExponentialBackoffRetryPolicy(
        int maxRetries = 5,
        TimeSpan? initialDelay = null,
        double backoffMultiplier = 2.0,
        TimeSpan? maxDelay = null)
    {
        _maxRetries = maxRetries;
        _initialDelay = initialDelay ?? TimeSpan.FromSeconds(1);
        _backoffMultiplier = backoffMultiplier;
        _maxDelay = maxDelay ?? TimeSpan.FromMinutes(5);
        _jitter = new Random();
    }

    public async Task<T> ExecuteWithRetryAsync<T>(
        Func<Task<T>> operation,
        Func<Exception, bool> shouldRetry)
    {
        int attempt = 0;
        TimeSpan delay = _initialDelay;

        while (attempt < _maxRetries)
        {
            try
            {
                return await operation();
            }
            catch (Exception ex) when (shouldRetry(ex) && attempt < _maxRetries - 1)
            {
                attempt++;
                
                // Add jitter to prevent thundering herd
                var jitteredDelay = delay.Add(TimeSpan.FromMilliseconds(
                    _jitter.Next(0, (int)delay.TotalMilliseconds * 10 / 100)));
                
                await Task.Delay(jitteredDelay);
                
                // Exponential backoff
                delay = TimeSpan.FromMilliseconds(
                    Math.Min(delay.TotalMilliseconds * _backoffMultiplier, _maxDelay.TotalMilliseconds));
                
                // Log retry attempt
                Console.WriteLine($"Retry attempt {attempt} after {jitteredDelay.TotalSeconds}s");
            }
        }

        // Final attempt
        return await operation();
    }
}

Circuit Breaker Pattern

Circuit breakers prevent cascading failures by stopping requests to failing services:

public enum CircuitState
{
    Closed,    // Normal operation
    Open,      // Failing, reject requests
    HalfOpen  // Testing if service recovered
}

public class CircuitBreaker
{
    private CircuitState _state = CircuitState.Closed;
    private int _failureCount = 0;
    private DateTime _lastFailureTime = DateTime.MinValue;
    private readonly int _failureThreshold;
    private readonly TimeSpan _timeout;
    private readonly object _lock = new object();

    public CircuitBreaker(int failureThreshold = 5, TimeSpan? timeout = null)
    {
        _failureThreshold = failureThreshold;
        _timeout = timeout ?? TimeSpan.FromMinutes(1);
    }

    public async Task<T> ExecuteAsync<T>(Func<Task<T>> operation)
    {
        if (_state == CircuitState.Open)
        {
            if (DateTime.UtcNow - _lastFailureTime > _timeout)
            {
                _state = CircuitState.HalfOpen;
            }
            else
            {
                throw new InvalidOperationException("Circuit breaker is open");
            }
        }

        try
        {
            var result = await operation();
            OnSuccess();
            return result;
        }
        catch (Exception ex)
        {
            OnFailure();
            throw;
        }
    }

    private void OnSuccess()
    {
        lock (_lock)
        {
            _failureCount = 0;
            _state = CircuitState.Closed;
        }
    }

    private void OnFailure()
    {
        lock (_lock)
        {
            _failureCount++;
            _lastFailureTime = DateTime.UtcNow;

            if (_failureCount >= _failureThreshold)
            {
                _state = CircuitState.Open;
            }
        }
    }
}

Dead-Letter Queue Implementation

Dead-letter queues (DLQ) capture messages that fail after all retry attempts, enabling manual review and reprocessing:

public class DeadLetterQueue
{
    private readonly ILogger<DeadLetterQueue> _logger;
    private readonly IMessageStore _messageStore;
    private readonly INotificationService _notificationService;

    public DeadLetterQueue(
        ILogger<DeadLetterQueue> logger,
        IMessageStore messageStore,
        INotificationService notificationService)
    {
        _logger = logger;
        _messageStore = messageStore;
        _notificationService = notificationService;
    }

    public async Task SendToDlqAsync(
        string messageId,
        string originalMessage,
        Exception exception,
        int retryCount,
        Dictionary<string, object> metadata = null)
    {
        var dlqMessage = new DeadLetterMessage
        {
            MessageId = messageId,
            OriginalMessage = originalMessage,
            ErrorMessage = exception.Message,
            ErrorType = exception.GetType().Name,
            StackTrace = exception.StackTrace,
            RetryCount = retryCount,
            Timestamp = DateTime.UtcNow,
            Metadata = metadata ?? new Dictionary<string, object>()
        };

        await _messageStore.SaveDlqMessageAsync(dlqMessage);
        
        _logger.LogError(
            "Message {MessageId} sent to DLQ after {RetryCount} retries. Error: {Error}",
            messageId, retryCount, exception.Message);

        // Notify operations team
        await _notificationService.NotifyDlqMessageAsync(dlqMessage);
    }

    public async Task<bool> ReprocessDlqMessageAsync(string messageId)
    {
        var dlqMessage = await _messageStore.GetDlqMessageAsync(messageId);
        if (dlqMessage == null)
        {
            return false;
        }

        // Attempt reprocessing
        try
        {
            // Your reprocessing logic here
            await ProcessMessageAsync(dlqMessage.OriginalMessage);
            
            // Remove from DLQ on success
            await _messageStore.DeleteDlqMessageAsync(messageId);
            return true;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to reprocess DLQ message {MessageId}", messageId);
            return false;
        }
    }
}

Idempotency Implementation

Idempotency ensures that processing the same message multiple times produces the same result, critical for retry scenarios:

public class IdempotencyHandler
{
    private readonly IIdempotencyStore _idempotencyStore;
    private readonly ILogger<IdempotencyHandler> _logger;

    public IdempotencyHandler(
        IIdempotencyStore idempotencyStore,
        ILogger<IdempotencyHandler> logger)
    {
        _idempotencyStore = idempotencyStore;
        _logger = logger;
    }

    public async Task<T> ExecuteIdempotentAsync<T>(
        string messageId,
        Func<Task<T>> operation)
    {
        // Check if message was already processed
        var existingResult = await _idempotencyStore.GetResultAsync<T>(messageId);
        if (existingResult != null)
        {
            _logger.LogInformation("Message {MessageId} already processed, returning cached result", messageId);
            return existingResult;
        }

        // Process message
        try
        {
            var result = await operation();
            
            // Store result for idempotency
            await _idempotencyStore.StoreResultAsync(messageId, result);
            
            return result;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error processing message {MessageId}", messageId);
            throw;
        }
    }

    public string GenerateIdempotencyKey(string messageId, string messageContent)
    {
        // Generate deterministic key from message content
        using var sha256 = SHA256.Create();
        var hash = sha256.ComputeHash(Encoding.UTF8.GetBytes($"{messageId}:{messageContent}"));
        return Convert.ToBase64String(hash);
    }
}

Complete Error Handling Pipeline

Here's a complete error handling pipeline that combines all patterns:

public class CargoWiseEAdapterErrorHandler
{
    private readonly ExponentialBackoffRetryPolicy _retryPolicy;
    private readonly CircuitBreaker _circuitBreaker;
    private readonly DeadLetterQueue _dlq;
    private readonly IdempotencyHandler _idempotencyHandler;
    private readonly ILogger<CargoWiseEAdapterErrorHandler> _logger;

    public CargoWiseEAdapterErrorHandler(
        ExponentialBackoffRetryPolicy retryPolicy,
        CircuitBreaker circuitBreaker,
        DeadLetterQueue dlq,
        IdempotencyHandler idempotencyHandler,
        ILogger<CargoWiseEAdapterErrorHandler> logger)
    {
        _retryPolicy = retryPolicy;
        _circuitBreaker = circuitBreaker;
        _dlq = dlq;
        _idempotencyHandler = idempotencyHandler;
        _logger = logger;
    }

    public async Task<CargoWiseResponse> ProcessMessageAsync(
        string messageId,
        string messageContent)
    {
        return await _idempotencyHandler.ExecuteIdempotentAsync(
            messageId,
            async () =>
            {
                return await _circuitBreaker.ExecuteAsync(async () =>
                {
                    return await _retryPolicy.ExecuteWithRetryAsync(
                        async () => await SendToCargoWiseAsync(messageContent),
                        ShouldRetry);
                });
            });
    }

    private async Task<CargoWiseResponse> SendToCargoWiseAsync(string messageContent)
    {
        try
        {
            // Your eAdapter send logic
            var response = await _cargoWiseClient.SendMessageAsync(messageContent);
            return response;
        }
        catch (HttpRequestException ex) when (IsTransientError(ex))
        {
            _logger.LogWarning("Transient error sending message: {Error}", ex.Message);
            throw; // Will be caught by retry policy
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Permanent error sending message");
            throw;
        }
    }

    private bool ShouldRetry(Exception ex)
    {
        return ex switch
        {
            HttpRequestException httpEx => IsTransientHttpError(httpEx),
            TimeoutException => true,
            SocketException => true,
            _ => false
        };
    }

    private bool IsTransientHttpError(HttpRequestException ex)
    {
        // Check status code if available
        // 429, 503, 502, 504 are typically retryable
        return true; // Simplified for example
    }

    private bool IsTransientError(Exception ex)
    {
        return ex is TimeoutException ||
               ex is SocketException ||
               (ex is HttpRequestException httpEx && IsTransientHttpError(httpEx));
    }
}

Error Classification and Handling

Error Classification Service

Classify errors to determine appropriate handling:

public enum ErrorCategory
{
    Transient,      // Network, timeouts - retry
    Validation,     // Invalid data - don't retry
    Business,       // Business rule violation - don't retry
    Authentication, // Auth failure - don't retry
    System          // System error - may retry
}

public class ErrorClassifier
{
    public ErrorCategory ClassifyError(Exception exception, string errorResponse = null)
    {
        return exception switch
        {
            HttpRequestException httpEx => ClassifyHttpError(httpEx, errorResponse),
            TimeoutException => ErrorCategory.Transient,
            SocketException => ErrorCategory.Transient,
            XmlException => ErrorCategory.Validation,
            ValidationException => ErrorCategory.Validation,
            UnauthorizedAccessException => ErrorCategory.Authentication,
            _ => ErrorCategory.System
        };
    }

    private ErrorCategory ClassifyHttpError(HttpRequestException ex, string errorResponse)
    {
        // Parse error response if available
        if (errorResponse?.Contains("VALIDATION_ERROR") == true)
            return ErrorCategory.Validation;
        
        if (errorResponse?.Contains("AUTHENTICATION_FAILED") == true)
            return ErrorCategory.Authentication;

        // Default to transient for HTTP errors
        return ErrorCategory.Transient;
    }
}

Monitoring and Observability

Structured Logging

Implement comprehensive logging for error tracking:

public class CargoWiseMessageProcessor
{
    private readonly ILogger<CargoWiseMessageProcessor> _logger;

    public async Task ProcessMessageAsync(CargoWiseMessage message)
    {
        using var scope = _logger.BeginScope(new Dictionary<string, object>
        {
            ["MessageId"] = message.MessageId,
            ["MessageType"] = message.MessageType,
            ["CorrelationId"] = message.CorrelationId
        });

        _logger.LogInformation(
            "Processing CargoWise message {MessageId} of type {MessageType}",
            message.MessageId, message.MessageType);

        try
        {
            var result = await ProcessMessageInternalAsync(message);
            
            _logger.LogInformation(
                "Successfully processed message {MessageId}",
                message.MessageId);
            
            return result;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex,
                "Failed to process message {MessageId} after {RetryCount} retries",
                message.MessageId, message.RetryCount);
            
            throw;
        }
    }
}

Metrics Collection

Track key metrics for error handling:

public class ErrorHandlingMetrics
{
    private readonly IMetricsCollector _metrics;

    public ErrorHandlingMetrics(IMetricsCollector metrics)
    {
        _metrics = metrics;
    }

    public void RecordRetry(string messageType, int attemptNumber)
    {
        _metrics.IncrementCounter("eadapter.retry.count", new Dictionary<string, string>
        {
            ["message_type"] = messageType,
            ["attempt"] = attemptNumber.ToString()
        });
    }

    public void RecordError(string messageType, ErrorCategory category)
    {
        _metrics.IncrementCounter("eadapter.error.count", new Dictionary<string, string>
        {
            ["message_type"] = messageType,
            ["category"] = category.ToString()
        });
    }

    public void RecordDlqMessage(string messageType, string errorType)
    {
        _metrics.IncrementCounter("eadapter.dlq.count", new Dictionary<string, string>
        {
            ["message_type"] = messageType,
            ["error_type"] = errorType
        });
    }

    public void RecordCircuitBreakerState(CircuitState state)
    {
        _metrics.SetGauge("eadapter.circuit_breaker.state", 
            state == CircuitState.Open ? 1 : 0);
    }
}

Production Best Practices

1. Implement Comprehensive Logging

Log all error scenarios with sufficient context:

Message IDs and correlation IDs
Retry attempt numbers
Error types and stack traces
Processing timestamps
Business context (shipment IDs, etc.)

2. Set Appropriate Retry Limits

Network errors: 5-10 retries with exponential backoff
Validation errors: 0 retries (immediate failure)
Business errors: 0 retries (immediate failure)
System errors: 3-5 retries with longer delays

3. Use Dead-Letter Queues

Always implement DLQ for messages that fail after all retries:

Enables manual review and correction
Prevents message loss
Allows reprocessing after fixes

4. Implement Idempotency

Ensure message processing is idempotent:

Use message IDs for deduplication
Store processing results
Return cached results for duplicate messages

5. Monitor Error Rates

Set up alerts for:

High error rates (>5% of messages)
Circuit breaker openings
DLQ message accumulation
Retry exhaustion

6. Test Error Scenarios

Comprehensive testing should include:

Network failures
Timeout scenarios
Invalid message formats
Service unavailability
Partial failures

Testing Error Handling

Unit Tests

[Fact]
public async Task ShouldRetryOnTransientError()
{
    var retryPolicy = new ExponentialBackoffRetryPolicy(maxRetries: 3);
    var attemptCount = 0;

    var result = await retryPolicy.ExecuteWithRetryAsync(
        async () =>
        {
            attemptCount++;
            if (attemptCount < 3)
                throw new TimeoutException("Transient error");
            return "Success";
        },
        ex => ex is TimeoutException);

    Assert.Equal(3, attemptCount);
    Assert.Equal("Success", result);
}

[Fact]
public async Task ShouldNotRetryOnValidationError()
{
    var retryPolicy = new ExponentialBackoffRetryPolicy(maxRetries: 3);
    var attemptCount = 0;

    await Assert.ThrowsAsync<ValidationException>(async () =>
    {
        await retryPolicy.ExecuteWithRetryAsync(
            async () =>
            {
                attemptCount++;
                throw new ValidationException("Invalid data");
            },
            ex => ex is TimeoutException); // Only retry timeouts
    });

    Assert.Equal(1, attemptCount); // Should not retry
}

Integration Tests

[Fact]
public async Task ShouldSendToDlqAfterMaxRetries()
{
    var dlq = new DeadLetterQueue(/* ... */);
    var processor = new CargoWiseMessageProcessor(/* ... */);

    // Simulate persistent failure
    var message = CreateFailingMessage();

    await Assert.ThrowsAsync<Exception>(() => processor.ProcessMessageAsync(message));

    // Verify message in DLQ
    var dlqMessage = await dlq.GetMessageAsync(message.MessageId);
    Assert.NotNull(dlqMessage);
    Assert.Equal(5, dlqMessage.RetryCount);
}

Advanced Error Handling Patterns

Partial Failure Handling

Handle scenarios where only part of a message fails:

public class PartialFailureHandler
{
    public async Task<ProcessingResult> ProcessMessageWithPartialFailureAsync(
        CargoWiseMessage message)
    {
        var results = new List<ItemProcessingResult>();
        var failedItems = new List<string>();

        foreach (var item in message.Items)
        {
            try
            {
                await ProcessItemAsync(item);
                results.Add(new ItemProcessingResult
                {
                    ItemId = item.Id,
                    Success = true
                });
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Failed to process item {ItemId}", item.Id);
                failedItems.Add(item.Id);
                results.Add(new ItemProcessingResult
                {
                    ItemId = item.Id,
                    Success = false,
                    Error = ex.Message
                });
            }
        }

        return new ProcessingResult
        {
            TotalItems = message.Items.Count,
            SuccessfulItems = results.Count(r => r.Success),
            FailedItems = failedItems,
            Results = results
        };
    }
}

Compensating Transactions

Implement compensating transactions for rollback scenarios:

public class CompensatingTransactionHandler
{
    private readonly List<CompensationAction> _compensationActions;

    public async Task ProcessWithCompensationAsync(Func<Task> operation)
    {
        _compensationActions.Clear();

        try
        {
            await operation();
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Operation failed, executing compensation");
            
            // Execute compensation actions in reverse order
            for (int i = _compensationActions.Count - 1; i >= 0; i--)
            {
                try
                {
                    await _compensationActions[i].CompensateAsync();
                }
                catch (Exception compEx)
                {
                    _logger.LogError(compEx, "Compensation action failed");
                }
            }

            throw;
        }
    }

    public void RegisterCompensation(Func<Task> compensationAction)
    {
        _compensationActions.Add(new CompensationAction
        {
            CompensateAsync = compensationAction
        });
    }
}

Real-World Scenarios

Scenario 1: Network Partition Recovery

Handle network partitions gracefully:

public class NetworkPartitionHandler
{
    private readonly IHealthChecker _healthChecker;
    private readonly IMessageStore _messageStore;

    public async Task HandleNetworkPartitionAsync()
    {
        var isPartitioned = await _healthChecker.CheckConnectivityAsync();
        
        if (isPartitioned)
        {
            _logger.LogWarning("Network partition detected, storing messages locally");
            
            // Store messages locally until connectivity restored
            await _messageStore.StorePendingMessagesAsync();
            
            // Start monitoring for connectivity restoration
            _ = Task.Run(async () => await MonitorConnectivityRestorationAsync());
        }
    }

    private async Task MonitorConnectivityRestorationAsync()
    {
        while (true)
        {
            await Task.Delay(TimeSpan.FromSeconds(30));
            
            if (await _healthChecker.CheckConnectivityAsync())
            {
                _logger.LogInformation("Connectivity restored, processing pending messages");
                await ProcessPendingMessagesAsync();
                break;
            }
        }
    }
}

Scenario 2: High-Volume Error Handling

Handle errors during high-volume periods:

public class HighVolumeErrorHandler
{
    private readonly SemaphoreSlim _concurrencyLimiter;
    private readonly IErrorRateMonitor _errorRateMonitor;

    public HighVolumeErrorHandler(int maxConcurrency = 10)
    {
        _concurrencyLimiter = new SemaphoreSlim(maxConcurrency);
    }

    public async Task ProcessWithBackpressureAsync(
        IEnumerable<CargoWiseMessage> messages)
    {
        var errorRate = await _errorRateMonitor.GetErrorRateAsync();
        
        if (errorRate > 0.1) // 10% error rate
        {
            _logger.LogWarning("High error rate detected: {ErrorRate}, reducing concurrency", errorRate);
            await ReduceConcurrencyAsync();
        }

        var tasks = messages.Select(async message =>
        {
            await _concurrencyLimiter.WaitAsync();
            try
            {
                await ProcessMessageAsync(message);
            }
            finally
            {
                _concurrencyLimiter.Release();
            }
        });

        await Task.WhenAll(tasks);
    }
}

Scenario 3: Message Validation Cascade

Handle validation errors that cascade across related messages:

public class ValidationCascadeHandler
{
    public async Task HandleValidationCascadeAsync(
        CargoWiseMessage message,
        ValidationError error)
    {
        // Identify related messages that may be affected
        var relatedMessages = await FindRelatedMessagesAsync(message);
        
        foreach (var relatedMessage in relatedMessages)
        {
            // Mark related messages for revalidation
            await MarkForRevalidationAsync(relatedMessage);
            
            // If message is in-flight, attempt to cancel
            if (await IsInFlightAsync(relatedMessage))
            {
                await CancelMessageAsync(relatedMessage);
            }
        }

        // Log cascade effect
        _logger.LogWarning(
            "Validation error in message {MessageId} affected {Count} related messages",
            message.MessageId, relatedMessages.Count);
    }
}

Performance Optimization

Optimizing Retry Performance

Optimize retry logic for performance:

public class OptimizedRetryPolicy
{
    private readonly ConcurrentDictionary<string, RetryState> _retryStates;
    private readonly Timer _cleanupTimer;

    public OptimizedRetryPolicy()
    {
        _retryStates = new ConcurrentDictionary<string, RetryState>();
        
        // Cleanup old retry states periodically
        _cleanupTimer = new Timer(
            _ => CleanupOldStates(),
            null,
            TimeSpan.FromMinutes(5),
            TimeSpan.FromMinutes(5));
    }

    public async Task<T> ExecuteWithOptimizedRetryAsync<T>(
        string operationKey,
        Func<Task<T>> operation)
    {
        var state = _retryStates.GetOrAdd(operationKey, _ => new RetryState());

        // Use adaptive retry based on success rate
        var delay = CalculateAdaptiveDelay(state);
        
        try
        {
            var result = await operation();
            state.RecordSuccess();
            return result;
        }
        catch (Exception ex) when (ShouldRetry(ex, state))
        {
            state.RecordFailure();
            await Task.Delay(delay);
            return await ExecuteWithOptimizedRetryAsync(operationKey, operation);
        }
    }

    private TimeSpan CalculateAdaptiveDelay(RetryState state)
    {
        // Adaptive delay based on success rate
        var successRate = state.SuccessRate;
        var baseDelay = TimeSpan.FromSeconds(1);

        if (successRate < 0.5)
        {
            return baseDelay * 4; // Longer delay for low success rate
        }
        else if (successRate < 0.8)
        {
            return baseDelay * 2; // Medium delay
        }

        return baseDelay; // Short delay for high success rate
    }
}

Batch Error Processing

Process errors in batches for efficiency:

public class BatchErrorProcessor
{
    private readonly int _batchSize;
    private readonly TimeSpan _batchTimeout;

    public async Task ProcessErrorsInBatchesAsync(
        IEnumerable<FailedMessage> failedMessages)
    {
        var batches = failedMessages
            .Batch(_batchSize)
            .Select(batch => ProcessBatchAsync(batch));

        await Task.WhenAll(batches);
    }

    private async Task ProcessBatchAsync(IEnumerable<FailedMessage> batch)
    {
        var tasks = batch.Select(async message =>
        {
            try
            {
                await RetryMessageAsync(message);
            }
            catch (Exception ex)
            {
                await SendToDlqAsync(message, ex);
            }
        });

        await Task.WhenAll(tasks);
    }
}

Security Considerations

Secure Error Messages

Avoid exposing sensitive information in errors:

public class SecureErrorHandler
{
    public Exception SanitizeException(Exception ex)
    {
        // Remove sensitive data from exception messages
        var sanitizedMessage = SanitizeMessage(ex.Message);
        
        // Create new exception without sensitive data
        return new Exception(sanitizedMessage)
        {
            Source = ex.Source,
            HelpLink = ex.HelpLink
        };
    }

    private string SanitizeMessage(string message)
    {
        // Remove API keys, passwords, connection strings
        var patterns = new[]
        {
            @"(api[_-]?key\s*[:=]\s*)([^\s]+)",
            @"(password\s*[:=]\s*)([^\s]+)",
            @"(connection[_-]?string\s*[:=]\s*)([^\s]+)"
        };

        var sanitized = message;
        foreach (var pattern in patterns)
        {
            sanitized = Regex.Replace(
                sanitized,
                pattern,
                "$1***REDACTED***",
                RegexOptions.IgnoreCase);
        }

        return sanitized;
    }
}

Error Logging Security

Secure error logging practices:

public class SecureErrorLogger
{
    private readonly ILogger _logger;
    private readonly IDataMasker _dataMasker;

    public void LogErrorSecurely(Exception ex, object context)
    {
        // Mask sensitive data in context
        var sanitizedContext = _dataMasker.MaskSensitiveData(context);
        
        // Log with sanitized context
        _logger.LogError(ex,
            "Error occurred with context: {Context}",
            JsonSerializer.Serialize(sanitizedContext));
    }
}

Troubleshooting Guide

Common Issues and Solutions

Issue 1: Messages Stuck in Retry Loop

Symptoms:

Messages repeatedly retrying
High CPU usage
No messages reaching DLQ

Solution:

// Add maximum retry limit
public class RetryLimiter
{
    private const int MaxRetries = 10;
    
    public async Task<T> ExecuteWithLimitAsync<T>(
        Func<Task<T>> operation,
        string messageId)
    {
        var retryCount = await GetRetryCountAsync(messageId);
        
        if (retryCount >= MaxRetries)
        {
            await SendToDlqAsync(messageId, "Max retries exceeded");
            throw new MaxRetriesExceededException();
        }

        try
        {
            var result = await operation();
            await ResetRetryCountAsync(messageId);
            return result;
        }
        catch (Exception ex)
        {
            await IncrementRetryCountAsync(messageId);
            throw;
        }
    }
}

Issue 2: Circuit Breaker Not Opening

Symptoms:

Circuit breaker stays closed despite failures
Errors not being caught

Solution:

// Ensure proper error classification
public class ImprovedCircuitBreaker
{
    public void RecordFailure(Exception ex)
    {
        // Only count retryable errors
        if (IsRetryableError(ex))
        {
            _failureCount++;
            
            if (_failureCount >= _threshold)
            {
                OpenCircuit();
            }
        }
    }

    private bool IsRetryableError(Exception ex)
    {
        return ex is HttpRequestException httpEx &&
               (httpEx.Message.Contains("timeout") ||
                httpEx.Message.Contains("503") ||
                httpEx.Message.Contains("502"));
    }
}

Issue 3: Dead-Letter Queue Growing Rapidly

Symptoms:

DLQ accumulating messages quickly
No reprocessing happening

Solution:

// Implement DLQ monitoring and auto-reprocessing
public class DlqMonitor
{
    public async Task MonitorAndReprocessAsync()
    {
        var dlqCount = await GetDlqMessageCountAsync();
        
        if (dlqCount > 100)
        {
            _logger.LogWarning("DLQ has {Count} messages, starting reprocessing", dlqCount);
            
            // Analyze common errors
            var commonErrors = await AnalyzeCommonErrorsAsync();
            
            // Fix root causes if possible
            await FixRootCausesAsync(commonErrors);
            
            // Reprocess messages
            await ReprocessDlqMessagesAsync();
        }
    }
}

Case Studies

Case Study 1: E-commerce Integration

Challenge: An e-commerce platform needed to integrate with CargoWise for shipment processing. The integration experienced high failure rates during peak shopping periods.

Solution: Implemented adaptive retry with backpressure:

public class EcommerceIntegrationHandler
{
    private readonly AdaptiveRetryPolicy _retryPolicy;
    private readonly BackpressureController _backpressure;

    public async Task ProcessShipmentAsync(Shipment shipment)
    {
        // Check system load
        var load = await GetSystemLoadAsync();
        
        if (load > 0.8)
        {
            await _backpressure.ThrottleAsync();
        }

        return await _retryPolicy.ExecuteWithRetryAsync(
            () => SendToCargoWiseAsync(shipment));
    }
}

Results:

95% reduction in failed shipments
60% improvement in throughput
Zero data loss during peak periods

Case Study 2: Multi-Region Deployment

Challenge: A logistics company needed to handle CargoWise integration across multiple regions with varying network conditions.

Solution: Implemented region-aware error handling:

public class RegionalErrorHandler
{
    private readonly Dictionary<string, RegionalRetryPolicy> _regionalPolicies;

    public async Task ProcessWithRegionalPolicyAsync(
        CargoWiseMessage message,
        string region)
    {
        var policy = _regionalPolicies[region];
        
        // Adjust retry strategy based on region
        return await policy.ExecuteWithRetryAsync(
            () => SendToCargoWiseAsync(message));
    }
}

Results:

Region-specific retry strategies
Improved reliability in high-latency regions
Better error recovery

Migration Guide

Migrating from Basic to Advanced Error Handling

Step 1: Add error classification:

// Before
try
{
    await SendMessageAsync(message);
}
catch (Exception ex)
{
    _logger.LogError(ex, "Error sending message");
    throw;
}

// After
try
{
    await SendMessageAsync(message);
}
catch (Exception ex)
{
    var category = _errorClassifier.ClassifyError(ex);
    
    if (category == ErrorCategory.Transient)
    {
        await RetryAsync(message);
    }
    else
    {
        await SendToDlqAsync(message, ex);
    }
}

Step 2: Implement retry logic:

// Add retry policy
builder.Services.AddSingleton<IRetryPolicy, ExponentialBackoffRetryPolicy>();

// Use in message processor
public async Task ProcessMessageAsync(CargoWiseMessage message)
{
    return await _retryPolicy.ExecuteWithRetryAsync(
        () => SendToCargoWiseAsync(message));
}

Step 3: Add circuit breaker:

builder.Services.AddSingleton<ICircuitBreaker, CircuitBreaker>();

public async Task ProcessMessageAsync(CargoWiseMessage message)
{
    return await _circuitBreaker.ExecuteAsync(
        () => _retryPolicy.ExecuteWithRetryAsync(
            () => SendToCargoWiseAsync(message)));
}

Extended FAQ

Q: How do I determine the optimal retry count?

A: Optimal retry count depends on:

Message criticality
System reliability
Timeout requirements
Business SLA

Start with 3-5 retries and adjust based on monitoring data.

Q: Should I retry all errors?

A: No. Only retry transient errors:

Network timeouts
Service unavailable (503)
Rate limiting (429)
Temporary database locks

Don't retry:

Validation errors (400)
Authentication failures (401)
Not found (404)
Business rule violations

Q: How do I handle message ordering with retries?

A: Use message sessions or sequence numbers:

public class OrderedMessageProcessor
{
    public async Task ProcessOrderedMessageAsync(
        CargoWiseMessage message)
    {
        // Wait for previous messages to complete
        await WaitForSequenceAsync(message.SequenceNumber - 1);
        
        // Process message
        await ProcessMessageAsync(message);
        
        // Mark sequence as complete
        await MarkSequenceCompleteAsync(message.SequenceNumber);
    }
}

Q: What's the difference between retry and DLQ?

Retry: Automatic retry of transient failures
DLQ: Manual review of persistent failures

Use retry for errors that might succeed on retry. Use DLQ for errors that need human intervention.

Q: How do I test error handling?

A: Use fault injection:

[Fact]
public async Task ShouldHandleNetworkFailure()
{
    // Inject network failure
    _networkSimulator.SimulateFailure();
    
    var result = await _processor.ProcessMessageAsync(message);
    
    // Verify retry occurred
    Assert.True(_retryMonitor.RetryOccurred);
    
    // Verify eventual success or DLQ
    Assert.True(result.Success || _dlq.Contains(message));
}

Q: How do I monitor error handling effectiveness?

A: Track key metrics:

Retry success rate
DLQ growth rate
Average retry count
Error recovery time
Circuit breaker state changes

Q: Can I use different retry strategies for different message types?

A: Yes, implement message-type-specific policies:

public class MessageTypeRetryPolicy
{
    private readonly Dictionary<string, IRetryPolicy> _policies;

    public IRetryPolicy GetPolicyForMessageType(string messageType)
    {
        return _policies.GetValueOrDefault(
            messageType,
            _defaultPolicy);
    }
}

Best Practices Summary

Error Classification First: Always classify errors before handling
Exponential Backoff: Use exponential backoff with jitter
Circuit Breakers: Implement circuit breakers for external dependencies
Dead-Letter Queues: Always have DLQ for failed messages
Idempotency: Ensure all operations are idempotent
Comprehensive Logging: Log all error scenarios with context
Monitoring: Set up alerts for error conditions
Testing: Test all error scenarios thoroughly
Documentation: Document error handling strategies
Review: Regularly review and optimize error handling

Conclusion

Implementing robust error handling and retry patterns for CargoWise eAdapter integrations is essential for production reliability. By combining exponential backoff, circuit breakers, dead-letter queues, and idempotency checks, you can build integrations that gracefully handle failures while maintaining data integrity.

Key Takeaways:

Classify Errors: Distinguish between transient and permanent errors
Implement Retry Logic: Use exponential backoff with jitter
Use Circuit Breakers: Prevent cascading failures
Dead-Letter Queues: Capture failed messages for review
Idempotency: Ensure safe retries
Comprehensive Logging: Track all error scenarios
Monitor Metrics: Set up alerts for error conditions
Performance Optimization: Optimize retry logic for efficiency
Security: Sanitize error messages and logs
Testing: Test all error scenarios

Next Steps:

Implement error classification in your eAdapter integration
Add retry policies with appropriate backoff strategies
Set up dead-letter queue processing
Configure monitoring and alerting
Test error scenarios thoroughly
Review and optimize based on production data

For more CargoWise integration guidance, explore our CargoWise eAdapter Integration Patterns guide or contact our team for enterprise integration support.

CargoWise eAdapter Error Handling & Retry Patterns: Production-Ready Strategies

Understanding eAdapter Error Scenarios

Common Error Types

Error Response Structure

Retry Strategy Fundamentals

When to Retry

Exponential Backoff Pattern

Circuit Breaker Pattern

Dead-Letter Queue Implementation

Idempotency Implementation

Complete Error Handling Pipeline

Error Classification and Handling

Error Classification Service

Monitoring and Observability

Structured Logging

Metrics Collection

Production Best Practices

1. Implement Comprehensive Logging

2. Set Appropriate Retry Limits

3. Use Dead-Letter Queues

4. Implement Idempotency

5. Monitor Error Rates

6. Test Error Scenarios

Testing Error Handling

Unit Tests

Integration Tests

Advanced Error Handling Patterns

Partial Failure Handling

Compensating Transactions

Real-World Scenarios

Scenario 1: Network Partition Recovery

Scenario 2: High-Volume Error Handling

Scenario 3: Message Validation Cascade

Performance Optimization

Optimizing Retry Performance

Batch Error Processing

Security Considerations

Secure Error Messages

Error Logging Security

Troubleshooting Guide

Common Issues and Solutions

Case Studies

Case Study 1: E-commerce Integration

Case Study 2: Multi-Region Deployment

Migration Guide

Migrating from Basic to Advanced Error Handling

Extended FAQ

Best Practices Summary

Conclusion

Related posts