CargoWise eAdapter Error Handling & Retry Patterns: Production-Ready Strategies
Building production-ready CargoWise eAdapter integrations requires robust error handling and retry mechanisms. Unlike simple REST APIs, eAdapter's message-based architecture introduces unique challenges: network failures, message validation errors, business logic failures, and system timeouts. Without proper error handling, your integration can lose critical shipment data, create duplicate records, or fail silently.
This comprehensive guide covers production-tested error handling and retry patterns for CargoWise eAdapter integrations. You'll learn how to implement exponential backoff, dead-letter queues, idempotency checks, and comprehensive error recovery strategies that ensure your logistics integrations remain reliable under all conditions.
Understanding eAdapter Error Scenarios
Common Error Types
CargoWise eAdapter integrations face several categories of errors, each requiring different handling strategies:
1. Network Errors
- Connection timeouts
- DNS resolution failures
- SSL/TLS handshake failures
- Network partitions
2. Message Validation Errors
- Invalid XML structure
- Schema validation failures
- Missing required fields
- Data type mismatches
3. Business Logic Errors
- Invalid shipment data
- Duplicate records
- Business rule violations
- State conflicts
4. System Errors
- CargoWise service unavailability
- Database locks
- Resource exhaustion
- Authentication failures
Error Response Structure
eAdapter returns structured error responses that include:
<?xml version="1.0" encoding="UTF-8"?>
<ErrorResponse>
<ErrorCode>VALIDATION_ERROR</ErrorCode>
<ErrorMessage>Shipment ID is required</ErrorMessage>
<ErrorDetails>
<Field>ShipmentId</Field>
<Reason>Required field missing</Reason>
</ErrorDetails>
<Timestamp>2025-11-15T10:30:00Z</Timestamp>
<MessageId>MSG-2025-001</MessageId>
</ErrorResponse>
Retry Strategy Fundamentals
When to Retry
Not all errors should trigger retries. Understanding which errors are transient versus permanent is crucial:
Transient Errors (Should Retry):
- Network timeouts
- Service unavailable (503)
- Rate limiting (429)
- Database locks
- Temporary service degradation
Permanent Errors (Should Not Retry):
- Validation errors (400)
- Authentication failures (401)
- Authorization failures (403)
- Not found errors (404)
- Business logic violations
Exponential Backoff Pattern
Exponential backoff prevents overwhelming failing systems while ensuring eventual success:
public class ExponentialBackoffRetryPolicy
{
private readonly int _maxRetries;
private readonly TimeSpan _initialDelay;
private readonly double _backoffMultiplier;
private readonly TimeSpan _maxDelay;
private readonly Random _jitter;
public ExponentialBackoffRetryPolicy(
int maxRetries = 5,
TimeSpan? initialDelay = null,
double backoffMultiplier = 2.0,
TimeSpan? maxDelay = null)
{
_maxRetries = maxRetries;
_initialDelay = initialDelay ?? TimeSpan.FromSeconds(1);
_backoffMultiplier = backoffMultiplier;
_maxDelay = maxDelay ?? TimeSpan.FromMinutes(5);
_jitter = new Random();
}
public async Task<T> ExecuteWithRetryAsync<T>(
Func<Task<T>> operation,
Func<Exception, bool> shouldRetry)
{
int attempt = 0;
TimeSpan delay = _initialDelay;
while (attempt < _maxRetries)
{
try
{
return await operation();
}
catch (Exception ex) when (shouldRetry(ex) && attempt < _maxRetries - 1)
{
attempt++;
// Add jitter to prevent thundering herd
var jitteredDelay = delay.Add(TimeSpan.FromMilliseconds(
_jitter.Next(0, (int)delay.TotalMilliseconds * 10 / 100)));
await Task.Delay(jitteredDelay);
// Exponential backoff
delay = TimeSpan.FromMilliseconds(
Math.Min(delay.TotalMilliseconds * _backoffMultiplier, _maxDelay.TotalMilliseconds));
// Log retry attempt
Console.WriteLine($"Retry attempt {attempt} after {jitteredDelay.TotalSeconds}s");
}
}
// Final attempt
return await operation();
}
}
Circuit Breaker Pattern
Circuit breakers prevent cascading failures by stopping requests to failing services:
public enum CircuitState
{
Closed, // Normal operation
Open, // Failing, reject requests
HalfOpen // Testing if service recovered
}
public class CircuitBreaker
{
private CircuitState _state = CircuitState.Closed;
private int _failureCount = 0;
private DateTime _lastFailureTime = DateTime.MinValue;
private readonly int _failureThreshold;
private readonly TimeSpan _timeout;
private readonly object _lock = new object();
public CircuitBreaker(int failureThreshold = 5, TimeSpan? timeout = null)
{
_failureThreshold = failureThreshold;
_timeout = timeout ?? TimeSpan.FromMinutes(1);
}
public async Task<T> ExecuteAsync<T>(Func<Task<T>> operation)
{
if (_state == CircuitState.Open)
{
if (DateTime.UtcNow - _lastFailureTime > _timeout)
{
_state = CircuitState.HalfOpen;
}
else
{
throw new InvalidOperationException("Circuit breaker is open");
}
}
try
{
var result = await operation();
OnSuccess();
return result;
}
catch (Exception ex)
{
OnFailure();
throw;
}
}
private void OnSuccess()
{
lock (_lock)
{
_failureCount = 0;
_state = CircuitState.Closed;
}
}
private void OnFailure()
{
lock (_lock)
{
_failureCount++;
_lastFailureTime = DateTime.UtcNow;
if (_failureCount >= _failureThreshold)
{
_state = CircuitState.Open;
}
}
}
}
Dead-Letter Queue Implementation
Dead-letter queues (DLQ) capture messages that fail after all retry attempts, enabling manual review and reprocessing:
public class DeadLetterQueue
{
private readonly ILogger<DeadLetterQueue> _logger;
private readonly IMessageStore _messageStore;
private readonly INotificationService _notificationService;
public DeadLetterQueue(
ILogger<DeadLetterQueue> logger,
IMessageStore messageStore,
INotificationService notificationService)
{
_logger = logger;
_messageStore = messageStore;
_notificationService = notificationService;
}
public async Task SendToDlqAsync(
string messageId,
string originalMessage,
Exception exception,
int retryCount,
Dictionary<string, object> metadata = null)
{
var dlqMessage = new DeadLetterMessage
{
MessageId = messageId,
OriginalMessage = originalMessage,
ErrorMessage = exception.Message,
ErrorType = exception.GetType().Name,
StackTrace = exception.StackTrace,
RetryCount = retryCount,
Timestamp = DateTime.UtcNow,
Metadata = metadata ?? new Dictionary<string, object>()
};
await _messageStore.SaveDlqMessageAsync(dlqMessage);
_logger.LogError(
"Message {MessageId} sent to DLQ after {RetryCount} retries. Error: {Error}",
messageId, retryCount, exception.Message);
// Notify operations team
await _notificationService.NotifyDlqMessageAsync(dlqMessage);
}
public async Task<bool> ReprocessDlqMessageAsync(string messageId)
{
var dlqMessage = await _messageStore.GetDlqMessageAsync(messageId);
if (dlqMessage == null)
{
return false;
}
// Attempt reprocessing
try
{
// Your reprocessing logic here
await ProcessMessageAsync(dlqMessage.OriginalMessage);
// Remove from DLQ on success
await _messageStore.DeleteDlqMessageAsync(messageId);
return true;
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to reprocess DLQ message {MessageId}", messageId);
return false;
}
}
}
Idempotency Implementation
Idempotency ensures that processing the same message multiple times produces the same result, critical for retry scenarios:
public class IdempotencyHandler
{
private readonly IIdempotencyStore _idempotencyStore;
private readonly ILogger<IdempotencyHandler> _logger;
public IdempotencyHandler(
IIdempotencyStore idempotencyStore,
ILogger<IdempotencyHandler> logger)
{
_idempotencyStore = idempotencyStore;
_logger = logger;
}
public async Task<T> ExecuteIdempotentAsync<T>(
string messageId,
Func<Task<T>> operation)
{
// Check if message was already processed
var existingResult = await _idempotencyStore.GetResultAsync<T>(messageId);
if (existingResult != null)
{
_logger.LogInformation("Message {MessageId} already processed, returning cached result", messageId);
return existingResult;
}
// Process message
try
{
var result = await operation();
// Store result for idempotency
await _idempotencyStore.StoreResultAsync(messageId, result);
return result;
}
catch (Exception ex)
{
_logger.LogError(ex, "Error processing message {MessageId}", messageId);
throw;
}
}
public string GenerateIdempotencyKey(string messageId, string messageContent)
{
// Generate deterministic key from message content
using var sha256 = SHA256.Create();
var hash = sha256.ComputeHash(Encoding.UTF8.GetBytes($"{messageId}:{messageContent}"));
return Convert.ToBase64String(hash);
}
}
Complete Error Handling Pipeline
Here's a complete error handling pipeline that combines all patterns:
public class CargoWiseEAdapterErrorHandler
{
private readonly ExponentialBackoffRetryPolicy _retryPolicy;
private readonly CircuitBreaker _circuitBreaker;
private readonly DeadLetterQueue _dlq;
private readonly IdempotencyHandler _idempotencyHandler;
private readonly ILogger<CargoWiseEAdapterErrorHandler> _logger;
public CargoWiseEAdapterErrorHandler(
ExponentialBackoffRetryPolicy retryPolicy,
CircuitBreaker circuitBreaker,
DeadLetterQueue dlq,
IdempotencyHandler idempotencyHandler,
ILogger<CargoWiseEAdapterErrorHandler> logger)
{
_retryPolicy = retryPolicy;
_circuitBreaker = circuitBreaker;
_dlq = dlq;
_idempotencyHandler = idempotencyHandler;
_logger = logger;
}
public async Task<CargoWiseResponse> ProcessMessageAsync(
string messageId,
string messageContent)
{
return await _idempotencyHandler.ExecuteIdempotentAsync(
messageId,
async () =>
{
return await _circuitBreaker.ExecuteAsync(async () =>
{
return await _retryPolicy.ExecuteWithRetryAsync(
async () => await SendToCargoWiseAsync(messageContent),
ShouldRetry);
});
});
}
private async Task<CargoWiseResponse> SendToCargoWiseAsync(string messageContent)
{
try
{
// Your eAdapter send logic
var response = await _cargoWiseClient.SendMessageAsync(messageContent);
return response;
}
catch (HttpRequestException ex) when (IsTransientError(ex))
{
_logger.LogWarning("Transient error sending message: {Error}", ex.Message);
throw; // Will be caught by retry policy
}
catch (Exception ex)
{
_logger.LogError(ex, "Permanent error sending message");
throw;
}
}
private bool ShouldRetry(Exception ex)
{
return ex switch
{
HttpRequestException httpEx => IsTransientHttpError(httpEx),
TimeoutException => true,
SocketException => true,
_ => false
};
}
private bool IsTransientHttpError(HttpRequestException ex)
{
// Check status code if available
// 429, 503, 502, 504 are typically retryable
return true; // Simplified for example
}
private bool IsTransientError(Exception ex)
{
return ex is TimeoutException ||
ex is SocketException ||
(ex is HttpRequestException httpEx && IsTransientHttpError(httpEx));
}
}
Error Classification and Handling
Error Classification Service
Classify errors to determine appropriate handling:
public enum ErrorCategory
{
Transient, // Network, timeouts - retry
Validation, // Invalid data - don't retry
Business, // Business rule violation - don't retry
Authentication, // Auth failure - don't retry
System // System error - may retry
}
public class ErrorClassifier
{
public ErrorCategory ClassifyError(Exception exception, string errorResponse = null)
{
return exception switch
{
HttpRequestException httpEx => ClassifyHttpError(httpEx, errorResponse),
TimeoutException => ErrorCategory.Transient,
SocketException => ErrorCategory.Transient,
XmlException => ErrorCategory.Validation,
ValidationException => ErrorCategory.Validation,
UnauthorizedAccessException => ErrorCategory.Authentication,
_ => ErrorCategory.System
};
}
private ErrorCategory ClassifyHttpError(HttpRequestException ex, string errorResponse)
{
// Parse error response if available
if (errorResponse?.Contains("VALIDATION_ERROR") == true)
return ErrorCategory.Validation;
if (errorResponse?.Contains("AUTHENTICATION_FAILED") == true)
return ErrorCategory.Authentication;
// Default to transient for HTTP errors
return ErrorCategory.Transient;
}
}
Monitoring and Observability
Structured Logging
Implement comprehensive logging for error tracking:
public class CargoWiseMessageProcessor
{
private readonly ILogger<CargoWiseMessageProcessor> _logger;
public async Task ProcessMessageAsync(CargoWiseMessage message)
{
using var scope = _logger.BeginScope(new Dictionary<string, object>
{
["MessageId"] = message.MessageId,
["MessageType"] = message.MessageType,
["CorrelationId"] = message.CorrelationId
});
_logger.LogInformation(
"Processing CargoWise message {MessageId} of type {MessageType}",
message.MessageId, message.MessageType);
try
{
var result = await ProcessMessageInternalAsync(message);
_logger.LogInformation(
"Successfully processed message {MessageId}",
message.MessageId);
return result;
}
catch (Exception ex)
{
_logger.LogError(ex,
"Failed to process message {MessageId} after {RetryCount} retries",
message.MessageId, message.RetryCount);
throw;
}
}
}
Metrics Collection
Track key metrics for error handling:
public class ErrorHandlingMetrics
{
private readonly IMetricsCollector _metrics;
public ErrorHandlingMetrics(IMetricsCollector metrics)
{
_metrics = metrics;
}
public void RecordRetry(string messageType, int attemptNumber)
{
_metrics.IncrementCounter("eadapter.retry.count", new Dictionary<string, string>
{
["message_type"] = messageType,
["attempt"] = attemptNumber.ToString()
});
}
public void RecordError(string messageType, ErrorCategory category)
{
_metrics.IncrementCounter("eadapter.error.count", new Dictionary<string, string>
{
["message_type"] = messageType,
["category"] = category.ToString()
});
}
public void RecordDlqMessage(string messageType, string errorType)
{
_metrics.IncrementCounter("eadapter.dlq.count", new Dictionary<string, string>
{
["message_type"] = messageType,
["error_type"] = errorType
});
}
public void RecordCircuitBreakerState(CircuitState state)
{
_metrics.SetGauge("eadapter.circuit_breaker.state",
state == CircuitState.Open ? 1 : 0);
}
}
Production Best Practices
1. Implement Comprehensive Logging
Log all error scenarios with sufficient context:
- Message IDs and correlation IDs
- Retry attempt numbers
- Error types and stack traces
- Processing timestamps
- Business context (shipment IDs, etc.)
2. Set Appropriate Retry Limits
- Network errors: 5-10 retries with exponential backoff
- Validation errors: 0 retries (immediate failure)
- Business errors: 0 retries (immediate failure)
- System errors: 3-5 retries with longer delays
3. Use Dead-Letter Queues
Always implement DLQ for messages that fail after all retries:
- Enables manual review and correction
- Prevents message loss
- Allows reprocessing after fixes
4. Implement Idempotency
Ensure message processing is idempotent:
- Use message IDs for deduplication
- Store processing results
- Return cached results for duplicate messages
5. Monitor Error Rates
Set up alerts for:
- High error rates (>5% of messages)
- Circuit breaker openings
- DLQ message accumulation
- Retry exhaustion
6. Test Error Scenarios
Comprehensive testing should include:
- Network failures
- Timeout scenarios
- Invalid message formats
- Service unavailability
- Partial failures
Testing Error Handling
Unit Tests
[Fact]
public async Task ShouldRetryOnTransientError()
{
var retryPolicy = new ExponentialBackoffRetryPolicy(maxRetries: 3);
var attemptCount = 0;
var result = await retryPolicy.ExecuteWithRetryAsync(
async () =>
{
attemptCount++;
if (attemptCount < 3)
throw new TimeoutException("Transient error");
return "Success";
},
ex => ex is TimeoutException);
Assert.Equal(3, attemptCount);
Assert.Equal("Success", result);
}
[Fact]
public async Task ShouldNotRetryOnValidationError()
{
var retryPolicy = new ExponentialBackoffRetryPolicy(maxRetries: 3);
var attemptCount = 0;
await Assert.ThrowsAsync<ValidationException>(async () =>
{
await retryPolicy.ExecuteWithRetryAsync(
async () =>
{
attemptCount++;
throw new ValidationException("Invalid data");
},
ex => ex is TimeoutException); // Only retry timeouts
});
Assert.Equal(1, attemptCount); // Should not retry
}
Integration Tests
[Fact]
public async Task ShouldSendToDlqAfterMaxRetries()
{
var dlq = new DeadLetterQueue(/* ... */);
var processor = new CargoWiseMessageProcessor(/* ... */);
// Simulate persistent failure
var message = CreateFailingMessage();
await Assert.ThrowsAsync<Exception>(() => processor.ProcessMessageAsync(message));
// Verify message in DLQ
var dlqMessage = await dlq.GetMessageAsync(message.MessageId);
Assert.NotNull(dlqMessage);
Assert.Equal(5, dlqMessage.RetryCount);
}
Advanced Error Handling Patterns
Partial Failure Handling
Handle scenarios where only part of a message fails:
public class PartialFailureHandler
{
public async Task<ProcessingResult> ProcessMessageWithPartialFailureAsync(
CargoWiseMessage message)
{
var results = new List<ItemProcessingResult>();
var failedItems = new List<string>();
foreach (var item in message.Items)
{
try
{
await ProcessItemAsync(item);
results.Add(new ItemProcessingResult
{
ItemId = item.Id,
Success = true
});
}
catch (Exception ex)
{
_logger.LogError(ex, "Failed to process item {ItemId}", item.Id);
failedItems.Add(item.Id);
results.Add(new ItemProcessingResult
{
ItemId = item.Id,
Success = false,
Error = ex.Message
});
}
}
return new ProcessingResult
{
TotalItems = message.Items.Count,
SuccessfulItems = results.Count(r => r.Success),
FailedItems = failedItems,
Results = results
};
}
}
Compensating Transactions
Implement compensating transactions for rollback scenarios:
public class CompensatingTransactionHandler
{
private readonly List<CompensationAction> _compensationActions;
public async Task ProcessWithCompensationAsync(Func<Task> operation)
{
_compensationActions.Clear();
try
{
await operation();
}
catch (Exception ex)
{
_logger.LogError(ex, "Operation failed, executing compensation");
// Execute compensation actions in reverse order
for (int i = _compensationActions.Count - 1; i >= 0; i--)
{
try
{
await _compensationActions[i].CompensateAsync();
}
catch (Exception compEx)
{
_logger.LogError(compEx, "Compensation action failed");
}
}
throw;
}
}
public void RegisterCompensation(Func<Task> compensationAction)
{
_compensationActions.Add(new CompensationAction
{
CompensateAsync = compensationAction
});
}
}
Real-World Scenarios
Scenario 1: Network Partition Recovery
Handle network partitions gracefully:
public class NetworkPartitionHandler
{
private readonly IHealthChecker _healthChecker;
private readonly IMessageStore _messageStore;
public async Task HandleNetworkPartitionAsync()
{
var isPartitioned = await _healthChecker.CheckConnectivityAsync();
if (isPartitioned)
{
_logger.LogWarning("Network partition detected, storing messages locally");
// Store messages locally until connectivity restored
await _messageStore.StorePendingMessagesAsync();
// Start monitoring for connectivity restoration
_ = Task.Run(async () => await MonitorConnectivityRestorationAsync());
}
}
private async Task MonitorConnectivityRestorationAsync()
{
while (true)
{
await Task.Delay(TimeSpan.FromSeconds(30));
if (await _healthChecker.CheckConnectivityAsync())
{
_logger.LogInformation("Connectivity restored, processing pending messages");
await ProcessPendingMessagesAsync();
break;
}
}
}
}
Scenario 2: High-Volume Error Handling
Handle errors during high-volume periods:
public class HighVolumeErrorHandler
{
private readonly SemaphoreSlim _concurrencyLimiter;
private readonly IErrorRateMonitor _errorRateMonitor;
public HighVolumeErrorHandler(int maxConcurrency = 10)
{
_concurrencyLimiter = new SemaphoreSlim(maxConcurrency);
}
public async Task ProcessWithBackpressureAsync(
IEnumerable<CargoWiseMessage> messages)
{
var errorRate = await _errorRateMonitor.GetErrorRateAsync();
if (errorRate > 0.1) // 10% error rate
{
_logger.LogWarning("High error rate detected: {ErrorRate}, reducing concurrency", errorRate);
await ReduceConcurrencyAsync();
}
var tasks = messages.Select(async message =>
{
await _concurrencyLimiter.WaitAsync();
try
{
await ProcessMessageAsync(message);
}
finally
{
_concurrencyLimiter.Release();
}
});
await Task.WhenAll(tasks);
}
}
Scenario 3: Message Validation Cascade
Handle validation errors that cascade across related messages:
public class ValidationCascadeHandler
{
public async Task HandleValidationCascadeAsync(
CargoWiseMessage message,
ValidationError error)
{
// Identify related messages that may be affected
var relatedMessages = await FindRelatedMessagesAsync(message);
foreach (var relatedMessage in relatedMessages)
{
// Mark related messages for revalidation
await MarkForRevalidationAsync(relatedMessage);
// If message is in-flight, attempt to cancel
if (await IsInFlightAsync(relatedMessage))
{
await CancelMessageAsync(relatedMessage);
}
}
// Log cascade effect
_logger.LogWarning(
"Validation error in message {MessageId} affected {Count} related messages",
message.MessageId, relatedMessages.Count);
}
}
Performance Optimization
Optimizing Retry Performance
Optimize retry logic for performance:
public class OptimizedRetryPolicy
{
private readonly ConcurrentDictionary<string, RetryState> _retryStates;
private readonly Timer _cleanupTimer;
public OptimizedRetryPolicy()
{
_retryStates = new ConcurrentDictionary<string, RetryState>();
// Cleanup old retry states periodically
_cleanupTimer = new Timer(
_ => CleanupOldStates(),
null,
TimeSpan.FromMinutes(5),
TimeSpan.FromMinutes(5));
}
public async Task<T> ExecuteWithOptimizedRetryAsync<T>(
string operationKey,
Func<Task<T>> operation)
{
var state = _retryStates.GetOrAdd(operationKey, _ => new RetryState());
// Use adaptive retry based on success rate
var delay = CalculateAdaptiveDelay(state);
try
{
var result = await operation();
state.RecordSuccess();
return result;
}
catch (Exception ex) when (ShouldRetry(ex, state))
{
state.RecordFailure();
await Task.Delay(delay);
return await ExecuteWithOptimizedRetryAsync(operationKey, operation);
}
}
private TimeSpan CalculateAdaptiveDelay(RetryState state)
{
// Adaptive delay based on success rate
var successRate = state.SuccessRate;
var baseDelay = TimeSpan.FromSeconds(1);
if (successRate < 0.5)
{
return baseDelay * 4; // Longer delay for low success rate
}
else if (successRate < 0.8)
{
return baseDelay * 2; // Medium delay
}
return baseDelay; // Short delay for high success rate
}
}
Batch Error Processing
Process errors in batches for efficiency:
public class BatchErrorProcessor
{
private readonly int _batchSize;
private readonly TimeSpan _batchTimeout;
public async Task ProcessErrorsInBatchesAsync(
IEnumerable<FailedMessage> failedMessages)
{
var batches = failedMessages
.Batch(_batchSize)
.Select(batch => ProcessBatchAsync(batch));
await Task.WhenAll(batches);
}
private async Task ProcessBatchAsync(IEnumerable<FailedMessage> batch)
{
var tasks = batch.Select(async message =>
{
try
{
await RetryMessageAsync(message);
}
catch (Exception ex)
{
await SendToDlqAsync(message, ex);
}
});
await Task.WhenAll(tasks);
}
}
Security Considerations
Secure Error Messages
Avoid exposing sensitive information in errors:
public class SecureErrorHandler
{
public Exception SanitizeException(Exception ex)
{
// Remove sensitive data from exception messages
var sanitizedMessage = SanitizeMessage(ex.Message);
// Create new exception without sensitive data
return new Exception(sanitizedMessage)
{
Source = ex.Source,
HelpLink = ex.HelpLink
};
}
private string SanitizeMessage(string message)
{
// Remove API keys, passwords, connection strings
var patterns = new[]
{
@"(api[_-]?key\s*[:=]\s*)([^\s]+)",
@"(password\s*[:=]\s*)([^\s]+)",
@"(connection[_-]?string\s*[:=]\s*)([^\s]+)"
};
var sanitized = message;
foreach (var pattern in patterns)
{
sanitized = Regex.Replace(
sanitized,
pattern,
"$1***REDACTED***",
RegexOptions.IgnoreCase);
}
return sanitized;
}
}
Error Logging Security
Secure error logging practices:
public class SecureErrorLogger
{
private readonly ILogger _logger;
private readonly IDataMasker _dataMasker;
public void LogErrorSecurely(Exception ex, object context)
{
// Mask sensitive data in context
var sanitizedContext = _dataMasker.MaskSensitiveData(context);
// Log with sanitized context
_logger.LogError(ex,
"Error occurred with context: {Context}",
JsonSerializer.Serialize(sanitizedContext));
}
}
Troubleshooting Guide
Common Issues and Solutions
Issue 1: Messages Stuck in Retry Loop
Symptoms:
- Messages repeatedly retrying
- High CPU usage
- No messages reaching DLQ
Solution:
// Add maximum retry limit
public class RetryLimiter
{
private const int MaxRetries = 10;
public async Task<T> ExecuteWithLimitAsync<T>(
Func<Task<T>> operation,
string messageId)
{
var retryCount = await GetRetryCountAsync(messageId);
if (retryCount >= MaxRetries)
{
await SendToDlqAsync(messageId, "Max retries exceeded");
throw new MaxRetriesExceededException();
}
try
{
var result = await operation();
await ResetRetryCountAsync(messageId);
return result;
}
catch (Exception ex)
{
await IncrementRetryCountAsync(messageId);
throw;
}
}
}
Issue 2: Circuit Breaker Not Opening
Symptoms:
- Circuit breaker stays closed despite failures
- Errors not being caught
Solution:
// Ensure proper error classification
public class ImprovedCircuitBreaker
{
public void RecordFailure(Exception ex)
{
// Only count retryable errors
if (IsRetryableError(ex))
{
_failureCount++;
if (_failureCount >= _threshold)
{
OpenCircuit();
}
}
}
private bool IsRetryableError(Exception ex)
{
return ex is HttpRequestException httpEx &&
(httpEx.Message.Contains("timeout") ||
httpEx.Message.Contains("503") ||
httpEx.Message.Contains("502"));
}
}
Issue 3: Dead-Letter Queue Growing Rapidly
Symptoms:
- DLQ accumulating messages quickly
- No reprocessing happening
Solution:
// Implement DLQ monitoring and auto-reprocessing
public class DlqMonitor
{
public async Task MonitorAndReprocessAsync()
{
var dlqCount = await GetDlqMessageCountAsync();
if (dlqCount > 100)
{
_logger.LogWarning("DLQ has {Count} messages, starting reprocessing", dlqCount);
// Analyze common errors
var commonErrors = await AnalyzeCommonErrorsAsync();
// Fix root causes if possible
await FixRootCausesAsync(commonErrors);
// Reprocess messages
await ReprocessDlqMessagesAsync();
}
}
}
Case Studies
Case Study 1: E-commerce Integration
Challenge: An e-commerce platform needed to integrate with CargoWise for shipment processing. The integration experienced high failure rates during peak shopping periods.
Solution: Implemented adaptive retry with backpressure:
public class EcommerceIntegrationHandler
{
private readonly AdaptiveRetryPolicy _retryPolicy;
private readonly BackpressureController _backpressure;
public async Task ProcessShipmentAsync(Shipment shipment)
{
// Check system load
var load = await GetSystemLoadAsync();
if (load > 0.8)
{
await _backpressure.ThrottleAsync();
}
return await _retryPolicy.ExecuteWithRetryAsync(
() => SendToCargoWiseAsync(shipment));
}
}
Results:
- 95% reduction in failed shipments
- 60% improvement in throughput
- Zero data loss during peak periods
Case Study 2: Multi-Region Deployment
Challenge: A logistics company needed to handle CargoWise integration across multiple regions with varying network conditions.
Solution: Implemented region-aware error handling:
public class RegionalErrorHandler
{
private readonly Dictionary<string, RegionalRetryPolicy> _regionalPolicies;
public async Task ProcessWithRegionalPolicyAsync(
CargoWiseMessage message,
string region)
{
var policy = _regionalPolicies[region];
// Adjust retry strategy based on region
return await policy.ExecuteWithRetryAsync(
() => SendToCargoWiseAsync(message));
}
}
Results:
- Region-specific retry strategies
- Improved reliability in high-latency regions
- Better error recovery
Migration Guide
Migrating from Basic to Advanced Error Handling
Step 1: Add error classification:
// Before
try
{
await SendMessageAsync(message);
}
catch (Exception ex)
{
_logger.LogError(ex, "Error sending message");
throw;
}
// After
try
{
await SendMessageAsync(message);
}
catch (Exception ex)
{
var category = _errorClassifier.ClassifyError(ex);
if (category == ErrorCategory.Transient)
{
await RetryAsync(message);
}
else
{
await SendToDlqAsync(message, ex);
}
}
Step 2: Implement retry logic:
// Add retry policy
builder.Services.AddSingleton<IRetryPolicy, ExponentialBackoffRetryPolicy>();
// Use in message processor
public async Task ProcessMessageAsync(CargoWiseMessage message)
{
return await _retryPolicy.ExecuteWithRetryAsync(
() => SendToCargoWiseAsync(message));
}
Step 3: Add circuit breaker:
builder.Services.AddSingleton<ICircuitBreaker, CircuitBreaker>();
public async Task ProcessMessageAsync(CargoWiseMessage message)
{
return await _circuitBreaker.ExecuteAsync(
() => _retryPolicy.ExecuteWithRetryAsync(
() => SendToCargoWiseAsync(message)));
}
Extended FAQ
Q: How do I determine the optimal retry count?
A: Optimal retry count depends on:
- Message criticality
- System reliability
- Timeout requirements
- Business SLA
Start with 3-5 retries and adjust based on monitoring data.
Q: Should I retry all errors?
A: No. Only retry transient errors:
- Network timeouts
- Service unavailable (503)
- Rate limiting (429)
- Temporary database locks
Don't retry:
- Validation errors (400)
- Authentication failures (401)
- Not found (404)
- Business rule violations
Q: How do I handle message ordering with retries?
A: Use message sessions or sequence numbers:
public class OrderedMessageProcessor
{
public async Task ProcessOrderedMessageAsync(
CargoWiseMessage message)
{
// Wait for previous messages to complete
await WaitForSequenceAsync(message.SequenceNumber - 1);
// Process message
await ProcessMessageAsync(message);
// Mark sequence as complete
await MarkSequenceCompleteAsync(message.SequenceNumber);
}
}
Q: What's the difference between retry and DLQ?
A:
- Retry: Automatic retry of transient failures
- DLQ: Manual review of persistent failures
Use retry for errors that might succeed on retry. Use DLQ for errors that need human intervention.
Q: How do I test error handling?
A: Use fault injection:
[Fact]
public async Task ShouldHandleNetworkFailure()
{
// Inject network failure
_networkSimulator.SimulateFailure();
var result = await _processor.ProcessMessageAsync(message);
// Verify retry occurred
Assert.True(_retryMonitor.RetryOccurred);
// Verify eventual success or DLQ
Assert.True(result.Success || _dlq.Contains(message));
}
Q: How do I monitor error handling effectiveness?
A: Track key metrics:
- Retry success rate
- DLQ growth rate
- Average retry count
- Error recovery time
- Circuit breaker state changes
Q: Can I use different retry strategies for different message types?
A: Yes, implement message-type-specific policies:
public class MessageTypeRetryPolicy
{
private readonly Dictionary<string, IRetryPolicy> _policies;
public IRetryPolicy GetPolicyForMessageType(string messageType)
{
return _policies.GetValueOrDefault(
messageType,
_defaultPolicy);
}
}
Best Practices Summary
- Error Classification First: Always classify errors before handling
- Exponential Backoff: Use exponential backoff with jitter
- Circuit Breakers: Implement circuit breakers for external dependencies
- Dead-Letter Queues: Always have DLQ for failed messages
- Idempotency: Ensure all operations are idempotent
- Comprehensive Logging: Log all error scenarios with context
- Monitoring: Set up alerts for error conditions
- Testing: Test all error scenarios thoroughly
- Documentation: Document error handling strategies
- Review: Regularly review and optimize error handling
Conclusion
Implementing robust error handling and retry patterns for CargoWise eAdapter integrations is essential for production reliability. By combining exponential backoff, circuit breakers, dead-letter queues, and idempotency checks, you can build integrations that gracefully handle failures while maintaining data integrity.
Key Takeaways:
- Classify Errors: Distinguish between transient and permanent errors
- Implement Retry Logic: Use exponential backoff with jitter
- Use Circuit Breakers: Prevent cascading failures
- Dead-Letter Queues: Capture failed messages for review
- Idempotency: Ensure safe retries
- Comprehensive Logging: Track all error scenarios
- Monitor Metrics: Set up alerts for error conditions
- Performance Optimization: Optimize retry logic for efficiency
- Security: Sanitize error messages and logs
- Testing: Test all error scenarios
Next Steps:
- Implement error classification in your eAdapter integration
- Add retry policies with appropriate backoff strategies
- Set up dead-letter queue processing
- Configure monitoring and alerting
- Test error scenarios thoroughly
- Review and optimize based on production data
For more CargoWise integration guidance, explore our CargoWise eAdapter Integration Patterns guide or contact our team for enterprise integration support.