Data & Database Workflows (page 13 of 40)
PostgreSQL, SQL, CSV, JSON, Excel, PDF, and conversion pipelines — practical workflows for working with structured data safely.
- Data Quality Metrics for Recurring CSV Feeds
A practical guide to measuring the health of recurring CSV feeds with metrics that catch breakages before dashboards, imports, and downstream jobs go wrong.
- Date Formats in CSV: ISO-8601 vs Locale-Specific Landmines
A practical guide to designing date columns in CSV files without creating ambiguity for imports, analytics, spreadsheets, or downstream systems.
- Deduplication Keys: Choosing Stable Business Identifiers
A practical guide to choosing deduplication keys that stay stable across CSV imports, warehouse loads, and recurring data workflows.
- Delimiter Checker: How to Interpret Mixed-Separator Files
A practical guide to understanding mixed-separator files and deciding whether to normalize, reject, quarantine, or split them before they break downstream imports.
- Detecting Delimiter Switches Mid-File (Yes, It Happens)
A practical guide to finding and handling delimiter changes that happen inside the same file before they silently break imports, analytics, or downstream pipelines.
- Anomaly Detection on CSV Arrival Volumes and Row Counts
A practical guide to monitoring CSV arrivals, row-count changes, volume spikes, missing files, and ingestion anomalies before they break downstream systems.
- Archiving CSV: Retention, Encryption, and Retrieval Testing
A detailed guide to retaining, encrypting, verifying, and testing archived CSV files so they stay usable when you need them later.
- Arrow and CSV: Columnar Benefits for Analytics Workloads
A detailed guide to Arrow vs CSV for analytics workloads, including columnar performance, interoperability, and when CSV still makes sense.
- Best Practices for CSV Data Contracts Between Vendors and Engineering
A practical guide to making vendor CSV feeds predictable with clear data contracts, schema rules, versioning, and validation workflows.
- BigQuery CSV Load Jobs: Autodetect vs Explicit Schema
A practical guide to choosing between BigQuery CSV schema autodetect and explicit schema for ad hoc imports, repeatable pipelines, and lower-risk production loads.