Data & Database Workflows (page 9 of 40)
PostgreSQL, SQL, CSV, JSON, Excel, PDF, and conversion pipelines — practical workflows for working with structured data safely.
- Mixed encodings in one file: detection heuristics
A practical guide to spotting mixed encodings inside one file without pretending charset detection is more certain than it really is.
- Documentation Templates for Internal CSV Exports
A practical guide to documenting internal CSV exports so teams stop relying on tribal knowledge, Slack messages, and guesswork.
- DuckDB Reading CSV: Types, Headers, and Strict Modes
A practical guide to reading CSV files in DuckDB without getting surprised by type inference, missing headers, or strict parsing behavior.
- DuckDB vs Pandas for Big CSV: When Each Wins
A practical guide to choosing DuckDB or pandas for big CSV files, with clear advice on where each tool wins and when using both together is the better move.
- Duplicate Column Names in CSV: Import Strategies That Survive
A practical guide to surviving duplicate CSV headers without losing meaning, breaking imports, or quietly mapping values to the wrong columns.
- E-commerce Returns CSV: SKU Normalization and Refunds Mapping
A practical guide to turning messy returns CSV exports into clean SKU and refund data that inventory, finance, and analytics teams can actually trust.
- Email Column Validation Beyond “Contains @”
A practical guide to validating email fields in CSV files without relying on naive string checks that break downstream systems and data quality.
- Empty Last Line in CSV: Harmless or a Schema Trap?
A practical guide to understanding whether a trailing blank line in CSV is harmless whitespace or a real schema and import problem.
- Escaped Quotes Inside CSV Fields: Parsing Rules in Plain English
A practical guide to understanding escaped quotes in CSV files without relying on fragile regexes or guesswork.
- Excel “Save as CSV” Encoding Options Explained for Importers
A practical guide to understanding Excel CSV encoding options so importers stop guessing and pipelines stop breaking on text that looked fine in the spreadsheet.