Power Query: robust CSV import settings for messy vendors

·By Elysiate·Updated Apr 9, 2026·
power-querycsvexcelpower-bidata-importetl
·

Level: intermediate · ~16 min read · Intent: informational

Audience: developers, data analysts, ops engineers, BI engineers, technical teams

Prerequisites

  • basic familiarity with CSV files
  • basic familiarity with Excel or Power BI
  • optional understanding of ETL workflows

Key takeaways

  • For messy vendor CSV files, the safest Power Query workflow is to reduce guessing: set delimiter and file origin explicitly, keep quote-aware parsing on, and avoid trusting auto-generated type steps.
  • Power Query can automatically detect headers and types for unstructured sources by inspecting the first 200 rows, which is useful for clean files but risky for vendor exports with drift, IDs, or late-row anomalies.
  • Land fragile columns such as SKUs, account numbers, postal codes, and invoice IDs as text first, then apply typed conversions intentionally with locale.
  • When combining multiple CSV files, remember the sample or first file shapes the transformation logic, so a bad example file can create schema drift for the whole folder.

References

FAQ

What are the safest Power Query CSV import settings for messy vendor files?
Start by choosing Transform Data, set file origin and delimiter explicitly, keep quote-aware parsing on, avoid trusting auto Changed Type for fragile columns, and apply locale-aware conversions only after you validate structure.
Why does Power Query import some vendor CSV files into one column?
The most common causes are wrong delimiter, wrong file origin or encoding, or a file that looks like text rather than structured CSV. Semicolon-delimited exports and European locale files are frequent culprits.
How do I keep leading zeros in Power Query?
Import identifier columns as text before any automatic numeric conversion. Postal codes, SKUs, account numbers, invoice IDs, and tracking numbers should usually stay text.
Why are dates and decimal values wrong after CSV import?
Locale mismatch is the usual reason. Power Query may interpret text values using your current regional settings, so dates and decimal separators can be misread unless you use Change Type Using Locale or explicit culture in M.
Why do combined CSV imports break when one file changes?
Power Query combine workflows rely on a sample file and generated helper queries. If the sample file is unrepresentative, or later files have extra rows, different headers, or drifted schema, the output can break.
0

Power Query: robust CSV import settings for messy vendors

Vendor CSV files are rarely “just CSV.”

They are often:

  • exported from ERP systems with old encoding defaults
  • resaved in Excel with a different delimiter
  • generated in one locale and imported in another
  • inconsistent across months, regions, or subsidiaries
  • technically structured, but only if your importer guesses the same rules the exporter used

That is why Power Query CSV import settings matter so much.

When people search for help here, they are usually not searching for theory. They are searching for one of these pain points:

  • Power Query CSV import all data in one column
  • Power Query semicolon CSV import
  • Power Query preserve leading zeros
  • Power Query wrong date format after import
  • Power Query UTF-8 BOM file origin
  • Power Query quoted line breaks CSV
  • Power Query combine CSV files with schema drift
  • Excel Power Query vendor file import problems
  • Power BI Power Query CSV delimiter and locale issues

This guide is built for those real-world cases.

The core idea is simple:

treat vendor CSV import as a contract, not a convenience feature.

For clean internal data, Power Query’s defaults are often fine. For messy vendor files, robust imports come from reducing guesses:

  • explicit delimiter
  • explicit file origin or encoding expectation
  • quote-aware parsing
  • intentional header promotion
  • text-first landing for fragile columns
  • locale-aware conversions only after validation

If your team receives recurring third-party exports, these settings are often the difference between:

  • stable refreshes
  • silent corruption
  • and late-night spreadsheet debugging

Why vendor CSVs break Power Query imports

CSV is common precisely because it is simple. CSV is unreliable precisely because it is simple.

RFC 4180 documents the common CSV shape most implementations roughly follow: rows, commas, quotes, escaped quotes, and quoted fields for commas or line breaks. In practice, though, many tools only partly follow the same rules. Some exports use semicolons. Some emit UTF-8 with BOM. Some emit other encodings. Some mix quoted and unquoted fields. Some let people edit files manually before upload.

Power Query sits in the middle of that mess.

Power Query’s Text/CSV connector tries to help by inferring structure. It can:

  • attempt to infer delimiters
  • let you choose file origin
  • let you configure delimiter and data type detection
  • treat CSV as a structured text source

That convenience is useful until the file is only mostly normal.

For messy vendors, the common failure pattern is not “Power Query is bad.” It is:

Power Query guessed a different contract than the source system used.

That mismatch usually appears as one of five symptoms:

  • the file loads into one column
  • dates flip month and day or error out
  • leading zeros disappear
  • rows split in the middle of a quoted field
  • combine-files workflows break when new files arrive

What Power Query actually guesses for you

Before choosing settings, it helps to know what Power Query is doing automatically.

1. It tries to infer whether a text file is delimited

If Power Query can infer delimiter-separated structure, it treats the file as a structured source. That is convenient for basic CSV files, but it means delimiter mistakes are one of the first places to look when imports go wrong.

2. Character set detection is limited

In the connector preview, File Origin controls the character set used to read the file. Microsoft’s documentation notes that the character set is not generally inferred, and UTF-8 is only inferred automatically if the file starts with a UTF-8 BOM.

That matters because many “random symbol” or “accented characters broke” issues are not row issues at all. They are encoding issues.

3. Automatic type and header detection is based on the first 200 rows

For unstructured sources such as CSV and text files, Power Query can automatically inspect the first 200 rows and add two steps:

  • Promote headers
  • Changed type

This is one of the most important behaviors to understand.

If row 201 introduces:

  • a longer SKU
  • a text value in a mostly numeric column
  • a new locale-specific date
  • an invoice number with leading zeros
  • a null or malformed value pattern not seen earlier

then the query may already have chosen the wrong schema.

4. Locale affects how text becomes dates, numbers, and currency

Power Query uses locale when interpreting text into typed values. On Desktop, it typically uses your operating system’s regional format unless you override it. This is why the same CSV can “work on my machine” and fail on someone else’s.

5. Combine-files workflows are shaped by a sample file

When you combine CSV files from a folder, Power Query uses helper queries and an example file path through the workflow. Microsoft’s combine-files documentation explicitly notes that the connector and defaults are determined from the first file or chosen sample context.

That means a bad example file can poison the transformation logic for the whole folder.

The safest default import profile for messy vendor CSVs

If the vendor file is unreliable, start with this operating posture:

  1. Choose Transform Data, not Load
  2. Set File Origin explicitly
  3. Set Delimiter explicitly
  4. Avoid trusting automatic Changed Type
  5. Promote headers only after checking the first row is really headers
  6. Keep IDs and codes as text
  7. Use locale explicitly for dates, decimals, and currency
  8. Validate structure before business rules
  9. Keep the original file untouched
  10. Document the source contract once the file is stable

This is the repeatable pattern that tends to survive new vendors, regional exports, and hand-edited spreadsheets.

Symptom-to-setting map

Symptom Most likely cause First setting or step to check
Everything lands in one column Wrong delimiter or wrong file origin Set delimiter and file origin explicitly
Accented characters look broken Wrong encoding or file origin Change File Origin and compare preview
Dates become errors or swap month/day Locale mismatch or bad auto type step Remove Changed Type and use locale-aware conversion
Leading zeros disappear IDs were inferred as numeric Set those columns to text before conversion
Rows break inside addresses or notes Quote handling or malformed quoting Keep quote-aware parsing and validate file structure
Combine folder import breaks on new files Sample file not representative or schema drift Fix Transform Sample File logic and enforce schema
Some numeric columns parse, others do not Mixed locale formatting Normalize text, then convert with explicit culture
Extra columns appear or vanish Drifted vendor export or wrong column expectations Validate headers and compare against a contract

The import settings that matter most

1. File Origin and encoding

If a vendor export contains names, addresses, symbols, or non-English characters, File Origin is not a minor setting.

Common signs of encoding trouble:

  • mojibake or corrupted accented characters
  • currency symbols rendered incorrectly
  • smart quotes becoming nonsense characters
  • headers that look almost right but do not match downstream logic

For messy vendor files:

  • do not assume UTF-8 unless you know the source system emits UTF-8
  • if a file worked last month and breaks this month, compare the raw bytes or at least re-check file origin
  • treat “opens fine in Excel” as insufficient evidence, because viewer behavior and import behavior are not the same thing

Practical rule: if text looks wrong before typing or transformation, stop there and fix file origin first.

2. Delimiter

Delimiter problems create some of the most common support tickets:

  • semicolon-delimited exports opening as one giant column
  • European files using semicolon because comma is used as a decimal separator
  • tab-delimited files mislabeled as CSV
  • hand-edited files mixing delimiters

Power Query’s connector supports explicit delimiter selection, including common options and custom delimiters. For robust imports, do not let the preview guess if the vendor is inconsistent.

Good practice

  • Set the delimiter explicitly for recurring imports.
  • Document it in the query or runbook.
  • If the vendor changes delimiter without notice, treat that as schema drift.

3. Header promotion

Header promotion seems harmless until the first row is not actually a clean header row.

Typical vendor problems:

  • preamble rows above the real header
  • merged-report exports with title lines
  • blank rows before headers
  • duplicate or unstable column names
  • date-like or numeric values in the header row

Power Query’s automatic header detection can add a Promote headers step. That is fine when the first row is trustworthy. It is risky when the first rows contain metadata or report decorations.

Safer pattern

  • remove report preamble rows first
  • verify the real header row visually
  • then promote headers
  • normalize column names if the vendor is inconsistent

If a vendor occasionally adds a banner row like “Generated on 2026-04-01,” automatic header promotion can derail the whole query.

4. Automatic type detection

Automatic type detection is one of the best convenience features for clean files and one of the biggest hidden risks for vendor files.

Because Power Query’s automatic detection for unstructured sources inspects the first 200 rows, it can easily miss late-row anomalies such as:

  • rare alphanumeric IDs
  • text placeholders like N/A
  • localized currency strings
  • dates that appear in multiple formats
  • long account numbers with leading zeros
  • unexpected blank rows or footer content

Safer pattern for messy files

Use a text-first landing zone:

  • import structure first
  • preserve risky columns as text
  • validate values
  • only then convert selected columns to number, date, currency, or logical

This approach reduces silent corruption.

For identifiers, “text until proven otherwise” is almost always the right move.

5. Quote handling and quoted line breaks

Real vendor CSVs frequently include:

  • commas inside descriptions
  • embedded quotes in comments
  • line breaks inside address fields
  • long note fields exported from CRMs or support systems

If your parser stops being quote-aware, rows can split in the middle of a field and everything downstream shifts.

Power Query’s M layer gives you important controls here:

  • CsvStyle
  • QuoteStyle

For robust imports, quoted fields and quoted line breaks should be treated as real CSV structure, not as arbitrary text noise.

Watch for this failure pattern

A file looks fine until a customer comment or address contains:

  • a comma
  • a quote
  • a line break

Then suddenly:

  • columns shift right
  • one record becomes two rows
  • headers still look fine, but line-level data is corrupted

That is not a business-rule problem. That is a parsing problem.

6. Locale

Locale is where many “mysterious” CSV bugs actually live.

Examples:

  • 03/04/2026 is parsed as March 4 on one machine and April 3 on another
  • 1,25 is read as one-and-a-quarter in one locale and as text in another
  • currency values with thousand separators import inconsistently
  • day-month-year exports fail on systems expecting month-day-year

Power Query supports changing types using locale, and the M function Table.TransformColumnTypes also supports culture input.

Safer pattern

  • keep raw date and decimal columns as text at first
  • inspect real source format
  • apply locale-aware conversion intentionally
  • do not assume your workstation locale matches the vendor export locale

If your imports feed finance, billing, operations, or regulatory reporting, locale drift is a serious risk, not a cosmetic issue.

7. Column expectations and schema drift

Messy vendors do not only break values. They break structure.

Common drift patterns:

  • added optional columns
  • renamed columns
  • reordered columns
  • duplicate headers
  • deleted columns without notice
  • extra delimiters at line ends
  • report footers inserted into the file

For recurring imports, define what must stay stable:

  • required column names
  • allowed optional columns
  • expected delimiter
  • expected encoding
  • expected header row position
  • which columns are text-only identifiers
  • which columns can be typed later

If you never write that contract down, your Power Query logic becomes the contract by accident.

A robust Power Query workflow for messy vendor CSVs

This is the workflow that holds up best over time.

Step 1. Keep the original file untouched

Never “fix” the only copy in Excel first. Save the original export as received.

Step 2. Open through Text/CSV and choose Transform Data

Do not load directly into the sheet or model if the file is messy.

Step 3. Set File Origin and Delimiter explicitly

Use preview quality as a structural check:

  • does text render correctly?
  • are columns splitting correctly?
  • do row counts feel plausible?

Step 4. Remove non-data rows before promoting headers

Strip titles, blank rows, disclaimers, preambles, or footers first.

Step 5. Promote headers intentionally

Only after the real header row is visible.

Step 6. Delete or avoid unsafe auto-typing

If the file is messy, remove the automatic Changed Type step and rebuild types intentionally.

Step 7. Convert fragile columns to text first

Typical text-first columns:

  • order IDs
  • invoice numbers
  • account numbers
  • SKUs
  • postal codes
  • phone numbers
  • tracking numbers
  • external reference codes

Step 8. Convert business columns with locale

Dates, decimals, and currency should be converted explicitly with the right locale.

Step 9. Validate structure

Check:

  • row count
  • column count
  • header names
  • duplicate headers
  • null spikes
  • obvious parse drift in wide text columns

Step 10. Only then load or combine

Do not combine a folder full of files until a single-file import path is robust.

Use this as a starting template when you want explicit control.

let
    Source =
        Csv.Document(
            File.Contents("C:\\Imports\\vendor-export.csv"),
            [
                Delimiter = ",",
                Encoding = 65001,
                CsvStyle = CsvStyle.QuoteAfterDelimiter,
                QuoteStyle = QuoteStyle.Csv
            ]
        ),
    RemoveTopRows = Table.Skip(Source, 0),
    PromoteHeaders = Table.PromoteHeaders(RemoveTopRows, [PromoteAllScalars = true]),
    KeepFragileColumnsAsText =
        Table.TransformColumnTypes(
            PromoteHeaders,
            {
                {"Order ID", type text},
                {"Vendor SKU", type text},
                {"Postal Code", type text},
                {"Tracking Number", type text},
                {"Order Date", type text},
                {"Amount", type text}
            },
            "en-US"
        )
in
    KeepFragileColumnsAsText

Why this pattern is safer:

  • delimiter is explicit
  • encoding expectation is explicit
  • quote-aware parsing is explicit
  • header promotion is explicit
  • fragile columns stay text until validated

For many vendor imports, this alone avoids the most expensive mistakes.

A safer pattern for dates and amounts

After validation, add a second conversion step instead of forcing everything into types immediately.

let
    Source =
        Csv.Document(
            File.Contents("C:\\Imports\\vendor-export.csv"),
            [
                Delimiter = ";",
                Encoding = 65001,
                CsvStyle = CsvStyle.QuoteAfterDelimiter,
                QuoteStyle = QuoteStyle.Csv
            ]
        ),
    PromoteHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars = true]),
    AsText =
        Table.TransformColumnTypes(
            PromoteHeaders,
            {
                {"Invoice ID", type text},
                {"Customer Code", type text},
                {"Invoice Date", type text},
                {"Net Amount", type text}
            },
            "en-GB"
        ),
    Typed =
        Table.TransformColumnTypes(
            AsText,
            {
                {"Invoice Date", type date},
                {"Net Amount", Currency.Type}
            },
            "en-GB"
        )
in
    Typed

This two-stage pattern is especially useful when:

  • dates are ambiguous
  • decimal separators vary by locale
  • IDs must keep leading zeros
  • you need a clean place to add validation or cleanup rules

How to preserve leading zeros in Power Query

This is one of the highest-value ranking subtopics around vendor CSV imports.

The usual victims are:

  • postal codes
  • product IDs
  • customer numbers
  • banking references
  • invoice IDs
  • tracking numbers
  • part codes

If Power Query or Excel treats them as numbers, leading zeros can disappear.

Best rule

If the value is an identifier rather than a quantity, treat it as text.

Good examples of text-first columns

  • 00012345
  • 001-44-A
  • 02108
  • 000000789312
  • 01-APR-REF-002

These are not “numbers” in the business sense. They are labels.

Once leading zeros are lost, downstream matching, joins, and audits get messy fast.

How to fix Power Query importing everything into one column

This is almost always one of three things:

Wrong delimiter

A semicolon file is being read as comma-delimited, or vice versa.

Wrong file origin

The parser is reading the text with the wrong character set, which can interfere with structure.

It is not truly a normal CSV

The file may be:

  • tab-delimited
  • pipe-delimited
  • fixed-width
  • quoted inconsistently
  • exported with mixed separators

Fast recovery checklist

  1. Check the raw file in a plain-text editor.
  2. Confirm the actual delimiter.
  3. Re-import with explicit delimiter.
  4. Confirm File Origin.
  5. Inspect for quoted commas and embedded line breaks.
  6. Validate row consistency before typing.

How to fix wrong dates and decimals in Power Query

Another major search cluster around this topic is date and number corruption.

Typical problems:

  • 04/03/2026 interpreted in the wrong order
  • decimal commas treated as delimiters
  • currency symbols mixed with numeric values
  • date columns that work for some rows and fail later

Robust pattern

  • import as text
  • inspect actual source format
  • convert with Using Locale
  • only then rely on date or numeric transforms

If you skip this and leave auto-typing in place, errors may appear far downstream, where they are harder to trace.

How to handle semicolon CSV files from European vendors

Semicolon CSV imports deserve their own section because they are so common.

Why they happen:

  • comma is used as decimal separator in many locales
  • exporting systems use semicolon to avoid delimiter conflicts
  • users assume “CSV” means comma even when the file does not

Safe approach

  • inspect the raw file
  • set delimiter to semicolon explicitly
  • convert amounts using the vendor’s locale
  • do not assume commas inside numbers mean bad data

This is a classic case where delimiter and locale are linked.

Combine files: how the first file can sabotage the whole import

Folder-based imports are powerful, but they are also where hidden assumptions multiply.

Power Query’s combine-files flow uses:

  • a sample file
  • helper queries
  • transformations applied to each file before combining

That means combine logic can fail when:

  • the first file is not representative
  • the sample file has different columns
  • one month includes an extra preamble row
  • one region includes different headers
  • later files add optional columns
  • a footer appears in only some files

Safer combine strategy

  • harden a single-file import first
  • choose a representative sample file
  • clean structure in the Transform Sample File query
  • validate headers across all files before final load
  • do not assume a folder contains only files with the same contract

If the vendor cannot keep structure stable, treat the folder as a collection of versioned contracts, not one reliable source.

What to document once the import is stable

When you finally get a messy vendor CSV import working, write down the contract.

At minimum, document:

  • source system
  • expected delimiter
  • expected file origin or encoding
  • expected header row position
  • required columns
  • optional columns
  • columns that must remain text
  • locale for dates and decimals
  • known preamble or footer rows
  • validation checks
  • failure examples and fixes

This reduces future debugging dramatically.

The biggest long-term mistake is solving the import once and leaving the logic undocumented.

Before you trust the data, validate:

Structural checks

  • consistent column counts
  • expected headers
  • no duplicate header names
  • no accidental blank rows in the middle of data
  • no obvious quote-related row splits

Semantic checks

  • IDs still match expected formats
  • dates parse under the intended locale
  • amounts convert cleanly
  • row counts are plausible against source expectations
  • unexpected null spikes are investigated

Operational checks

  • refresh succeeds consistently
  • sample files do not differ materially from production files
  • combined folder imports do not silently exclude files
  • helper queries still point to the right schema

For privacy-first preflight checks, your most natural companion tools are:

These are especially useful when the data is sensitive and you want to isolate structural issues before deeper transformation work.

Common anti-patterns

Loading first and cleaning later

This makes it easier for incorrect types to spread into formulas, models, and reports.

Trusting Changed Type on fragile vendor files

Convenient, but risky when the file has drift or mixed formatting.

Using numeric types for business identifiers

If the value is an identifier, text is often safer.

Blaming Power Query for a broken vendor contract

The real issue is often undocumented delimiter, encoding, or locale behavior upstream.

Combining files before stabilizing one file

Folder logic multiplies the pain of a weak single-file import.

Treating “opens in Excel” as proof the file is fine

Display success is not import correctness.

Which search intents this article should capture

To maximize impressions, this article is intentionally designed to rank across multiple related query families, including:

Core intent

  • power query csv import settings
  • power query robust csv import settings
  • excel power query csv import settings
  • power bi power query csv import settings

Failure-mode intent

  • power query csv one column
  • power query wrong delimiter csv
  • power query preserve leading zeros
  • power query csv date format wrong
  • power query encoding problem csv
  • power query quoted line breaks csv
  • power query combine files error sample file

Workflow intent

  • best power query settings for vendor csv
  • how to import messy csv into power query
  • power query change type using locale
  • power query semicolon csv import
  • power query file origin utf-8 bom

That breadth matters because the people who need this page rarely search the exact same phrase twice. They search by symptom, not by feature name.

Final recommendation

If you only remember one operating rule from this guide, let it be this:

for messy vendor CSVs, reduce guesses before you add transformations.

That means:

  • explicit delimiter
  • explicit file origin
  • quote-aware parsing
  • cautious header promotion
  • no blind trust in auto Changed Type
  • text-first handling for IDs and risky columns
  • locale-aware conversion only when intentional
  • contract documentation once stable

That is how you turn Power Query from a preview tool into a reliable import boundary.

FAQ

What are the safest Power Query CSV import settings for messy vendor files?

Start with Transform Data, then set File Origin and Delimiter explicitly, keep parsing quote-aware, avoid trusting the default Changed Type step for fragile columns, and apply typed conversions only after you validate structure. For recurring vendor feeds, treat identifiers as text and use locale-aware conversion for dates and decimals.

Why does Power Query import some vendor CSV files into one column?

Usually because the delimiter is wrong, the file origin is wrong, or the file is not a normal comma-delimited CSV at all. Semicolon exports, tab-separated files, and files with mixed structure are common causes. Check the raw text first and then force the delimiter explicitly.

How do I preserve leading zeros in Power Query?

Import business identifiers as text before any numeric conversion happens. This is especially important for postal codes, order IDs, invoice numbers, customer numbers, and tracking references. Once numeric typing is applied, leading zeros can disappear.

Why are my dates wrong after importing a CSV into Power Query?

Most of the time, it is a locale mismatch. Power Query may interpret text dates using your current regional settings, which breaks day-month-year or comma-decimal formats from other locales. Keep the column as text first, then convert it using the correct locale.

What should I do if quotes and line breaks break rows?

Suspect parsing before transformation. Vendor CSV files often contain commas, quotes, and line breaks inside note fields, comments, or addresses. Use quote-aware parsing and validate structure before applying business rules.

Why does Power Query combine-files logic break when a new file arrives?

Because the combine workflow depends on helper queries and a representative sample file. If the first or example file does not match later files, schema drift or header differences can break the output. Harden the sample-file logic before trusting the folder import.

Should I disable automatic type detection in Power Query?

For clean internal files, automatic detection is often fine. For messy vendor files, it is safer to reduce auto-guessing, especially when identifiers, mixed locale formatting, or late-row anomalies are common. The more inconsistent the source, the more valuable a text-first import becomes.

References

About the author

Elysiate publishes practical guides and privacy-first tools for data workflows, developer tooling, SEO, and product engineering.

CSV & data files cluster

Explore guides on CSV validation, encoding, conversion, cleaning, and browser-first workflows—paired with Elysiate’s CSV tools hub.

Pillar guide

Free CSV Tools for Developers (2025 Guide) - CLI, Libraries & Online Tools

Comprehensive guide to free CSV tools for developers in 2025. Compare CLI tools, libraries, online tools, and frameworks for data processing.

View all CSV guides →

Related posts