"CSV" That Is Actually Semicolon-Separated European Excel
Level: intermediate · ~12 min read · Intent: informational
Audience: developers, data analysts, ops engineers, finance and operations teams
Prerequisites
- basic familiarity with CSV files
- optional: SQL or ETL concepts
Key takeaways
- A file can be called CSV and still be semicolon-separated because spreadsheet tools often follow regional list-separator settings rather than a strict comma-only interpretation.
- European-style decimal commas and thousands separators are one of the main reasons semicolon-separated exports appear.
- The safest workflow is to detect the actual delimiter and numeric locale first, then validate structure and only after that cast types or load into downstream systems.
FAQ
- Why does Excel sometimes save a CSV with semicolons instead of commas?
- Because Excel often follows the operating system's regional list-separator settings. In locales where comma is used as the decimal separator, semicolon is commonly used as the field separator.
- Is a semicolon-separated file still really CSV?
- In practice, many teams still call it CSV, but it is better thought of as delimiter-separated text that may not match a strict comma-separated expectation.
- Why do decimal commas matter so much here?
- Because if comma is used inside numeric values, using comma as the field delimiter becomes ambiguous, so semicolon is often chosen instead.
- What is the safest way to import European Excel CSV files?
- Keep the original file, detect delimiter and encoding, inspect numeric locale patterns, and then normalize or configure the importer explicitly before loading.
"CSV" That Is Actually Semicolon-Separated European Excel
One of the most common CSV support tickets is not really about CSV syntax at all.
It sounds more like this:
- “The file is called CSV, but my parser only sees one column.”
- “Excel opened it perfectly, but the database import failed.”
- “The numbers use commas for decimals, and now everything is split wrong.”
- “This vendor says it is CSV, but the rows are separated by semicolons.”
That situation is extremely common in European spreadsheet workflows.
A file can absolutely be delivered as “CSV” and still be semicolon-separated in practice. That does not mean the producer is crazy. It means the producer and consumer are operating under different assumptions about locale, list separators, and what the word “CSV” implies.
This guide explains why semicolon-separated European Excel exports are so common, why they break parsers, and how to validate them safely before loading them into anything downstream.
If you want the practical tools first, start with the CSV Validator, CSV Format Checker, CSV Delimiter Checker, CSV Header Checker, CSV Row Checker, or Malformed CSV Checker.
Why this happens in the first place
RFC 4180 documents a common CSV format where fields are separated by commas and records are separated by line breaks. That is the standards baseline many developers have in mind when they hear “CSV.” But RFC 4180 also exists because CSV has long had many interpretations in the real world. It does not magically force every spreadsheet export to be comma-delimited.
In Microsoft Excel and related import/export flows, regional settings matter. Microsoft’s support documentation says the Windows regional settings let you change the list separator used for text export and import behavior, and the Excel import/export documentation explicitly points users to regional settings for controlling separators. Microsoft also documents changing decimal and thousands separators separately in Excel. citeturn442353search0turn442353search1turn442353search5turn442353search9
That means the file format users casually call “CSV” is often really:
- delimiter-separated text
- shaped by the local spreadsheet environment
- influenced by decimal separator conventions
- only loosely aligned with a strict comma-separated assumption
The key driver: decimal comma versus comma delimiter
This is the most important concept in the whole topic.
In many European locales:
- comma is used as the decimal separator
- dot is used as the thousands separator
- semicolon becomes the practical list separator
That means a row like this is natural in those environments:
account;amount;status
4001;577,50;active
4002;12,30;pending
If your importer assumes comma delimiter, the numeric values no longer look like numbers. They look like broken columns.
This is why semicolon-separated “CSV” appears so often in European Excel exports: semicolon avoids the ambiguity that would come from using comma both inside numeric values and between fields.
Why Excel users still call it CSV
From the user’s perspective, nothing weird happened.
They used:
- Save as CSV
- Export as CSV
- a spreadsheet tool that labels the format as CSV
- a regional environment where semicolon is the expected list separator
So the file is still understood socially as a CSV file.
From a parser’s perspective, the file may instead be:
- semicolon-delimited
- locale-sensitive numeric text
- a dialect that does not match RFC-4180-style comma assumptions
- something that needs explicit delimiter selection or auto-detection
That is why both sides feel correct and the pipeline still breaks.
Excel and system separators are part of the contract whether you like it or not
Microsoft’s documentation about changing decimal and thousands separators in Excel, plus its guidance on changing Windows regional settings and list separators, tells you something important:
spreadsheet exports are influenced by environment settings, not just by abstract file-format ideals. citeturn442353search1turn442353search5turn442353search9
So if your pipeline ingests files produced by Excel users in multiple regions, you should assume that at least some of them will hand you:
- semicolon-separated files
- decimal-comma numbers
- locale-shaped dates
- encodings that follow spreadsheet/export defaults rather than your backend defaults
The export contract should acknowledge that reality instead of pretending every file will be cleanly comma-separated.
Power Query is helpful, but it proves the point
Power Query’s Text/CSV connector docs say that if you choose a text file, Power Query automatically attempts to determine whether it has delimiter-separated values and what the delimiter is. That is useful because it reflects real-world CSV diversity. Microsoft’s docs also show delimiter and file-origin options explicitly in import flows. citeturn442353search2turn442353search10turn442353search13
That helps users, but it also highlights why downstream systems disagree:
- some tools auto-detect
- some assume comma
- some let users override delimiter manually
- some are stricter than spreadsheets
- some interpret numeric locale differently
So when a user says “Excel imported it fine,” they may really mean:
- Excel or Power Query made a locale-aware guess
- the tool inferred semicolon correctly
- the decimal commas stayed intact
- the importer used the user’s regional context
Your backend parser may not be doing any of that.
The file is not broken just because it uses semicolons
This is worth saying explicitly.
A semicolon-separated export is not necessarily a bad file. It is only a bad file for your pipeline if your pipeline expects something else.
That is why the right question is not:
Is this a real CSV?
The better question is:
Does this file match the dialect, locale assumptions, and delimiter rules our importer is configured to accept?
That framing makes debugging much easier.
Common failure modes
1. Everything lands in one column
This is the classic symptom.
The importer assumes comma delimiter, but the file uses semicolons. Every row comes in as one giant string.
2. Numeric columns split incorrectly
If someone tries to “fix” the delimiter without understanding the decimal convention, values like 577,50 may become:
- two fields
- strings that fail numeric casting
- incorrectly normalized amounts
3. A spreadsheet re-save changes the dialect again
A user opens the file in one environment and re-saves it in another. Now the delimiter, encoding, or numeric rendering may have changed.
4. Parser auto-detection disagrees with spreadsheet behavior
Power Query may infer one thing. A backend ETL tool may infer another. A warehouse load step may need explicit delimiter and numeric handling instead of any guessing at all.
5. Header rows look correct while values are misread
The importer may parse the header row acceptably and still misinterpret all the numeric fields later.
That is especially common when delimiter detection and type inference are both in play.
Why “just replace semicolons with commas” is dangerous
This is one of the worst quick fixes.
If the file also contains decimal commas, a blind replacement creates ambiguity or corruption immediately.
For example, this valid semicolon-style row:
account;amount;status
4001;577,50;active
cannot be safely turned into comma-delimited CSV by a naive search-and-replace, because now the amount field itself contains a comma that needs to be quoted or normalized first.
A correct transformation requires deliberate rules:
- identify the delimiter
- identify the decimal separator
- normalize numeric fields if needed
- then re-emit the file in the target dialect
Anything less is guesswork.
A practical validation workflow
1. Preserve the original file
Do not start by opening and re-saving it in Excel.
Keep the original bytes so you can answer later:
- what delimiter did the source really use?
- what encoding was exported?
- were decimal commas present in the raw text?
- did a human or another tool rewrite the file after the original export?
2. Detect the actual delimiter
Use a CSV-aware validator or import tool to inspect whether the file is:
- comma-delimited
- semicolon-delimited
- tab-delimited
- something stranger
Do not assume from the .csv extension alone.
3. Inspect numeric patterns before type casting
Look for fields like:
577,501.234,561,234.5612,3012.30
This tells you whether the file is using a European-style numeric locale, an English-style numeric locale, or something mixed and dangerous.
4. Confirm encoding
Spreadsheet exports may differ on UTF-8, BOM presence, or other locale-sensitive defaults. If your loader cares about encoding, document and validate it explicitly.
5. Validate structure before business rules
Once delimiter and encoding are known, then validate:
- row consistency
- quoting
- header presence
- malformed rows
- duplicate headers
Only after that should you move to type casts or domain rules.
6. Normalize or configure, do not improvise
At this point you have two main safe options:
- configure the importer to accept semicolon delimiter and the expected locale patterns
- normalize the file into your canonical CSV dialect using a repeatable transform
Both are valid. Ad hoc manual edits are not.
The best long-term fix is a data contract
If semicolon-separated European Excel exports are recurring in your workflow, you do not have a one-off file problem. You have a contract problem.
A useful contract should document:
- accepted delimiter
- accepted encoding
- decimal separator expectations
- thousands separator expectations
- whether the producer may use Excel-generated exports
- whether semicolon-separated files are valid input
- whether files must be normalized before warehouse load
- which tool or parser is the source of truth for validation
This is what turns recurring support tickets into a stable ingestion rule.
Common mistakes to avoid
Assuming .csv means comma-delimited
In real-world spreadsheet workflows, it often does not.
Treating semicolon-separated files as invalid by definition
They may be valid for the producing environment, even if your pipeline does not accept them yet.
Replacing semicolons blindly
This breaks decimal-comma numbers immediately.
Using Excel as the only validator
Excel import behavior is helpful, but it is not the same thing as validating against the production parser contract.
Ignoring numeric locale
Delimiter issues and numeric parsing issues are tightly connected in this topic.
FAQ
Why does Excel sometimes save a CSV with semicolons instead of commas?
Because Excel often follows the system’s regional list-separator settings, and in locales where comma is the decimal separator, semicolon is commonly used as the field separator. citeturn442353search1turn442353search5turn442353search9
Is a semicolon-separated file still really CSV?
In practice many teams still call it CSV, but it is safer to think of it as a delimiter-separated text file whose dialect may not match a strict comma-separated assumption.
Why do decimal commas matter so much?
Because if comma is used inside numeric values, using comma as the field delimiter becomes ambiguous, so semicolon is often chosen instead.
Can Power Query handle these files?
Often yes, because Microsoft documents that it attempts delimiter detection and exposes delimiter and file-origin settings. But that convenience does not mean every backend parser will behave the same way. citeturn442353search2turn442353search10
What is the safest way to import these files?
Keep the original, detect the real delimiter and numeric locale, validate structure, and then either configure the importer explicitly or normalize the file with a repeatable transform.
Related tools and next steps
If you are dealing with European Excel exports that do not match strict comma-delimited assumptions, these are the best next steps:
- CSV Validator
- CSV Format Checker
- CSV Delimiter Checker
- CSV Header Checker
- CSV Row Checker
- Malformed CSV Checker
- CSV tools hub
Final takeaway
A semicolon-separated European Excel export is not weird. It is normal in the environment that produced it.
The real problem starts when a downstream system assumes:
- comma delimiter
- dot decimal separator
- spreadsheet behavior equals parser behavior
.csvmeans one universal dialect
Once you treat delimiter and numeric locale as part of the data contract, these files become much easier to handle and much less mysterious to debug.
About the author
Elysiate publishes practical guides and privacy-first tools for data workflows, developer tooling, SEO, and product engineering.