SQL UNION vs UNION ALL

Data & Database Workflows

Apr 4, 2026·Updated Apr 4, 2026·

sqldatabasequery-languageset-operationssql tutorialdata-analysis

·

Level: intermediate · ~17 min read · Intent: informational

Audience: backend developers, data analysts, data engineers, technical teams, software engineers, students

Prerequisites

basic familiarity with databases
basic understanding of SELECT statements

Key takeaways

UNION combines result sets and removes duplicate rows, while UNION ALL combines result sets and keeps every row, including duplicates.
UNION ALL is usually faster and should be the default when you do not explicitly need duplicate removal, because UNION adds extra work to deduplicate the combined result.

FAQ

What is the difference between UNION and UNION ALL in SQL?: UNION removes duplicate rows from the combined result, while UNION ALL keeps all rows, including duplicates.
Is UNION ALL faster than UNION?: Often yes. UNION ALL is usually faster because it does not need to do the extra work required to identify and remove duplicate rows.
When should I use UNION instead of UNION ALL?: Use UNION when duplicate rows across the combined result should appear only once. Use UNION ALL when every row should be preserved or when you know duplicates are not a problem.
Do UNION and UNION ALL require the same columns?: Yes. The SELECT statements being combined must return the same number of columns in the same logical order, and the data types should be compatible.

0

SQL UNION and UNION ALL are two of the most useful set operators because they let you combine the results of multiple queries into one result set.

That matters because real SQL work often involves questions like:

combine current customers and archived customers
merge active products and draft products into one report
pull users from multiple sources into one export
stack monthly summary queries together
combine two filtered result sets into one output
or build one dashboard table from multiple query branches

At first glance, UNION and UNION ALL look almost identical.

That is why people confuse them.

The syntax is very similar. The output often looks similar. But the difference is important because it affects:

whether duplicates stay or disappear
whether row counts change unexpectedly
and whether the query does extra work that may hurt performance

This guide explains SQL UNION vs UNION ALL clearly, including:

what each one does
how duplicates are handled
why performance differs
what column rules must match
how ORDER BY works with combined results
and how to choose the right one in real SQL work

Why UNION and UNION ALL matter

A lot of SQL queries read from one table. But many real problems require combining multiple result sets.

For example:

one query finds paid orders
another query finds refunded orders
and the final report needs both in one combined output

Or:

one table stores current employees
another stores contractors
and you want one people list

Or:

one branch of the query handles online sales
another handles retail sales
and the report needs both in one result

This is exactly the kind of situation where UNION or UNION ALL becomes useful.

They are not join tools. They are result-stacking tools.

That distinction is very important.

The most important rule

Before anything else, remember this:

Use UNION ALL when you want to keep every row. Use UNION only when you explicitly want duplicate rows removed.

That is the single most important rule in this topic.

A lot of people use UNION by default because it sounds cleaner. But that is often the wrong habit.

Why?

Because UNION does extra work to remove duplicates. If you do not actually need that behavior, then UNION ALL is often the better and faster choice.

So the right question is not:

which one sounds nicer?

The right question is:

should duplicate rows be removed here or not?

That is the practical decision.

What UNION does

UNION combines the results of two or more SELECT queries and removes duplicate rows from the final combined result.

Example:

SELECT city
FROM customers

UNION

SELECT city
FROM suppliers;

This returns a single list of cities from both queries, but if the same city appears in both result sets, it appears only once in the final output.

That is the key behavior of UNION:

combine
then deduplicate

What UNION ALL does

UNION ALL also combines the results of two or more SELECT queries, but it keeps all rows exactly as they come in.

Example:

SELECT city
FROM customers

UNION ALL

SELECT city
FROM suppliers;

Now if the same city appears in both queries, it appears multiple times in the final result.

That is the key behavior of UNION ALL:

combine
keep everything

The simplest difference

A very simple summary is:

UNION

Combines results and removes duplicates.

UNION ALL

Combines results and keeps duplicates.

That is the whole core difference.

Everything else in this topic comes from that one distinction.

Why duplicate handling matters so much

A lot of SQL users think duplicates are always bad.

That is not true.

Sometimes duplicates are:

expected
meaningful
and correct

For example:

if two different systems both produced the same row values, maybe you want to keep both
if two sales happened with identical displayed values, those are still two sales
if two customers share the same city, that repetition is meaningful data

So the key question is not:

do duplicates exist?

It is:

are identical rows across these two result sets supposed to appear once or multiple times in the final result?

That is the decision UNION versus UNION ALL controls.

A basic example with names

Suppose query one returns:

name
Alice
Bob

And query two returns:

name
Bob
Cara

With UNION

Result:

name
Alice
Bob
Cara

Bob appears once.

With UNION ALL

Result:

name
Alice
Bob
Bob
Cara

Bob appears twice.

That is the easiest possible example of the difference.

Why UNION can change row counts in surprising ways

Because UNION removes duplicate rows, the final number of rows may be lower than the sum of the two input queries.

For example:

first query returns 100 rows
second query returns 100 rows
final UNION result might return 170 rows instead of 200 if 30 rows were duplicates

This is important because it means:

row count changes are part of UNION logic
not necessarily a bug

So if you use UNION and your row count drops, the first question should be:

were duplicates removed?

That is usually the explanation.

Why UNION ALL is often faster

UNION ALL is usually faster because it does not need to remove duplicates.

It simply stacks the results together.

UNION, on the other hand, must do extra work to identify duplicate rows across the combined result.

That usually means more processing, especially on:

large result sets
wide rows
or queries already doing expensive work

This is why UNION ALL is often the better default when duplicate removal is not required.

A good practical rule is:

do not pay for deduplication unless you actually need deduplication

That is one of the most important performance lessons here.

Why UNION is not always “better quality”

Some people assume:

UNION removes duplicates
therefore it must be the safer or cleaner choice

Not necessarily.

If duplicate rows are meaningful, then removing them is actually wrong.

For example, suppose one query returns:

one order row

and another query returns:

another different order row

If the selected columns happen to match exactly, UNION may collapse them into one visible row even though the business meaning says they were two separate events.

That is why you should not use UNION automatically. It can change the meaning of the data.

UNION and UNION ALL are set operators, not join operators

This is one of the most important conceptual distinctions.

UNION and UNION ALL:

stack rows vertically

JOIN:

combines columns horizontally based on relationships

That means UNION is for:

combining similar result sets

JOIN is for:

relating tables by keys

If you confuse these jobs, queries become harder to reason about.

A good question is:

do I need to add more columns to each row using a relationship? Use a join.

Or:

do I need to append more rows from another compatible query? Use UNION or UNION ALL.

That distinction matters a lot.

Column compatibility rules

For UNION and UNION ALL to work, the SELECT queries must be compatible.

That usually means:

1. Same number of columns

Each SELECT must return the same count of columns.

2. Same column order

The first column from query one aligns with the first column from query two, and so on.

3. Compatible data types

The aligned columns should be type-compatible or convertible.

Example:

SELECT customer_id, customer_name
FROM customers

UNION ALL

SELECT supplier_id, supplier_name
FROM suppliers;

This works because both queries return:

an ID-like column
a name-like column

The column names do not have to be identical. The positions are what matter.

Why column order matters

This is easy to overlook.

Consider:

SELECT customer_id, customer_name
FROM customers

UNION ALL

SELECT supplier_name, supplier_id
FROM suppliers;

This is logically wrong if the intention was:

ID then name

Because now the second query returns:

name in position 1
ID in position 2

The union aligns by column position, not by semantic label.

That is why result shape discipline matters. Always make sure the columns line up correctly.

Output column names usually come from the first query

Another practical detail is that the final result usually takes its output column names from the first SELECT.

Example:

SELECT customer_id AS entity_id, customer_name AS entity_name
FROM customers

UNION ALL

SELECT supplier_id, supplier_name
FROM suppliers;

The combined result will usually use:

entity_id
entity_name

from the first query branch.

That is why the first SELECT is often where people define the final presentation names.

ORDER BY with UNION and UNION ALL

If you want to sort the combined result, the ORDER BY usually applies to the entire final unioned output, not to each query separately.

Example:

SELECT customer_name AS name
FROM customers

UNION ALL

SELECT supplier_name AS name
FROM suppliers

ORDER BY name;

This sorts the final combined result by name.

That is the standard pattern.

If you put ordering ideas inside individual branches casually, that usually does not mean the final combined result will stay ordered the way you expect.

So a good rule is:

if the final merged result needs sorting, apply ORDER BY once at the end

Can you use ORDER BY in each branch?

Some databases allow branch-level ordering only when combined with subqueries or specific constructs, but for ordinary UNION-style result building, what matters most is:

the final ORDER BY on the full result

That is the safest mental model.

So in practice, think:

combine first
sort last

That is the cleanest way to reason about unioned results.

UNION can hide data issues if used carelessly

Because UNION removes duplicates, it can sometimes hide problems you should have noticed.

For example:

accidental overlap between sources
duplicate feed rows
repeated ingestion
faulty joins producing identical output rows
or business records that should have been examined instead of silently collapsed

That is why UNION should be used intentionally. It is not just a harmless convenience feature.

If you remove duplicates, you are making a semantic decision about the data.

That deserves deliberate thought.

UNION ALL is often better for auditability

If you are combining sources and want to see exactly what each source contributed, UNION ALL is often better because it preserves:

every row
every duplication
every repeated appearance

This can be useful in:

audits
reconciliation
data validation
integration troubleshooting
and pipeline debugging

You can always deduplicate later if needed. But once a UNION collapses rows, that detail is gone from the result.

This is one reason many engineers prefer to start with UNION ALL during debugging and only move to UNION when the business rule clearly requires deduplication.

Practical use case: active and archived data

A very common use case is combining current and archived tables.

Example:

SELECT order_id, customer_id, total_amount, created_at
FROM current_orders

UNION ALL

SELECT order_id, customer_id, total_amount, created_at
FROM archived_orders;

This is often a good UNION ALL use case because:

current and archived rows are both real rows
duplicates are usually not expected if the archival process is clean
and if duplicates do appear, you may actually want to know

Using plain UNION here could mask problems in the archival process.

That is a good example of why the choice matters.

Now consider this:

SELECT category
FROM products

UNION

SELECT category
FROM services;

Here, UNION may be the correct choice because the real question is:

what unique categories exist across both sources?

If Consulting appears in both queries, the final answer may indeed need it only once.

This is a classic good fit for UNION.

UNION ALL for event-style data

If you are combining:

events from system A
and events from system B

then UNION ALL is often more appropriate because each event row is meaningful, even if some displayed fields match.

Example:

SELECT event_time, event_type, user_id
FROM app_events

UNION ALL

SELECT event_time, event_type, user_id
FROM web_events;

These are usually separate event streams. If identical-looking rows occur, they may still represent two different real events.

That makes UNION ALL the safer and more semantically correct choice in many event scenarios.

UNION is often better for distinct lookup-style outputs

UNION is often more appropriate when the output itself is meant to behave like a unique set.

Examples:

unique city list
unique email domains
unique tags
unique status names
unique product categories
unique region names across two sources

These are set-style outputs where duplicate removal is part of the goal.

That is why the word “union” fits so naturally in set-like reporting problems.

Common mistake: using UNION when UNION ALL is the real intent

This is probably the most common mistake.

People often write UNION simply because they know it, not because they need duplicate removal.

That causes two problems:

extra work
possible silent row loss

A better default habit is:

start by asking whether duplicates should remain
if yes or maybe yes, use UNION ALL
only use UNION when duplicate removal is explicitly correct

This is the single most practical usage habit for this topic.

Common mistake: assuming duplicates are compared by only one column

UNION removes duplicate rows, not duplicate values in just one column.

That means all selected columns matter together.

Example:

Query result one:

name	city
Alice	Cape Town

Query result two:

name	city
Alice	Durban

These are not duplicates as full rows. So UNION keeps both.

This is important because duplicate removal applies to the whole row shape of the combined SELECT result.

Common mistake: mismatched column meaning

A UNION query may run successfully even when the columns are semantically mismatched.

Example:

SELECT customer_id, country
FROM customers

UNION ALL

SELECT supplier_name, supplier_id
FROM suppliers;

This might be syntactically allowed if types are compatible enough, but it is logically nonsense.

That is why column alignment is not only about type compatibility. It is also about meaning compatibility.

Strong UNION queries combine:

similar columns
in the same positions
with a shared logical purpose

UNION ALL in ETL and data pipelines

In ETL and data engineering, UNION ALL is extremely common because data pipelines often need to:

stack source partitions
combine daily loads
merge feeds from multiple systems
or append snapshots before later deduplication or validation

This is another reason UNION ALL often appears more frequently in performance-sensitive or engineering-heavy SQL work.

It preserves the raw truth of the inputs, which is often what pipelines need first.

If you need deduplication later, do it deliberately

Sometimes the best design is:

first use UNION ALL
then explicitly deduplicate in a later step if needed

For example, you may want to:

combine all source rows
inspect duplicates
label source system
apply rules
then decide what counts as a true duplicate

That is often better than using UNION too early and silently collapsing rows before you fully understand the overlap.

This is especially valuable in data integration work.

Performance summary: why UNION ALL usually wins

If both result sets are already logically compatible and you do not need duplicate removal, UNION ALL is usually better because it:

does less work
preserves row truth
avoids deduplication overhead
and is often easier to reason about in data pipelines

UNION is still the right choice when uniqueness is the actual business requirement. But it should be used because you want deduplication, not by habit.

That is the most practical performance lesson here.

Practical examples

Example 1: unique city list

SELECT city
FROM customers

UNION

SELECT city
FROM suppliers;

Use this when:

you want one deduplicated list of cities

Example 2: combine all events

SELECT event_time, event_type, user_id
FROM app_events

UNION ALL

SELECT event_time, event_type, user_id
FROM web_events;

Use this when:

every event row matters
and duplicates should not be removed automatically

Example 3: combine active and archived users

SELECT user_id, email, created_at
FROM active_users

UNION ALL

SELECT user_id, email, created_at
FROM archived_users;

Use this when:

you want the full combined history
and deduplication is not the immediate goal

Example 4: unique product categories across tables

SELECT category AS category_name
FROM products

UNION

SELECT category_name
FROM services;

Use this when:

the output should be one distinct category list

Example 5: performance-conscious stacking of monthly results

SELECT 'January' AS month_name, total_revenue
FROM january_summary

UNION ALL

SELECT 'February' AS month_name, total_revenue
FROM february_summary

UNION ALL

SELECT 'March' AS month_name, total_revenue
FROM march_summary;

Use this when:

each monthly summary row is meaningful
and no deduplication is needed

FAQ

What is the difference between UNION and UNION ALL in SQL?

UNION removes duplicate rows from the combined result, while UNION ALL keeps all rows, including duplicates.

Is UNION ALL faster than UNION?

Often yes. UNION ALL is usually faster because it does not need to do the extra work required to identify and remove duplicate rows.

When should I use UNION instead of UNION ALL?

Use UNION when duplicate rows across the combined result should appear only once. Use UNION ALL when every row should be preserved or when you know duplicates are not a problem.

Do UNION and UNION ALL require the same columns?

Yes. The SELECT statements being combined must return the same number of columns in the same logical order, and the data types should be compatible.

Final thoughts

SQL UNION and UNION ALL are simple once you focus on the one difference that matters most:

UNION removes duplicate rows
UNION ALL keeps everything

That sounds small, but it affects:

correctness
row counts
performance
and the meaning of the final result

The strongest practical habit is:

use UNION ALL by default when duplicate removal is not explicitly needed
use UNION only when one deduplicated combined result is the actual business goal

If you keep that distinction clear, you will avoid a lot of accidental performance cost and a lot of silent result-shaping mistakes in real SQL work.

Free, privacy-first utilities in your browser — no uploads required for most workflows.

SQL guides

Explore the connected SQL guides on fundamentals, joins, analytics, performance, interviews, and practical workflows.

Pillar guide

SQL Complete Guide for Beginners and Developers

A complete SQL guide for beginners and developers covering databases, tables, SELECT, WHERE, JOINs, GROUP BY, CASE, subqueries, CTEs, inserts, updates, deletes, indexes, and practical query patterns.

View all SQL guides →

SQL UNION vs UNION ALL

Prerequisites

Key takeaways

FAQ

Why UNION and UNION ALL matter

The most important rule

What UNION does

What UNION ALL does

The simplest difference

UNION

UNION ALL

Why duplicate handling matters so much

A basic example with names

With UNION

With UNION ALL

Why UNION can change row counts in surprising ways

Why UNION ALL is often faster

Why UNION is not always “better quality”

UNION and UNION ALL are set operators, not join operators

Column compatibility rules

1. Same number of columns

2. Same column order

3. Compatible data types

Why column order matters

Output column names usually come from the first query

ORDER BY with UNION and UNION ALL

Can you use ORDER BY in each branch?

UNION can hide data issues if used carelessly

UNION ALL is often better for auditability

Practical use case: active and archived data

Practical use case: unique category list across sources

UNION ALL for event-style data

UNION is often better for distinct lookup-style outputs

Common mistake: using UNION when UNION ALL is the real intent

Common mistake: assuming duplicates are compared by only one column

Common mistake: mismatched column meaning

UNION ALL in ETL and data pipelines

If you need deduplication later, do it deliberately

Performance summary: why UNION ALL usually wins

Practical examples

Example 1: unique city list

Example 2: combine all events

Example 3: combine active and archived users

Example 4: unique product categories across tables

Example 5: performance-conscious stacking of monthly results

FAQ

What is the difference between UNION and UNION ALL in SQL?

Is UNION ALL faster than UNION?

When should I use UNION instead of UNION ALL?

Do UNION and UNION ALL require the same columns?

Final thoughts

Use these tools

SQL guides

Related posts