SQL Subqueries Explained

Data & Database Workflows

Apr 4, 2026·By Elysiate·Updated Apr 4, 2026·

sqldatabasequery-languagesubqueriessql tutorialintermediate sql

·

Level: intermediate · ~18 min read · Intent: informational

Audience: backend developers, data analysts, data engineers, technical teams, software engineers, students

Prerequisites

basic familiarity with databases
basic understanding of SELECT, WHERE, JOIN, and aggregate functions

Key takeaways

A SQL subquery is simply a query inside another query, but the real skill is understanding what the inner query should return and how the outer query uses that result.
Subqueries are powerful for filtering, comparison, existence checks, and staged logic, but they should be chosen deliberately because some problems are clearer or faster with joins, CTEs, or window functions.

FAQ

What is a subquery in SQL?: A subquery is a query nested inside another SQL query. It can return a single value, a list of values, or a result set that the outer query uses for filtering, comparison, or further querying.
What is the difference between a correlated subquery and a normal subquery?: A normal subquery can run independently of the outer query, while a correlated subquery depends on values from the current row of the outer query.
When should I use a subquery instead of a JOIN?: Use a subquery when it expresses the logic more clearly, especially for value comparisons, EXISTS checks, or staged filtering. Use a JOIN when you need columns from related tables or when a join-based approach is clearer and easier to optimize.
Are subqueries bad for performance?: Not automatically. Some subqueries perform very well, but correlated subqueries or poorly structured nested logic can become expensive. The right choice depends on the database engine, the data shape, the indexes, and the execution plan.

0

SQL subqueries are one of the most useful SQL features because they let you answer a question by first answering a smaller question inside it.

That sounds simple, and it is.

But it is also where many SQL learners get stuck.

They can write:

SELECT
WHERE
JOIN
GROUP BY

But once they see a query inside another query, it starts to feel more advanced than it really is.

The good news is that a subquery is not a separate kind of SQL universe. It is just:

one query producing something
that another query uses

That is all.

This guide explains SQL subqueries clearly, including:

what a subquery is
the main subquery types
scalar subqueries
IN and EXISTS
correlated subqueries
subqueries in SELECT
subqueries in FROM
when subqueries are a great fit
and when a join, CTE, or window function may be the better choice

Why subqueries matter

A lot of SQL problems are really two-step problems.

For example:

find products priced above the average product price
return customers who placed at least one order
show employees who earn more than their department average
return the latest order date per customer
list users whose email appears in a suspicious source list
find products that were never ordered

Each of those has a natural smaller question inside it:

what is the average price?
which customers have orders?
what is the average salary in this department?
what is the latest order date?
which emails are in the suspicious list?
which products appear in order items?

That inner question is often a subquery.

This is why subqueries matter so much. They let you break logic into nested steps that often match how people naturally think.

The most important rule

Before anything else, remember this:

To understand a subquery, first ask what the inner query returns.

That is the most important rule in this topic.

A lot of confusion disappears once you ask:

does the inner query return one value?
a list of values?
or a table-like result set?

Because that determines how the outer query can use it.

For example:

one value can be used with =, >, <
a list of values can be used with IN
an existence check can be used with EXISTS
a table-like result can be used in FROM

If you get the return shape of the inner query clear first, the rest becomes much easier.

What a subquery is

A subquery is a query inside another query.

That is the basic definition.

Example:

SELECT *
FROM products
WHERE price > (
    SELECT AVG(price)
    FROM products
);

The inner query is:

SELECT AVG(price)
FROM products

The outer query uses that result to filter rows.

That is a subquery.

It is sometimes also called:

nested query
inner query

But “subquery” is the most common term.

A simple mental model

A very useful mental model is:

Outer query

The main question.

Inner query

A smaller helper question.

For example:

Main question:

which products cost more than average?

Helper question:

what is the average price?

That is exactly how subqueries often work in practice.

The main kinds of subqueries

Subqueries are usually easiest to understand when grouped by what they return.

The most common practical types are:

1. Scalar subquery

Returns one value.

2. Multi-row subquery

Returns a list of values.

3. Correlated subquery

Depends on the current row of the outer query.

4. Derived-table subquery

Returns a table-like result used in FROM.

These are the main patterns you will see again and again.

Scalar subquery explained

A scalar subquery returns one value.

That makes it useful in comparisons like:

=
>
<
>=
<=

Example:

SELECT
    product_id,
    product_name,
    price
FROM products
WHERE price > (
    SELECT AVG(price)
    FROM products
);

The inner query returns:

one value
the average price

The outer query then compares each product price to that one value.

This is one of the easiest and most common subquery patterns.

Why scalar subqueries are useful

Scalar subqueries are useful when the outer query needs:

one threshold
one benchmark
one summary value
one lookup value

Examples:

above average salary
below minimum passing score
equal to latest date
more expensive than average product
greater than total from another calculation

These are all natural one-value comparison problems.

Multi-row subquery explained

A multi-row subquery returns multiple values.

That means the outer query often uses it with:

IN
NOT IN
ANY
ALL

The most common one by far is IN.

Example:

SELECT
    customer_id,
    customer_name
FROM customers
WHERE customer_id IN (
    SELECT customer_id
    FROM orders
);

The inner query returns:

many customer IDs from orders

The outer query then keeps only customers whose customer_id appears in that set.

This is a very common and useful SQL pattern.

Why IN works so naturally with subqueries

IN is useful because the logic reads almost like English.

Example:

return customers whose ID is in the set of customer IDs from orders

That is very intuitive.

It is often used for:

filtering by related IDs
matching against a result set
narrowing rows based on another table
and quick membership logic

That is why IN subqueries are common in both learning and real production SQL.

Example: products that were ordered

SELECT
    product_id,
    product_name
FROM products
WHERE product_id IN (
    SELECT product_id
    FROM order_items
);

This returns products whose IDs appear in the order_items table.

The inner query returns:

the set of ordered product IDs

The outer query uses that set for filtering.

This is another very practical subquery pattern.

Subqueries with NOT IN

You can also use:

NOT IN

Example:

SELECT
    product_id,
    product_name
FROM products
WHERE product_id NOT IN (
    SELECT product_id
    FROM order_items
);

This looks like:

return products not present in order items

Conceptually, that is fine.

But there is an important caveat with NULL values, which we will cover later, because NOT IN can behave unexpectedly if the subquery returns NULL.

That is one reason many people prefer NOT EXISTS for these patterns.

EXISTS explained

EXISTS checks whether the subquery returns at least one row.

That is different from comparing a value to a returned set. It is purely an existence check.

Example:

SELECT
    c.customer_id,
    c.customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

This means:

keep customers for whom at least one matching order exists

This is one of the most important SQL patterns.

Why EXISTS is so useful

EXISTS is useful when the real question is:

does a related row exist?

Not:

what exact values should I compare?

Examples:

customers with at least one order
users with at least one login
products with at least one sale
employees with subordinates
accounts with unpaid invoices

These are all existence questions.

That is why EXISTS is often one of the cleanest ways to express them.

EXISTS versus IN

A very common SQL question is:

should I use IN or EXISTS?

The practical answer is:

both can be correct
but they express slightly different logic

IN

Often feels natural when comparing one value to a returned set.

EXISTS

Often feels natural when asking whether at least one related row exists.

Example with IN:

WHERE customer_id IN (
    SELECT customer_id
    FROM orders
)

Example with EXISTS:

WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
)

Both may solve similar problems. But EXISTS is often clearer for relationship existence logic.

Correlated subquery explained

A correlated subquery is a subquery that depends on values from the outer query.

That means the inner query cannot be understood independently in the same way as a normal subquery. It is tied to the current outer row.

Example:

SELECT
    e.employee_id,
    e.employee_name,
    e.department,
    e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department = e.department
);

The inner query depends on:

e.department

which comes from the outer query.

So for each employee row, the inner query calculates:

average salary in that employee’s department

That is a correlated subquery.

Why correlated subqueries are powerful

Correlated subqueries are powerful because they let you compare each row against:

its own group
its own related context
or its own dynamic benchmark

Examples:

employee versus department average
order versus customer’s average order size
product versus category average price
row versus previous maximum within a group

They are very expressive and often match the business logic well.

Why correlated subqueries can be harder

Correlated subqueries are often harder because:

they are more conceptually nested
the inner query runs with reference to outer-row values
performance can become more sensitive on larger datasets
and the same logic might sometimes be expressible with joins or window functions more clearly

That does not make them bad. It just means they require more deliberate thinking.

A good habit is:

use them when they make the logic clearer
but still inspect performance and readability

Example: customers whose total order value is above average customer total

You can solve this in several ways, but a staged subquery approach can look natural.

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals
WHERE total_spend > (
    SELECT AVG(total_spend)
    FROM (
        SELECT
            customer_id,
            SUM(total_amount) AS total_spend
        FROM orders
        GROUP BY customer_id
    ) avg_source
);

This is more complex than some alternatives, but it shows a key point:

subqueries can be used to create intermediate result sets
and then query those intermediate results further

That is very useful in reporting logic.

Subquery in the FROM clause

A subquery can also appear in the FROM clause.

When this happens, it acts like a temporary table or derived table.

Example:

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals;

The inner query creates:

a customer totals result set

The outer query selects from it as though it were a table.

This is a very useful SQL technique.

Why subqueries in FROM are useful

They are useful when:

you want to build an intermediate result
give it a name
then query it further

This helps with tasks like:

filtering aggregated results
joining summarized data
restructuring row grain before the next step
simplifying more complex logic

A lot of SQL reporting becomes easier when you think in stages like this.

Subquery in the SELECT list

A subquery can also appear in the SELECT list itself.

Example:

SELECT
    c.customer_id,
    c.customer_name,
    (
        SELECT COUNT(*)
        FROM orders o
        WHERE o.customer_id = c.customer_id
    ) AS order_count
FROM customers c;

This returns:

customer data
plus a derived order count per customer

This is another form of correlated subquery, because the inner query depends on:

c.customer_id

This pattern can be useful, but like other correlated forms, it should be used thoughtfully.

When a SELECT-list subquery is useful

It is useful when you want:

one main row
plus a derived related value per row

Examples:

customer plus order count
product plus total sales
employee plus number of direct reports
ticket plus latest comment time

This can be very readable for small or moderate use cases.

But on large workloads, you should still consider whether:

a join plus grouping
or a window function would be clearer or more efficient

Subqueries versus joins

This is one of the biggest practical SQL design questions.

Sometimes a subquery is clearer. Sometimes a join is clearer.

There is no universal rule that one is always better.

A good practical difference is:

Subquery

Useful when the inner query feels like:

a helper result
a membership set
an existence check
a benchmark value
or a staged intermediate result

Join

Useful when you need:

columns from related tables
direct row combination
or a flatter relationship-based query structure

The best choice is usually the one that makes the logic easiest to understand and maintain while still performing well.

Example: subquery versus join for ordered products

Subquery style:

SELECT
    product_id,
    product_name
FROM products
WHERE product_id IN (
    SELECT product_id
    FROM order_items
);

Join style:

SELECT DISTINCT
    p.product_id,
    p.product_name
FROM products p
INNER JOIN order_items oi
    ON p.product_id = oi.product_id;

Both may answer the same logical question.

The subquery version says:

keep products whose ID appears in order_items

The join version says:

join matching ordered products

Sometimes the subquery feels more direct for the business question. That is why subqueries remain so useful.

Subqueries versus CTEs

A CTE is often just a clearer, named way to express staged subquery logic.

Example using a derived table:

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals;

Equivalent idea with a CTE:

WITH customer_totals AS (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
)
SELECT
    customer_id,
    total_spend
FROM customer_totals;

Both are valid.

The CTE version is often easier to read in larger queries. That is why subqueries and CTEs are closely related in practical SQL design.

Subqueries versus window functions

Some problems that used to be solved with subqueries are often cleaner with window functions.

Example: latest order per customer.

Subquery solutions exist. But often ROW_NUMBER() or MAX() OVER (...) style patterns are clearer.

That does not make subqueries obsolete. It just means modern SQL gives you several tools, and the best one depends on:

readability
row shape
performance
and the business question

A strong SQL user knows multiple ways to solve a problem.

Common subquery patterns

Here are the most common patterns you will see.

1. Scalar comparison

WHERE price > (
    SELECT AVG(price)
    FROM products
)

2. Membership test

WHERE customer_id IN (
    SELECT customer_id
    FROM orders
)

3. Existence check

WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
)

4. Derived table in FROM

FROM (
    SELECT customer_id, SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals

5. Correlated calculation

WHERE salary > (
    SELECT AVG(salary)
    FROM employees e2
    WHERE e2.department = e.department
)

If you understand these five patterns, you understand a large share of real SQL subquery use.

Common mistake: not knowing what the subquery returns

This is the biggest one.

If the outer query expects:

one value

but the subquery returns:

many rows

the query may fail or behave differently than expected.

That is why the first question is always:

what shape does the subquery return?

One value? One column with many rows? A whole table-like result?

That determines the correct outer usage.

Common mistake: using = with a multi-row subquery

This is a classic beginner issue.

Wrong pattern if multiple rows can return:

WHERE customer_id = (
    SELECT customer_id
    FROM orders
)

If the subquery returns many customer IDs, that is not appropriate.

The correct pattern may be:

WHERE customer_id IN (
    SELECT customer_id
    FROM orders
)

This is why matching the comparison operator to the subquery shape is so important.

Common mistake: NOT IN with NULL values

This is one of the most important practical warnings.

Example:

SELECT *
FROM products
WHERE product_id NOT IN (
    SELECT product_id
    FROM order_items
);

If the subquery can return NULL values, NOT IN may behave in surprising ways.

That is why many teams prefer:

WHERE NOT EXISTS (...)

for anti-join style logic such as:

rows with no related match

This is a very important real-world SQL caution.

Common mistake: overusing correlated subqueries where clearer options exist

Correlated subqueries can be elegant. But sometimes they make a query:

harder to read
harder to optimize
or harder to explain

If the same logic is clearer with:

a join
a CTE
or a window function

that may be the better option.

This is not about banning correlated subqueries. It is about using the clearest tool for the problem.

Performance thoughts on subqueries

Subqueries are not automatically bad for performance.

Some perform extremely well. Others do not.

The real performance depends on:

the database engine
the indexes
whether the subquery is correlated
row counts
join paths
and the overall execution plan

A strong rule is:

do not judge performance by syntax alone
inspect the plan and measure the real query

That is especially important for correlated subqueries and very large result sets.

Practical examples

Example 1: products above average price

SELECT
    product_id,
    product_name,
    price
FROM products
WHERE price > (
    SELECT AVG(price)
    FROM products
);

Example 2: customers with orders

SELECT
    customer_id,
    customer_name
FROM customers
WHERE customer_id IN (
    SELECT customer_id
    FROM orders
);

Example 3: customers with at least one order using EXISTS

SELECT
    c.customer_id,
    c.customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

Example 4: employees above department average

SELECT
    e.employee_id,
    e.employee_name,
    e.department,
    e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department = e.department
);

Example 5: summarized data from a derived table

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals
WHERE total_spend > 1000;

These examples cover the main styles you will use most often.

How to decide whether a subquery is a good fit

A practical decision framework looks like this:

Use a subquery when:

the inner query feels like a natural helper question
you need one benchmark value
you need a membership set
you need an existence check
or you want a clean intermediate result

Consider a join when:

you need columns from related tables directly
the row relationship is central to the logic
or the join makes the result shape clearer

Consider a CTE when:

the logic has multiple stages
naming intermediate results improves readability
or the query is getting long

Consider window functions when:

the problem is about row ranking, row comparison, or per-group latest logic

This is a very useful way to think practically about SQL design.

FAQ

What is a subquery in SQL?

A subquery is a query nested inside another SQL query. It can return a single value, a list of values, or a result set that the outer query uses for filtering, comparison, or further querying.

What is the difference between a correlated subquery and a normal subquery?

A normal subquery can run independently of the outer query, while a correlated subquery depends on values from the current row of the outer query.

When should I use a subquery instead of a JOIN?

Use a subquery when it expresses the logic more clearly, especially for value comparisons, EXISTS checks, or staged filtering. Use a JOIN when you need columns from related tables or when a join-based approach is clearer and easier to optimize.

Are subqueries bad for performance?

Not automatically. Some subqueries perform very well, but correlated subqueries or poorly structured nested logic can become expensive. The right choice depends on the database engine, the data shape, the indexes, and the execution plan.

Final thoughts

SQL subqueries are powerful because they let you build a larger query from smaller logical questions.

That is why they feel natural once you stop thinking of them as “advanced syntax” and start thinking of them as:

helper queries inside bigger queries

The most important ideas to remember are:

first understand what the inner query returns
scalar subqueries return one value
multi-row subqueries often work with IN
EXISTS is great for relationship existence checks
correlated subqueries depend on the outer row
subqueries in FROM act like temporary derived tables
and joins, CTEs, and window functions are sometimes better alternatives depending on the problem

If you keep those ideas clear, subqueries stop feeling confusing and start becoming one of the most natural and useful parts of writing real SQL.

About the author

Elysiate publishes practical guides and privacy-first tools for data workflows, developer tooling, SEO, and product engineering.

View author profile Read editorial policy

Free, privacy-first utilities in your browser — no uploads required for most workflows.

SQL guides

Explore the connected SQL guides on fundamentals, joins, analytics, performance, interviews, and practical workflows.

Pillar guide

SQL Complete Guide for Beginners and Developers

A complete SQL guide for beginners and developers covering SELECT, WHERE, JOINs, GROUP BY, CTEs, indexes, and practical query patterns.

View all SQL guides →

SQL Subqueries Explained

Prerequisites

Key takeaways

FAQ

Why subqueries matter

The most important rule

What a subquery is

A simple mental model

Outer query

Inner query

The main kinds of subqueries

1. Scalar subquery

2. Multi-row subquery

3. Correlated subquery

4. Derived-table subquery

Scalar subquery explained

Why scalar subqueries are useful

Multi-row subquery explained

Why IN works so naturally with subqueries

Example: products that were ordered

Subqueries with NOT IN

EXISTS explained

Why EXISTS is so useful

EXISTS versus IN

IN

EXISTS

Correlated subquery explained

Why correlated subqueries are powerful

Why correlated subqueries can be harder

Example: customers whose total order value is above average customer total

Subquery in the FROM clause

Why subqueries in FROM are useful

Subquery in the SELECT list

When a SELECT-list subquery is useful

Subqueries versus joins

Subquery

Join

Example: subquery versus join for ordered products

Subqueries versus CTEs

Subqueries versus window functions

Common subquery patterns

1. Scalar comparison

2. Membership test

3. Existence check

4. Derived table in FROM

5. Correlated calculation

Common mistake: not knowing what the subquery returns

Common mistake: using = with a multi-row subquery

Common mistake: NOT IN with NULL values

Common mistake: overusing correlated subqueries where clearer options exist

Performance thoughts on subqueries

Practical examples

Example 1: products above average price

Example 2: customers with orders

Example 3: customers with at least one order using EXISTS

Example 4: employees above department average

Example 5: summarized data from a derived table

How to decide whether a subquery is a good fit

Use a subquery when:

Consider a join when:

Consider a CTE when:

Consider window functions when:

FAQ

What is a subquery in SQL?

What is the difference between a correlated subquery and a normal subquery?

When should I use a subquery instead of a JOIN?

Are subqueries bad for performance?

Final thoughts

About the author

Use these tools

SQL guides

Related posts