SQL Subqueries Explained

·Updated Apr 4, 2026·
sqldatabasequery-languagesubqueriessql tutorialintermediate sql
·

Level: intermediate · ~18 min read · Intent: informational

Audience: backend developers, data analysts, data engineers, technical teams, software engineers, students

Prerequisites

  • basic familiarity with databases
  • basic understanding of SELECT, WHERE, JOIN, and aggregate functions

Key takeaways

  • A SQL subquery is simply a query inside another query, but the real skill is understanding what the inner query should return and how the outer query uses that result.
  • Subqueries are powerful for filtering, comparison, existence checks, and staged logic, but they should be chosen deliberately because some problems are clearer or faster with joins, CTEs, or window functions.

FAQ

What is a subquery in SQL?
A subquery is a query nested inside another SQL query. It can return a single value, a list of values, or a result set that the outer query uses for filtering, comparison, or further querying.
What is the difference between a correlated subquery and a normal subquery?
A normal subquery can run independently of the outer query, while a correlated subquery depends on values from the current row of the outer query.
When should I use a subquery instead of a JOIN?
Use a subquery when it expresses the logic more clearly, especially for value comparisons, EXISTS checks, or staged filtering. Use a JOIN when you need columns from related tables or when a join-based approach is clearer and easier to optimize.
Are subqueries bad for performance?
Not automatically. Some subqueries perform very well, but correlated subqueries or poorly structured nested logic can become expensive. The right choice depends on the database engine, the data shape, the indexes, and the execution plan.
0

SQL subqueries are one of the most useful SQL features because they let you answer a question by first answering a smaller question inside it.

That sounds simple, and it is.

But it is also where many SQL learners get stuck.

They can write:

  • SELECT
  • WHERE
  • JOIN
  • GROUP BY

But once they see a query inside another query, it starts to feel more advanced than it really is.

The good news is that a subquery is not a separate kind of SQL universe. It is just:

  • one query producing something
  • that another query uses

That is all.

This guide explains SQL subqueries clearly, including:

  • what a subquery is
  • the main subquery types
  • scalar subqueries
  • IN and EXISTS
  • correlated subqueries
  • subqueries in SELECT
  • subqueries in FROM
  • when subqueries are a great fit
  • and when a join, CTE, or window function may be the better choice

Why subqueries matter

A lot of SQL problems are really two-step problems.

For example:

  • find products priced above the average product price
  • return customers who placed at least one order
  • show employees who earn more than their department average
  • return the latest order date per customer
  • list users whose email appears in a suspicious source list
  • find products that were never ordered

Each of those has a natural smaller question inside it:

  • what is the average price?
  • which customers have orders?
  • what is the average salary in this department?
  • what is the latest order date?
  • which emails are in the suspicious list?
  • which products appear in order items?

That inner question is often a subquery.

This is why subqueries matter so much. They let you break logic into nested steps that often match how people naturally think.

The most important rule

Before anything else, remember this:

To understand a subquery, first ask what the inner query returns.

That is the most important rule in this topic.

A lot of confusion disappears once you ask:

  • does the inner query return one value?
  • a list of values?
  • or a table-like result set?

Because that determines how the outer query can use it.

For example:

  • one value can be used with =, >, <
  • a list of values can be used with IN
  • an existence check can be used with EXISTS
  • a table-like result can be used in FROM

If you get the return shape of the inner query clear first, the rest becomes much easier.

What a subquery is

A subquery is a query inside another query.

That is the basic definition.

Example:

SELECT *
FROM products
WHERE price > (
    SELECT AVG(price)
    FROM products
);

The inner query is:

SELECT AVG(price)
FROM products

The outer query uses that result to filter rows.

That is a subquery.

It is sometimes also called:

  • nested query
  • inner query

But “subquery” is the most common term.

A simple mental model

A very useful mental model is:

Outer query

The main question.

Inner query

A smaller helper question.

For example:

Main question:

  • which products cost more than average?

Helper question:

  • what is the average price?

That is exactly how subqueries often work in practice.

The main kinds of subqueries

Subqueries are usually easiest to understand when grouped by what they return.

The most common practical types are:

1. Scalar subquery

Returns one value.

2. Multi-row subquery

Returns a list of values.

3. Correlated subquery

Depends on the current row of the outer query.

4. Derived-table subquery

Returns a table-like result used in FROM.

These are the main patterns you will see again and again.

Scalar subquery explained

A scalar subquery returns one value.

That makes it useful in comparisons like:

  • =
  • >
  • <
  • >=
  • <=

Example:

SELECT
    product_id,
    product_name,
    price
FROM products
WHERE price > (
    SELECT AVG(price)
    FROM products
);

The inner query returns:

  • one value
  • the average price

The outer query then compares each product price to that one value.

This is one of the easiest and most common subquery patterns.

Why scalar subqueries are useful

Scalar subqueries are useful when the outer query needs:

  • one threshold
  • one benchmark
  • one summary value
  • one lookup value

Examples:

  • above average salary
  • below minimum passing score
  • equal to latest date
  • more expensive than average product
  • greater than total from another calculation

These are all natural one-value comparison problems.

Multi-row subquery explained

A multi-row subquery returns multiple values.

That means the outer query often uses it with:

  • IN
  • NOT IN
  • ANY
  • ALL

The most common one by far is IN.

Example:

SELECT
    customer_id,
    customer_name
FROM customers
WHERE customer_id IN (
    SELECT customer_id
    FROM orders
);

The inner query returns:

  • many customer IDs from orders

The outer query then keeps only customers whose customer_id appears in that set.

This is a very common and useful SQL pattern.

Why IN works so naturally with subqueries

IN is useful because the logic reads almost like English.

Example:

  • return customers whose ID is in the set of customer IDs from orders

That is very intuitive.

It is often used for:

  • filtering by related IDs
  • matching against a result set
  • narrowing rows based on another table
  • and quick membership logic

That is why IN subqueries are common in both learning and real production SQL.

Example: products that were ordered

SELECT
    product_id,
    product_name
FROM products
WHERE product_id IN (
    SELECT product_id
    FROM order_items
);

This returns products whose IDs appear in the order_items table.

The inner query returns:

  • the set of ordered product IDs

The outer query uses that set for filtering.

This is another very practical subquery pattern.

Subqueries with NOT IN

You can also use:

NOT IN

Example:

SELECT
    product_id,
    product_name
FROM products
WHERE product_id NOT IN (
    SELECT product_id
    FROM order_items
);

This looks like:

  • return products not present in order items

Conceptually, that is fine.

But there is an important caveat with NULL values, which we will cover later, because NOT IN can behave unexpectedly if the subquery returns NULL.

That is one reason many people prefer NOT EXISTS for these patterns.

EXISTS explained

EXISTS checks whether the subquery returns at least one row.

That is different from comparing a value to a returned set. It is purely an existence check.

Example:

SELECT
    c.customer_id,
    c.customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

This means:

  • keep customers for whom at least one matching order exists

This is one of the most important SQL patterns.

Why EXISTS is so useful

EXISTS is useful when the real question is:

  • does a related row exist?

Not:

  • what exact values should I compare?

Examples:

  • customers with at least one order
  • users with at least one login
  • products with at least one sale
  • employees with subordinates
  • accounts with unpaid invoices

These are all existence questions.

That is why EXISTS is often one of the cleanest ways to express them.

EXISTS versus IN

A very common SQL question is:

  • should I use IN or EXISTS?

The practical answer is:

  • both can be correct
  • but they express slightly different logic

IN

Often feels natural when comparing one value to a returned set.

EXISTS

Often feels natural when asking whether at least one related row exists.

Example with IN:

WHERE customer_id IN (
    SELECT customer_id
    FROM orders
)

Example with EXISTS:

WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
)

Both may solve similar problems. But EXISTS is often clearer for relationship existence logic.

Correlated subquery explained

A correlated subquery is a subquery that depends on values from the outer query.

That means the inner query cannot be understood independently in the same way as a normal subquery. It is tied to the current outer row.

Example:

SELECT
    e.employee_id,
    e.employee_name,
    e.department,
    e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department = e.department
);

The inner query depends on:

  • e.department

which comes from the outer query.

So for each employee row, the inner query calculates:

  • average salary in that employee’s department

That is a correlated subquery.

Why correlated subqueries are powerful

Correlated subqueries are powerful because they let you compare each row against:

  • its own group
  • its own related context
  • or its own dynamic benchmark

Examples:

  • employee versus department average
  • order versus customer’s average order size
  • product versus category average price
  • row versus previous maximum within a group

They are very expressive and often match the business logic well.

Why correlated subqueries can be harder

Correlated subqueries are often harder because:

  • they are more conceptually nested
  • the inner query runs with reference to outer-row values
  • performance can become more sensitive on larger datasets
  • and the same logic might sometimes be expressible with joins or window functions more clearly

That does not make them bad. It just means they require more deliberate thinking.

A good habit is:

  • use them when they make the logic clearer
  • but still inspect performance and readability

Example: customers whose total order value is above average customer total

You can solve this in several ways, but a staged subquery approach can look natural.

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals
WHERE total_spend > (
    SELECT AVG(total_spend)
    FROM (
        SELECT
            customer_id,
            SUM(total_amount) AS total_spend
        FROM orders
        GROUP BY customer_id
    ) avg_source
);

This is more complex than some alternatives, but it shows a key point:

  • subqueries can be used to create intermediate result sets
  • and then query those intermediate results further

That is very useful in reporting logic.

Subquery in the FROM clause

A subquery can also appear in the FROM clause.

When this happens, it acts like a temporary table or derived table.

Example:

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals;

The inner query creates:

  • a customer totals result set

The outer query selects from it as though it were a table.

This is a very useful SQL technique.

Why subqueries in FROM are useful

They are useful when:

  • you want to build an intermediate result
  • give it a name
  • then query it further

This helps with tasks like:

  • filtering aggregated results
  • joining summarized data
  • restructuring row grain before the next step
  • simplifying more complex logic

A lot of SQL reporting becomes easier when you think in stages like this.

Subquery in the SELECT list

A subquery can also appear in the SELECT list itself.

Example:

SELECT
    c.customer_id,
    c.customer_name,
    (
        SELECT COUNT(*)
        FROM orders o
        WHERE o.customer_id = c.customer_id
    ) AS order_count
FROM customers c;

This returns:

  • customer data
  • plus a derived order count per customer

This is another form of correlated subquery, because the inner query depends on:

  • c.customer_id

This pattern can be useful, but like other correlated forms, it should be used thoughtfully.

When a SELECT-list subquery is useful

It is useful when you want:

  • one main row
  • plus a derived related value per row

Examples:

  • customer plus order count
  • product plus total sales
  • employee plus number of direct reports
  • ticket plus latest comment time

This can be very readable for small or moderate use cases.

But on large workloads, you should still consider whether:

  • a join plus grouping
  • or a window function would be clearer or more efficient

Subqueries versus joins

This is one of the biggest practical SQL design questions.

Sometimes a subquery is clearer. Sometimes a join is clearer.

There is no universal rule that one is always better.

A good practical difference is:

Subquery

Useful when the inner query feels like:

  • a helper result
  • a membership set
  • an existence check
  • a benchmark value
  • or a staged intermediate result

Join

Useful when you need:

  • columns from related tables
  • direct row combination
  • or a flatter relationship-based query structure

The best choice is usually the one that makes the logic easiest to understand and maintain while still performing well.

Example: subquery versus join for ordered products

Subquery style:

SELECT
    product_id,
    product_name
FROM products
WHERE product_id IN (
    SELECT product_id
    FROM order_items
);

Join style:

SELECT DISTINCT
    p.product_id,
    p.product_name
FROM products p
INNER JOIN order_items oi
    ON p.product_id = oi.product_id;

Both may answer the same logical question.

The subquery version says:

  • keep products whose ID appears in order_items

The join version says:

  • join matching ordered products

Sometimes the subquery feels more direct for the business question. That is why subqueries remain so useful.

Subqueries versus CTEs

A CTE is often just a clearer, named way to express staged subquery logic.

Example using a derived table:

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals;

Equivalent idea with a CTE:

WITH customer_totals AS (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
)
SELECT
    customer_id,
    total_spend
FROM customer_totals;

Both are valid.

The CTE version is often easier to read in larger queries. That is why subqueries and CTEs are closely related in practical SQL design.

Subqueries versus window functions

Some problems that used to be solved with subqueries are often cleaner with window functions.

Example: latest order per customer.

Subquery solutions exist. But often ROW_NUMBER() or MAX() OVER (...) style patterns are clearer.

That does not make subqueries obsolete. It just means modern SQL gives you several tools, and the best one depends on:

  • readability
  • row shape
  • performance
  • and the business question

A strong SQL user knows multiple ways to solve a problem.

Common subquery patterns

Here are the most common patterns you will see.

1. Scalar comparison

WHERE price > (
    SELECT AVG(price)
    FROM products
)

2. Membership test

WHERE customer_id IN (
    SELECT customer_id
    FROM orders
)

3. Existence check

WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
)

4. Derived table in FROM

FROM (
    SELECT customer_id, SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals

5. Correlated calculation

WHERE salary > (
    SELECT AVG(salary)
    FROM employees e2
    WHERE e2.department = e.department
)

If you understand these five patterns, you understand a large share of real SQL subquery use.

Common mistake: not knowing what the subquery returns

This is the biggest one.

If the outer query expects:

  • one value

but the subquery returns:

  • many rows

the query may fail or behave differently than expected.

That is why the first question is always:

  • what shape does the subquery return?

One value? One column with many rows? A whole table-like result?

That determines the correct outer usage.

Common mistake: using = with a multi-row subquery

This is a classic beginner issue.

Wrong pattern if multiple rows can return:

WHERE customer_id = (
    SELECT customer_id
    FROM orders
)

If the subquery returns many customer IDs, that is not appropriate.

The correct pattern may be:

WHERE customer_id IN (
    SELECT customer_id
    FROM orders
)

This is why matching the comparison operator to the subquery shape is so important.

Common mistake: NOT IN with NULL values

This is one of the most important practical warnings.

Example:

SELECT *
FROM products
WHERE product_id NOT IN (
    SELECT product_id
    FROM order_items
);

If the subquery can return NULL values, NOT IN may behave in surprising ways.

That is why many teams prefer:

WHERE NOT EXISTS (...)

for anti-join style logic such as:

  • rows with no related match

This is a very important real-world SQL caution.

Common mistake: overusing correlated subqueries where clearer options exist

Correlated subqueries can be elegant. But sometimes they make a query:

  • harder to read
  • harder to optimize
  • or harder to explain

If the same logic is clearer with:

  • a join
  • a CTE
  • or a window function

that may be the better option.

This is not about banning correlated subqueries. It is about using the clearest tool for the problem.

Performance thoughts on subqueries

Subqueries are not automatically bad for performance.

Some perform extremely well. Others do not.

The real performance depends on:

  • the database engine
  • the indexes
  • whether the subquery is correlated
  • row counts
  • join paths
  • and the overall execution plan

A strong rule is:

  • do not judge performance by syntax alone
  • inspect the plan and measure the real query

That is especially important for correlated subqueries and very large result sets.

Practical examples

Example 1: products above average price

SELECT
    product_id,
    product_name,
    price
FROM products
WHERE price > (
    SELECT AVG(price)
    FROM products
);

Example 2: customers with orders

SELECT
    customer_id,
    customer_name
FROM customers
WHERE customer_id IN (
    SELECT customer_id
    FROM orders
);

Example 3: customers with at least one order using EXISTS

SELECT
    c.customer_id,
    c.customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

Example 4: employees above department average

SELECT
    e.employee_id,
    e.employee_name,
    e.department,
    e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(e2.salary)
    FROM employees e2
    WHERE e2.department = e.department
);

Example 5: summarized data from a derived table

SELECT
    customer_id,
    total_spend
FROM (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM orders
    GROUP BY customer_id
) customer_totals
WHERE total_spend > 1000;

These examples cover the main styles you will use most often.

How to decide whether a subquery is a good fit

A practical decision framework looks like this:

Use a subquery when:

  • the inner query feels like a natural helper question
  • you need one benchmark value
  • you need a membership set
  • you need an existence check
  • or you want a clean intermediate result

Consider a join when:

  • you need columns from related tables directly
  • the row relationship is central to the logic
  • or the join makes the result shape clearer

Consider a CTE when:

  • the logic has multiple stages
  • naming intermediate results improves readability
  • or the query is getting long

Consider window functions when:

  • the problem is about row ranking, row comparison, or per-group latest logic

This is a very useful way to think practically about SQL design.

FAQ

What is a subquery in SQL?

A subquery is a query nested inside another SQL query. It can return a single value, a list of values, or a result set that the outer query uses for filtering, comparison, or further querying.

What is the difference between a correlated subquery and a normal subquery?

A normal subquery can run independently of the outer query, while a correlated subquery depends on values from the current row of the outer query.

When should I use a subquery instead of a JOIN?

Use a subquery when it expresses the logic more clearly, especially for value comparisons, EXISTS checks, or staged filtering. Use a JOIN when you need columns from related tables or when a join-based approach is clearer and easier to optimize.

Are subqueries bad for performance?

Not automatically. Some subqueries perform very well, but correlated subqueries or poorly structured nested logic can become expensive. The right choice depends on the database engine, the data shape, the indexes, and the execution plan.

Final thoughts

SQL subqueries are powerful because they let you build a larger query from smaller logical questions.

That is why they feel natural once you stop thinking of them as “advanced syntax” and start thinking of them as:

  • helper queries inside bigger queries

The most important ideas to remember are:

  • first understand what the inner query returns
  • scalar subqueries return one value
  • multi-row subqueries often work with IN
  • EXISTS is great for relationship existence checks
  • correlated subqueries depend on the outer row
  • subqueries in FROM act like temporary derived tables
  • and joins, CTEs, and window functions are sometimes better alternatives depending on the problem

If you keep those ideas clear, subqueries stop feeling confusing and start becoming one of the most natural and useful parts of writing real SQL.

SQL guides

Explore the connected SQL guides on fundamentals, joins, analytics, performance, interviews, and practical workflows.

Pillar guide

SQL Complete Guide for Beginners and Developers

A complete SQL guide for beginners and developers covering databases, tables, SELECT, WHERE, JOINs, GROUP BY, CASE, subqueries, CTEs, inserts, updates, deletes, indexes, and practical query patterns.

View all SQL guides →

Related posts