SQL GROUP BY vs HAVING
Level: intermediate · ~17 min read · Intent: informational
Audience: backend developers, data analysts, data engineers, technical teams, students, software engineers
Prerequisites
- basic familiarity with databases
- basic understanding of SELECT, WHERE, and aggregate functions
Key takeaways
- GROUP BY creates groups of rows so SQL can summarize them, while HAVING filters those grouped results after aggregate calculations have been applied.
- The most common mistake is using HAVING when WHERE should be used, or using WHERE when the condition depends on an aggregate like COUNT, SUM, or AVG.
FAQ
- What is the difference between GROUP BY and HAVING in SQL?
- GROUP BY groups rows together so aggregate functions can summarize them, while HAVING filters the grouped results after those aggregate calculations are complete.
- When should I use HAVING instead of WHERE?
- Use HAVING when your filter depends on an aggregate result such as COUNT, SUM, AVG, MIN, or MAX. Use WHERE when you are filtering raw rows before grouping.
- Can I use GROUP BY without HAVING?
- Yes. GROUP BY is often used without HAVING when you want grouped summaries but do not need to filter the groups afterward.
- Can I use HAVING without GROUP BY?
- In some SQL databases, yes, especially when aggregate expressions are used over the full result set. But most of the time HAVING is used together with GROUP BY.
GROUP BY and HAVING are closely related in SQL, which is exactly why so many people mix them up.
They often appear in the same query. They both matter in reporting and grouped analysis. And they both become important the moment you start using aggregate functions like:
COUNTSUMAVGMINMAX
But they do not do the same job.
That is the key point.
A lot of SQL confusion happens because developers and analysts know that grouped queries need both concepts somewhere, but they are not always clear on:
- what GROUP BY actually does
- what HAVING actually does
- and where WHERE fits into the picture
This guide explains SQL GROUP BY vs HAVING clearly, including:
- what each one does
- how they work together
- when to use WHERE instead
- common mistakes
- and practical examples you can reuse in real SQL work
Why this topic matters
The difference between GROUP BY and HAVING is not just a syntax detail.
It affects whether your query:
- returns the right result
- runs at the right stage of filtering
- communicates its intent clearly
- and works correctly with grouped metrics
For example, if you want to answer questions like:
- which customers placed more than five orders?
- which departments have an average salary above 70000?
- which product categories produced more than 100000 in revenue?
- which support teams have more than 50 open tickets?
then you need to understand not only grouping, but also group-level filtering.
That is where HAVING comes in.
If you do not understand the distinction, you often end up with:
- invalid SQL
- incorrect filtering
- confusing query structure
- or grouped logic that is much harder to read than it needs to be
The most important rule
Before anything else, remember this:
GROUP BY creates the groups. HAVING filters the groups after aggregation.
That is the single most important idea in this entire topic.
A useful way to think about it is:
WHEREfilters raw rows before groupingGROUP BYcreates groups of rows- aggregate functions summarize each group
HAVINGfilters the grouped summaries
If you remember that sequence, most of the confusion disappears.
What GROUP BY does
GROUP BY divides rows into groups based on one or more columns.
Once the rows are grouped, aggregate functions can calculate summary values for each group.
Example:
SELECT
country,
COUNT(*) AS user_count
FROM users
GROUP BY country;
This means:
- take all rows in
users - group them by
country - count how many rows are in each country group
The result is one row per country, not one row per user.
That is the core purpose of GROUP BY.
What HAVING does
HAVING filters groups after the grouped result has been calculated.
Example:
SELECT
country,
COUNT(*) AS user_count
FROM users
GROUP BY country
HAVING COUNT(*) > 100;
This means:
- group users by country
- count the rows in each group
- keep only countries where the count is greater than 100
That is the core purpose of HAVING.
It filters grouped results, not raw rows.
GROUP BY without HAVING
You can use GROUP BY without HAVING very often.
Example:
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;
This produces average salary per department.
There is no need for HAVING unless you also want to filter the grouped results afterward.
This is an important point because many grouped queries do not need HAVING at all.
They just need:
- grouping
- and aggregate calculations
HAVING without GROUP BY
In some database systems, HAVING can be used without GROUP BY when aggregate functions apply to the entire result as one overall group.
Example conceptually:
SELECT COUNT(*) AS total_orders
FROM orders
HAVING COUNT(*) > 1000;
This is valid in some engines and contexts.
But in everyday SQL usage, HAVING is most commonly used together with GROUP BY.
So for practical learning, it is best to think of HAVING primarily as:
- the filter for grouped results
That mental model will be correct most of the time.
Where WHERE fits into the picture
A lot of confusion around GROUP BY vs HAVING really comes from WHERE.
That is because people often want to filter something, but they are not sure whether the filter belongs:
- before grouping or
- after grouping
The difference is simple:
WHERE
Filters rows before grouping.
HAVING
Filters groups after grouping and aggregation.
That is the most important distinction in practical SQL.
Example: WHERE before GROUP BY
Suppose you want the average salary by department, but only for employees with salary above 30000.
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
WHERE salary > 30000
GROUP BY department_id;
This means:
- first remove all rows where salary is 30000 or below
- then group the remaining rows by department
- then calculate the average for those remaining rows
That is row-level filtering.
So WHERE is the correct tool.
Example: HAVING after GROUP BY
Now suppose you want only departments whose average salary is above 70000.
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 70000;
This means:
- first group rows by department
- calculate average salary for each group
- then keep only departments where the average exceeds 70000
That is group-level filtering.
So HAVING is the correct tool.
The most common beginner mistake
The most common mistake is trying to use an aggregate function inside WHERE.
For example:
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
WHERE COUNT(*) > 5
GROUP BY customer_id;
This is wrong because:
WHEREhappens before groupingCOUNT(*)does not exist yet at the WHERE stage
The correct version is:
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;
This is one of the simplest and most important examples of the difference.
A practical sequence for understanding grouped queries
A useful mental order for grouped queries is:
FROMWHEREGROUP BY- aggregate calculations
HAVINGORDER BY
That is not always how every database engine physically executes the query internally, but it is a very good conceptual model for writing and understanding SQL.
It explains why:
- WHERE cannot use grouped aggregates
- HAVING can
- and GROUP BY sits between them
Example: orders per customer
Let’s say you want to count how many orders each customer placed.
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;
This is a pure GROUP BY query.
It:
- groups rows by customer
- counts the rows in each group
Now suppose you only want customers with more than five orders.
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;
Now HAVING filters the grouped result.
This is one of the clearest practical examples of when HAVING is needed.
Example: revenue by category
Suppose you want total revenue by category.
SELECT
category_id,
SUM(total_amount) AS total_revenue
FROM orders
GROUP BY category_id;
Now suppose you only want categories with revenue above 100000.
SELECT
category_id,
SUM(total_amount) AS total_revenue
FROM orders
GROUP BY category_id
HAVING SUM(total_amount) > 100000;
Again:
- GROUP BY creates the category groups
- SUM calculates revenue per group
- HAVING filters the grouped output
That is the pattern.
Example: WHERE plus HAVING in the same query
This is very common in real work.
Suppose you want:
- paid orders only
- grouped by customer
- but only customers whose paid revenue exceeds 5000
SELECT
customer_id,
SUM(total_amount) AS paid_revenue
FROM orders
WHERE status = 'Paid'
GROUP BY customer_id
HAVING SUM(total_amount) > 5000;
This query does two different types of filtering:
WHERE
Keeps only rows where status = 'Paid'
HAVING
Keeps only grouped customers whose sum exceeds 5000
This is one of the best ways to understand how the two work together.
GROUP BY controls the level of the result
Another important idea is that GROUP BY determines what one output row represents.
For example:
SELECT
country,
COUNT(*) AS user_count
FROM users
GROUP BY country;
Each output row now represents:
- one country
If you instead write:
SELECT
country,
city,
COUNT(*) AS user_count
FROM users
GROUP BY country, city;
Each output row now represents:
- one country-city combination
That is why GROUP BY is so important in reporting. It defines the grouping grain of the result.
HAVING then filters those groups after they are formed.
HAVING depends on the grouped result
A good way to think about HAVING is:
HAVING filters what GROUP BY produced.
That means it usually works with:
- aggregate expressions
- grouped output logic
- or conditions that only make sense after grouping
Examples:
HAVING COUNT(*) > 10HAVING SUM(revenue) > 50000HAVING AVG(score) >= 75HAVING MAX(order_total) > 1000
All of these depend on group summaries, not raw rows.
That is why HAVING exists.
GROUP BY and aggregate functions belong together
A lot of the time, GROUP BY is paired with aggregate functions.
Common combinations:
GROUP BY+COUNTGROUP BY+SUMGROUP BY+AVGGROUP BY+MINGROUP BY+MAX
Examples:
Count users by country
SELECT
country,
COUNT(*) AS user_count
FROM users
GROUP BY country;
Average salary by department
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id;
Maximum order value by customer
SELECT
customer_id,
MAX(total_amount) AS max_order_amount
FROM orders
GROUP BY customer_id;
HAVING becomes relevant when you want to filter those grouped summaries.
GROUP BY without aggregates can work, but is not always the clearest choice
Sometimes people write:
SELECT country
FROM users
GROUP BY country;
This may work and can return one row per unique country.
But if your real goal is only:
- unique values
then DISTINCT is often clearer:
SELECT DISTINCT country
FROM users;
That is because GROUP BY implies:
- grouping logic
- possible aggregation
- one row per group
while DISTINCT more directly says:
- remove duplicate result rows
This is not always a correctness issue, but it is a readability and intent issue.
Common real-world GROUP BY vs HAVING patterns
There are a few patterns that appear constantly in SQL work.
1. Customers with more than N orders
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;
2. Products with revenue above a threshold
SELECT
product_id,
SUM(total_amount) AS total_revenue
FROM order_items
GROUP BY product_id
HAVING SUM(total_amount) > 10000;
3. Departments with average salary above a threshold
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 70000;
4. Countries with more than 100 active users
SELECT
country,
COUNT(*) AS active_user_count
FROM users
WHERE is_active = true
GROUP BY country
HAVING COUNT(*) > 100;
This last example is especially useful because it combines:
- row-level filtering with WHERE
- group-level filtering with HAVING
That is exactly how these clauses are meant to work together.
When to use WHERE instead of HAVING
Use WHERE when:
- the condition applies to individual rows
- the aggregate result is not needed yet
- filtering early makes logical and performance sense
Examples:
- paid orders only
- active users only
- events from the last 30 days
- salaries above 30000 before calculating department averages
- rows where the country is South Africa
These are all row conditions.
That means WHERE is the right tool.
Example:
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
WHERE salary > 30000
GROUP BY department_id;
When to use HAVING instead of WHERE
Use HAVING when:
- the condition depends on the grouped result
- you need to filter based on aggregate values
- the question is about which groups qualify
Examples:
- customers with more than 5 orders
- departments with average salary above 70000
- countries with more than 100 active users
- months with revenue above 50000
- products with minimum stock movement above a threshold
These are all group conditions.
That means HAVING is the right tool.
Can HAVING repeat an aggregate alias?
This depends on the database.
For example, some people want to write:
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING order_count > 5;
In some databases this may work. In others it may not.
The most portable and reliable style is often to repeat the aggregate expression:
HAVING COUNT(*) > 5
That avoids ambiguity across engines.
For SQL writing that may move between systems, this is usually the safer style.
Performance note: WHERE usually reduces rows earlier
Although this article is mostly about correctness and clarity, there is also a practical performance angle.
Because WHERE filters rows before grouping, it often reduces the amount of data that reaches the GROUP BY stage.
That can matter a lot on larger datasets.
For example:
SELECT
customer_id,
SUM(total_amount) AS paid_revenue
FROM orders
WHERE status = 'Paid'
GROUP BY customer_id;
This is generally better than trying to group all orders first and then somehow filter paid totals later in a more complicated way.
So a useful performance-minded rule is:
- put row-level filters in WHERE whenever possible
- reserve HAVING for true group-level filters
This is usually both clearer and more efficient.
Common mistakes with GROUP BY
There are a few GROUP BY mistakes that show up often.
1. Forgetting to group selected non-aggregated columns
Bad pattern:
SELECT
customer_id,
customer_name,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;
This is usually invalid unless customer_name is also grouped or otherwise handled correctly.
The basic rule is:
- selected columns in grouped queries usually must be either grouped or aggregated
2. Grouping at the wrong level
For example:
- grouping by customer and order date when you only wanted customer-level totals
This can create too many result rows and wrong reporting grain.
3. Using GROUP BY when DISTINCT would be clearer
This is not always wrong, but it can make intent less obvious.
Common mistakes with HAVING
There are also a few classic HAVING mistakes.
1. Using HAVING for ordinary row filtering
Bad pattern:
SELECT
customer_id,
COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING customer_id = 42;
This may work in some cases, but if customer_id = 42 is just a row-level filter, it is usually clearer and better in WHERE:
WHERE customer_id = 42
The rule is:
- use HAVING only when the filter depends on grouped results
2. Trying to use HAVING instead of understanding WHERE
This often makes queries harder to read and can process more rows than needed.
3. Writing HAVING conditions that do not reflect the business question
For example:
- filtering on raw row conditions after grouping instead of filtering the intended aggregate metric
A practical mental model
A simple mental model that works very well is this:
WHERE
Which raw rows belong in the analysis?
GROUP BY
How should those rows be grouped?
HAVING
Which grouped results should remain?
That sequence makes a lot of SQL much easier to think about.
Practical examples side by side
Example 1: Filter rows before grouping
Question:
- average salary by department, but only for employees with salary above 30000
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
WHERE salary > 30000
GROUP BY department_id;
Why WHERE?
- because the filter applies to rows before grouping
Example 2: Filter groups after grouping
Question:
- departments whose average salary is above 70000
SELECT
department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 70000;
Why HAVING?
- because the filter applies to the grouped average
Example 3: Use both WHERE and HAVING
Question:
- customers with more than 5 paid orders
SELECT
customer_id,
COUNT(*) AS paid_order_count
FROM orders
WHERE status = 'Paid'
GROUP BY customer_id
HAVING COUNT(*) > 5;
Why both?
- WHERE filters raw orders to paid ones
- HAVING filters grouped customers to those with more than five paid orders
This is one of the best example patterns to remember.
FAQ
What is the difference between GROUP BY and HAVING in SQL?
GROUP BY groups rows together so aggregate functions can summarize them, while HAVING filters the grouped results after those aggregate calculations are complete.
When should I use HAVING instead of WHERE?
Use HAVING when your filter depends on an aggregate result such as COUNT, SUM, AVG, MIN, or MAX. Use WHERE when you are filtering raw rows before grouping.
Can I use GROUP BY without HAVING?
Yes. GROUP BY is often used without HAVING when you want grouped summaries but do not need to filter the groups afterward.
Can I use HAVING without GROUP BY?
In some SQL databases, yes, especially when aggregate expressions are used over the full result set. But most of the time HAVING is used together with GROUP BY.
Final thoughts
GROUP BY and HAVING are closely connected, but they do different jobs.
The clearest way to remember them is:
GROUP BYcreates the groupsHAVINGfilters the groupsWHEREfilters the raw rows before grouping starts
That is the conceptual core.
Once you understand that sequence, grouped SQL becomes much easier to write and debug.
And that matters because a huge amount of practical SQL depends on grouped reporting:
- counts by category
- revenue by month
- averages by department
- customers above a threshold
- and every other case where you need one row per group instead of one row per raw record
If you keep the rule simple, you will usually stay correct:
- row conditions go in WHERE
- aggregate conditions go in HAVING
- and GROUP BY defines what one result row represents