WHERE vs HAVING
WHERE vs HAVING
Technical Explanation
WHERE and HAVING both filter data, but they operate at different stages of query execution and serve different purposes.
The Execution Order
SQL queries don't execute in the order you write them. The logical execution order is:
1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY
This means WHERE filters rows before aggregation, while HAVING filters groups after aggregation.
When to Use Each
| Operator | Filters | Can use Aggregates? | Use When |
|---|---|---|---|
| WHERE | Individual rows | No | Filtering raw data before grouping |
| HAVING | Groups (after GROUP BY) | Yes | Filtering based on aggregate results |
Key Differences
- WHERE goes first — It evaluates each row's condition before any grouping happens
- HAVING evaluates after — It can only see the result of aggregations, not individual rows
- Performance — WHERE can use indexes; HAVING typically cannot
Using WHERE for row-level filtering and HAVING for group-level filtering is not just about correctness—it's about performance and readability.
Code Examples
Using the CatCafe dataset:
-- Dataset: customers
-- id, name, email, registration_date, city, total_spent
-- WHERE: Filter rows BEFORE grouping
-- Find customers from "Brooklyn" who spent more than $50
SELECT
city,
COUNT(*) as customer_count,
SUM(total_spent) as total_revenue
FROM customers
WHERE city = 'Brooklyn' -- Filters individual rows
GROUP BY city;
-- Result: Only Brooklyn customers, aggregated
-- HAVING: Filter groups AFTER grouping
-- Find cities with more than 10 customers
SELECT
city,
COUNT(*) as customer_count,
SUM(total_spent) as total_revenue
FROM customers
GROUP BY city
HAVING COUNT(*) > 10; -- Filters groups after aggregation
-- Result: Only cities with 10+ customers
-- Combining both: Use WHERE AND HAVING
-- Cities with 10+ customers from customers who registered in 2024
SELECT
city,
COUNT(*) as customer_count,
AVG(total_spent) as avg_spent
FROM customers
WHERE registration_date >= '2024-01-01' -- Row-level filter
GROUP BY city
HAVING COUNT(*) > 10; -- Group-level filter
Common Mistake
-- WRONG: Trying to filter on aggregate in WHERE
SELECT
city,
COUNT(*) as customer_count
FROM customers
WHERE COUNT(*) > 10 -- ERROR: aggregate not allowed in WHERE
GROUP BY city;
-- CORRECT: Use HAVING for aggregate filters
SELECT
city,
COUNT(*) as customer_count
FROM customers
GROUP BY city
HAVING COUNT(*) > 10; -- This works
The Cat Analogy
Imagine you're organizing a cat show.
WHERE is like sorting cats as they arrive:
- "Only let cats with clean fur into the main hall"
- "Only cats with collars can enter"
- Each cat is evaluated individually before the show begins
HAVING is like deciding which groups get awards:
- After cats are grouped by breed, you say "Only groups with more than 5 cats qualify"
- After grouping by color, you say "Only groups where the average weight is above 5kg get a prize"
- You're filtering based on properties of the group itself, not individual cats
You can't say "only groups with more than 5 cats" before you know how many cats are in each group—that's why HAVING comes after GROUP BY.
Exercises
Exercise 1
Given this reservations table:
-- id, cat_id, customer_id, reservation_date, duration_hours, status
Write a query that shows average duration per status, but only for reservations made in 2024, and only for statuses with more than 100 reservations.
Exercise 2
Explain why this query is problematic:
SELECT city, COUNT(*) as cnt
FROM customers
WHERE COUNT(*) > 5
GROUP BY city;
Exercise 3
Write a query using both WHERE and HAVING that:
- Filters catcafe orders to only those with
status = 'completed' - Groups by
customer_id - Shows only customers with more than 5 completed orders
- Includes their total spending
Exercise 4
In the CatCafe scenario, would you use WHERE or HAVING to:
- Exclude cancelled reservations?
- Find customers whose average order is over $50?
- Filter to a specific city before grouping?
- Find cities where total revenue exceeds $10,000?
Key Takeaways
- WHERE filters rows before GROUP BY; HAVING filters groups after GROUP BY
- WHERE cannot use aggregate functions; HAVING is designed for them
- For best performance, filter early with WHERE whenever possible
- Use both together: WHERE for row conditions, HAVING for aggregate conditions