NULL Handling
NULL Handling
Technical Explanation
NULL represents the absence of a value—it's not zero, not an empty string, not false. NULL means "this value is unknown or not applicable." Understanding NULL behavior is critical because it affects comparisons, aggregations, and JOINs.
The Three-Valued Logic
SQL uses three-valued logic with NULL:
| Expression | Result |
|---|---|
| TRUE AND NULL | UNKNOWN |
| FALSE AND NULL | FALSE |
| TRUE OR NULL | TRUE |
| FALSE OR NULL | UNKNOWN |
| NOT NULL | UNKNOWN |
This means WHERE conditions with NULL often exclude rows you might expect.
NULL Comparison Rules
-- NULL is never equal to anything
NULL = NULL -- Returns UNKNOWN (not TRUE!)
NULL != NULL -- Returns UNKNOWN (not TRUE!)
NULL <> NULL -- Returns UNKNOWN (not TRUE!)
-- Correct ways to check for NULL
column IS NULL
column IS NOT NULL
NULL in Aggregations
| Function | NULL Behavior |
|---|---|
| COUNT(column) | Ignores NULLs |
| SUM(column) | Ignores NULLs |
| AVG(column) | Ignores NULLs |
| COUNT(*) | Counts all rows including NULL |
| MIN/MAX | Ignores NULLs |
NULL in JOINs
-- NULL does NOT match NULL in standard joins
SELECT *
FROM a
JOIN b
ON a.id = b.id;
-- If a.id = NULL and b.id = NULL, they WON'T match!
Code Examples
Using the CatCafe dataset:
-- Table: customers
-- id, name, email, city, registration_date, phone, total_spent
-- Finding NULLs: Customers without a phone number
SELECT *
FROM customers
WHERE phone IS NULL;
-- Handling NULLs: Replace with default value
SELECT
name,
COALESCE(city, 'Unknown') as city,
COALESCE(phone, 'No phone') as phone
FROM customers;
-- NULLIF: Handle division by zero
SELECT
SUM(total_spent) / NULLIF(COUNT(*), 0) as avg_spent
FROM customers;
-- Returns NULL instead of error when COUNT(*) = 0
-- CASE WHEN with NULL handling
SELECT
name,
total_spent,
CASE
WHEN total_spent IS NULL THEN 'No purchases'
WHEN total_spent < 50 THEN 'Low value'
WHEN total_spent < 200 THEN 'Medium value'
ELSE 'High value'
END as customer_tier
FROM customers;
-- Aggregate with NULL filtering
SELECT
COUNT(*) as total_customers,
COUNT(phone) as customers_with_phone,
COUNT(*) - COUNT(phone) as customers_without_phone,
AVG(total_spent) as avg_spent_all,
AVG(total_spent) as avg_spent_with_purchases
FROM customers;
-- NULLs in JOINs: Customers with their latest order
SELECT
customers.name,
orders.order_date,
orders.total_amount
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id
AND orders.order_date = (
SELECT MAX(order_date)
FROM orders o2
WHERE o2.customer_id = orders.customer_id
);
-- This ensures we get only the latest order, even if customer has NULL orders
-- Using DISTINCT ON for latest per group (PostgreSQL)
SELECT DISTINCT ON (customers.id)
customers.name,
orders.total_amount,
orders.order_date
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id
ORDER BY customers.id, orders.order_date DESC;
The Cat Analogy
Think of NULL as a cat's mood:
- NULL is NOT "the cat is calm"
- NULL is NOT "the cat is angry"
- NULL is "we don't know what the cat's mood is"
Comparing NULLs:
"Is Whiskers' mood the same as Mittens' mood?"
"We don't know if Whiskers is calm" = "We don't know if Mittens is calm"
→ UNKNOWN (not "yes, they're the same")
NULL in aggregations: If you're tallying up cat snacks:
- SUM(snacks) ignores bowls with NULL (no snack recorded)
- COUNT(*) counts all bowls, including empty ones
- COUNT(snacks) counts only bowls where snacks were recorded
NULL in comparisons:
-- "Show me all unassigned cats"
WHERE assigned_to IS NULL -- Correct: find cats with no assignment
WHERE assigned_to = NULL -- Wrong: this never matches anything!
-- "Show me cats with assignments"
WHERE assigned_to IS NOT NULL
Common Pitfalls
Comparing with NULL
-- WRONG: This never returns rows!
SELECT *
FROM cats
WHERE adopted_date = NULL;
-- RIGHT:
SELECT *
FROM cats
WHERE adopted_date IS NULL;
NULL in String Concatenation
-- Concatenation with NULL produces NULL
SELECT 'Name: ' || name || ', Age: ' || age
FROM cats;
-- If age is NULL, entire row becomes NULL!
-- Solution: Use COALESCE
SELECT 'Name: ' || name || ', Age: ' || COALESCE(age::text, 'Unknown')
FROM cats;
NOT IN with NULLs
-- Dangerous: If subquery returns NULL, NOT IN fails
SELECT *
FROM cats
WHERE id NOT IN (SELECT cat_id FROM adoption_fees WHERE cat_id IS NULL);
-- This returns nothing if any cat_id is NULL!
-- Safer: Use NOT EXISTS or exclude NULLs explicitly
SELECT *
FROM cats
WHERE id NOT IN (
SELECT cat_id
FROM adoption_fees
WHERE cat_id IS NOT NULL
);
-- Or better:
SELECT *
FROM cats c
WHERE NOT EXISTS (
SELECT 1 FROM adoption_fees af WHERE af.cat_id = c.id
);
Exercises
Exercise 1
Write a query to find all customers who:
- Have never placed an order (no matching records in orders table)
- Have placed orders but never provided a phone number
Exercise 2
Explain why these return different results:
-- Query A
SELECT COUNT(*) FROM orders WHERE customer_id NOT IN (1, 2, NULL);
-- Query B
SELECT COUNT(*) FROM orders WHERE customer_id NOT IN (1, 2) AND customer_id IS NOT NULL;
Exercise 3
Write a query showing each customer's name, total spent, and a "spending tier" (use CASE WHEN with COALESCE for NULL total_spent).
Exercise 4
Why does this query return no results?
SELECT *
FROM customers
WHERE total_spent > 100
OR total_spent = NULL;
Exercise 5
Write a query to find the percentage of customers with NULL phone numbers.
Key Takeaways
- NULL means "absence of value," not zero or empty string
- Always use
IS NULL/IS NOT NULLfor NULL comparisons - NULL comparisons always return UNKNOWN, never TRUE or FALSE directly
- COALESCE and NULLIF are essential for NULL handling
- Aggregates (SUM, AVG, COUNT(col)) ignore NULLs; COUNT(*) counts them
- Be careful with NOT IN and subqueries that might contain NULLs
- String concatenation with NULL produces NULL—use COALESCE