Metric Drop Investigation
Metric Drop Investigation
Technical Explanation
When a key metric drops unexpectedly, investigating systematically prevents misdiagnosis and wrong fixes. The drop might be a real problem, a data issue, a statistical anomaly, or expected seasonal variation.
Investigation Framework
| Step | Action |
|---|---|
| 1. Verify | Confirm the drop is real (not a reporting error) |
| 2. Scope | Is it all segments or specific ones? |
| 3. Timeline | When did it start? Any correlating events? |
| 4. Root Cause | What changed that could cause this? |
| 5. Fix vs Monitor | Is it actionable or just noise? |
Questions to Ask
- Did the data pipeline change?
- Was there a site/app issue?
- Did we launch a new feature?
- Is this seasonal/normal pattern?
- Did a competitor do something?
Code Examples
Using the CatCafe dataset:
-- Step 1: Verify - Check if drop is in raw data or aggregation issue
SELECT
DATE_TRUNC('day', order_date) as day,
COUNT(*) as orders,
COUNT(DISTINCT customer_id) as unique_customers,
SUM(total_amount) as revenue
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY DATE_TRUNC('day', order_date)
ORDER BY day;
-- Step 2: Scope - Is it specific segments or all?
SELECT
DATE_TRUNC('day', order_date) as day,
segment,
COUNT(*) as orders,
SUM(total_amount) as revenue
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY DATE_TRUNC('day', order_date), segment
ORDER BY day, segment;
-- Step 3: Compare to same period last period (week-over-week)
WITH daily_metrics AS (
SELECT
DATE_TRUNC('day', order_date) as day,
SUM(total_amount) as revenue,
COUNT(DISTINCT customer_id) as customers
FROM orders
WHERE status = 'completed'
GROUP BY DATE_TRUNC('day', order_date)
)
SELECT
current.day,
current.revenue as revenue,
previous.revenue as prev_week_revenue,
current.revenue - previous.revenue as revenue_diff,
(current.revenue - previous.revenue) * 100.0 / NULLIF(previous.revenue, 0) as wow_change_pct
FROM daily_metrics current
LEFT JOIN daily_metrics previous
ON current.day - previous.day = INTERVAL '7 days'
WHERE current.day >= CURRENT_DATE - INTERVAL '14 days'
ORDER BY current.day;
-- Step 4: Compare to same day last year ( YoY )
WITH daily_revenue AS (
SELECT
DATE_TRUNC('day', order_date) as day,
SUM(total_amount) as revenue
FROM orders
WHERE status = 'completed'
GROUP BY DATE_TRUNC('day', order_date)
)
SELECT
current.day,
current.revenue,
previous.revenue as same_day_last_year,
(current.revenue - previous.revenue) * 100.0 / NULLIF(previous.revenue, 0) as yoy_change
FROM daily_revenue current
LEFT JOIN daily_revenue previous
ON current.day - previous.revenue = INTERVAL '1 year' -- Wrong, should be day of year
ORDER BY current.day;
-- Step 4 corrected: Compare day of week
WITH dow_metrics AS (
SELECT
TO_CHAR(order_date, 'Dy') as day_name,
EXTRACT(DOW FROM order_date) as dow,
AVG(SUM(total_amount)) OVER (
PARTITION BY EXTRACT(DOW FROM order_date)
ORDER BY DATE_TRUNC('week', order_date)
) as rolling_avg
FROM orders
WHERE status = 'completed'
GROUP BY DATE_TRUNC('week', order_date), EXTRACT(DOW FROM order_date)
)
SELECT
day_name,
AVG(rolling_avg) as expected_revenue
FROM dow_metrics
GROUP BY day_name, dow
ORDER BY dow;
-- Step 5: Check for data pipeline issues
SELECT
DATE_TRUNC('day', created_at) as day,
COUNT(*) as records_processed,
COUNT(*) FILTER (WHERE status = 'failed') as failed_records
FROM etl_pipeline_logs
WHERE created_at >= CURRENT_DATE - INTERVAL '7 days'
GROUP BY DATE_TRUNC('day', created_at)
ORDER BY day;
The Cat Analogy
Your cat's weight dropped from 10 lbs to 9 lbs. Investigate systematically:
Step 1: Verify
- Is the scale working? (Check data accuracy)
- Did someone else feed the cat? (Check if it's real)
Step 2: Scope
- All cats dropped or just Whiskers? (Is it specific or universal?)
- Only morning weights or all day? (Time-based anomaly?)
Step 3: Timeline
- Did it happen suddenly or gradually? (Gradual = health issue, sudden = diet change)
- Did we change food recently? (Correlating event)
Step 4: Root Cause
- Less food available?
- More exercise?
- Worms?
- Stress?
Step 5: Fix vs Monitor
- If worms → treat immediately
- If normal after moving → monitor only
Common Pitfalls
Jumping to Conclusions
-- WRONG: "Revenue dropped, must be the marketing campaign"
SELECT *
FROM metrics
WHERE date = '2024-03-15';
-- Without checking: seasonality, data issues, other changes!
-- RIGHT: Check multiple potential causes
SELECT
date,
revenue,
page_views,
new_signups,
support_tickets
FROM daily_metrics
ORDER BY date DESC
LIMIT 30;
Exercises
Exercise 1
Revenue dropped 20% on Tuesday. Write a query to compare Tuesday vs Monday, vs same day last week.
Exercise 2
List 5 possible reasons for a sudden revenue drop that are NOT real problems.
Exercise 3
Write a query to identify if a drop is isolated to a specific segment or universal.
Exercise 4
What signals would indicate the drop is a data pipeline issue vs a real business change?
Exercise 5
Write a query to check if a metric drop correlates with any site changes or launches.
Key Takeaways
- Always verify before investigating (rule out data errors)
- Scope: Is it all segments or specific ones?
- Timeline: Gradual decline vs sudden drop = different causes
- Check correlating events: launches, outages, seasonality
- Fix actionable causes; monitor normal variation
- Document findings to prevent future misdiagnosis