Home/Data Analyst/Business Metrics

Metric Drop Investigation

Metric Drop Investigation

Technical Explanation

When a key metric drops unexpectedly, investigating systematically prevents misdiagnosis and wrong fixes. The drop might be a real problem, a data issue, a statistical anomaly, or expected seasonal variation.

Investigation Framework

Step Action
1. Verify Confirm the drop is real (not a reporting error)
2. Scope Is it all segments or specific ones?
3. Timeline When did it start? Any correlating events?
4. Root Cause What changed that could cause this?
5. Fix vs Monitor Is it actionable or just noise?

Questions to Ask

  • Did the data pipeline change?
  • Was there a site/app issue?
  • Did we launch a new feature?
  • Is this seasonal/normal pattern?
  • Did a competitor do something?

Code Examples

Using the CatCafe dataset:

-- Step 1: Verify - Check if drop is in raw data or aggregation issue
SELECT
    DATE_TRUNC('day', order_date) as day,
    COUNT(*) as orders,
    COUNT(DISTINCT customer_id) as unique_customers,
    SUM(total_amount) as revenue
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY DATE_TRUNC('day', order_date)
ORDER BY day;

-- Step 2: Scope - Is it specific segments or all?
SELECT
    DATE_TRUNC('day', order_date) as day,
    segment,
    COUNT(*) as orders,
    SUM(total_amount) as revenue
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY DATE_TRUNC('day', order_date), segment
ORDER BY day, segment;

-- Step 3: Compare to same period last period (week-over-week)
WITH daily_metrics AS (
    SELECT
        DATE_TRUNC('day', order_date) as day,
        SUM(total_amount) as revenue,
        COUNT(DISTINCT customer_id) as customers
    FROM orders
    WHERE status = 'completed'
    GROUP BY DATE_TRUNC('day', order_date)
)
SELECT
    current.day,
    current.revenue as revenue,
    previous.revenue as prev_week_revenue,
    current.revenue - previous.revenue as revenue_diff,
    (current.revenue - previous.revenue) * 100.0 / NULLIF(previous.revenue, 0) as wow_change_pct
FROM daily_metrics current
LEFT JOIN daily_metrics previous
    ON current.day - previous.day = INTERVAL '7 days'
WHERE current.day >= CURRENT_DATE - INTERVAL '14 days'
ORDER BY current.day;

-- Step 4: Compare to same day last year ( YoY )
WITH daily_revenue AS (
    SELECT
        DATE_TRUNC('day', order_date) as day,
        SUM(total_amount) as revenue
    FROM orders
    WHERE status = 'completed'
    GROUP BY DATE_TRUNC('day', order_date)
)
SELECT
    current.day,
    current.revenue,
    previous.revenue as same_day_last_year,
    (current.revenue - previous.revenue) * 100.0 / NULLIF(previous.revenue, 0) as yoy_change
FROM daily_revenue current
LEFT JOIN daily_revenue previous
    ON current.day - previous.revenue = INTERVAL '1 year'  -- Wrong, should be day of year
ORDER BY current.day;

-- Step 4 corrected: Compare day of week
WITH dow_metrics AS (
    SELECT
        TO_CHAR(order_date, 'Dy') as day_name,
        EXTRACT(DOW FROM order_date) as dow,
        AVG(SUM(total_amount)) OVER (
            PARTITION BY EXTRACT(DOW FROM order_date)
            ORDER BY DATE_TRUNC('week', order_date)
        ) as rolling_avg
    FROM orders
    WHERE status = 'completed'
    GROUP BY DATE_TRUNC('week', order_date), EXTRACT(DOW FROM order_date)
)
SELECT
    day_name,
    AVG(rolling_avg) as expected_revenue
FROM dow_metrics
GROUP BY day_name, dow
ORDER BY dow;

-- Step 5: Check for data pipeline issues
SELECT
    DATE_TRUNC('day', created_at) as day,
    COUNT(*) as records_processed,
    COUNT(*) FILTER (WHERE status = 'failed') as failed_records
FROM etl_pipeline_logs
WHERE created_at >= CURRENT_DATE - INTERVAL '7 days'
GROUP BY DATE_TRUNC('day', created_at)
ORDER BY day;

The Cat Analogy

Your cat's weight dropped from 10 lbs to 9 lbs. Investigate systematically:

Step 1: Verify

  • Is the scale working? (Check data accuracy)
  • Did someone else feed the cat? (Check if it's real)

Step 2: Scope

  • All cats dropped or just Whiskers? (Is it specific or universal?)
  • Only morning weights or all day? (Time-based anomaly?)

Step 3: Timeline

  • Did it happen suddenly or gradually? (Gradual = health issue, sudden = diet change)
  • Did we change food recently? (Correlating event)

Step 4: Root Cause

  • Less food available?
  • More exercise?
  • Worms?
  • Stress?

Step 5: Fix vs Monitor

  • If worms → treat immediately
  • If normal after moving → monitor only

Common Pitfalls

Jumping to Conclusions

-- WRONG: "Revenue dropped, must be the marketing campaign"
SELECT *
FROM metrics
WHERE date = '2024-03-15';
-- Without checking: seasonality, data issues, other changes!

-- RIGHT: Check multiple potential causes
SELECT
    date,
    revenue,
    page_views,
    new_signups,
    support_tickets
FROM daily_metrics
ORDER BY date DESC
LIMIT 30;

Exercises

Exercise 1

Revenue dropped 20% on Tuesday. Write a query to compare Tuesday vs Monday, vs same day last week.

Exercise 2

List 5 possible reasons for a sudden revenue drop that are NOT real problems.

Exercise 3

Write a query to identify if a drop is isolated to a specific segment or universal.

Exercise 4

What signals would indicate the drop is a data pipeline issue vs a real business change?

Exercise 5

Write a query to check if a metric drop correlates with any site changes or launches.


Key Takeaways

  • Always verify before investigating (rule out data errors)
  • Scope: Is it all segments or specific ones?
  • Timeline: Gradual decline vs sudden drop = different causes
  • Check correlating events: launches, outages, seasonality
  • Fix actionable causes; monitor normal variation
  • Document findings to prevent future misdiagnosis