Supply Chain · Service Analyst Case Study
A distribution network that couldn’t see its own blind spots.
A network of 6 distribution centers, 126 stores, and 20 products, simulated across 90 days. Sales were quietly leaking to stockouts no one saw coming — buried under messy, real-world data. This is the end-to-end analysis that found the money, then built a control tower to keep watch.
- $1.53M
- estimated lost sales to stockouts
- 4.1%
- of CEDIS-days ran short
- 6
- distribution centers (CEDIS)
All data is 100% synthetic. Lost sales is a labeled estimate.
6 CEDIS · simulated national network
02 · The Raw Data
Real operational data never arrives clean.
These are raw shipment records, straight from the source — the kind of messy operational data an analyst actually inherits. Of 13,715 rows, 2,583 (18.8%) carried at least one quality issue. A representative sample, every problem flagged:
| Shipment | Store | SKU | Ordered | Received | Promised | Actual | Flags |
|---|---|---|---|---|---|---|---|
| SH-000002 | ST-0001 | sku-005 | 65 | 65 | 2024-01-06 | 2024-01-06 | SKU typo |
| SH-000009 | ST-0001 | sku-019 | 36 | 36 | 2024-01-29 | 2024-02-02 | SKU typolate delivery |
| SH-000005 | ST-0001 | SKU-011 | 25 | 14 | 2024-01-16 | 2024-01-16 | incomplete load |
| SH-000008 | ST-0001 | SKU-017 | 57 | 49 | 2024-01-26 | 2024-01-26 | incomplete load |
| SH-000220 | ST-9008 | SKU-007 | 58 | 58 | 2024-02-20 | 2024-02-20 | unknown store |
| SH-000350 | ST-9000 | SKU-015 | 82 | 82 | 2024-03-29 | 2024-03-29 | unknown store |
| SH-000002 | ST-0001 | SKU-017 | 49 | 49 | 2024-01-06 | 2024-01-07 | late delivery |
| SH-000005 | ST-0001 | SKU-018 | 47 | 47 | 2024-01-16 | 2024-01-18 | late delivery |
| SH-000059 | ST-0003 | SKU-020 | -50 | 50 | 2024-02-03 | 2024-02-03 | negative qty |
| SH-000025 | ST-0001 | SKU-006 | 49 | null | 2024-03-21 | 2024-03-21 | missing qty |
| SH-001394 | ST-0071 | SKU-001 | 48 | 48 | 2024-01-31 | 2024-01-31 | duplicate |
| SH-001394 | ST-0071 | SKU-001 | 48 | 48 | 2024-01-31 | 2024-01-31 | duplicate |
Note that one shipment ID can legitimately repeat across its SKUs — which is exactly why duplicates are detected on shipment + SKU, not the ID alone. Nothing is discarded yet: next, how each defect is detected and resolved — recovering what we can, dropping only the unusable.
03 · The Cleaning
Most problems get fixed, not thrown away.
Every injected defect was detected, then resolved by the cheapest honest means: recover in place where the data allows, flag and keep the real service events, and drop only what is genuinely unusable. 456 issues were repaired without losing a row; just 279 rows were dropped.
Raw intake
13,715
rows received
98.0%
retained
Clean output
13,436
rows kept
Recovered
456
fixed in place — kept
- SKU typos394
Standardized to canonical SKU (uppercase; O→0, I/L→1). Rows retained.
- Negative quantities62
Sign error corrected to absolute value. Rows retained.
Flagged
2,000
real events — kept for follow-up
- Incomplete loads801
Flagged is_incomplete=True for follow-up. Rows retained.
- Late deliveries1,199
Flagged is_late=True for follow-up. Rows retained.
Removed
279
unusable — dropped
- Duplicate shipment lines203
Dropped exact duplicate (shipment_id, SKU) lines; kept first occurrence.
- Phantom shipments12
Dropped — store_id absent from the store master.
- Missing received quantity64
Dropped — received quantity missing and cannot be honestly imputed.
✓ Detection check: every injected defect was found — 738/738 matched (typos, duplicates, phantom stores, missing/negative quantities).
04 · The Findings
Where the sales leaked — and why.
An estimated $1.53M (est.) in sales never converted — demand that arrived to empty shelves. Three questions: where did the loss concentrate, when do deliveries fail, and how good is the forecast?
Three CEDIS — Mexico City, Guadalajara, Monterrey — drive 62% of lost sales.
Lost sales concentrate sharply. Fixing the top three distribution centers first captures most of the recoverable upside — that is where to act.
Orders placed Thursday–Saturday slip the most.
Friday orders run late 12.4% of the time vs 6.2% on Tuesday — a weekend dispatch backlog. The dashed line marks the 8.9% network average.
Forecast error is ~18.9% across the board.
Only ~2 points higher for the event-driven categories (Snacks, Beverages) — worth noting, not a headline.
05 · The Control Tower
Everything above — live.
Filter by region, brand, and date. Every KPI and chart recomputes from the same daily aggregate. With no filter applied, the numbers equal the analysis exactly — the same data, made interactive.
Loading live data…
06 · The Prediction
Now — see it coming.
Everything so far looked backward. The last move is forward: a stock-out risk score for every SKU at every CEDIS — and, deliberately, one you can read. Not a black box; a transparent scoring heuristic where every point traces to a cause a planner can act on. That legibility is the point, not a limitation.
How the score works
A 0–100 score from four drivers, each adding points:
- Days of cover — the dominant driver. Under 1 day adds 50, tiering to 0 by ~7 days.
- Below safety stock — +20 when stock is under the buffer.
- Demand volatility (CV) — up to +20 for erratic demand.
- SKU rotation — +10 high / +5 medium; a fast-mover costs more to lose.
Capped at 100, bucketed Low <35 · Medium 35–59 · High 60+.
Worked example
SKU-015 · FrutaViva @ Guadalajara
- Days of cover (1.48 days)
- +40
- Below safety stock (619 < 1,042)
- +20
- Demand volatility (CV 0.247)
- +20
- High rotation (fast-mover)
- +10
- Total
- 90/100 → High
The 7-day risk view
Of 120 SKU×CEDIS positions, the model flags a deliberately tight critical list — 13 High, 27 Medium, 80 Low. Most of the network is fine; attention goes where it’s earned.
x: days of cover · y: score · dashed = Medium (35) & High (60)
| SKU | CEDIS | Cover | Cov | Saf | Vol | Rot | Score |
|---|---|---|---|---|---|---|---|
| SKU-015 | GDL | 1.48d | 40 | 20 | 20 | 10 | 90 |
| SKU-010 | GDL | 0.58d | 50 | 20 | 6 | 5 | 81 |
| SKU-017 | MER | 2.29d | 30 | 20 | 20 | 10 | 80 |
| SKU-001 | GDL | 2.46d | 30 | 20 | 20 | 10 | 80 |
| SKU-011 | MTY | 0.71d | 50 | 20 | 0 | 5 | 75 |
| SKU-006 | GDL | 1.55d | 40 | 20 | 6 | 5 | 71 |
| SKU-006 | TIJ | 1.65d | 40 | 20 | 6 | 5 | 71 |
| SKU-002 | MTY | 2.45d | 30 | 20 | 12 | 5 | 67 |
| SKU-014 | MX | 1.64d | 40 | 20 | 6 | 0 | 66 |
| SKU-009 | GDL | 1.66d | 40 | 20 | 6 | 0 | 66 |
| SKU-007 | GDL | 1.18d | 40 | 20 | 6 | 0 | 66 |
| SKU-012 | MER | 1.93d | 40 | 20 | 6 | 0 | 66 |
| SKU-017 | GDL | 2.94d | 30 | 0 | 20 | 10 | 60 |
The backward and forward views agree
Guadalajara drove 20% of historical lost sales — the #2 CEDIS in the Pareto — and now holds 7 of the 13 highest forward-looking risk scores. The place the analysis flagged in hindsight is exactly where the model says to look next. Two independent methods, one answer.
07 · The Recommendation
What I’d do Monday morning.
Three moves, in priority order. Each ties back to a finding above — none is generic, and the numbers are the analysis’s, not a wishlist.
- 01
Concentrate replenishment where the loss is.
↓ Findings — Three CEDIS — Mexico City, Guadalajara, Monterrey — drive 62% of an estimated $1.53M in lost sales.
Review safety stock, reorder points, and inbound lead times for those three centers first. The loss is concentrated, so the earliest fixes capture most of the recoverable upside before touching the long tail.
- 02
Move the weekend dispatch cutoff earlier.
↓ Findings — Friday orders run late 12.4% of the time vs 6.2% on Tuesday — a weekend dispatch backlog.
Bring the Thursday/Friday cutoff forward, or add weekend dispatch capacity, so late-week orders clear before the backlog builds. A scheduling change, not a capital one — cheap to pilot.
- 03
Make the forecast event-aware, and run the risk watchlist weekly.
↓ Findings + Prediction — Forecast error of 18.9% that misses the event surge, and 13 positions already flagged High-risk.
Add an event- and payday-aware uplift to the forecast for beverages & snacks, and stand up the High-risk list as a standing weekly report — so planners pre-position stock instead of reacting after the shelf is empty.
If I could run only one play first: the Guadalajara replenishment review. It is the #2 historical loss and 7 of the 13 forward risks — the one place hindsight and the model agree. Highest confidence, first hour.