Supply Chain · Service Analyst Case Study

A distribution network that couldn’t see its own blind spots.

A network of 6 distribution centers, 126 stores, and 20 products, simulated across 90 days. Sales were quietly leaking to stockouts no one saw coming — buried under messy, real-world data. This is the end-to-end analysis that found the money, then built a control tower to keep watch.

$1.53M
estimated lost sales to stockouts: 4.1%
of CEDIS-days ran short: 6
distribution centers (CEDIS)

All data is 100% synthetic. Lost sales is a labeled estimate.

6 CEDIS · simulated national network

02 · The Raw Data

Real operational data never arrives clean.

These are raw shipment records, straight from the source — the kind of messy operational data an analyst actually inherits. Of 13,715 rows, 2,583 (18.8%) carried at least one quality issue. A representative sample, every problem flagged:

Sample of raw shipment rows. Cells with data-quality issues are highlighted and named in the Flags column: SKU typos, incomplete loads, late deliveries, missing or negative quantities, unknown (phantom) stores, and duplicate records.
Shipment	Store	SKU	Ordered	Received	Promised	Actual	Flags
SH-000002	ST-0001	sku-005	65	65	2024-01-06	2024-01-06	SKU typo
SH-000009	ST-0001	sku-019	36	36	2024-01-29	2024-02-02	SKU typolate delivery
SH-000005	ST-0001	SKU-011	25	14	2024-01-16	2024-01-16	incomplete load
SH-000008	ST-0001	SKU-017	57	49	2024-01-26	2024-01-26	incomplete load
SH-000220	ST-9008	SKU-007	58	58	2024-02-20	2024-02-20	unknown store
SH-000350	ST-9000	SKU-015	82	82	2024-03-29	2024-03-29	unknown store
SH-000002	ST-0001	SKU-017	49	49	2024-01-06	2024-01-07	late delivery
SH-000005	ST-0001	SKU-018	47	47	2024-01-16	2024-01-18	late delivery
SH-000059	ST-0003	SKU-020	-50	50	2024-02-03	2024-02-03	negative qty
SH-000025	ST-0001	SKU-006	49	null	2024-03-21	2024-03-21	missing qty
SH-001394	ST-0071	SKU-001	48	48	2024-01-31	2024-01-31	duplicate
SH-001394	ST-0071	SKU-001	48	48	2024-01-31	2024-01-31	duplicate

amber — recoverable (SKU typo, incomplete load, duplicate)red — hard error (late, missing / negative qty, phantom store)

Note that one shipment ID can legitimately repeat across its SKUs — which is exactly why duplicates are detected on shipment + SKU, not the ID alone. Nothing is discarded yet: next, how each defect is detected and resolved — recovering what we can, dropping only the unusable.

03 · The Cleaning

Most problems get fixed, not thrown away.

Every injected defect was detected, then resolved by the cheapest honest means: recover in place where the data allows, flag and keep the real service events, and drop only what is genuinely unusable. 456 issues were repaired without losing a row; just 279 rows were dropped.

Raw intake

13,715

rows received

98.0%

retained

Clean output

13,436

rows kept

456 recovered · 2,000flagged & kept279 removed

Recovered

456

fixed in place — kept

SKU typos394
Standardized to canonical SKU (uppercase; O→0, I/L→1). Rows retained.
Negative quantities62
Sign error corrected to absolute value. Rows retained.

Flagged

2,000

real events — kept for follow-up

Incomplete loads801
Flagged is_incomplete=True for follow-up. Rows retained.
Late deliveries1,199
Flagged is_late=True for follow-up. Rows retained.

Removed

279

unusable — dropped

Duplicate shipment lines203
Dropped exact duplicate (shipment_id, SKU) lines; kept first occurrence.
Phantom shipments12
Dropped — store_id absent from the store master.
Missing received quantity64
Dropped — received quantity missing and cannot be honestly imputed.

✓ Detection check: every injected defect was found — 738/738 matched (typos, duplicates, phantom stores, missing/negative quantities).

04 · The Findings

Where the sales leaked — and why.

An estimated $1.53M (est.) in sales never converted — demand that arrived to empty shelves. Three questions: where did the loss concentrate, when do deliveries fail, and how good is the forecast?

Three CEDIS — Mexico City, Guadalajara, Monterrey — drive 62% of lost sales.

Lost sales concentrate sharply. Fixing the top three distribution centers first captures most of the recoverable upside — that is where to act.

Orders placed Thursday–Saturday slip the most.

Friday orders run late 12.4% of the time vs 6.2% on Tuesday — a weekend dispatch backlog. The dashed line marks the 8.9% network average.

Forecast error is ~18.9% across the board.

Only ~2 points higher for the event-driven categories (Snacks, Beverages) — worth noting, not a headline.

05 · The Control Tower

Everything above — live.

Filter by region, brand, and date. Every KPI and chart recomputes from the same daily aggregate. With no filter applied, the numbers equal the analysis exactly — the same data, made interactive.

Loading live data…

06 · The Prediction

Now — see it coming.

Everything so far looked backward. The last move is forward: a stock-out risk score for every SKU at every CEDIS — and, deliberately, one you can read. Not a black box; a transparent scoring heuristic where every point traces to a cause a planner can act on. That legibility is the point, not a limitation.

How the score works

A 0–100 score from four drivers, each adding points:

Days of cover — the dominant driver. Under 1 day adds 50, tiering to 0 by ~7 days.
Below safety stock — +20 when stock is under the buffer.
Demand volatility (CV) — up to +20 for erratic demand.
SKU rotation — +10 high / +5 medium; a fast-mover costs more to lose.

Capped at 100, bucketed Low <35 · Medium 35–59 · High 60+.

Worked example

SKU-015 · FrutaViva @ Guadalajara

Days of cover (1.48 days): +40
Below safety stock (619 < 1,042): +20
Demand volatility (CV 0.247): +20
High rotation (fast-mover): +10
Total: 90/100 → High

The 7-day risk view

Of 120 SKU×CEDIS positions, the model flags a deliberately tight critical list — 13 High, 27 Medium, 80 Low. Most of the network is fine; attention goes where it’s earned.

Risk landscape

x: days of cover · y: score · dashed = Medium (35) & High (60)

Critical list — 13 highest-risk positions
SKU	CEDIS	Cover	Cov	Saf	Vol	Rot	Score
SKU-015	GDL	1.48d	40	20	20	10	90
SKU-010	GDL	0.58d	50	20	6	5	81
SKU-017	MER	2.29d	30	20	20	10	80
SKU-001	GDL	2.46d	30	20	20	10	80
SKU-011	MTY	0.71d	50	20	0	5	75
SKU-006	GDL	1.55d	40	20	6	5	71
SKU-006	TIJ	1.65d	40	20	6	5	71
SKU-002	MTY	2.45d	30	20	12	5	67
SKU-014	MX	1.64d	40	20	6	0	66
SKU-009	GDL	1.66d	40	20	6	0	66
SKU-007	GDL	1.18d	40	20	6	0	66
SKU-012	MER	1.93d	40	20	6	0	66
SKU-017	GDL	2.94d	30	0	20	10	60

The backward and forward views agree

Guadalajara drove 20% of historical lost sales — the #2 CEDIS in the Pareto — and now holds 7 of the 13 highest forward-looking risk scores. The place the analysis flagged in hindsight is exactly where the model says to look next. Two independent methods, one answer.

07 · The Recommendation

What I’d do Monday morning.

Three moves, in priority order. Each ties back to a finding above — none is generic, and the numbers are the analysis’s, not a wishlist.

01
Concentrate replenishment where the loss is.
↓ Findings — Three CEDIS — Mexico City, Guadalajara, Monterrey — drive 62% of an estimated $1.53M in lost sales.
Review safety stock, reorder points, and inbound lead times for those three centers first. The loss is concentrated, so the earliest fixes capture most of the recoverable upside before touching the long tail.
02
Move the weekend dispatch cutoff earlier.
↓ Findings — Friday orders run late 12.4% of the time vs 6.2% on Tuesday — a weekend dispatch backlog.
Bring the Thursday/Friday cutoff forward, or add weekend dispatch capacity, so late-week orders clear before the backlog builds. A scheduling change, not a capital one — cheap to pilot.
03
Make the forecast event-aware, and run the risk watchlist weekly.
↓ Findings + Prediction — Forecast error of 18.9% that misses the event surge, and 13 positions already flagged High-risk.
Add an event- and payday-aware uplift to the forecast for beverages & snacks, and stand up the High-risk list as a standing weekly report — so planners pre-position stock instead of reacting after the shelf is empty.

If I could run only one play first: the Guadalajara replenishment review. It is the #2 historical loss and 7 of the 13 forward risks — the one place hindsight and the model agree. Highest confidence, first hour.