# Live State Layer M1 Ship Summary

Generated: 2026-05-07

## Ladder Summary

| Step | Commit | Primary artifact |
| --- | --- | --- |
| Step 1 | `9aac193` | `scripts/shared/fetch_live_mlb_state.py`, `MLBStatsAPIClient.get_standings()` |
| Step 2 | `3e9a23f` | `scripts/shared/fetch_statcast_seasonal_current.py` |
| Step 3 | `f2f2a28` | `scripts/shared/compute_actual_vs_projected_deltas.py` |
| Step 4 | `e831d05` | `scripts/shared/compute_regression_candidates.py` |
| Step 5 | `50401d7` | `scripts/shared/surface_daily_stories.py`, live loader functions |
| Step 6 | this commit | Verification, route smoke, ship summary |

## What Shipped

Live State Layer M1 adds a cached current-state data layer so Home, Daily, Analytics, and future design-refresh pages can show what is actually happening now, then compare it to preseason projections and underlying Statcast indicators.

## Scheduler Tasks

| Task | Cadence | Probe metric on 2026-05-07 |
| --- | --- | ---: |
| `fetch_live_mlb_state` | daily `04:15` | 615 rows |
| `fetch_statcast_seasonal_current` | daily `04:30` | 2,309 rows |
| `compute_actual_vs_projected_deltas` | daily `05:00` | 600 rows |
| `compute_regression_candidates` | daily `05:15` | 387 rows |
| `surface_daily_stories` | daily `05:30` | 25 rows |

Forced scheduler verification passed for all five tasks on 2026-05-07.

## Data Artifacts

| Artifact | Rows | Contents |
| --- | ---: | --- |
| `data/derived/live/standings_2026-05-07.parquet` | 30 | Current standings from MLB Stats API |
| `data/derived/live/team_records_2026-05-07.parquet` | 30 | Team record summary, games back, run diff, streaks |
| `data/derived/live/leaderboards_hitting_2026-05-07.parquet` | 177 | Cached current hitting leaderboards |
| `data/derived/live/leaderboards_pitching_2026-05-07.parquet` | 84 | Cached current pitching leaderboards |
| `data/derived/live/leaderboards_fielding_2026-05-07.parquet` | 294 | Cached current fielding leaderboards |
| `data/historical/statcast_hitter_seasonal/hitter_seasonal_2026.parquet` | 1,335 | Current-season Statcast hitter seasonal table, schema-aligned with 2025 |
| `data/historical/statcast_pitcher_seasonal/pitcher_seasonal_2026.parquet` | 974 | Current-season Statcast pitcher seasonal table, schema-aligned with 2025 |
| `data/derived/live/deltas_team_2026-05-07.parquet` | 30 | Actual 162-game win pace vs projected wins |
| `data/derived/live/deltas_player_hitter_2026-05-07.parquet` | 368 | 50+ PA hitter actual/underlying deltas vs projection |
| `data/derived/live/deltas_player_pitcher_2026-05-07.parquet` | 202 | 20+ IP pitcher deltas vs projection |
| `data/derived/live/regression_candidates_2026-05-07.parquet` | 387 | 100+ PA hitter and 30+ IP pitcher surface-vs-underlying gaps |
| `data/derived/live/daily_stories_2026-05-07.parquet` | 25 | Templated current-state story candidates |
| `data/derived/live/latest.json` | n/a | Latest snapshot pointer for loader functions |

All live parquets write JSON manifest sidecars.

## Loader Functions

New loader functions in `src/pitcher_card_engine/web/loader.py`:

| Function | Return shape |
| --- | --- |
| `load_live_standings(date=None)` | `{"snapshot_date", "rows"}` |
| `load_live_leaders(group, date=None)` | `{"snapshot_date", "group", "rows"}` |
| `load_team_records(date=None)` | `{"snapshot_date", "rows"}` |
| `load_actual_vs_projected_team_deltas(date=None)` | `{"snapshot_date", "rows"}` |
| `load_actual_vs_projected_player_deltas(group, date=None)` | `{"snapshot_date", "group", "rows"}` |
| `load_regression_candidates(group=None, date=None)` | `{"snapshot_date", "group", "rows"}` |
| `load_daily_stories(date=None)` | `{"snapshot_date", "rows"}` |

`_load_leaderboard_sections()` now prefers cached `data/derived/live/leaderboards_<group>_<date>.parquet` when the cache is younger than 24 hours, falling back to the live Stats API only when the cache is missing or stale.

## Sample Outputs

| Check | Sample |
| --- | --- |
| Current standings | `NYY 26-12, +81 run diff, 0.0 GB in AL East` |
| Team projection delta | `TB +35.0 wins over projected pace through 36 games` |
| Hitter projection delta | `Dane Myers +0.137 xwOBA over projection` |
| Pitcher projection delta | `Antonio Senzatela -3.79 ERA vs projection` |
| Regression candidate | `Trent Grisham: .331 wOBA vs .365 xwOBA, positive-regression small band` |
| Daily story | `NYM is -27.8 wins off projected pace through 36 games` |

## Verification

| Check | Result |
| --- | --- |
| Forced live-state scheduler cycle | Passed, all five tasks success |
| Scheduler health check | Live-state tasks passed; unrelated pre-existing stale/failing tasks remain for betting/CLV/projection refreshes |
| Loader smoke test | Passed: standings 30, hitting leaders 177, team deltas 30, hitter deltas 368, pitcher deltas 202, regression 387, stories 25 |
| `/leaderboards` regular-season default | Passed; default is now `regular` on May 7 |
| Route smoke | `/`, `/leaderboards`, `/leaderboards?group=pitching`, `/projections` all returned 200 |
| Tests | `PYTHONPATH=src;.` `python -m unittest discover tests`: 51 tests passed |

Without `PYTHONPATH`, two pre-existing tests that import `pitcher_card_engine` directly still fail under raw `unittest discover`. The project already relies on `PYTHONPATH=src;.` or the scheduler's environment setup for those imports; with that environment, the suite passes.

Pre-existing health-check failures observed on 2026-05-07 include stale CLV, futures, Chadwick/Savant reference pulls, betting board refreshes, and projection/scorecard refreshes. The five Live State M1 tasks all reported fresh successful runs and passed their thresholds.

## How Design Refresh Consumes This

Home v2 / Daily / Analytics can now avoid preseason-only claims by reading:

| Page need | Data source |
| --- | --- |
| Current standings and division gaps | `load_team_records()` / `load_live_standings()` |
| Projection-vs-actual ledes | `load_actual_vs_projected_team_deltas()` |
| Breakouts and slumps | `load_actual_vs_projected_player_deltas("hitter")` and `("pitcher")` |
| Trent Grisham-style luck candidates | `load_regression_candidates("hitter")` |
| Templated bulletin/feed | `load_daily_stories()` |
| Live leaderboards | `load_live_leaders(group)` or existing `/leaderboards` route |

## Carried Debts

- Statcast current-season pull takes roughly 1-5 minutes depending on cache state. It is acceptable for daily early-morning automation, but not for page-render time.
- MLB Stats API `/v1/standings` requires `leagueId=103,104`; `sportId=1` alone returned zero rows. The client now encodes the working league-ID behavior.
- Leaderboard cache fallback should be observed after a deliberately stale cache test in a later verification pass.
- Story heuristics are deterministic and useful, but still first-pass. They rank current-state gaps; they do not perform manual news judgment or external news enrichment.
- Projection accountability forward logs do not exist yet; current "what went wrong" style accountability remains betting-first until projection logs are added.
- Sample-size thresholds are conservative but should be revisited after more live data accumulates.

## Explicitly Not Included

- No new ML training.
- No template or visual changes.
- No manual story-by-story fact-checking.
- No changes to K Valinor files or betting-domain strikeout files.
