CVN-N001-EI-S03 — Split / regime reconstruction (Block 3b)¶
Implementation summary for the S41 split-ablation diagnostic and the gated production split.
Full rationale + review history: the plan dossier
(committee plan_review 0854dc31 PASSED, OP Meeting #234) and the
regime-signal decision.
What it answers¶
Is the weak signal (AUC ≈ 0.64, best_iter = 1) driven by how we split train/test (validation
design, hypothesis C-d — the program's leading hypothesis) rather than the model? Two jobs:
- Job A — resolve the S02 reserve (gate). S02 returned
SYSTEMATIC_LEAKdriven solely by the temporal-autocorrelation probe under degraded proxies; the leakage probes were clean. S41 re-measures the autocorrelation under a real purge + embargo split. A genuine temporal leak → HALT; the expected label-overlap autocorrelation neutralised by embargo → clear the reserve. - Job B — the split ablation. Re-fit under six split families and measure two deltas:
- Δ_leak = AUC(purged-WF) − AUC(naive). Material negative ⇒ the naive AUC was leak-inflated
(
BASELINE_LEAK_INFLATED). - Δ_cd = AUC(regime family) − AUC(purged-WF). Material positive ⇒ validation design is the
driver (
SPLIT_MOVES_VALIDATION).
AUC-gated (ΔAUC ≥ 0.02, CI excludes 0), best_iter corroborating-not-vetoing, with a power gate
(SPLIT_STABLE only at ≥ 0.80 power, else INCONCLUSIVE_UNDERPOWERED — evidence of absence vs
absence of evidence).
Architecture (two-layer, Hamilton-native)¶
s41_io.load_cell_inputs (Airflow-owned I/O: run_s22a1 anchor → SHA-drift guard → captured-parquet
load → PG-resolved LGB hp, ADR-90) feeds the Hamilton graph in hamilton/s41_nodes.py:
combined → regime_signals → family_splits → family_auc → {delta_leak, delta_cd, acf_reserve} → cell_verdict
↓
group_verdict (strict-majority)
The heavy LGB fit is a single family_auc node (one fit per family); the Δ probes derive from it.
Split constructors live in src/training/cv/regime_split.py (pure, leak-guarded). Six families: naive
(Δ_leak reference, unguarded), purged_wf (honest baseline), strict_temporal, market_period,
volatility_bucket, regime_ab, plus the group-level crypto_loo.
Decided design points (regime-signal decision)¶
- Option B, unconditional — the regime signal comes from captured features (
price_volatility_20vol,distance_SMA_192trend,trend_strengthstructure), never rawclose. This keeps the instrument a pure function of the S07-pinned fold (no external data, no drift, no dual path). - Multivariate family-6 — the regime code is
(vol_band × trend_sign × structure_band)with absolute pre-registered bands (stable identity across folds), soregime_abtests a genuine regime, not a single seen-axis slice. - Q-feature label-correlation gate — a regime feature with
|Spearman corr(feature, label)| > 0.10is rejected / flaggedDIFFICULTY_CONFOUND(else the split confounds difficulty with transfer). - Leak-safety — every guarded family passes
assert_no_leak(no train/test overlap + embargo gap; violation →event=s41_embargo_violation severity=error, clean verdict, never a crash).
Gated production split (CVN_SPLIT_MODE)¶
The purged walk-forward split is implemented in ablation_runner._generate_folds as an
off-by-default, A/B-testable capability (ADR-56). Default (off/unset) is byte-identical to the prior
behaviour (train_end == test_start); CVN_SPLIT_MODE=purged_wf inserts a CVN_SPLIT_EMBARGO_DAYS gap
(ftf_config, ADR-59). It is not promoted to the production default in S03 (ADR-2 — awaits the S03 + S04
verdicts); promotion would be a separate Story with its own staged rollout.
Scope¶
Every verdict is specific to the defi_top5 control group — it informs the program but is not a
program-wide claim. A positive S03 (SPLIT_MOVES_VALIDATION) triggers a generalisation Story, not a
direct program-wide split ship; BASELINE_LEAK_INFLATED escalates to a program-level leak investigation.
Operator surface¶
- DAG:
diagnostic__s41(two-layer,crypto_group=defi_top5fan-out, reuses the S07 pinned fold via therun_s22a1anchor + SHA-drift guard; schedule=None, paused). - Verdict catalogue:
SPLIT_MOVES_VALIDATION/SPLIT_STABLE/BASELINE_LEAK_INFLATED/INCONCLUSIVE_UNDERPOWERED/INCONCLUSIVE_TOOLING; group:SYSTEMATIC_VALIDATION_DESIGN/PER_ASSET_SPLIT_EFFECT. - Observability:
event=s41_*(Loki). See the MLOps readiness.