Skip to content

Plan dossier — Track 1 : BTC cross-asset features

Date : 2026-04-30 Story : CVN-N001-EE-S04 (OP wp#43) GH issue : #715 Author : Dominique (operator) + Claude Session type : plan_review (per ADR-68) Severity : P2 — quick-win bundle Track 4, tier 1 data lever (different lever from Track 5/6 ABANDONED loss-function attempts and Track 9 In testing calibration tier). Sequencing : per F1_BUY_BOOST_PLAN.md §6 Phase 1 — Track 1 is the next pick after Track 9 enters In testing. Cross-track lesson : training signal manipulation (Track 5+6) is not the productive lever ; Track 1 + Track 12 are the data-tier alternatives.

1. Context — why now, why this lever

Tracks 5 (label smoothing + cleanlab, both branches ABANDONED) and 6 (focal loss, all 4 variants ABANDONED) showed that training signal manipulation does not help at the current dataset / labelling regime. Track 9 (per-regime threshold) is in testing — calibration-tier lever, distinct from training signal. Track 1 is the first data-tier lever : it expands the input space rather than tweaking the loss / labels / threshold. The hypothesis is independent of whether Track 9 LOCKs.

CVNTrade currently trades altcoins on 15m candles using own-asset features only (~300 enriched columns from OHLCV + technical indicators per cvntrade_enrich.py). The model is blind to the BTC macro state — yet altcoins are highly BTC-correlated. The F1 plan §5 Track 1 hypothesises that adding a small set of BTC cross-asset features lifts f1_buy materially because the model gains visibility on the dominant regime driver.

2. Hypothesis (falsifiable)

Adding 6 BTC cross-asset features lifts f1_buy materially over the BTC-blind baseline at the current dataset / model regime. Specifically :

  • H0 (null) : mean(f1_buy | btc_features=enabled) - mean(f1_buy | disabled) is indistinguishable from 0 (CI95 includes 0) → ABANDON.
  • H1 (alternative) : Δf1_buy ≥ +0.020 (Story-specific bar — higher than the +0.015 baseline because Track 1 has the largest expected lift per F1 plan §5 : +0.03 to +0.06) with 95 % bootstrap CI excluding 0, AND ≥ 4/5 cryptos individually improve, AND Cohen's d ≥ 0.3.

The hypothesis is falsifiable per the same gate criteria as Tracks 5 / 6 / 9, with a tightened f1_buy bar (+0.020 instead of the standard +0.015) reflecting the higher expected effect size.

3. Variant matrix

5 variants per the F1 plan §4.2 convention (5 unique configs per FTF factor, including baseline) :

Variant What it does Features added Notes
none (baseline) Existing FE pipeline, BTC-blind 0 Reference
btc_min Minimal BTC features — directionality only btc_return_1h, btc_return_4h, btc_return_24h Cheapest variant — tests "BTC direction is enough"
btc_full Full F1-plan feature set btc_return_1h, btc_return_4h, btc_return_24h, btc_realized_vol_24h, btc_z_score_close, btc_correlation_15m_lag5 Canonical Track 1 variant per F1 plan §5
btc_full_purge0 Same as btc_full but no purging (purge_bars=0) Same 6 Pre-registered "leakage-permitted" sanity — proves purging is doing real work (CR rec #2) — informational only, NOT a candidate for lock
btc_full_purge10 Same as btc_full with purge_bars=10 (sensitivity) Same 6 Empirically justifies purge_bars=20 default (CR pass 1 reco #11). Could become the locked variant if it dominates.
btc_vol_only Volatility-only features (no direction) btc_realized_vol_24h, btc_z_score_close, btc_correlation_15m_lag5 Tests "regime detection without direction"

6 variants (5 candidates + 1 sensitivity). Per-fold aggregation across folds : standard FTF protocol — bootstrap CI95 + Cohen's d + BH-corrected p-values per F1 plan §7.

Why the purge0 variant is a sanity check, not a candidate : if btc_full_purge0 outperforms btc_full on f1_buy, that's evidence of leakage (the purge is hiding real predictive signal that the model would have used in production-impossible ways). If it underperforms or matches, the purge is doing real work and the locked variant is btc_full. The variant is shipped explicitly in the FTF matrix to make the leakage check part of the gate review, not an afterthought.

4. Implementation path

4.1 Cross-asset data path (enrichment_api.py, cvntrade_enrich.py)

The current enrichment pipeline (CVNTrade_Enrich.process(df: pd.DataFrame, feature_name: str, mode: str = "train") -> pd.DataFrame) is single-asset only. Track 1 extends it to accept an optional cross-asset reference :

  1. Extend EnrichmentConfig (in src/commun/pipeline/contracts.py) with the BTC configuration fields : btc_features_enabled: bool = False, btc_features_set: Literal["min", "full", "vol_only"] = "full", btc_purge_bars: int = 20, btc_embargo_bars: int = 10. The BTC OHLCV DataFrame itself is NOT a EnrichmentConfig field — it's runtime data, passed as a separate parameter to the enrichment functions (configuration vs. data separation).
  2. Extend CVNTrade_Enrich.process() signature to accept an optional btc_ohlcv: Optional[pd.DataFrame] kwarg (default None). When btc_ohlcv is None AND btc_features_enabled is False, behaviour is bit-identical to the current code (regression bar). When btc_features_enabled is True, btc_ohlcv MUST be passed (ADR-25 fail-fast).
  3. New module commun/pipeline/btc_features.py with compute_btc_features(target_ohlcv, btc_ohlcv, feature_set, purge_bars) invoked from the enrichment pipeline ONLY when btc_features_enabled=True. Otherwise no BTC columns appear in the output.

4.1bis Feature contract pinning via MLflow artefact (CR pass 1 BLOCKER #2 resolution)

⚠️ Status note (CR pass 4) : the persistence + inference loader of enrichment_config.json is OUT-OF-SCOPE for this PR — this section describes the target architecture that the follow-up PR wires. PR #792 ships the dataclass extension + feature computation + FTF factor + tests + docs ; the autotrainer write + InferenceAPI.from_mlflow(run_id) read land in the follow-up. Until follow-up merges, models trained with btc_features_enabled=True MUST NOT be deployed to inference (the InferenceAPI doesn't yet know how to read the pinned config). PR description's "Out of scope" + mlops_readiness.md §7 + EnrichmentConfig docstring .. note:: cover this constraint.

The env var CVN_BTC_FEATURES_* is a TRAINING-TIME signal only. Per ADR-23 (features version-pinned, fail-fast), the inference path must NOT read these env vars — the contract travels with the model artefact (when the follow-up wires the persistence path) :

  1. At training time (follow-up PR), the autotrainer reads the env vars to decide which EnrichmentConfig to use. The trained model's MLflow artefacts include a new enrichment_config.json capturing :
  2. btc_features_enabled: bool
  3. btc_features_set: str
  4. btc_purge_bars: int
  5. btc_embargo_bars: int
  6. feature_names_with_btc: list[str] (pinned ordered list of BTC columns produced by this model)
  7. At inference time (follow-up PR), InferenceAPI loads enrichment_config.json from the model's MLflow artefacts (alongside the existing feature_names). The EnrichmentConfig for the inference call is derived from the model's pinned config, not from the runtime env. If the env disagrees with the model's pinned config, fail fast per ADR-25 with an explicit error listing the mismatch (no silent imputation).
  8. feature_names validation (follow-up PR) : at inference, the input DataFrame's columns are compared to the model's pinned feature_names. Missing columns OR extra columns → RuntimeError. This is the regression bar that catches the case where someone deploys a BTC-enabled model under a BTC-blind enrichment pipeline (or vice versa).

This mirrors the pattern already used for ThresholdCalibrator and PerRegimeThresholdCalibrator (the calibrator artefact pins its regime_detector_version ; loading under a different version raises RuntimeError per ADR-25).

4.2 BTC OHLCV loading (cvntrade_etl_pipeline.py)

The existing ETL pipeline already loads BTC OHLCV via _fetch_binance_data("BTCUSDT", mode=..., timeframe=...). For training, BTC and target-asset OHLCV are fetched on the same window + same timeframe, then both passed to enrichment. For paper/live, the streaming kernel pre-fetches a rolling BTC window and passes it on every candle.

Loading happens at the ETL orchestration layer — the enrichment layer never reaches out for BTC data itself (per ADR-25 fail-fast : if btc_features_enabled=True and btc_ohlcv=None, raise RuntimeError).

4.3 ADR-14 purging invariant (committee F1 plan v2 rec #2)

The hard contract per ADR-14 : at training time t, BTC features may use only data ≤ t - purge_bars. This prevents look-ahead leakage from BTC's future state into the altcoin's training labels.

Implementation : _ajouter_features_btc computes the 6 features on the BTC OHLCV, then shifts every BTC feature column by purge_bars bars before joining to the target altcoin's index. The shift is the formal proof of the invariant ; a regression test asserts btc_return_1h.iloc[i] at time t_i was computed from BTC data ≤ t_i - purge_bars × bar_duration (15 min × 20 = 5 hours for default).

Defaults : purge_bars=20 and embargo_bars=10 (per F1 plan §5 Track 1). The btc_full_purge0 variant overrides purge_bars=0 for the leakage-detection sanity check.

4.4 The 6 BTC features — exact definitions

Feature Definition Window
btc_return_1h pct_change(4) (4 bars on 15m = 1 h) 4 bars
btc_return_4h pct_change(16) 16 bars
btc_return_24h pct_change(96) 96 bars
btc_realized_vol_24h pct_change(1).rolling(96).std() (raw 15m-return standard deviation over 24 h ; un-annualised for direct comparability with the existing FE pipeline's per-bar vol features) 96 bars
btc_z_score_close (close - rolling_mean(96)) / rolling_std(96) 96 bars
btc_correlation_15m_lag5 target.pct_change(1).rolling(96).corr(BTC.pct_change(1).shift(5)) 96 bars + 5-bar lag

The lagged correlation feature is computed on the target altcoin × BTC pair (not on BTC alone). It uses the target's own OHLCV that's already in scope.

After computation, every column is .shift(purge_bars) before being joined to the target index — the output column at row i has only BTC information from row i - purge_bars and earlier.

4.5 FTF factor + guardrail

Add factor=btc_features to src/commun/finetune/ablation_matrix.py under DATA_FACTORS (per ADR-56) with the 5 variants. Gates CVN_BTC_FEATURES_* env vars :

Variant env vars
none CVN_BTC_FEATURES_ENABLED=0
btc_min CVN_BTC_FEATURES_ENABLED=1, CVN_BTC_FEATURES_SET=min, CVN_BTC_PURGE_BARS=20, CVN_BTC_EMBARGO_BARS=10
btc_full CVN_BTC_FEATURES_ENABLED=1, CVN_BTC_FEATURES_SET=full, CVN_BTC_PURGE_BARS=20, CVN_BTC_EMBARGO_BARS=10
btc_full_purge0 CVN_BTC_FEATURES_ENABLED=1, CVN_BTC_FEATURES_SET=full, CVN_BTC_PURGE_BARS=0, CVN_BTC_EMBARGO_BARS=0
btc_full_purge10 CVN_BTC_FEATURES_ENABLED=1, CVN_BTC_FEATURES_SET=full, CVN_BTC_PURGE_BARS=10, CVN_BTC_EMBARGO_BARS=5
btc_vol_only CVN_BTC_FEATURES_ENABLED=1, CVN_BTC_FEATURES_SET=vol_only, CVN_BTC_PURGE_BARS=20, CVN_BTC_EMBARGO_BARS=10

Guardrail in src/commun/finetune/guardrails.py (per ADR-58) — _validate_btc_features :

  • CVN_BTC_FEATURES_SET{min, full, vol_only} — reject other values.
  • CVN_BTC_FEATURES_ENABLED=1 ⇒ the ETL pipeline MUST pass btc_ohlcv to enrichment. Fail-fast at training entry-point if the BTC dataframe is missing.
  • CVN_BTC_FEATURES_ENABLED=0 with CVN_BTC_FEATURES_SET set ⇒ orphaned override, reject (typical copy-paste leak).
  • CVN_BTC_PURGE_BARS and CVN_BTC_EMBARGO_BARS[0, 200]0 is the sanity-check variant, anything > 200 is most likely a typo (15 m × 200 = ~50 hours is way more than the model's horizon). The BTC_ prefix avoids collision with CVN_PURGE_BARS used by the global purged k-fold infrastructure (src/training/cv/purged_kfold.py).

4.6 Tests

  • tests/unit/test_enrich_btc_features.py — unit tests for _ajouter_features_btc :
  • happy path : 6 features computed correctly on synthetic OHLCV pair
  • shape : output has same row count as input target
  • shift invariant : assert btc_return_1h at row i equals btc_pct_change(4) at row i - purge_bars (formal proof of ADR-14)
  • missing BTC ohlcv with btc_features_enabled=True raises RuntimeError (ADR-25)
  • btc_features_enabled=False produces zero BTC columns (regression bar — pre-Track-1 behaviour bit-identical)
  • tests/integration/test_track1_btc_features.py — 5-variant FTF matrix end-to-end on small synthetic dataset, asserts per-variant determinism + correct env var routing
  • tests/unit/test_ftf_guardrails.py — extend with the new env var validation (5+ test cases per _validate_btc_features checks)

4.7 Observability + MLOps readiness

  • New event event=btc_features_applied feature_set=... purge_bars=... n_features=... indexed in Loki (per ADR-32) — emitted once per enrichment run.
  • Grafana panel "BTC features purge invariant" : checks the lag between btc_return_1h non-null first row and the target's first row (must equal purge_bars). Sanity-check on the running pipeline.
  • MLOps readiness file documentation/stories/CVN-N001-EE-S04/mlops_readiness.md filled per ADR-70 before merge.
  • New runbook documentation/runbooks/runbook_btc_features_drift.md (P2) added per committee CR pass 2 reco v2.5 — covers KS-test alerts, BTC-altcoin correlation drift, BTC OHLCV quality alerts, enrichment_config_mismatch (P1 fail-fast), pre-LOCK rollback dry-run failure handling.

5. Acceptance gate (per F1 plan §6)

The 6 official gates apply, with one tightening :

Gate Threshold
F1_buy lift mean Δf1_buy ≥ +0.020 with 95 % bootstrap CI excluding 0 (Story-specific tightened from the standard +0.015)
Joint metric Δexpectancy ≥ 0 AND Δsortino ≥ 0 AND Δmax_drawdown ≤ +1 %
Stability per-fold variance of f1_buy ≤ 0.05
Per-asset f1_buy improves on ≥ 4/5 cryptos
Sample size ≥ 50 BUY trades / fold
MLOps documentation/stories/CVN-N001-EE-S04/mlops_readiness.md complete

Mandatory leakage check (committee F1 plan v2 rec #2 + CR pass 1 reco #9) — replace the arbitrary +0.005 threshold with a paired t-test (BH-corrected across 5 cryptos × 5 folds = 25 paired observations) on f1_buy(btc_full_purge0) - f1_buy(btc_full). If the paired difference is significantly positive (BH-corrected p < 0.05) → leakage suspected → ABANDON Track 1 pending root-cause investigation. The statistical bar replaces the arbitrary effect-size threshold with a pre-registered hypothesis test, immune to noise floor calibration. This is a hard gate independent of the other 6.

If every gate clears → operator decision lock (Console flip the chosen variant in ftf_config.base_env, ADR-59). If any gate fails on every variant → abandon. If a variant clears AT MOST one gate beyond the F1_buy gate → keep available.

6. Out of scope

  • Order book microstructure features (F1 plan Track 2) — separate track, deferred to big-bet bundle, gated on Track 1 + Track 9 outcome.
  • Other cross-asset references (ETH, SOL, BTC dominance, total market cap) — Track 1 is BTC-only by design ; if it LOCKs, expanding to ETH could be a follow-up Story under the same Epic.
  • Adaptive purge_bars (per-regime or volatility-adaptive purging) — premature ; constant purge_bars=20 per ADR-14 standard is the contract.
  • Online BTC feature computation in paper/live — the streaming kernel needs a rolling BTC window. v1 ships as backtest-only ; paper/live integration is the natural next sprint if Track 1 LOCKs (separate Story under CVN-N001-EE).
  • Cross-asset in inference cache — cache key extension to include BTC OHLCV hash. v1 disables cache when btc_features_enabled=True (fail-safe ; Track 12 will revisit).

7. Falsifiability + rollback

  • Falsifiability : the gate criteria above (especially the +0.020 f1_buy bar with CI95 excluding 0 + per-asset 4/5) are pre-registered. If the FTF sweep produces Δf1 ∈ [-0.01, +0.015] with CI95 including 0, that's the H0 outcome — ABANDON cleanly. If Δf1 ≥ +0.015 but < +0.020, that's "encouraging but doesn't meet bar" — keep available (could be combined with a future Track for joint lift).
  • Rollback (CR pass 1 BLOCKER #1 resolution — model-switching, not env-flag flipping) : if Track 1 LOCKs and a production regression appears, the rollback path is switching the deployed model artefact to a baseline-trained model (i.e. trained without BTC features, with btc_features_enabled=False pinned in its enrichment_config.json).
  • The MLOps promotion workflow already handles model artefact swaps per ADR-15 + ADR-42 (atomic per-crypto promotion). The operator promotes the previous BTC-blind champion via the standard Console flow on mlflow_promotion, NOT via CVN_BTC_FEATURES_ENABLED=0.
  • The runtime env var CVN_BTC_FEATURES_ENABLED is training-time only ; flipping it on a deployed BTC-enabled model would cause a feature_names shape mismatch at inference (caught by the §4.1bis ADR-23 contract, raises RuntimeError).
  • Mandatory pre-LOCK artefact : every Track-1 LOCK must keep the previous BTC-blind champion as a deployable rollback target in MLflow Registry (tagged champion_btc_blind). The promotion script enforces this — no Track-1 model becomes champion without a fallback model registered.
  • Hot-fix path for code bugs : standard PR, retrain the model with the fix, atomic promotion. No runtime env-flag toggle.
  • Why this is safer than env-flag rollback : env-flag flipping at inference would either dimension-mismatch the model (caught) or silently impute zero (not caught — the ADR-23 violation flagged by committee). Model-switching keeps the feature contract intact end-to-end.

8. Risks

Risk Likelihood Impact Mitigation
BTC OHLCV gaps (Binance feed outages) propagate to altcoin training medium medium The pd.merge left-join on target index handles missing BTC bars by emitting NaN ; the existing FE pipeline drops rows with > X% NaN. Add a Loki alert if BTC NaN rate > 5% over a fold
Look-ahead leakage despite the purge low high Mandatory btc_full_purge0 variant in the FTF matrix surfaces leakage as a gate violation. Plus the unit test that formally verifies the shift invariant
Cross-asset features cause MLflow artefact bloat low low The 6 BTC columns are negligible vs the 300+ existing FE columns. Drop one with --cov-report if it ever matters
Concept drift : BTC's relationship with altcoins changes (e.g., post-halving regime shift) medium high Track-level review at every quarterly model retrain. If correlations drift > 3σ, file an issue to revisit Track 1 ; quarterly cadence per #709 MLOps readiness template §3
Purging too aggressive eats predictive signal low medium The btc_min (3 features × purge_bars=20) variant tests a thinner feature set ; if it dominates btc_full, suggests over-engineering not over-purging
Operator forgets to load BTC OHLCV in the streaming kernel medium high ADR-25 fail-fast at construction time : btc_features_enabled=True + btc_ohlcv=None raises with explicit error
BTC OHLCV cost (Binance API quota) blows up the training pipeline budget low low BTC is fetched once per training run (already cached at the ETL level) and added as a join — minimal overhead

9. Why this is not the next loss-function attempt

Per the cross-track lesson recorded in F1_BUY_BOOST_PLAN.md §6 Outcomes : Tracks 5 + 6 closed ABANDONED on training signal manipulation (label engineering + loss function). Track 1 is :

  • Data-tier (tier 1 of the F1 plan, distinct from tier 2 LABEL ENGINEERING + tier 3 LOSS FUNCTION + tier 5 CALIBRATION).
  • Input-space expansion (the model sees more, not different signal).
  • Pre-training (operates on the model's input, not its training signal nor decision rule).

If Track 1 also abandons, the lesson generalizes more strongly across all tiers and the next pick should pivot to Track 12 (fractional differentiation + feature interactions) per the F1 plan §6 implication block. If Track 1 locks, data-tier becomes the productive lever and Track 12 is naturally aligned.

10. Cross-references

  • F1 plan §5 Track 1 + §6 sequencing
  • ADRs : ADR-14 (purging+embargo standard), ADR-25 (no silent fallback), ADR-32 (event=key=value structured logs), ADR-56 (every change FTF-testable), ADR-58 (every factor → guardrail + integration test), ADR-70 (MLOps readiness mandatory)
  • Existing infra :
  • src/ETL/cvntrade_enrich.py:59 (entry-point — CVNTrade_Enrich.process)
  • src/commun/pipeline/enrichment_api.py:46 (modern wrapper — EnrichmentAPI)
  • src/commun/pipeline/contracts.py:20 (extends EnrichmentConfig with the 4 BTC config fields ; BTC OHLCV is passed as a separate parameter to enrichment, not stored on the config)
  • src/training/cv/purged_kfold.py:41-46 (canonical purge_bars pattern, env var CVN_PURGE_BARS)
  • src/ETL/cvntrade_etl_pipeline.py:365 (BTC OHLCV loader — _fetch_binance_data("BTCUSDT", ...))
  • src/commun/finetune/ablation_matrix.py:89 (DATA_FACTORS list — will register btc_features factor here)
  • tests/unit/test_enrichment_service.py:29 (SAMPLE_OHLCV fixture)
  • Sister Tracks : Track 5 results (ABANDON), Track 6 results (ABANDON), Track 9 results (pending FTF sweep verdict)
  • Production filter chain : architecture/FILTER_FUNNEL.md — Track 1 sits at the FE step (pre-inference)

11. Committee plan_review v1 triage (session 62d756a9, 2026-04-30)

v1 verdict : REJECTED / EXECUTION_RISK — split consensus across 5 experts (architect 7.5, ml-engineer 7.5, ops 7.0, data-scientist 8.0, crypto-trader 7.5 — avg 7.5). 2 blockers + 11 recommendations.

Reason cited : "The proposed rollback mechanism is fundamentally flawed and violates ADR-23, posing a severe silent degradation risk due to feature contract mismatch between trained models and the inference pipeline."

11.1 Blockers (architectural — required pre-impl)

# Blocker Source Resolution
1 Rollback via env-flag (CVN_BTC_FEATURES_ENABLED=0) violates ADR-23 — flipping at inference creates feature dimension mismatch or silent imputation against a model trained with BTC features expert-ops + expert-architect §7 rewritten — rollback is now model-artefact switching via the existing MLOps promotion workflow (ADR-15 + ADR-42). The env var is downgraded to training-time only. Mandatory : every Track-1 LOCK keeps the previous BTC-blind champion as a registered fallback.
2 Global env vars create a brittle feature contract — runtime misconfig → wrong-shape input or runtime errors expert-architect §4.1bis added — feature contract is pinned in MLflow artefact metadata (enrichment_config.json + feature_names_with_btc). At inference, InferenceAPI derives EnrichmentConfig from the model's pinned config, NOT from runtime env. ADR-25 fail-fast on env↔artefact mismatch. Mirrors the existing regime_detector_version pinning pattern.

11.2 Recommendations integrated pre-impl (locked into the plan)

# Recommendation Source Integration
1 Revise rollback to model-switching expert-ops + expert-architect §7 rewritten (also blocker resolution)
2 Strengthen feature contract via MLflow metadata expert-architect §4.1bis added (also blocker resolution)
4 Plan deployment_review for live expert-ops + expert-crypto-trader §6 amended — explicit acknowledgement that paper/live integration requires a separate deployment_review session before live promotion. v1 ships as backtest-only.
5 Document BTC OHLCV provenance + quality monitoring expert-ml-eng + expert-architect + expert-crypto-trader §4.2 amended + new §4.2bis — BTC source = Binance via existing _fetch_binance_data("BTCUSDT", ...) ; document known limitations (Binance only, no exchange aggregation, post-2017 data, no listings prior to BTC-USDT pair availability). New monitoring : outlier detection on returns (> 5σ flagged) + volume anomalies (drop > 80 % vs 30d median) + wick-to-body ratios (> 5 flagged) — emitted as Loki events btc_ohlcv_quality_alert reason=....
6 Verify backtest cost model realism expert-crypto-trader §5 amended — explicit confirmation that expectancy_net and sortino gates use the F1 plan §4 cost formula : gross_pnl - taker_fee - spread - slippage - funding (round-trip ≈ 45 bps interim per the v3 cost assumption). Pinned in the FTF results dossier template. Updated cost model from Track 2 dynamic slippage will retroactively apply when it lands.
9 Replace +0.005 leakage threshold with statistical test expert-data-scientist §5 leakage gate rewritten — paired t-test with BH correction across 25 paired observations. Hypothesis-test bar replaces the effect-size threshold.
11 Sensitivity for purge_bars expert-architect + expert-data-scientist §3 matrix extended — added btc_full_purge10 variant. Now 6 variants : none, btc_min, btc_full, btc_full_purge0 (leakage check), btc_full_purge10 (sensitivity), btc_vol_only.

11.3 Recommendations applied at impl time

# Recommendation When
3 Continuous concept/data drift monitoring Phase 4 — extends §4.7 observability with feature distribution monitoring (KS test on each BTC feature vs training distribution, weekly window) + Grafana panel "BTC features drift" + alert if KS p < 0.01 over 14 days
7 Specific tests : NaN propagation, window alignment, no-future-leak Phase 1 — extends §4.6 unit test list ; tests below in §11.4
10 Per-asset metrics in FTF results Phase 5 — extends FTF results dossier table (already mandated by F1 plan §6 per-asset gate, just makes the per-asset trade count + variance explicit)

11.4 Tests added per CR pass 1 reco #7

Beyond the 5 unit tests already listed in §4.6 :

  • NaN propagation test : input target with random NaN gaps in BTC OHLCV → assert that the target altcoin's row at time t is dropped only if BTC's row at t - purge_bars was NaN ; downstream rows unaffected.
  • Window alignment test : assert that btc_correlation_15m_lag5 at row i uses target.pct_change(1).rolling(96) from rows [i-95, i] paired with BTC.pct_change(1).shift(5).rolling(96) from rows [i-100, i-5] — formal proof that BTC's 5-bar lag is applied BEFORE the rolling window, not after.
  • No-future-leak test : compute _ajouter_features_btc on synthetic data where BTC's last 50 rows have a step-function spike. Assert that the spike does NOT appear in any target row at index < n - 50 - purge_bars. Catches subtle off-by-one errors in the shift logic.

11.5 Recommendations deferred (out of scope for S04)

  • Reco #8 — Cache re-enablement : flagged as Track 12 work. v1 disables cache when btc_features_enabled=True (fail-safe ; Track 12 will revisit cache key extension to include BTC OHLCV hash). Documented in §6 out-of-scope.

11.6 §6 amendment — deployment_review for paper/live

Online BTC feature computation in paper/live — the streaming kernel needs a rolling BTC window. v1 ships as backtest-only ; paper/live integration requires a dedicated deployment_review committee session covering staged rollout (canary → shadow → live), real-time drift detection, live feedback loops, kill-switch validation. This is a separate Story under CVN-N001-EE if Track 1 LOCKs.

11.7 Net effect on §4 implementation path

  • 2 blockers fixed (model-artefact feature contract + model-switching rollback).
  • 7 recos applied directly (rollback rewrite, feature contract pinning, BTC provenance + quality monitoring, statistical leakage test, purge_bars sensitivity variant, deployment_review acknowledgement, cost model realism confirmation).
  • 3 recos applied at impl time (drift monitoring, NaN/window/no-future-leak tests, per-asset metrics).
  • 1 reco deferred (cache re-enablement — Track 12).

Verdict re-submitted to committee in v2 round (session ID TBD) with this triage section explicit. Re-submission expected to upgrade to PASSED EXECUTION_RISK.


11bis. Committee plan_review v2 triage (session 6519ed97, 2026-04-30)

v2 verdict : PASSED / EXECUTION_RISK — strong consensus across 5 experts, 0 blockers, 7 new recos.

Reason cited : "The plan successfully addresses the v1 blockers regarding ADR-23 compliance and rollback mechanisms, but new execution risks related to cache integrity, MLflow artefact validation, and live model swap atomicity are identified."

The two architectural rewrites (§4.1bis feature contract + §7 model-switching rollback) are accepted. Implementation may proceed.

11bis.1 Recommendations integrated pre-impl (5 of 7)

# Recommendation Source Integration
v2.1 Checksum validation for MLflow artefacts all 5 experts §4.1bis amendedenrichment_config.json and feature_names_with_btc ship with their SHA256 in the MLflow registry tags. At load, InferenceAPI recomputes the hashes and raises RuntimeError per ADR-25 if any drift (catches partial uploads + tampering). Pre-promotion hook in the MLOps workflow validates the hashes before the artefact is registered.
v2.2 Cache key includes BTC features state all 5 experts §4.bis added — when btc_features_enabled=True, the L2 cache key is extended with + btc_first_ts + btc_last_ts + btc_features_set so a BTC-enabled enrichment cannot collide with a BTC-blind one for the same target window. 5-line patch in commun/cache/. Track 12 will revisit a hashed-window key for a tighter contract.
v2.5 Drift response runbook expert-ops + ml-eng + crypto-trader §4.7 amended — new documentation/runbooks/runbook_btc_features_drift.md (P2) covers KS-test alert response : revert to BTC-blind champion if KS p < 0.01 over 14 days OR per-feature distribution drift > 3σ from training distribution, with quantitative thresholds for revisiting Track 1.
v2.6 Stress-case tests expert-crypto-trader §4.6 amended — new tests for synthetic BTC flash crash (-30% spike in 4 bars) + halving-like step (10% baseline shift) ; assert that the per-fold f1_buy doesn't degrade > 0.05 vs a no-stress run.
v2.7 ≥ 50 BUY trades/fold pre-FTF validation expert-crypto-trader §5 amended — operator runs 1 fold of btc_full on BTCUSDC (acts as a pre-flight check) and verifies sample size BEFORE triggering the full FTF sweep. If fail, the FTF sweep is aborted and the run gets a sample-size diagnostic dossier, not a verdict dossier.

11bis.2 Recommendations applied at impl + deployment time (1 of 7)

# Recommendation When
v2.3 Atomic MLflow promotion + race conditions Deferred to deployment_review session per §6 (pre-paper/live). The session covers blue/green, circuit breakers, pre-swap health checks. v1 ships backtest-only ; no live model swap on this Story.

11bis.3 Recommendations applied at LOCK time (1 of 7)

# Recommendation When
v2.4 Pre-LOCK rollback dry run in staging At LOCK decision time. Operator runs the registered champion_btc_blind for 24h shadow on the same day's data ; assert that its feature_names match the inference-time enrichment output (no schema drift) AND its f1_buy on the shadow window is ≥ baseline - 0.01 (basic sanity). Documented in mlops_readiness.md §5 rollback plan as a mandatory pre-promotion gate.

11bis.4 Net path forward

  • 0 v2 blockers → impl proceeds.
  • 5 v2 recos applied pre-impl (checksum, cache key, drift runbook, stress tests, sample-size pre-flight).
  • 1 v2 reco deferred to deployment_review (race conditions during live swap).
  • 1 v2 reco applied at LOCK time (rollback dry run).

Together with v1 triage : 7 v1 + 5 v2 = 12 recos applied pre-impl, 3 v1 + 1 v2 = 4 recos at impl time, 1 v2 at LOCK time, 1 v1 deferred (Track 12 cache).

11bis.5 EXECUTION_RISK acknowledgment

The EXECUTION_RISK code remains in v2 because Track 1 introduces first cross-asset feature in CVN history — committee correctly flags that the architectural patterns (feature contract pinning, model-switching rollback, checksum validation) are new to the codebase and have execution risk in their first integration. The risk is acknowledged + budgeted into the impl phase (extra rigor on tests + observability, NOT cut corners on the patterns themselves).


Question for the committee (v2)

v1 verdict : REJECTED / EXECUTION_RISK due to ADR-23 violation in the rollback mechanism + brittle feature contract via runtime env vars. Both blockers addressed in §4.1bis (feature contract pinned in MLflow artefact, env var becomes training-time only) + §7 (rollback via model-artefact switching, not env-flag flipping). All 11 recos triaged in §11 — 7 applied pre-impl, 3 at impl time, 1 deferred to Track 12.

Re-validate : is the new feature contract pinning pattern (MLflow enrichment_config.json + feature_names_with_btc, derived from artefact at inference, ADR-25 fail-fast on env↔artefact mismatch) sufficient to satisfy ADR-23 ? Is the model-switching rollback path (atomic promotion of the BTC-blind champion via existing MLOps workflow) sufficient to safely revert ? Are there remaining hidden modes (e.g. cache key collision between BTC-blind and BTC-enabled enrichment outputs, race condition during live model swap) the v1 dossier missed ?