Skip to content

MLOps readiness — CVN-N001-EE-S04 — BTC cross-asset features (Track 1, F1_buy boost)

Story : CVN-N001-EE-S04 (wp#43) · GH issue #715 Owner : @dococeven (DRI for production behaviour of this change) Filled on : 2026-04-30 Reviewed by committee : - v1 session 62d756a9 — REJECTED / EXECUTION_RISK (2 architectural blockers) - v2 session 6519ed97PASSED / EXECUTION_RISK (strong consensus, 0 blockers, 7 forward-looking recos)


1. Production monitoring (MUST)

Metric Type Source Dashboard Threshold (warn / crit) Owner
event=btc_features_applied feature_set=... n_features=... purge_bars=... counter (training-time) commun.pipeline.btc_features.compute_btc_features (Loki) Grafana cvntrade-track1-btc-features panel "Feature set distribution" sample size mismatch vs FTF run config → warn @dococeven
Per-feature distribution (mean, std, KS-statistic vs training distribution) gauge offline analysis on inference logs (committee CR pass 2 reco v2.5) Grafana panel "BTC features drift" KS p < 0.01 over 14 days → warn ; > 3σ per-feature → crit @dococeven
event=btc_ohlcv_quality_alert reason=outlier_returns\|outlier_volume\|wick_to_body (committee CR pass 1 reco #5) counter ETL orchestration layer Grafana panel "BTC OHLCV quality" any reason fires > 5 times in 1 h → warn @dococeven
event=enrichment_config_mismatch model_run_id=... env_value=... artefact_value=... counter (failure) InferenceAPI._enforce_btc_artefact_consistency (ADR-25 fail-fast) Grafana panel "BTC artefact contract" any occurrence → P1 (model deployed under wrong env config) @dococeven
f1_buy per fold per crypto with BTC features gauge FTF results dossier table (committee reco #10) offline ; not Grafana per-fold variance > 0.05 → ABANDON variant (gate criterion 3) operator

Required minima covered : - ✅ prediction-rate metric — signals.buy_proba distribution (existing) + new event=btc_features_applied for training-time tagging - ✅ outcome metric — f1_buy per fold per crypto with BTC variant attribution - ✅ health metric — event=btc_ohlcv_quality_alert + event=enrichment_config_mismatch (ADR-25 fail-fast contract)

All metrics tagged with the FTF variant id (none / btc_min / btc_full / btc_full_purge0 / btc_full_purge10 / btc_vol_only) per ADR-30.

2. Alerting & runbooks (MUST)

  • Runbook P2 : runbook_btc_features_drift.md — handles concept drift on the 6 BTC features (committee CR pass 2 reco v2.5) + BTC OHLCV quality alerts + enrichment artefact contract mismatch.
  • Alerts :
  • event=enrichment_config_mismatch fires → P1 alert routed to @dococeven (model deployed under wrong env, ADR-23 violation)
  • KS test p < 0.01 on any BTC feature over 14 days → P2 alert routed to @dococeven (drift, follow runbook §1)
  • event=btc_ohlcv_quality_alert reason=outlier_returns > 5 times / hour → P2 alert (BTC feed degradation)
  • btc_full_purge0 outperforms btc_full paired t-test BH-corrected p < 0.05 in FTF results → leakage suspected → ABANDON Track 1 (mandatory hard gate per dossier §5)

3. Drift detection (MUST)

Drift type Detection method Threshold Action
Per-feature distribution drift (committee reco v2.5) KS test on each BTC feature vs training distribution, weekly window KS p < 0.01 over 14 days runbook §1 — investigate or rollback to champion_btc_blind
BTC-altcoin correlation drift (committee reco #3) rolling 30d Pearson correlation of target vs BTC returns drift > 3σ from training-time correlation runbook §2 — quarterly review trigger
BTC OHLCV quality outlier detection on returns / volume / wick-to-body per-bar threshold (5σ returns ; 80% volume drop ; wick-to-body > 5) runbook §3 + alert
Cross-regime f1_buy variance per-regime f1 in FTF results dossier per-fold variance > 0.05 gate 3 of F1 plan §6 — block lock
enrichment_config.json SHA256 mismatch (committee reco v2.1) SHA256 of artefact recomputed on load vs MLflow registry tag strict equality RuntimeError per ADR-25 (catches partial uploads + tampering)
Class distribution drift (existing) PSI on y_true per fold PSI > 0.2 existing playbook

4. Staged rollout (MUST)

Stage Surface Duration Gate
1 FTF sweep on defi_top5 (5 cryptos × 5 folds × 6 variants = 150 rows) run-completion every gate of F1 plan §6 + tightened f1_buy ≥ +0.020 + mandatory leakage check via paired t-test on purge0 vs full
2 Pre-FTF sample-size pre-flight (committee reco v2.7) 1 fold of btc_full on BTCUSDC ≥ 50 BUY trades / fold ; if fail, FTF sweep aborted
3 Pre-LOCK rollback dry run (committee reco v2.4) 24h shadow on the champion_btc_blind fallback model feature_names schema match + f1_buy ≥ baseline - 0.01
4 Operator promotion → live for 1 crypto (BTCUSDC), 7 days 7 d f1_buy ≥ baseline + 0.020 ; max_drawdown ≤ baseline + 1 %
5 Rollout to all 5 defi-top5 cryptos continuous quarterly drift review per §3

Per ADR-59, the lock decision is a Console-driven model promotion (atomic per-crypto promotion per ADR-15 + ADR-42), NOT a runtime env-flag toggle. The artefact-pinned config (per ADR-23) means rollback = deploying the champion_btc_blind model, not flipping a switch.

Paper/live integration is NOT in this Story — covered by a separate deployment_review session (committee CR pass 1 reco #4).

5. Rollback plan (MUST)

Symptom Action Reversal latency
event=enrichment_config_mismatch (model loaded under wrong env, ADR-25 fail-fast) Console promotion of the registered champion_btc_blind model (atomic per-crypto promotion per ADR-15 + ADR-42) < 5 minutes
Production f1_buy regression > 0.02 over 7 d same Console promotion of champion_btc_blind < 5 minutes
BTC OHLCV feed degraded > 5% NaN over a fold revert to champion_btc_blind ; investigate Binance API status < 5 minutes
Bug in compute_btc_features itself (e.g. window misalignment) hot-fix PR ; DOES NOT require redeploying ; revert via Console promotion until fix lands < 5 minutes for revert ; ~1 hour for fix-and-retrain
BTC-altcoin concept drift (correlation > 3σ from training) quarterly re-fit cadence ; revert to champion_btc_blind if revert is needed before retrain completes < 5 min revert ; 1 sprint to retrain
Pre-LOCK dry run fails (committee reco v2.4) LOCK decision blocked ; investigation required before any promotion 0 (LOCK not approved)

The rollback path is symmetric : every Track-1 variant ships with a registered champion_btc_blind fallback as part of the LOCK gate. The env var CVN_BTC_FEATURES_ENABLED is training-time only ; flipping it on a deployed model would either dimension-mismatch (caught by ADR-23) or silently impute zero (ruled out by §4.1bis pinning).

6. Owner & DRI (MUST)

  • DRI : @dococeven
  • Backup : @cvntrade-ml
  • Escalation : @cvntrade-architect (architectural drift on the feature contract or rollback workflow) ; @cvntrade-ops (production incident impacting SL/TP behaviour or kill-switch)

7. Known follow-up — cache key extension (committee CR pass 2 reco v2.2)

The L2 cache key for enrichment outputs (in commun/cache/) currently does not include the BTC OHLCV identity. With btc_features_enabled=True mixing into a target's enrichment, two parallel runs (one BTC-blind, one BTC-enabled) for the same target window could collide cache entries.

v1 known debt : the FTF sweep runs the BTC-enabled and BTC-blind variants in separate runs (separate run_id) so cache collision between them within a single run is moot. Cross-run cache pollution is a Track 12 concern.

Track 12 follow-up : extend the cache key with btc_first_ts + btc_last_ts + btc_features_set so BTC-blind and BTC-enabled enrichments coexist safely in cache. To be filed as a separate issue before Track 1 LOCK ; deferred per committee CR pass 2 (reco "v1 OR Track 12" — operator chose Track 12 to keep the v1 PR scope tight).

Pre-LOCK gate : if Track 1 clears all 6 official gates, the cache extension MUST land before live promotion (dry-run check : same target window enriched twice with none and btc_full variants in the same cache namespace must not collide).

This is a known-debt acknowledgement, not a blocker — committee verdict v2 accepts the deferral with the pre-LOCK gate above.


Sign-off checklist (gate before PR merge)

  • §1-§6 all complete
  • Plan dossier 2026-04-30-track1-btc-features-plan.md v2 PASSED committee plan_review (session 6519ed97)
  • Runbook runbook_btc_features_drift.md lands in this PR
  • All 6 official gates of F1 plan §6 met OR explicit keep available / abandon verdict — checked at FTF sweep completion, post-merge
  • Mandatory leakage check : btc_full_purge0 does NOT outperform btc_full (paired t-test BH p ≥ 0.05) — checked at FTF sweep completion
  • champion_btc_blind rollback model registered + 24h shadow dry run passed — checked at LOCK time, post-FTF
  • Expert Committee pr_review PASSED — runs against this PR before merge (mandatory per ADR-68 for substantial ML changes)