Skip to content

CVNTrade — Tuning Protocol

Version: 1.0 Date: 2026-04-14 Issue: #499 Governing ADR: ADR-56 (every change A/B testable by design)


1. Purpose

This document defines the systematic process by which every parameter of the CVNTrade ML trading pipeline is selected, validated, and locked. No parameter is set by intuition, copied from another system, or left at a library default. Every choice is empirically validated through controlled ablation testing.

Audience: Anyone who needs to understand WHY the pipeline is configured the way it is — engineers, auditors, investors, regulators.

Guarantee: For every parameter in this document, there is: 1. A hypothesis explaining why this value was chosen 2. An FTF ablation run proving it's the best option 3. A statistical test (BH-corrected p < 0.05) or an explicit "no significant difference" verdict 4. A committee review validating the decision

Companion plan: see F1_BUY_BOOST_PLAN.md for the active 13-track plan to break the chronic f1_buy 0.40-0.46 plateau (committee-approved 2026-04-27 round 3, session 9d4942cb).


2. Methodology — Lock-and-Advance

Principle

Test ONE factor at a time (ceteris paribus). All other parameters are held at their locked baseline value. This isolates the effect of each change.

Process per factor

1. Define variants         → env var + FTF factor (ADR-56)
2. Run FTF ablation        → 5 folds × 5 cryptos × 3 costs × N variants
3. Analyze results         → Sortino, CI, pairwise BH comparison
4. Committee review        → score ≥ 8 to proceed
5. Lock winner             → lock_winner(factor, variant)
6. Update BASE_ENV         → winner becomes new baseline
7. Advance to next factor  → all subsequent tests use updated baseline

Statistical Standards

Criterion Threshold
Significance BH-corrected p < 0.05
Effect size Cohen's d reported (no minimum, but d < 0.2 = negligible)
Power ≥ 63 trades per variant (d=0.5, α=5%, power=80%)
Minimum trades per fold ≥ 30 (below = underpowered warning)
Confidence intervals Bootstrap 95% CI on all metrics
Outlier protection Sortino capped ±20, PF capped 50, runs < 3 trades excluded

If no significant difference

Lock the simplest/cheapest variant (Occam's razor). Document "no significant difference" — the test is still valuable because it proves the parameter doesn't matter.


3. Protocol Phases

Phase 0 — Calibration Baseline

Purpose: Establish the starting point. Verify the pipeline works end-to-end.

Factor Variants tested Winner Evidence
calibration none, isotonic, platt isotonic No significant difference (p=0.96). All three viable.

Decision: Lock isotonic (user selection). Calibration has negligible impact on Sortino.


Phase 1a — Data Foundation

Purpose: How we prepare the data BEFORE training. These choices affect every downstream component.

Factor Question Variants Env var
timeframe What candle resolution? 5m, 15m, 30m, 1h CVN_TIMEFRAME
fold_size How long is each training window? 6m, 9m, 12m, 18m CVN_TRAIN_WINDOW_MONTHS
n_features How many features does the model see? top_30, top_50, top_100, full CVN_MAX_FEATURES
atr_period ATR lookback for label generation? 10, 14, 20 bars CVN_ATR_PERIOD
purge_embargo Gap between train/test to prevent leakage? various purge/embargo combos CVN_PURGE_BARS, CVN_EMBARGO_BARS

Status: ✅ LOCKED — Winners applied to BASE_ENV (2026-04-16)

Why these matter: - timeframe: Determines the granularity of patterns the model can learn. 5m = noisy but granular. 1h = smoother but fewer samples. - fold_size: Short folds = more recent data but less training signal. Long folds = more data but older patterns that may no longer hold. - n_features: Too few = model can't learn complex patterns. Too many = overfitting risk (mitigated by XGBoost regularization). - atr_period: Controls the TP/SL levels in the triple barrier. Shorter = more responsive to recent volatility. Longer = more stable. - purge_embargo: Prevents label leakage between train and test. Too small = leakage risk. Too large = wasted data.


Phase 1b — Training Core

Purpose: How we configure the model training process itself. The highest-leverage phase.

Factor Question Variants Env var Why it matters
cusum_training_mode Do we filter data before training? disabled, relaxed_1_5, legacy_3_0 CVN_CUSUM_TRAINING_MODE Critical: CUSUM removes 95% of training data. Disabling it gives 20× more samples.
class_balancing Do we reweight minority classes? OFF, ON CVN_CLASS_BALANCING 70% of labels are HOLD. Without balancing, model defaults to HOLD and misses BUY signals.
hpo_objective What does the optimizer maximize? fbeta_buy, precision_recall_auc, f1_macro, sortino_net CVN_HPO_OBJECTIVE Classification metrics ≠ trading profit. sortino_net optimizes what we actually care about.
early_stopping When to stop training? 50, 150, 300 rounds CVN_EARLY_STOPPING_ROUNDS Too early = undertrained. Too late = overfit.
hpo_budget How many HPO trials? 15, 30, 50 CVN_HPO_N_TRIALS More trials = better HP search but slower. Diminishing returns after ~30.

Status: NEXT (Sprint 1 fixes deployed, cusum_training_mode triggered)

Market hypothesis behind these choices:

The system targets short-term mean-reversion at regime transition points in DeFi altcoins. CUSUM detects regime shifts. The ML model predicts which transitions will mean-revert profitably. The triple barrier captures the reversion (TP) or limits the loss (SL).

For this to work, the model needs: - Enough training data to learn transition patterns (→ relax CUSUM during training) - Balanced exposure to BUY signals (→ class balancing) - Optimization for trading profit, not classification accuracy (→ sortino_net)


Phase 2 — Model Architecture

Purpose: What type of model and classification scheme.

Factor Question Variants Env var Why it matters
classification_mode 2-class or 3-class? 3class, binary_balanced, binary_precision CVN_BINARY_CLASSIFICATION 3-class wastes capacity on SELL (we only go long). Binary focuses 100% on the BUY decision.
model_type Which ML algorithm? xgboost, lightgbm, catboost CVN_MODEL_TYPE Different inductive biases. XGBoost = baseline. LightGBM = faster. CatBoost = better on categoricals.
objective_beta Precision/recall trade-off? β=0.5, 1.0, 2.0 CVN_BUY_BETA β<1 favors precision (fewer but better trades). β>1 favors recall (more trades).

Status: PLANNED (after Phase 1b locked)


Phase 3 — Signal Generation

Purpose: How the model's predictions are filtered before becoming trade signals.

Factor Question Variants Env var Why it matters
meta_labeling Secondary model validates primary? OFF, ON_03, ON_05, ON_07 CVN_USE_META_LABEL Meta-label can filter false positives but adds complexity and requires separate training.
cusum_threshold How sensitive is the regime detector? h=2.0, 3.0, 5.0 CVN_CUSUM_THRESHOLD_H Lower h = more events (more trades, more noise). Higher h = fewer events (fewer trades, cleaner).
adaptive_event_engine Dynamic CUSUM threshold? OFF, ON CVN_ADAPTIVE_EVENT_ENGINE Static CUSUM doesn't adapt to regime changes. Adaptive adjusts h based on rolling volatility.
confidence_threshold Minimum model confidence to trade? 0.3, 0.4, 0.5, 0.6 CVN_THRESHOLD_BUY Lower = more trades (higher recall). Higher = fewer but better trades (higher precision).

Status: PLANNED


Phase 4 — Execution Rules

Purpose: How we structure each trade (entry, exit, time limit).

Factor Question Variants Env var Why it matters
signal_mode Instant or confirmed execution? ldp (instant), legacy_confirm (2-candle) CVN_USE_LDP_PIPELINE LdP = faster execution, no missed trades. Legacy = confirmation reduces false signals but delays entry.
triple_barrier SL/TP/Horizon settings? Various ATR multiplier combos CVN_SL_MULT, CVN_TP_MULT, CVN_HORIZON_HOURS Defines the risk/reward of each trade. Wider SL = fewer stops but larger losses. Higher TP = larger wins but fewer hits.

Status: PLANNED


Phase 5 — Signal Filters

Purpose: Post-inference filters that gate which signals become trades.

Factor Question Variants Env var Why it matters
trend_filter Only trade with the trend? OFF, ON_EMA20, ON_EMA50 CVN_USE_TREND_FILTER Prevents counter-trend trades. May reduce drawdown but also reduces opportunity.
regime_filter Block hostile regimes? OFF, ON CVN_USE_REGIME_FILTER Prevents trading during high-volatility crashes. May miss recovery bounces.
cooldown_policy Minimum time between trades? none, 5min, 15min CVN_TRADE_COOLDOWN_SECONDS Prevents overtrading after stops. Reduces emotional/revenge trading patterns in the model.
concurrency_limit Max simultaneous positions? 1, 2, 3 CVN_MAX_CONCURRENT 1 = concentrated bets (higher Sortino variance). 3 = diversified (lower variance but lower per-trade impact).

Default policy: All filters disabled by default unless FTF ablation proves they improve Sortino (BH p < 0.05). Committee approval required to enable (ADR-52).

Status: PLANNED


Phase 6 — Cost & Risk

Purpose: How we model transaction costs and manage portfolio risk.

Factor Question Variants Env var Why it matters
cost_model Base transaction fee? 10, 15, 30 bps CVN_TRADE_FEE_BPS Lower cost = more trades profitable. 15 bps is realistic for DeFi perps (maker+taker).
slippage_model How much slippage? none, linear, nonlinear CVN_SLIPPAGE_IMPACT_FACTOR Nonlinear: base + impact × √(size/volume). More realistic for illiquid DeFi tokens.
kelly_sizing Position sizing method? OFF, half_kelly, full_kelly CVN_USE_KELLY Kelly maximizes long-term growth. Half-Kelly is more conservative (lower drawdown).

Status: PLANNED


Phase 7 — Operations

Purpose: Runtime behavior and safety mechanisms.

Factor Question Variants Env var
drift_detection Monitor for model degradation? OFF, ON CVN_DRIFT_ACTION
system_status Active trading or shadow mode? active, shadow CVN_SYSTEM_STATUS

Status: PLANNED


Phase 8 — Holdout Validation

Purpose: Final out-of-sample validation before production deployment.

Process: 1. Run the fully locked configuration on holdout fold (fold 1, most recent 2 months) 2. Compare to baselines: buy-and-hold, random entry, naive (ADR-29) 3. Verify: Sortino > 1.5× random, net expectancy > 0, trades ≥ 30 4. Committee review (score ≥ 8) 5. Promote model to Production in MLflow (ADR-2: manual only) 6. Staged rollout: shadow → canary (10%) → full (ADR-42: atomic per crypto)


4. Current Baseline (BASE_ENV)

Every parameter below is the CURRENT production configuration. Changes require FTF validation + committee approval.

Data Preparation

Parameter Value Locked by Phase
Timeframe 1h Phase 1a ✅ Locked
Train window 18 months Phase 1a ✅ Locked
History depth 24 months Fixed
Feature count top 50 Phase 1a ✅ Locked
ATR period 20 bars Phase 1a ✅ Locked
Purge bars 20 (strict) Phase 1a ✅ Locked
Embargo bars 10 (strict) Phase 1a ✅ Locked
CUSUM training mode enabled (legacy) Phase 1b (testing)

Model Training

Parameter Value Locked by Phase
Model type XGBoost Phase 2
Classification 3-class (SELL/HOLD/BUY) Phase 2
HPO objective precision_recall_auc Phase 1b (testing)
HPO trials 30 Phase 1b
Early stopping 150 rounds Phase 1b
Class balancing ON ADR-46 Phase 1b
Calibration isotonic Phase 0 ✅ Locked
Binary classification OFF (3-class) Phase 2

Signal Generation

Parameter Value Locked by Phase
CUSUM filter (inference) ON, h=3.0σ Phase 3
Meta-label OFF Phase 3
Adaptive event engine OFF Phase 3
Confidence threshold 0.4 (HPO-tuned) Phase 3

Execution

Parameter Value Locked by Phase
Pipeline mode LdP (instant) Fixed
SL multiplier 1.5 ATR Phase 4
TP multiplier 3.0 ATR Phase 4
Horizon 5 hours Phase 4

Filters

Parameter Value Locked by Phase
Trend filter OFF Phase 5
Regime filter OFF Phase 5
Meta-label filter OFF Phase 5
Concurrency max 1 Phase 5
Cooldown 0s (none) Phase 5

Cost & Risk

Parameter Value Locked by Phase
Trade fee 15 bps Phase 6
Slippage model nonlinear (impact=0.001) Phase 6
Kelly sizing OFF Phase 6
Max daily drawdown 10% Fixed

5. Audit Trail

Every locked parameter has a traceable chain:

FTF Run (run_id) → PostgreSQL (finetune_results)
    → FTF Report (committee/reports/)
    → Committee Session (committee/sessions/{id}_committee.json)
    → lock_winner() call (results/ftf_locked_config.json)
    → BASE_ENV update (PR with CodeRabbit review)
    → Helm deploy (CI/CD)

All data is retained. Any decision can be replayed.


6. Glossary

Term Definition
FTF Fine-Tuning Framework — the ablation testing engine
ADR-56 Every pipeline change gated by env var + FTF factor
Lock A parameter value validated by FTF and committee, set as new baseline
BASE_ENV The current locked configuration in ablation_matrix.py
Ceteris paribus "All else equal" — only ONE factor varies per test
BH correction Benjamini-Hochberg false discovery rate control for multiple comparisons
Sortino Risk-adjusted return (downside deviation only). Primary trading metric.
CUSUM Cumulative Sum control chart — detects regime changes in volatility
Triple barrier Labeling method: TP hit → BUY, SL hit → SELL, timeout → HOLD
Walk-forward OOS validation: train on past, test on future, slide window forward
Holdout Final validation fold never seen during ablation (fold 1, most recent)