Skip to content

0063 ftf mission mode is binary buy not buy 3 class is legacy

ADR-63 — FTF Mission Mode is Binary BUY/NOT_BUY; 3-class is Legacy

Status: Active as of 2026-04-22 (issue #608 F1=0.80 mission).

Context: The FTF framework historically trains 3-class classifiers (SELL / HOLD / BUY). Operator directive 2026-04-22 mandates a focused push on f1_buy ≥ 0.80 on binary BUY/NOT_BUY. Running 3-class models in parallel dilutes the compute budget and makes variant comparisons ambiguous — a factor that improves the HOLD boundary doesn't translate to better BUY decisions. The binary path now has end-to-end infrastructure (label remap, HPO branch, ThresholdCalibrator), so the default should match the mission.

Decision: The FTF baseline (ftf_config.base_env in PostgreSQL, ADR-59) pins the mission defaults:

CVN_BINARY_CLASSIFICATION=1
CVN_HPO_OBJECTIVE=f1_binary
CVN_HPO_N_TRIALS=50
CVN_THRESHOLD_METHOD=f1_binary

3-class mode is legacy — kept available via the classification_mode.3class ablation factor for audit comparisons, but never the default.

Invariants:

  • Binary is the default: new FTF runs without explicit factor overrides use CVN_BINARY_CLASSIFICATION=1.
  • HPO objective matches mode: when binary mode is on, CVN_HPO_OBJECTIVE=f1_binary is the default; the hyperoptimizer raises fail-fast (ADR-25) if an incompatible combo is selected (PR β, commit bc523dc3).
  • Threshold method matches mode: post-HPO threshold picking uses CVN_THRESHOLD_METHOD=f1_binary (ThresholdCalibrator, PR δ-2) — aligned with the optimization metric.
  • n_trials is bumped to 50 (from the previous 15) — the f1 landscape has more local optima than logloss; committee rec 11.
  • 3-class runs are marked legacy=true: any factor variant that forces CVN_BINARY_CLASSIFICATION=0 is treated as an audit comparison, not a contender for promotion.

Rollback: to revert to 3-class as default, flip the base_env values via Console UI (no code change). The pipeline supports both modes — only the default moves.

Alternatives rejected:

  • Keep 3-class (SELL/HOLD/BUY) as the default: the mission metric is f1_buy, not multi-class accuracy. A factor that improves the HOLD boundary doesn't translate to better BUY decisions, so 3-class compute dilutes the budget without moving the needle on the target.
  • Hybrid routing (run both binary and 3-class per factor variant): doubles compute per FTF run (~6h instead of 3h) with no added signal for the mission — the 3-class results never drive a decision. Cheaper to run 3-class as an explicit audit variant on demand.
  • Keep HPO n_trials at 30: the f1 landscape has more local optima than logloss (committee rec 11). 30 trials often converge to a flat plateau around mid-range theta; 50 trials surfaces the f1-maximizing region consistently. The added cost is ~10 minutes per fold, acceptable vs. the risk of under-sampling.
  • Keep fbeta_buy as the HPO objective (mathematically equivalent to f1_binary when CVN_BUY_BETA=1.0): same metric, different name. f1_binary is the explicit, mission-aligned name — reduces ambiguity in reports and PDFs (ADR-57: ADRs and labels in English, matching the metric, not aliased).

Files: ftf_config.base_env (PostgreSQL row id=1), managed via Console UI. No git change required (ADR-59).