ADR-0099 — Staged-proof architecture: the existence-vs-selection boundary, and inter-stage validity chains¶
Status: active (operator-mandated 2026-06-04 ; durable lesson-of-record of the CVN-N001-EI thread)
Context-of-record: the S04/s42 episode — a Step-1 existence diagnostic that overstepped into a selection (recommend lr=0.025) and treated an existence proof (better validation AUC) as a transfer proof (better in deployment), which the production f1_buy search refuted. See ADR-0098, S10, the report.
Context¶
The program optimises an economic objective (net trading return) that is too noisy, path-dependent, and non-stationary to optimise directly. The sound response is a staged decomposition — separation of concerns, each stage provable before composition:
- Signal discrimination (on OHLCV, primarily Top5-DeFi) — is there discriminable structure?
- Engine filter tuning — can the filters raise the win-rate / net edge? (the prediction is one non-deterministic input among others)
- Portfolio management — can allocation raise net gains?
- Integration — compose all three end-to-end.
This decomposition is correct and necessary: it is the only tractable way to attack an objective too noisy to optimise in one shot, and Step 1 is entitled to a proxy (AUC, F1) because its question is existence of signal, not economic value. The thread, however, exposed two failure modes — both realised in S04 — that the decomposition does not prevent on its own:
- Boundary violation (existence → selection). A stage's legitimate output is an existence claim of bounded magnitude ("discriminable signal exists, ≈ this big"). The moment Step 1 began to select and recommend a configuration (
lr=0.025), it crossed into a decision — "which config serves the economic objective" — that requires the downstream objective it does not have. A signal detector cannot say which signal is the right signal for downstream. - Existence proof treated as transfer proof. "Improves the proxy" was taken to mean "improves the economic goal". Only composition can establish that. Local rigour ≠ global validity. Stage-by-stage gains compose into end-to-end gain only under a local-optimum → global-optimum assumption that is false wherever the pipeline has non-monotone interactions — and S04 is a concrete instance (a signal that improves Step-1 AUC which the downstream f1_buy search does not want).
Decision¶
The staged architecture is adopted, with four load-bearing disciplines that make composition earned rather than hoped:
- Boundary rule — existence, not selection. Each stage proves the existence (and bounded magnitude) of its scoped property using a legitimate proxy. A stage MUST NOT make a selection / recommendation that requires the downstream or economic objective; selection is deferred to composition. (Step 1 outputs "signal exists, magnitude X" — never "deploy config Y".)
- Inter-stage validity chain — prove the seam, don't assume it. Before a stage's output is consumed downstream, its local proxy MUST be validated as a faithful contributor to the next stage / the economic goal. Each seam carries an explicit validity link; without it, "every stage proven locally" + "the whole works" is a bet, not a proof.
- Magnitude-aware sub-objectives. Where the economic goal is magnitude-sensitive, stage proxies must be too. Win-rate and class-F1 are magnitude-blind (a high win-rate can lose money: many small gains, few large losses → negative expectancy). Stage 2+ targets expectancy / net edge (
win_rate × avg_gain − loss_rate × avg_loss, net of costs), not bare win-rate. (CVN_THRESHOLD_METHOD=expectancyalready provides the ingredient.) - Top-down objective definition. The economic objective defines what each stage optimises (a backward dependency through the feed-forward order), rather than each stage independently choosing a generic proxy. "Good signal" at Step 1 is informed by what Steps 2–3 reward.
Invariants¶
- Invariant 1 — stage output is an existence claim, not a selection. A stage deliverable states "property P exists with magnitude X (CI)", not "use configuration/filter/weight Y". Any selection is flagged as out-of-stage and deferred to composition.
- Invariant 2 — no downstream consumption without a demonstrated seam-validity link. A stage output used by the next stage MUST come with evidence that its proxy contributes faithfully to that stage / the economic goal. Assumed transfer is a defect.
- Invariant 3 — magnitude-aware proxies. Sub-objectives at magnitude-sensitive stages are expectancy/net-edge, not win-rate/F1 in isolation.
- Invariant 4 — objective is top-level. Stage proxies are derived from the economic objective, not chosen locally; the derivation is stated.
Relation to other ADRs¶
ADR-0098 (a diagnostic must map the deployment regime — metric and regime — at plan time) is the specific instance of Invariant 2 at the diagnostic→deployment seam. This ADR is the general architecture; ADR-0098 is one seam of it. ADR-0097 (experiment-report template / pre-registration) and ADR-0095 (diagnostic-story template) are the artefacts in which the boundary and seam checks are recorded; they gain an "existence-vs-selection + seam-validity" section.
Consequences¶
- The CVN-N001-EI Epic is explicitly a Step-1 (existence/discrimination) program: its diagnostics output existence claims of bounded magnitude, not configuration selections. The S04 HP-swap overstep is the lesson-of-record; S04 closes on the documented non-transfer, and its selection ambition is recognised as out-of-stage.
- Each stage seam gains a validity gate (the proxy-contributes-faithfully check); composition is built on demonstrated links, not hope.
- Some attractive within-stage results (a config that wins on the stage proxy) will be withheld from downstream until their seam link is shown — that is the point, and it is what would have saved the S04 thread up front.
- A correctly-scoped Step-1 successor (S11) asks an existence question ("does the production HPO over-fit?") and makes no selection — the boundary held by design.
References¶
- Thread lesson-of-record: ADR-0098, S10, S11, the published report.
- Related: ADR-14 (multi-fold generalisation), ADR-15 (theta calibrated OOS — a seam ingredient), ADR-29 (naïve baseline — an existence-proof discipline), ADR-79/80 (FTF verdict → routing — composition-stage decisions).