Skip to content

ADR-0099 — Staged-proof architecture: the existence-vs-selection boundary, and inter-stage validity chains

Status: active (operator-mandated 2026-06-04 ; durable lesson-of-record of the CVN-N001-EI thread)

Context-of-record: the S04/s42 episode — a Step-1 existence diagnostic that overstepped into a selection (recommend lr=0.025) and treated an existence proof (better validation AUC) as a transfer proof (better in deployment), which the production f1_buy search refuted. See ADR-0098, S10, the report.

Context

The program optimises an economic objective (net trading return) that is too noisy, path-dependent, and non-stationary to optimise directly. The sound response is a staged decomposition — separation of concerns, each stage provable before composition:

  1. Signal discrimination (on OHLCV, primarily Top5-DeFi) — is there discriminable structure?
  2. Engine filter tuningcan the filters raise the win-rate / net edge? (the prediction is one non-deterministic input among others)
  3. Portfolio managementcan allocation raise net gains?
  4. Integration — compose all three end-to-end.

This decomposition is correct and necessary: it is the only tractable way to attack an objective too noisy to optimise in one shot, and Step 1 is entitled to a proxy (AUC, F1) because its question is existence of signal, not economic value. The thread, however, exposed two failure modes — both realised in S04 — that the decomposition does not prevent on its own:

  • Boundary violation (existence → selection). A stage's legitimate output is an existence claim of bounded magnitude ("discriminable signal exists, ≈ this big"). The moment Step 1 began to select and recommend a configuration (lr=0.025), it crossed into a decision — "which config serves the economic objective" — that requires the downstream objective it does not have. A signal detector cannot say which signal is the right signal for downstream.
  • Existence proof treated as transfer proof. "Improves the proxy" was taken to mean "improves the economic goal". Only composition can establish that. Local rigour ≠ global validity. Stage-by-stage gains compose into end-to-end gain only under a local-optimum → global-optimum assumption that is false wherever the pipeline has non-monotone interactions — and S04 is a concrete instance (a signal that improves Step-1 AUC which the downstream f1_buy search does not want).

Decision

The staged architecture is adopted, with four load-bearing disciplines that make composition earned rather than hoped:

  1. Boundary rule — existence, not selection. Each stage proves the existence (and bounded magnitude) of its scoped property using a legitimate proxy. A stage MUST NOT make a selection / recommendation that requires the downstream or economic objective; selection is deferred to composition. (Step 1 outputs "signal exists, magnitude X" — never "deploy config Y".)
  2. Inter-stage validity chain — prove the seam, don't assume it. Before a stage's output is consumed downstream, its local proxy MUST be validated as a faithful contributor to the next stage / the economic goal. Each seam carries an explicit validity link; without it, "every stage proven locally" + "the whole works" is a bet, not a proof.
  3. Magnitude-aware sub-objectives. Where the economic goal is magnitude-sensitive, stage proxies must be too. Win-rate and class-F1 are magnitude-blind (a high win-rate can lose money: many small gains, few large losses → negative expectancy). Stage 2+ targets expectancy / net edge (win_rate × avg_gain − loss_rate × avg_loss, net of costs), not bare win-rate. (CVN_THRESHOLD_METHOD=expectancy already provides the ingredient.)
  4. Top-down objective definition. The economic objective defines what each stage optimises (a backward dependency through the feed-forward order), rather than each stage independently choosing a generic proxy. "Good signal" at Step 1 is informed by what Steps 2–3 reward.

Invariants

  • Invariant 1 — stage output is an existence claim, not a selection. A stage deliverable states "property P exists with magnitude X (CI)", not "use configuration/filter/weight Y". Any selection is flagged as out-of-stage and deferred to composition.
  • Invariant 2 — no downstream consumption without a demonstrated seam-validity link. A stage output used by the next stage MUST come with evidence that its proxy contributes faithfully to that stage / the economic goal. Assumed transfer is a defect.
  • Invariant 3 — magnitude-aware proxies. Sub-objectives at magnitude-sensitive stages are expectancy/net-edge, not win-rate/F1 in isolation.
  • Invariant 4 — objective is top-level. Stage proxies are derived from the economic objective, not chosen locally; the derivation is stated.

Relation to other ADRs

ADR-0098 (a diagnostic must map the deployment regime — metric and regime — at plan time) is the specific instance of Invariant 2 at the diagnostic→deployment seam. This ADR is the general architecture; ADR-0098 is one seam of it. ADR-0097 (experiment-report template / pre-registration) and ADR-0095 (diagnostic-story template) are the artefacts in which the boundary and seam checks are recorded; they gain an "existence-vs-selection + seam-validity" section.

Consequences

  • The CVN-N001-EI Epic is explicitly a Step-1 (existence/discrimination) program: its diagnostics output existence claims of bounded magnitude, not configuration selections. The S04 HP-swap overstep is the lesson-of-record; S04 closes on the documented non-transfer, and its selection ambition is recognised as out-of-stage.
  • Each stage seam gains a validity gate (the proxy-contributes-faithfully check); composition is built on demonstrated links, not hope.
  • Some attractive within-stage results (a config that wins on the stage proxy) will be withheld from downstream until their seam link is shown — that is the point, and it is what would have saved the S04 thread up front.
  • A correctly-scoped Step-1 successor (S11) asks an existence question ("does the production HPO over-fit?") and makes no selection — the boundary held by design.

References

  • Thread lesson-of-record: ADR-0098, S10, S11, the published report.
  • Related: ADR-14 (multi-fold generalisation), ADR-15 (theta calibrated OOS — a seam ingredient), ADR-29 (naïve baseline — an existence-proof discipline), ADR-79/80 (FTF verdict → routing — composition-stage decisions).