Skip to content

ADR-0095 — Diagnostic Stories follow the canonical 5-artifact template + pre-registered methodological invariants

Status: active (operator-mandated 2026-06-02 ; committee plan_review ae4ac80f REJECTED on 1 blocker — missing explicit multiple-testing-control invariant — addressed via Invariant 7, 2026-06-02 ; architect/ops/ml-engineer/crypto-trader 9/9, data-scientist 7/9)

Context-of-record: CVN-N001-EI signal diagnostic program (S03/S04/S05). Reference implementation: CVN-N001-EI-S05 (the canonical pattern).

Context

The EI diagnostic program produced a high-quality, repeatable documentation + method shape (S05): a plan with non-technical problematisation → user stories → hypotheses → state-of-the-art → consolidation, plus pre-registered decision rules, a Hamilton-native architecture, an operator runbook, and a test strategy whose decision table caught a real logic bug on paper before any code. This quality was emerging per-Story and artisanally. Without a mandated shape it would regress to ad-hoc dossiers — the same drift ADR-77 (docs SSoT) and ADR-70 (MLOps readiness) prevent in their domains. Diagnostic Stories also share a methodological core (pre-registration, envelope bootstrap, multiple-testing, significance-keyed decisions, honest inconclusives, no-crash) that, when skipped, produces exactly the failure classes the program exists to avoid (data-snooping, selection bias, point-estimate misclassification, silent crashes).

Decision

Every diagnostic Story — a Story that produces a verdict about a model/pipeline via the two-layer /diagnostic-scaffold pattern (sNN) — MUST follow the canonical template documentation/templates/TEMPLATE_diagnostic_story.md: the 5 artifacts (hub · plan · architecture · runbook · test strategy) published on docs.cvntrade.eu (ADR-77), hubbed under the Epic landing page; and MUST honour the methodological invariants below. Complements ADR-68 (committee plan_review), ADR-70 (MLOps readiness), ADR-83 (test taxonomy), and ADR-0093/0094 (diagnostic-scaffold invariants). Does not apply to non-diagnostic Stories (feature/infra/tooling).

Invariants

  • Invariant 1 — 5 artifacts present + linked: the Story has stories/<cvn_id>/index.md (hub), a plan dossier, an architecture doc, a runbook, and a test strategy, all rendered on docs.cvntrade.eu and reachable from the Epic hub (epics/<epic>-*.md). Testable: docs-site strict build + presence check.
  • Invariant 2 — decision rule pre-registered before the run: the verdict rule (pseudo-code), thresholds, tie-breaks, and inconclusive-priority live in the plan §1 before the run; never invented at test/code time (anti plan↔code divergence + anti data-snooping, Nosek 2018).
  • Invariant 3 — significance-keyed decisions: the decision keys on the CI position (significance), never on the point-estimate (S05 r2.7 bug class: a non-significant point > 0 is not an effect).
  • Invariant 4 — first-class inconclusives + no-crash: INCONCLUSIVE_* verdicts are valid outcomes; every error path yields a structured INCONCLUSIVE_TOOLING, never a raise to the operator UI (sharpens ADR-25), no NaN propagated, no print (ADR-31).
  • Invariant 5 — un-measured decisional inputs gate the full run: any value that drives the verdict but is not yet measured (e.g. real trading costs) gates the decisional full run; placeholders are allowed for the smoke (wiring) run only, and must be marked non-validated.
  • Invariant 6 — exhaustive decision table + test: the test strategy includes an exhaustiveness test (cartesian product of decision classes → exactly-one non-None verdict), not branch-coverage alone (which misses fall-throughs).
  • Invariant 7 — explicit multiple-testing control (committee plan_review ae4ac80f blocker, added): when a diagnostic runs more than one decisional test (across model families, ablation axes, or sweep thresholds), it MUST apply an explicit, pre-registered family-wise error control (Bonferroni or equivalent FWER procedure) to the decisional tests. The envelope statistic (Inv-2 method) controls the within-sweep selection bias; Invariant 7 controls the across-test multiplicity — both are required for statistical validity, and the corrected α (e.g. 0.05/k) is recorded in the plan. Testable: the plan states the number of decisional tests and the corrected α; a single-test diagnostic states k=1 explicitly.

Alternatives rejected

  • OpenProject wiki as the doc store — rejected: two doc stores drift; ADR-77 makes the MkDocs site the SSoT and OP the orchestration layer that links to it.
  • Free-form per-Story dossiers — rejected: artisanal quality regresses; the bug-catching value (decision table on paper) depends on the mandated shape.
  • Bake the template into the scaffold generator only — insufficient: the scaffold emits code stubs, not the 5 narrative/spec artifacts; the invariants are about method + docs, not file generation.

Consequences

  • Diagnostic Stories are discoverable (Epic hub → Story hub → artifacts) and consistent; quality becomes systemic.
  • New diagnostic Stories copy TEMPLATE_diagnostic_story.md; S05 is the worked reference.
  • Slightly higher upfront doc cost per Story — offset by bugs caught on paper (S05 r2.7) and reusable structure.
  • Cross-references: the Epic hub + OP work package both link the docs.cvntrade.eu URLs (ADR-76).

Rollback

Revert the template + this ADR; diagnostic Stories fall back to free-form dossiers. No code impact (the invariants are method/doc conventions; the scaffold + committee gates remain).

References

  • documentation/templates/TEMPLATE_diagnostic_story.md (the template)
  • Reference Story: stories/CVN-N001-EI-S05/index.md, Epic hub epics/CVN-N001-EI-signal-diagnostic-program.md
  • Skill: /diagnostic-scaffold (CVN-N014-EC-S05)
  • Related ADRs: ADR-77 (docs SSoT), ADR-76 (OP orchestration), ADR-68 (committee), ADR-70 (MLOps readiness), ADR-83 (test taxonomy), ADR-25 (no silent fallback), ADR-0093/0094 (diagnostic-scaffold invariants)