CVN-N001-EK-S02 — Test strategy¶

Story artifact required by ADR-0101. S02's deliverable is a set of derivations (or typed INFEASIBLE), so "test" here means validating the derivations — reproducibility, evidence grading, contract completeness, absence of snooping — and that the guardrails fail correctly when bypassed (negative tests), plus the enforcement tests S03/S04 must implement.

1. What is under test in S02¶

Not runtime code — the derived charter values + their provenance, and the discipline that produced them. Validation = (a) each value satisfies the plan DoD, (b) it is reproducible from its signed provenance, (c) no analysis-only / anti-snooping boundary was breached, and (d) a violation of any of these is detected and blocked (§4 negative tests).

2. Plan-review baseline (not re-validated here)¶

The D2 plan has already passed committee plan_review in Meeting #273 (strong consensus, 5/5). This test strategy does not re-validate the plan decision. It validates the S02 implementation outputs: the derived charter values, typed INFEASIBLE records, provenance, reproducibility, and boundary compliance. Only V1–V16 (§3) are S02 implementation-acceptance checks; the negative checks (§4) are mandatory alongside them.

3. S02 validation table (implementation acceptance — runnable on the outputs)¶

#	Check	How	Pass criterion
V1	Reference capacity by the §9.1 rule	review derivation	rule pre-specified before cost; rejected alternatives recorded; non-deployment labelled
V2	P90 cost tier + downstream wiring	review	tier ∈ {A,B,C,D}; only A/B lockable; Tier-C completes S02 only as labelled non-lockable + marks S03 blocked; Tier-D → `INFEASIBLE-cost-data`
V3	Tier-B bounded	review adjustment model	error analysis + stress factor present; else downgraded to C
V4	`E_econ_min` ≠ `E_pred_min`	inspect values	two distinct stored values + documented mapping
V5	Mapping monotonicity	review	monotonicity/stability documented; else metric not used as primary gate
V6	Power feasibility	review power report	`MDE_available` computed; compared to `E_pred_min`; `N_min` if underpowered
V7	Power-sim contract complete	checklist vs §11.1	block design · deps · purge/embargo · stat · reps · seed · sensitivity present
V8	Null-gate justified	review §12	candidates compared; primary = most-conservative-valid; invalidity rationale recorded
V9	`INFEASIBLE` typed	review	single verdict + reason + required artifact + next action (§15)
V10	Signed derivations	provenance check	every value has the §16 fields (path/SHA/hashes/code/params/author/timestamp/repro)
V11	Reproducibility (tolerance pre-declared)	re-run from provenance	tolerance declared in the derivation record before rerun; deterministic → exact reproduction (numerical tolerance ≤ ±0.1% only where float non-determinism is documented); stochastic → declared seed + replications + CI tolerance + acceptable numerical drift, re-run within it
V12	Anti-snooping (choice provenance)	audit choices vs §8.1	each calibration choice (capacity · primary metric · null-gate · budget · universe · label/horizon) has a non-performance rationale, or any exploratory influence is recorded as prior rationale + tuple-budgeted. *"Exploratory influence" includes any post-hoc adjustment* of a calibration parameter based on an observed outcome**, not only the initial choice
V13	No run performed	audit	no training / sweep / Airflow launch / Phase-2 run executed (analysis-only attested)
V14	Docs build	`mkdocs build --strict`	green, no orphan, tables render
V15	Risk-owner Tier-C boundary	inspect handoff / risk note	risk-owner approval (if any) carries Tier-C only as non-lockable context; it does not convert Tier-C into lockable P90 evidence
V16	`INFEASIBLE` semantics	review verdict record	typed `INFEASIBLE` recorded as a successful S02 outcome, S03 blocked, allowed remediation listed; no placeholder written

4. Negative / violation checks (the guardrails must break correctly)¶

#	Violation	Expected result
N1	A charter value has no signed derivation	value rejected; cannot feed S03
N2	Tier-C cost is marked lockable	validation fails; S03 blocked
N3	Tier-D / placeholder cost is used	typed `INFEASIBLE-cost-data` required; Tier-D cannot be carried even as non-lockable context (unlike Tier-C)
N4	Exploratory outcomes influenced capacity / metric / null-gate / universe / action policy / budget without prior-rationale registration	anti-snooping violation; affected derivation invalid
N5	MLflow run id cited as training / predictive-run evidence	validation fails (MLflow is provenance-only)
N6	Power sim lacks seed / replications / block design / purge-embargo mechanics / sensitivity	V7 fails
N7	Primary null relies only on a diagnostic / random-entry null	V8 fails
N8	Any Airflow launch / training job / cluster job / Phase-2 predictive run occurred	analysis-only attestation fails; S02 invalidated + escalated

5. Downstream enforcement tests (contractual; built S03/S04)¶

Future test	Owner	Trigger	Blocking condition	Evidence required
Tier-A/B lockability gate	S03	charter-lock attempt	any cost value Tier-C/D or missing tier	signed cost derivation at Tier A/B
Charter immutability	S03	post-lock edit	edit without a recorded re-lock	re-lock record + joint sign-off
Reference-capacity non-deployment guard	S03	capacity referenced	capacity read as deployment/AUM	non-deployment label on the value
Power-contract presence	S04	Phase-2 run start	missing locked `E_pred_min` / null-gate	locked S02/S03 power values
Provenance integrity	S03/S04	value used	value not resolvable to immutable signed derivation	provenance record

6. Validation evidence matrix (required in the implementation PR)¶

The implementation PR MUST include one evidence row per V- and N-check:

Check ID	Evidence artifact / link	Reviewer	Result	Notes
V1			pass/fail/n/a
…
N1			pass/fail/n/a
…

7. Test data / fixtures¶

Validation uses existing, read-only inputs only: existing OHLCV cache · existing ATR-H4 labels · existing trade/cost logs · signed derivation artifacts · plan / architecture / runbook references. No synthetic predictive outputs and no fabricated run-like fixtures — any artifact resembling a model run violates the analysis-only boundary (N8).

8. Non-applicable¶

Performance / load / integration testing is N/A for S02 (no runtime code) — rationale recorded per ADR-0101 Invariant 3; reviewer to accept. N/A does not mean untested: S02 validation is documentary / provenance / control validation (§3–§4). Any runtime/integration test discovered as necessary belongs to S04+ and must not be executed under S02.

9. Definition of test-done (S02)¶

V1–V16 and N1–N8 pass (or the corresponding typed INFEASIBLE is recorded with its artifact); the implementation PR includes the validation evidence matrix (§6) with one row per check; the downstream enforcement table (§5) is carried into the S03/S04 plans so each contract has an owner.