Experiment report (ADR-0097)
Canonical experiment-report template (ADR-0097). Every experiment report under
documentation/reports/MUST follow this structure. Italicised Template note — blocks explain each section and are deleted in a concrete report. A filled worked example: Multi-fold stability filtering of HP recommendations (2026-06-03).Non-negotiable invariants (ADR-0097) — a report is non-conformant without them: 1. Pre-registration BEFORE results (§3): hypotheses + decision rules registered with a link + timestamp prior to observing the out-of-sample / confirmatory data. A report that sets its decision rule after seeing results is exploratory, and must say so. 2. Effect sizes + confidence intervals, not bare p-values (§6) [Wasserstein2016]. 3. Threats to validity / Limitations is first-class (§8), each naming the inference it threatens + its mitigation/follow-up. 4. Reproducibility statement (§10): run IDs, code commit/SHA, calendar windows (not fold indices), software versions. 5. Standalone + anonymisable: an outside reader follows it without project jargon (Glossary defines every internal term; assets anonymised for external circulation).
Abstract¶
Template note — Structured abstract, ≤250 words: Background, Objective, Methods, Results, Conclusion. Self-contained. State the pre-registered decision rule and report effect sizes with CIs, not bare p-values.
{{...}}
1. Introduction¶
Template note — Frame the problem for a reader outside the project: (i) the domain problem, (ii) why it matters, (iii) the specific question this experiment answers, (iv) the contribution. Define or avoid internal acronyms (see Glossary).
{{...}}
2. Background and Related Work¶
Template note — Situate the methods in the literature so the report stands alone. Keep to what is actually used (time-series CV, selection bias / backtest overfitting, pre-registration / multiplicity, bootstrap inference, equivalence testing, the model family). Cite, do not summarise at length.
{{...}}
3. Hypotheses and Pre-Registration¶
Template note — Load-bearing section (ADR-0097 Inv 1). State hypotheses and the decision rule before results; record when/where they were registered (link + timestamp). Do not edit after seeing confirmatory data. This is what separates a confirmatory report from an exploratory one.
Pre-registration. Hypotheses, scope, procedure, and decision rules below were registered
at {{LINK}} ({{COMMIT/HASH}}, {{TIMESTAMP}}), prior to {{the confirmatory data being
observed}}.
H1 … {{hypothesis}}
Decision rule. {{retain/reject criteria, operationalised}}
Pre-registered scope / multiplicity policy. {{which comparisons are confirmatory; what is exploratory; how family-wise error / forking-paths is controlled}}
4. Data and Experimental Setup¶
Template note — Everything needed to reproduce: universe, timeframe, fold construction (with exact calendar windows — pin windows, not indices), grid, seeds, budget, software versions, compute image. Flag any time-anchoring non-reproducibility as a caveat here.
| Item | Value |
|---|---|
| {{...}} | {{...}} |
5. Methods¶
Template note — Precise enough to re-implement: aggregation, the interval estimator, the decision criteria operationalised, the multiplicity policy. Paired statistics where the design is paired; intervals over stars.
{{...}}
6. Results¶
Template note — Report the registered comparisons first. Effect sizes + CIs in tables; figures for trajectories/timelines. Mark any pending analysis explicitly as a ⏳ placeholder rather than omitting it.
{{...}}
6.x Figures¶
- Figure 1. {{...}} (generate from data arrays; do not hand-draw)
7. Discussion¶
Template note — Interpret, connect to the literature, state what the result does and does not license. Resist over-claiming from limited samples.
{{...}}
8. Threats to Validity / Limitations¶
Template note — First-class (ADR-0097 Inv 3). Be exhaustive and specific; each limitation names the inference it threatens and the mitigation/follow-up.
- {{limitation}} ({{validity type}}). {{what it threatens; mitigation}}
9. Conclusion and Next Steps¶
Template note — One paragraph tied to the pre-registered question; then concrete, ordered next steps with their gates.
{{...}}
10. Reproducibility Statement¶
Template note — Consolidate everything to re-run and audit (ADR-0097 Inv 4): run identifiers, code commit, exact calendar windows, software versions, analysis scripts.
- Runs. {{run ids}}
- Code. Image / commit
{{SHA}}; relevant paths{{...}}. - Data windows. Calendar, not indices (see §4).
- Statistics. {{estimator, multiplicity correction}}.
- Environment. {{lib versions}}.
Glossary¶
Template note — Define every internal term/acronym for outside readers. Anonymise asset/product names where the report circulates externally.
- {{term}} — {{definition}}
References¶
Template note — Consistent style; verify pagination/DOIs before external circulation. Minimal canon actually used by the methods.
{{...}}
Appendix A — Full numerical results¶
(Per-axis/per-fold/per-point tables; generated from data arrays.)
Appendix B — Pre-registration snapshot¶
(Verbatim copy or immutable link + hash of the registered design, with timestamp.)