Skip to content

ADR-0084 — Foundation Epic test stack pick (pytest + xdist + Testcontainers + DVC)

Status: accepted (committee plan_review session 94aa2881 PASSED 2026-05-06 ; ratified by the same gate as the S03 architecture dossier ; operator decisions A + D on wp#118) Date: 2026-05-06 Introduced by: CVN-N015-EA-S03 / GH issue #838 / OP wp#118 Companion documents: documentation/strategy/CVN-N015-test-strategy.md (S01 strategy), documentation/reviews/2026-05-06-cvn-n015-ea-s03-architecture-plan.md (S03 architecture dossier §1 row A + D)


Context

CVN-N015-EA (test stack foundation) needs a concrete library set picked once, frozen, and used across S04-S09 implementation Stories. Without a single locked stack, every implementation Story would re-derive choices and contradict its siblings (the same drift S01's strategy doc was created to prevent at the taxonomy layer).

The chosen libraries must : - support Python 3.12 (per S02 NF8), - be open-source (per S02 NF7), - support pytest's -n auto parallelism (per S02 NF4), - support C-level time freezing for tz-aware pandas timestamps (per S02 F6 + decision C), - support content-addressed dataset versioning with an S3-compatible backend (per S02 F8 + decision D).

Decision

Locked stack (frozen on this Story's merge ; bump only via amendment Story per ADR-77 SSoT discipline) :

Library Pinned to Role
pytest 8.x (latest stable minor) core test framework
pytest-xdist 3.x parallelism (-n auto) — S02 NF4
testcontainers-python 4.x service virtualisation — S02 F2 (5 services : PG / Redis / MinIO / MLflow / Airflow scheduler+webserver)
time_machine >=2.13 tz-aware time-freezing primitive — S02 F6 + decision C, replaces freezegun (which had pandas Timestamp.now() divergences in our legacy suite)
pytest-randomly >=3.15 seeded test ordering — S02 NF5 determinism
pytest-rerunfailures >=14.0 infra-flake re-runs — distinct from F5 flaky-test detector (which acts on test-code flakes)
DVC (Data Version Control) 3.x dataset versioning > 10 MB with MinIO S3-compatible backend — decision D

Pin policy : each library's minor version is committed to pyproject.toml (or equivalent) on this Story's merge. Bumps require a dedicated amendment Story ("test stack version refresh") that runs the full test pyramid against the new versions ; acceptance bar = zero new test failures vs the prior version. Quarterly cadence by default ; no automatic dependabot float (predictability beats freshness for a test foundation).

Out of scope of this ADR : choice of factory library (factory_boy vs hand-rolled), specific Testcontainers image versions, fixture-scope policy (covered by ADR-0085), promotion-gate semantics (covered by ADR-0086), Story-phase × test-artefacts integration (covered by ADR-0087), test cases + datasets versioning conventions (covered by ADR-0088).

Consequences

Positive : - Single locked stack across the foundation Epic — no per-Story re-derivation, no sibling contradiction - Open-source only (S02 NF7 satisfied) — no SaaS dependency, no commercial tier - time_machine over freezegun resolves the pandas Timestamp.now() divergences that bit the legacy FTF tests (S02 F6 + memory ".values on Series timestamp pandas") - DVC's git-native pointer files give content-addressed dataset provenance without a parallel infrastructure (the existing MinIO Testcontainers helper is the storage backend)

Negative / risks : - DVC bootstrapping cost (S04 must run dvc init && dvc remote add -d minio s3://test-datasets) ; mitigated by the additive migration policy (existing test datasets in data/ migrate one-by-one as Stories that touch them open — no big-bang) - Pin freeze means stale minors over 6+ months between amendment Stories ; mitigated by quarterly cadence + the F5 flaky-test detector catching subtle regressions early - Roll-our-own ruled out for every component — accepts the implicit cost of upstream dependency on these projects' continued maintenance ; for time_machine the migration cost back to freezegun is bounded (a few hours per test) so the lock-in risk is acceptable

Cross-references : - S01 strategy doc §5 (performance budgets enforced by performance test type using these libraries) - S02 requirements F1-F8 + NF1-NF11 (the contracts this stack must implement) - S03 architecture dossier §1 row A + D + §3 service conventions - ADR-0085 fixture scope discipline (decision B) - ADR-0086 CI tier promotion gate (decision C) - ADR-0087 Story-phase test integration (S01 §11 + S02 U7/F7) - ADR-0088 test cases + datasets versioned + provenance-tracked (S02 F8 + NF10 + NF11) - ADR-77 (MkDocs SSoT — this ADR + companion documents)