CVN-N015-EA-S04 — pytest core (conftest + markers + scopes)¶
Story: CVN-N015-EA-S04 OpenProject: wp#119 GH issue: #839
Date : 2026-05-07
Story : CVN-N015-EA-S04 (OP wp#119)
Parent Epic : CVN-N015-EA — Test stack foundation (OP wp#107)
Depends on : S01 (wp#116 ✅ Closed), S02 (wp#117 ✅ Closed), S03 (wp#118 ✅ Closed) — foundation Epic upstream complete + 6 ADRs ratified (0083-0088)
Blocks : S05 (Testcontainers helpers), S06 (flaky-test detector), S07-impl downstream consumers
Operator decisions locked : 2026-05-07 — A=pyproject.toml [tool.pytest.ini_options] (modern PEP 621/518 single-source) ; B=grep+sed batch (single-PR migration)
Status : proposed (committee plan_review pending)
0. Intent + scope¶
S04 is the first implementation Story of the foundation Epic. It centralises pytest configuration, locks the marker catalogue from S01 §2, enforces the marker discipline from S03 §2 + ADR-0085 + ADR-0087, and migrates the existing ~200+ tests to the right markers. Every subsequent Story (S05 Testcontainers, S06 flaky-test detector, S07-impl, S08-S09 …) consumes the contracts laid down here.
S04 deliverable summary :
- pyproject.toml [tool.pytest.ini_options] block with marker catalogue + collect rules + warning filters
- tests/conftest.py root with cross-cutting fixtures placeholders + the S04 pytest plugin (--story CLI flag per ADR-0087 Invariant 4)
- All ~200 existing tests carry the right @pytest.mark.<type> (mostly unit, some integration) — migrated via grep+sed batch
- --strict-markers enforced (unknown markers fail collection) + custom pytest_collection_finish hook in tests/conftest.py that raises pytest.UsageError if any collected test has zero TYPE_MARKERS (un-marked tests fail collection — no silent slip into the wrong tier)
Out of scope (covered by other Stories) :
- Testcontainers helpers (PG / Redis / MinIO / MLflow / Airflow) → S05
- Factories implementation under tests/factories/ → S05+
- Flaky-test detector implementation → S06
- Per-Story manifest tooling (G5/G6 CI guardrails per ADR-0087) → S08
- CI workflow YAML files (ci-fast.yml, ci-integration.yml, ci-nightly.yml) → S08
1. Operator decisions — locked 2026-05-07¶
| # | Decision | Locked value | Rationale |
|---|---|---|---|
| A | Marker location | pyproject.toml [tool.pytest.ini_options] |
Modern PEP 621/518 convention. Already where black / isort / flake8 / bandit configs live in this repo (single-source-of-truth principle). pytest reads it natively (no plugin needed for pyproject.toml since pytest 6.0). Avoids creating a pytest.ini orphan file. conftest.py rejected because Python-based config is harder to grep / less discoverable than declarative TOML, and config-as-code isn't needed here. |
| B | Migration approach for ~200 tests | grep+sed batch (single-PR) | Pattern already validated by PR #858 (31 files, 220 lines, reviewable in one CR pass). Per-file PRs = 200 micro-PRs = operator overwhelmed by CR cycles + huge merge-order coordination overhead. Bulk diff stays reviewable when commit message lists the migration rule + grep query that produced it (e.g., "all tests/unit/test_*.py files get @pytest.mark.unit ; tests/integration/test_*.py get @pytest.mark.integration"). |
2. Marker catalogue — pinned in pyproject.toml¶
The marker set comes directly from S01 §2 (the 12-test-type taxonomy) + S03 §2 (directory layout per type) :
[tool.pytest.ini_options]
minversion = "8.0"
testpaths = ["tests"]
addopts = "--strict-markers --strict-config -ra"
markers = [
# — Tier : fast (PR-blocking, p95 ≤ 2 min per S02 NF1) —
"unit: pure-function logic, no I/O, no fixtures heavier than in-memory pandas",
"property: hypothesis-style invariants on FE / labels / cache keys",
"contract: per-API contract test (FE inputs / MLflow artefact schema / ETL output schema)",
# — Tier : medium (subsystem-PR-blocking, p95 ≤ 10 min per S02 NF2) —
"cache: L1/L2/L3 cache key correctness + invalidation per ADR-04",
"integration: multi-component flows in-process (no docker, no k8s)",
"dag_smoke: per-DAG dag.test() — verifies import + task discovery + 1-step execution under stub data",
# — Tier : nightly (post-merge, p95 ≤ 30 min per S02 NF3 ; populated by Epics EB-EI) —
"data_quality: Great Expectations OSS suites on OHLCV / L2 / KPI events (per Epic ED)",
"ml_behaviour: drift / leakage / fairness / robustness checks (per Epic EE)",
"performance: p95/p99 latency budgets per code path (per S02 §5 budget table)",
"system_e2e: paper-trading kernel + kill-switch + risk gates with Testcontainers (per Epic EG)",
# — Tier : operator-driven (post-deploy / per-Story) —
"uat: hybrid Markdown + Playwright operator-driven scenarios (per S01 §6)",
"post_deploy_smoke: k8s liveness + 1-prediction call + Grafana panel populated",
# — Story-phase integration (per ADR-0087 Invariant 4) —
"story: tag a test to its OP Story (parameter: cvn_id, e.g. @pytest.mark.story('CVN-N015-EA-S04')) for the --story CLI filter (S04 plugin)",
]
Discipline rules (enforced by --strict-markers + the S04 plugin) :
- --strict-markers : tests with unknown markers FAIL collection (pytest's native semantic — does NOT block tests with zero markers)
- Custom pytest_collection_finish(session) hook in tests/conftest.py (NEW in S04) walks session.items, finds items with no markers from the TYPE_MARKERS = {unit, property, contract, cache, integration, dag_smoke, data_quality, ml_behaviour, performance, system_e2e, uat, post_deploy_smoke} set, and raises pytest.UsageError("N un-marked tests found:\n<first 10>"). This is the actual "no silent slip" enforcement (per CR pass 16e6f2ba — --strict-markers alone does NOT block un-marked tests)
- pytest --collect-only --strict-markers is a CI fast-tier gate (smoke that catches missing markers BEFORE the suite runs)
- Every test file under tests/<type>/ SHOULD carry @pytest.mark.<type> ; mismatches caught by tests/unit/test_marker_discipline.py (NEW in S04, see §4)
3. Directory layout consumed (from S03)¶
This Story does NOT redesign the tests/ directory layout — it implements what S03 §2 froze :
tests/
├── conftest.py # NEW in S04 — root fixtures + S04 plugin (--story CLI flag)
├── factories/ # placeholder ; populated by S05+
├── fixtures/ # placeholder ; populated by S05+
├── cases/ # placeholder ; populated by S05+ (per ADR-0088)
├── datasets/ # placeholder ; populated by S05+ (per ADR-0088)
├── unit/ # existing 100+ tests, all get @pytest.mark.unit
├── property/ # NEW dir, populated by S07-S09 hypothesis-driven specs
├── contract/ # NEW dir, populated by S05 (schema contracts)
├── cache/ # existing tests with @pytest.mark.cache
├── integration/ # existing tests with @pytest.mark.integration
├── dag_smoke/ # NEW dir, populated by S08
└── e2e/ # placeholder ; populated by Epic EG
S04 creates the EMPTY placeholder dirs with __init__.py + a brief README.md so the layout exists from day 1, even though most are empty. Future Stories populate them.
4. Implementation path¶
4.1 pyproject.toml block¶
Add [tool.pytest.ini_options] per §2 above. Test : pytest --collect-only -q | wc -l returns the same count as before. Run pytest --strict-markers ; any failure means a test carries an unregistered marker (note : --strict-markers does NOT block tests with zero markers — that case is covered by the custom pytest_collection_finish hook in §4.2, per CR pass 16e6f2ba).
4.2 tests/conftest.py root¶
"""Project-root pytest configuration. S04 deliverable per ADR-0087 Invariant 4."""
import pytest
def pytest_addoption(parser):
"""ADR-0087 Invariant 4 : pytest plugin for the --story CLI flag.
Filters test collection to a single Story id (e.g., CVN-N015-EA-S07).
Required by the G5 CI guardrail that gates ADR-81 transitions.
"""
parser.addoption(
"--story",
action="store",
default=None,
metavar="CVN_ID",
help="Filter tests to a single Story id (e.g., CVN-N015-EA-S07). "
"Reads @pytest.mark.story('<cvn_id>') marker arguments.",
)
def pytest_collection_modifyitems(config, items):
"""Apply the --story filter set by pytest_addoption above."""
story = config.getoption("--story")
if not story:
return
items[:] = [
item for item in items
if any(m.name == "story" and story in m.args for m in item.iter_markers())
]
# Type-marker discipline (per CR pass 16e6f2ba) :
# `--strict-markers` alone fails on UNKNOWN markers but NOT on un-marked tests.
# This hook is the actual "no silent slip" enforcement.
TYPE_MARKERS = frozenset({
"unit", "property", "contract", "cache", "integration", "dag_smoke",
"data_quality", "ml_behaviour", "performance", "system_e2e",
"uat", "post_deploy_smoke",
})
def pytest_collection_finish(session):
"""Fail collection if any test has zero TYPE_MARKERS markers."""
unmarked = [
item.nodeid for item in session.items
if not (TYPE_MARKERS & {m.name for m in item.iter_markers()})
]
if unmarked:
sample = "\n - ".join(unmarked[:10])
suffix = f"\n ... and {len(unmarked) - 10} more" if len(unmarked) > 10 else ""
raise pytest.UsageError(
f"{len(unmarked)} un-marked tests found (need ≥ 1 marker from "
f"TYPE_MARKERS = {sorted(TYPE_MARKERS)}):\n - {sample}{suffix}"
)
4.3 Marker migration via grep+sed (decision B)¶
Migration rules (in order of precedence) :
tests/unit/test_*.py → @pytest.mark.unit
tests/integration/test_*.py → @pytest.mark.integration
tests/cache/test_*.py → @pytest.mark.cache
tests/dag_smoke/test_*.py → @pytest.mark.dag_smoke
tests/property/test_*.py → @pytest.mark.property
tests/contract/test_*.py → @pytest.mark.contract
tests/e2e/test_*.py → @pytest.mark.system_e2e
For each rule : grep -L "@pytest.mark.<type>" tests/<type>/test_*.py to find un-marked files, then sed insertion of the marker decorator AT THE FILE LEVEL (top-level pytestmark = []) — covers all tests in the file at once without per-test edits.
# At the top of each tests/<type>/test_*.py file (after imports) :
import pytest
pytestmark = pytest.mark.<type>
This is the canonical pytest pattern — applies the marker to every test in the file. pytest markers are ADDITIVE, not overriding : a per-test @pytest.mark.<type> decorator COMBINES with the file-level pytestmark rather than replacing it. So a test in tests/unit/test_x.py that carries @pytest.mark.integration ends up with BOTH unit and integration markers — which would mismatch the directory contract + confuse pytest -m selectors. Outliers MUST be handled via one of these 3 corrective patterns (per CR pass 16e6f2ba) :
- Move the outlier file into the directory matching its actual tier (e.g., move
tests/unit/test_x.py→tests/integration/test_x.pyso it inherits the properpytestmark). - Use CI selector expressions with documented conventions (e.g.,
pytest -m "integration and not unit"to filter additive-marker collisions explicitly). - Skip file-level
pytestmarkfor heterogeneous files — mark each test individually with@pytest.mark.<type>instead. This is the per-test option for legitimate mixed-tier files (rare ; documented as the file-level escape hatch).
4.4 New test : tests/unit/test_marker_discipline.py¶
Validates that the marker catalogue is enforced :
@pytest.mark.unit
def test_every_test_has_a_type_marker():
"""No test in the repo can be unmarked (caught at collection time)."""
# pytest --strict-markers --collect-only is the actual enforcement ;
# this test is a unit-level smoke that asserts the rule is in pyproject.toml
import tomllib
with open("pyproject.toml", "rb") as f:
config = tomllib.load(f)
addopts = config["tool"]["pytest"]["ini_options"]["addopts"]
assert "--strict-markers" in addopts
@pytest.mark.unit
def test_marker_catalogue_complete():
"""The marker catalogue MUST contain every type from S01 §2 (12 types)."""
import tomllib
with open("pyproject.toml", "rb") as f:
config = tomllib.load(f)
markers = config["tool"]["pytest"]["ini_options"]["markers"]
expected = {
"unit", "property", "contract", "cache", "integration", "dag_smoke",
"data_quality", "ml_behaviour", "performance", "system_e2e",
"uat", "post_deploy_smoke", "story",
}
found = {m.split(":")[0].strip() for m in markers}
assert expected <= found, f"Missing markers: {expected - found}"
4.5 Empty placeholder dirs¶
for d in factories fixtures cases datasets property contract dag_smoke e2e; do
mkdir -p tests/$d
echo "# $d — populated by S05+ per S03 §2 directory layout" > tests/$d/README.md
touch tests/$d/__init__.py
done
5. Acceptance criteria¶
| # | Criterion | Evidence |
|---|---|---|
| 1 | pyproject.toml [tool.pytest.ini_options] block matches §2 spec |
grep block + diff against §2 |
| 2 | All 12 markers from S01 §2 + the story marker registered |
tests/unit/test_marker_discipline.py::test_marker_catalogue_complete |
| 3 | --strict-markers enforced |
addopts contains --strict-markers ; tests/unit/test_marker_discipline.py::test_every_test_has_a_type_marker |
| 4 | All ~200 existing tests carry @pytest.mark.<type> (file-level via pytestmark) |
pytest --strict-markers --collect-only exits 0 |
| 5 | tests/conftest.py ships the --story plugin per ADR-0087 Invariant 4 |
pytest --story CVN-XXX --collect-only works (plugin discovered) |
| 6 | Empty placeholder dirs exist for factories/ fixtures/ cases/ datasets/ property/ contract/ dag_smoke/ e2e/ with __init__.py + brief README |
ls tests/<dir>/ returns 2 files (__init__.py + README.md) |
| 7 | Committee plan_review PASSED — verdict body includes tests_strategy: PASSED per ADR-0087 §11.4 |
session JSON link in PR body |
| 8 | wp#119 OP transition New → In specification → Specified |
OP audit comment trail |
6. Out of scope (explicit)¶
- Implementation of any fixture / factory / container helper → S05+
- Per-Story
documentation/stories/<cvn_id>/tests/folder mechanism + manifest schema → S08 (per ADR-0087 + ADR-0088) - CI workflow YAML files for fast/medium/nightly tiers → S08
- DVC bootstrap +
tests/datasets/content-addressed naming → S05+ tests/cases/<cvn_id>/_schema.yamltest-case YAML schema → S05+- Flaky-test detector implementation → S06
- G5/G6 CI guardrails (Story-phase enforcement + manifest immutability) → S08
- Performance budget canary jobs (per S02 §5) → S09
7. Risks¶
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
Existing test files that don't fit the marker rule (e.g., a tests/unit/test_x.py that's actually integration in disguise) get the wrong file-level pytestmark |
medium | low | Migration rule applies file-level marker by directory ; outlier handling (per §4.3) uses one of 3 patterns : (1) move the file to the right directory, (2) selector expression like -m "integration and not unit", (3) skip file-level pytestmark for heterogeneous files. NOT "per-test override" since pytest markers are additive (per CR pass 16e6f2ba). Tests that fail under the assigned tier surface during the first CI run after merge — fixable as follow-up commits within S04. |
pytest_collection_finish hook breaks an unmarked test that no one noticed |
high | low | This is the desired effect — surfacing un-marked tests is the whole point. Migration step §4.3 marks every existing test BEFORE wiring the hook ; if anything slips through, CI fast-tier catches it on the first PR. (Note : --strict-markers alone fails only on UNKNOWN markers, NOT zero-marker tests — the custom hook is the actual gate, per CR pass 16e6f2ba.) |
The --story plugin in tests/conftest.py clashes with a future plugin |
low | low | Plugin uses a unique CLI flag (--story) registered via pytest_addoption ; pytest's plugin system handles ordering. If a clash surfaces, namespace under --cvn-story in an amendment Story. |
| 200-file batch migration introduces a typo that breaks one file | medium | low | Migration is mechanical (grep + sed) ; CI runs the full suite before merge ; per-file rollback via git checkout on the offending file. |
| Operator missed a test type that needs a marker beyond the 12 from S01 § 2 | low | medium | Discovery comes via pytest --strict-markers failures during S05+ implementation ; amendment Story adds the missing marker. The 12-type catalogue is from S01 strategy doc which already passed committee plan_review. |
8. Sequencing¶
S01 (✅ Closed) + S02 (✅ Closed) + S03 (✅ Closed) — foundation Epic upstream
↓
S04 (this Story, wp#119) :
PR draft → CR + committee plan_review → merge → wp#119 Specified → In progress → Developed → Tested → Closed
Single PR : pyproject.toml + tests/conftest.py + 8 placeholder dirs + ~200 marker insertions + new marker discipline tests
↓
S05 (Testcontainers helpers — 5 services) [unblocked]
S06 (flaky-test detector — uses #756) [unblocked]
S07 (existing tests layout migration follow-up) [unblocked]
S08 (CI workflow files + G5/G6 guardrails) [unblocked, depends on S05+S06]
S09 (Grafana wiring — per-tier latency dashboards) [unblocked, depends on S08]
9. References¶
- Parent Epic :
CVN-N015-EA — Test stack foundation(OP wp#107 / GH #827) - S01 strategy doc :
../strategy/CVN-N015-test-strategy.md— §2 marker catalogue source - S02 requirements :
2026-05-06-cvn-n015-ea-s02-requirements-plan.md— F1-F8 + NF1-NF11 + U1-U7 contracts - S03 architecture :
2026-05-06-cvn-n015-ea-s03-architecture-plan.md— §2 directory layout + §6 mechanism - ADR-0083 (test taxonomy) — the 12-type catalogue
- ADR-0084 (foundation stack pick — pytest 8.x + xdist 3.x)
- ADR-0085 (fixture scope discipline — function default, session opt-in)
- ADR-0087 (Story-phase test integration —
--storyplugin + 4 invariants) - ADR-0088 (test cases + datasets versioned — directory layout this Story instantiates)
10. Plan-review questions for committee¶
- Marker
storyregistration : registered as barestoryin themarkerslist (per CR pass 147d4c81 — pytest requires bare identifiers, notstory(cvn_id)). Thecvn_idis a runtime parameter convention via@pytest.mark.story("CVN-..."). Does committee endorse the bare-name + runtime-arg pattern (which is what every existing pytest marker with arguments does, e.g.pytest.mark.parametrize) ? - File-level
pytestmarkvs per-test decorator : §4.3 chooses file-level for the bulk migration. Does committee endorse this for consistency, or should some test types (e.g.,propertytests that are individually parameterised) carry per-test decorators by default ? --strict-configenforcement : we add it alongside--strict-markers. This means any unknown config key inpyproject.toml [tool.pytest.ini_options]fails. Is that the right strictness for v1, or should we run--strict-markersonly and add--strict-configafter S05+ confirms the config is stable ?- Empty placeholder dirs with READMEs : §4.5 creates 8 placeholder dirs with brief READMEs. Acceptable hygiene, or should we wait until S05+ creates them on-demand to avoid empty-dir pollution ?
test_marker_discipline.pylocation : we put it undertests/unit/. Should it move totests/root (since it's about the suite itself, not a unit test of any code path) ?- Migration commit granularity : single squash commit for the ~200 marker insertions, OR 12 commits (one per marker type) for finer git-blame ? Tradeoff is reviewability vs blame fidelity.