CVN-N015-EA-S04 — pytest core (conftest + markers + scopes)¶

Story: CVN-N015-EA-S04 OpenProject: wp#119 GH issue: #839

Date : 2026-05-07 Story : CVN-N015-EA-S04 (OP wp#119) Parent Epic : CVN-N015-EA — Test stack foundation (OP wp#107) Depends on : S01 (wp#116 ✅ Closed), S02 (wp#117 ✅ Closed), S03 (wp#118 ✅ Closed) — foundation Epic upstream complete + 6 ADRs ratified (0083-0088) Blocks : S05 (Testcontainers helpers), S06 (flaky-test detector), S07-impl downstream consumers Operator decisions locked : 2026-05-07 — A=pyproject.toml [tool.pytest.ini_options] (modern PEP 621/518 single-source) ; B=grep+sed batch (single-PR migration) Status : proposed (committee plan_review pending)

0. Intent + scope¶

S04 is the first implementation Story of the foundation Epic. It centralises pytest configuration, locks the marker catalogue from S01 §2, enforces the marker discipline from S03 §2 + ADR-0085 + ADR-0087, and migrates the existing ~200+ tests to the right markers. Every subsequent Story (S05 Testcontainers, S06 flaky-test detector, S07-impl, S08-S09 …) consumes the contracts laid down here.

S04 deliverable summary : - pyproject.toml [tool.pytest.ini_options] block with marker catalogue + collect rules + warning filters - tests/conftest.py root with cross-cutting fixtures placeholders + the S04 pytest plugin (--story CLI flag per ADR-0087 Invariant 4) - All ~200 existing tests carry the right @pytest.mark.<type> (mostly unit, some integration) — migrated via grep+sed batch - --strict-markers enforced (unknown markers fail collection) + custom pytest_collection_finish hook in tests/conftest.py that raises pytest.UsageError if any collected test has zero TYPE_MARKERS (un-marked tests fail collection — no silent slip into the wrong tier)

Out of scope (covered by other Stories) : - Testcontainers helpers (PG / Redis / MinIO / MLflow / Airflow) → S05 - Factories implementation under tests/factories/ → S05+ - Flaky-test detector implementation → S06 - Per-Story manifest tooling (G5/G6 CI guardrails per ADR-0087) → S08 - CI workflow YAML files (ci-fast.yml, ci-integration.yml, ci-nightly.yml) → S08

1. Operator decisions — locked 2026-05-07¶

#	Decision	Locked value	Rationale
A	Marker location	`pyproject.toml [tool.pytest.ini_options]`	Modern PEP 621/518 convention. Already where black / isort / flake8 / bandit configs live in this repo (single-source-of-truth principle). pytest reads it natively (no plugin needed for `pyproject.toml` since pytest 6.0). Avoids creating a `pytest.ini` orphan file. `conftest.py` rejected because Python-based config is harder to grep / less discoverable than declarative TOML, and config-as-code isn't needed here.
B	Migration approach for ~200 tests	grep+sed batch (single-PR)	Pattern already validated by PR #858 (31 files, 220 lines, reviewable in one CR pass). Per-file PRs = 200 micro-PRs = operator overwhelmed by CR cycles + huge merge-order coordination overhead. Bulk diff stays reviewable when commit message lists the migration rule + grep query that produced it (e.g., "all `tests/unit/test_.py` files get `@pytest.mark.unit` ; `tests/integration/test_.py` get `@pytest.mark.integration`").

2. Marker catalogue — pinned in `pyproject.toml`¶

The marker set comes directly from S01 §2 (the 12-test-type taxonomy) + S03 §2 (directory layout per type) :

[tool.pytest.ini_options]
minversion = "8.0"
testpaths = ["tests"]
addopts = "--strict-markers --strict-config -ra"
markers = [
    # — Tier : fast (PR-blocking, p95 ≤ 2 min per S02 NF1) —
    "unit: pure-function logic, no I/O, no fixtures heavier than in-memory pandas",
    "property: hypothesis-style invariants on FE / labels / cache keys",
    "contract: per-API contract test (FE inputs / MLflow artefact schema / ETL output schema)",

    # — Tier : medium (subsystem-PR-blocking, p95 ≤ 10 min per S02 NF2) —
    "cache: L1/L2/L3 cache key correctness + invalidation per ADR-04",
    "integration: multi-component flows in-process (no docker, no k8s)",
    "dag_smoke: per-DAG dag.test() — verifies import + task discovery + 1-step execution under stub data",

    # — Tier : nightly (post-merge, p95 ≤ 30 min per S02 NF3 ; populated by Epics EB-EI) —
    "data_quality: Great Expectations OSS suites on OHLCV / L2 / KPI events (per Epic ED)",
    "ml_behaviour: drift / leakage / fairness / robustness checks (per Epic EE)",
    "performance: p95/p99 latency budgets per code path (per S02 §5 budget table)",
    "system_e2e: paper-trading kernel + kill-switch + risk gates with Testcontainers (per Epic EG)",

    # — Tier : operator-driven (post-deploy / per-Story) —
    "uat: hybrid Markdown + Playwright operator-driven scenarios (per S01 §6)",
    "post_deploy_smoke: k8s liveness + 1-prediction call + Grafana panel populated",

    # — Story-phase integration (per ADR-0087 Invariant 4) —
    "story: tag a test to its OP Story (parameter: cvn_id, e.g. @pytest.mark.story('CVN-N015-EA-S04')) for the --story CLI filter (S04 plugin)",
]

Discipline rules (enforced by --strict-markers + the S04 plugin) : - --strict-markers : tests with unknown markers FAIL collection (pytest's native semantic — does NOT block tests with zero markers) - Custom pytest_collection_finish(session) hook in tests/conftest.py (NEW in S04) walks session.items, finds items with no markers from the TYPE_MARKERS = {unit, property, contract, cache, integration, dag_smoke, data_quality, ml_behaviour, performance, system_e2e, uat, post_deploy_smoke} set, and raises pytest.UsageError("N un-marked tests found:\n<first 10>"). This is the actual "no silent slip" enforcement (per CR pass 16e6f2ba — --strict-markers alone does NOT block un-marked tests) - pytest --collect-only --strict-markers is a CI fast-tier gate (smoke that catches missing markers BEFORE the suite runs) - Every test file under tests/<type>/ SHOULD carry @pytest.mark.<type> ; mismatches caught by tests/unit/test_marker_discipline.py (NEW in S04, see §4)

3. Directory layout consumed (from S03)¶

This Story does NOT redesign the tests/ directory layout — it implements what S03 §2 froze :

tests/
├── conftest.py                # NEW in S04 — root fixtures + S04 plugin (--story CLI flag)
├── factories/                 # placeholder ; populated by S05+
├── fixtures/                  # placeholder ; populated by S05+
├── cases/                     # placeholder ; populated by S05+ (per ADR-0088)
├── datasets/                  # placeholder ; populated by S05+ (per ADR-0088)
├── unit/                      # existing 100+ tests, all get @pytest.mark.unit
├── property/                  # NEW dir, populated by S07-S09 hypothesis-driven specs
├── contract/                  # NEW dir, populated by S05 (schema contracts)
├── cache/                     # existing tests with @pytest.mark.cache
├── integration/               # existing tests with @pytest.mark.integration
├── dag_smoke/                 # NEW dir, populated by S08
└── e2e/                       # placeholder ; populated by Epic EG

S04 creates the EMPTY placeholder dirs with __init__.py + a brief README.md so the layout exists from day 1, even though most are empty. Future Stories populate them.

4. Implementation path¶

4.1 `pyproject.toml` block¶

Add [tool.pytest.ini_options] per §2 above. Test : pytest --collect-only -q | wc -l returns the same count as before. Run pytest --strict-markers ; any failure means a test carries an unregistered marker (note : --strict-markers does NOT block tests with zero markers — that case is covered by the custom pytest_collection_finish hook in §4.2, per CR pass 16e6f2ba).

4.2 `tests/conftest.py` root¶

"""Project-root pytest configuration. S04 deliverable per ADR-0087 Invariant 4."""

import pytest

def pytest_addoption(parser):
    """ADR-0087 Invariant 4 : pytest plugin for the --story CLI flag.

    Filters test collection to a single Story id (e.g., CVN-N015-EA-S07).
    Required by the G5 CI guardrail that gates ADR-81 transitions.
    """
    parser.addoption(
        "--story",
        action="store",
        default=None,
        metavar="CVN_ID",
        help="Filter tests to a single Story id (e.g., CVN-N015-EA-S07). "
             "Reads @pytest.mark.story('<cvn_id>') marker arguments.",
    )

def pytest_collection_modifyitems(config, items):
    """Apply the --story filter set by pytest_addoption above."""
    story = config.getoption("--story")
    if not story:
        return
    items[:] = [
        item for item in items
        if any(m.name == "story" and story in m.args for m in item.iter_markers())
    ]


# Type-marker discipline (per CR pass 16e6f2ba) :
# `--strict-markers` alone fails on UNKNOWN markers but NOT on un-marked tests.
# This hook is the actual "no silent slip" enforcement.
TYPE_MARKERS = frozenset({
    "unit", "property", "contract", "cache", "integration", "dag_smoke",
    "data_quality", "ml_behaviour", "performance", "system_e2e",
    "uat", "post_deploy_smoke",
})

def pytest_collection_finish(session):
    """Fail collection if any test has zero TYPE_MARKERS markers."""
    unmarked = [
        item.nodeid for item in session.items
        if not (TYPE_MARKERS & {m.name for m in item.iter_markers()})
    ]
    if unmarked:
        sample = "\n  - ".join(unmarked[:10])
        suffix = f"\n  ... and {len(unmarked) - 10} more" if len(unmarked) > 10 else ""
        raise pytest.UsageError(
            f"{len(unmarked)} un-marked tests found (need ≥ 1 marker from "
            f"TYPE_MARKERS = {sorted(TYPE_MARKERS)}):\n  - {sample}{suffix}"
        )

4.3 Marker migration via grep+sed (decision B)¶

Migration rules (in order of precedence) :

tests/unit/test_*.py          → @pytest.mark.unit
tests/integration/test_*.py   → @pytest.mark.integration
tests/cache/test_*.py         → @pytest.mark.cache
tests/dag_smoke/test_*.py     → @pytest.mark.dag_smoke
tests/property/test_*.py      → @pytest.mark.property
tests/contract/test_*.py      → @pytest.mark.contract
tests/e2e/test_*.py           → @pytest.mark.system_e2e

For each rule : grep -L "@pytest.mark.<type>" tests/<type>/test_*.py to find un-marked files, then sed insertion of the marker decorator AT THE FILE LEVEL (top-level pytestmark = []) — covers all tests in the file at once without per-test edits.

# At the top of each tests/<type>/test_*.py file (after imports) :
import pytest
pytestmark = pytest.mark.<type>

This is the canonical pytest pattern — applies the marker to every test in the file. pytest markers are ADDITIVE, not overriding : a per-test @pytest.mark.<type> decorator COMBINES with the file-level pytestmark rather than replacing it. So a test in tests/unit/test_x.py that carries @pytest.mark.integration ends up with BOTH unit and integration markers — which would mismatch the directory contract + confuse pytest -m selectors. Outliers MUST be handled via one of these 3 corrective patterns (per CR pass 16e6f2ba) :

Move the outlier file into the directory matching its actual tier (e.g., move tests/unit/test_x.py → tests/integration/test_x.py so it inherits the proper pytestmark).
Use CI selector expressions with documented conventions (e.g., pytest -m "integration and not unit" to filter additive-marker collisions explicitly).
Skip file-level pytestmark for heterogeneous files — mark each test individually with @pytest.mark.<type> instead. This is the per-test option for legitimate mixed-tier files (rare ; documented as the file-level escape hatch).

4.4 New test : `tests/unit/test_marker_discipline.py`¶

Validates that the marker catalogue is enforced :

@pytest.mark.unit
def test_every_test_has_a_type_marker():
    """No test in the repo can be unmarked (caught at collection time)."""
    # pytest --strict-markers --collect-only is the actual enforcement ;
    # this test is a unit-level smoke that asserts the rule is in pyproject.toml
    import tomllib
    with open("pyproject.toml", "rb") as f:
        config = tomllib.load(f)
    addopts = config["tool"]["pytest"]["ini_options"]["addopts"]
    assert "--strict-markers" in addopts

@pytest.mark.unit
def test_marker_catalogue_complete():
    """The marker catalogue MUST contain every type from S01 §2 (12 types)."""
    import tomllib
    with open("pyproject.toml", "rb") as f:
        config = tomllib.load(f)
    markers = config["tool"]["pytest"]["ini_options"]["markers"]
    expected = {
        "unit", "property", "contract", "cache", "integration", "dag_smoke",
        "data_quality", "ml_behaviour", "performance", "system_e2e",
        "uat", "post_deploy_smoke", "story",
    }
    found = {m.split(":")[0].strip() for m in markers}
    assert expected <= found, f"Missing markers: {expected - found}"

4.5 Empty placeholder dirs¶

for d in factories fixtures cases datasets property contract dag_smoke e2e; do
    mkdir -p tests/$d
    echo "# $d — populated by S05+ per S03 §2 directory layout" > tests/$d/README.md
    touch tests/$d/__init__.py
done

5. Acceptance criteria¶

#	Criterion	Evidence
1	`pyproject.toml [tool.pytest.ini_options]` block matches §2 spec	grep block + diff against §2
2	All 12 markers from S01 §2 + the `story` marker registered	`tests/unit/test_marker_discipline.py::test_marker_catalogue_complete`
3	`--strict-markers` enforced	`addopts` contains `--strict-markers` ; `tests/unit/test_marker_discipline.py::test_every_test_has_a_type_marker`
4	All ~200 existing tests carry `@pytest.mark.<type>` (file-level via `pytestmark`)	`pytest --strict-markers --collect-only` exits 0
5	`tests/conftest.py` ships the `--story` plugin per ADR-0087 Invariant 4	`pytest --story CVN-XXX --collect-only` works (plugin discovered)
6	Empty placeholder dirs exist for `factories/` `fixtures/` `cases/` `datasets/` `property/` `contract/` `dag_smoke/` `e2e/` with `__init__.py` + brief README	`ls tests/<dir>/` returns 2 files (`__init__.py` + `README.md`)
7	Committee `plan_review` PASSED — verdict body includes `tests_strategy: PASSED` per ADR-0087 §11.4	session JSON link in PR body
8	wp#119 OP transition `New → In specification → Specified`	OP audit comment trail

6. Out of scope (explicit)¶

Implementation of any fixture / factory / container helper → S05+
Per-Story documentation/stories/<cvn_id>/tests/ folder mechanism + manifest schema → S08 (per ADR-0087 + ADR-0088)
CI workflow YAML files for fast/medium/nightly tiers → S08
DVC bootstrap + tests/datasets/ content-addressed naming → S05+
tests/cases/<cvn_id>/_schema.yaml test-case YAML schema → S05+
Flaky-test detector implementation → S06
G5/G6 CI guardrails (Story-phase enforcement + manifest immutability) → S08
Performance budget canary jobs (per S02 §5) → S09

7. Risks¶

Risk	Likelihood	Impact	Mitigation
Existing test files that don't fit the marker rule (e.g., a `tests/unit/test_x.py` that's actually integration in disguise) get the wrong file-level `pytestmark`	medium	low	Migration rule applies file-level marker by directory ; outlier handling (per §4.3) uses one of 3 patterns : (1) move the file to the right directory, (2) selector expression like `-m "integration and not unit"`, (3) skip file-level pytestmark for heterogeneous files. NOT "per-test override" since pytest markers are additive (per CR pass 16e6f2ba). Tests that fail under the assigned tier surface during the first CI run after merge — fixable as follow-up commits within S04.
`pytest_collection_finish` hook breaks an unmarked test that no one noticed	high	low	This is the desired effect — surfacing un-marked tests is the whole point. Migration step §4.3 marks every existing test BEFORE wiring the hook ; if anything slips through, CI fast-tier catches it on the first PR. (Note : `--strict-markers` alone fails only on UNKNOWN markers, NOT zero-marker tests — the custom hook is the actual gate, per CR pass 16e6f2ba.)
The `--story` plugin in `tests/conftest.py` clashes with a future plugin	low	low	Plugin uses a unique CLI flag (`--story`) registered via `pytest_addoption` ; pytest's plugin system handles ordering. If a clash surfaces, namespace under `--cvn-story` in an amendment Story.
200-file batch migration introduces a typo that breaks one file	medium	low	Migration is mechanical (grep + sed) ; CI runs the full suite before merge ; per-file rollback via `git checkout` on the offending file.
Operator missed a test type that needs a marker beyond the 12 from S01 § 2	low	medium	Discovery comes via `pytest --strict-markers` failures during S05+ implementation ; amendment Story adds the missing marker. The 12-type catalogue is from S01 strategy doc which already passed committee `plan_review`.

8. Sequencing¶

S01 (✅ Closed) + S02 (✅ Closed) + S03 (✅ Closed) — foundation Epic upstream
   ↓
S04 (this Story, wp#119) :
  PR draft → CR + committee plan_review → merge → wp#119 Specified → In progress → Developed → Tested → Closed
  Single PR : pyproject.toml + tests/conftest.py + 8 placeholder dirs + ~200 marker insertions + new marker discipline tests
   ↓
S05 (Testcontainers helpers — 5 services) [unblocked]
S06 (flaky-test detector — uses #756) [unblocked]
S07 (existing tests layout migration follow-up) [unblocked]
S08 (CI workflow files + G5/G6 guardrails) [unblocked, depends on S05+S06]
S09 (Grafana wiring — per-tier latency dashboards) [unblocked, depends on S08]

9. References¶

Parent Epic : CVN-N015-EA — Test stack foundation (OP wp#107 / GH #827)
S01 strategy doc : ../strategy/CVN-N015-test-strategy.md — §2 marker catalogue source
S02 requirements : 2026-05-06-cvn-n015-ea-s02-requirements-plan.md — F1-F8 + NF1-NF11 + U1-U7 contracts
S03 architecture : 2026-05-06-cvn-n015-ea-s03-architecture-plan.md — §2 directory layout + §6 mechanism
ADR-0083 (test taxonomy) — the 12-type catalogue
ADR-0084 (foundation stack pick — pytest 8.x + xdist 3.x)
ADR-0085 (fixture scope discipline — function default, session opt-in)
ADR-0087 (Story-phase test integration — --story plugin + 4 invariants)
ADR-0088 (test cases + datasets versioned — directory layout this Story instantiates)

10. Plan-review questions for committee¶

Marker story registration : registered as bare story in the markers list (per CR pass 147d4c81 — pytest requires bare identifiers, not story(cvn_id)). The cvn_id is a runtime parameter convention via @pytest.mark.story("CVN-..."). Does committee endorse the bare-name + runtime-arg pattern (which is what every existing pytest marker with arguments does, e.g. pytest.mark.parametrize) ?
File-level pytestmark vs per-test decorator : §4.3 chooses file-level for the bulk migration. Does committee endorse this for consistency, or should some test types (e.g., property tests that are individually parameterised) carry per-test decorators by default ?
--strict-config enforcement : we add it alongside --strict-markers. This means any unknown config key in pyproject.toml [tool.pytest.ini_options] fails. Is that the right strictness for v1, or should we run --strict-markers only and add --strict-config after S05+ confirms the config is stable ?
Empty placeholder dirs with READMEs : §4.5 creates 8 placeholder dirs with brief READMEs. Acceptable hygiene, or should we wait until S05+ creates them on-demand to avoid empty-dir pollution ?
test_marker_discipline.py location : we put it under tests/unit/. Should it move to tests/ root (since it's about the suite itself, not a unit test of any code path) ?
Migration commit granularity : single squash commit for the ~200 marker insertions, OR 12 commits (one per marker type) for finer git-blame ? Tradeoff is reviewability vs blame fidelity.