Skip to content

CVN-N001-EI-S07 — Validation runs (Gate-3 size + clean-fail) — 2026-05-27

Two operator-triggered captures via diagnostic__s18_step1_4_chain, build 5b26d1b (PR #1080 merged + DAG-synced 09:43:53Z), defi_top5 fold 3. Serialized (max_active_runs=1, ADR-22): LDOUSDC then AAVEUSDC.

Results

Run Step-0 observed / expected f1_buy |Δ| vs ε=0.005 parquet_bytes n_features n_train / n_val Phase-A elapsed Task
LDOUSDC/3 PASS (reproduced) 0.3092 / 0.3092 0.0000 21 835 187 (20.8 MiB / 21.84 MB) 319 9060 / 1923 1340 s (~22 min) GREEN
AAVEUSDC/3 FAIL (non-reproducing) 0.3591 / 0.3520 0.0070 > ε 24 370 983 (23.2 MiB / 24.37 MB) 320 9853 / 2093 1377 s (~23 min) GREEN

Both lines carry the new parquet_bytes field → confirms the instrumented 5b26d1b build is what ran.

Finding 1 — Clean-fail validated (PR #1080) ✅

AAVEUSDC re-diverged (live data drift, the §2bis phenomenon: today's fetched data ≠ canonical anchor, observed 0.3591 vs expected 0.3520) — exactly the case that previously crashed with a RuntimeError traceback. With #1080 the chain now surfaces it cleanly:

ERROR - event=s18_chain_verdict severity=error outcome=PHASE_A_FAIL phase_a_status=FAIL
        observed_f1=0.3591 expected_f1=0.3520 next_action=ESCALATE reason=non_reproducing_baseline
  • Task stayed GREEN; no traceback from the chain task.
  • The no-Python-crash rule is satisfied: a non-reproducing baseline is a loud structured severity=error verdict (ADR-25/26/30 — Loki→Grafana is the alert channel), not a stacktrace.
  • LDOUSDC reproduced cleanly (NO_DIVERGENCE, path already fixed by #947) → its verdict is the happy-path control.

Finding 2 — Gate-3 (artifact size) — PASS, knee raised to 30 MB (safety buffer) ⚠️

Original threshold (design §274): p95 ≤ 25 MB/fold · audit ≤ 1 GB · read+verify ≤ 10 s. Knee raised 25 → 30 MB on 2026-05-27 (operator-directed, this evidence) — a 5 MB/fold safety buffer so the gate doesn't start at the limit.

  • Per-fold size: max observed 24.37 MB (AAVEUSDC) — was hugging the old 25 MB knee; now ≤ 30 MB with ~5.6 MB headroomPASS.
  • Audit budget: unchanged ≤ 1 GB ⇒ at 30 MB × ~30 folds ≈ ~900 MB (worst case at the new knee); observed ~24 MB × 30 ≈ ~720 MB → within budget.
  • Caveat A — bigger than the §9 estimate: the parametric estimate predicted ~3–6 MB/fold (snappy on float32 ~10k×320). Observed is ~22–24 MB = 4–8× larger. Likely drivers: train+val both persisted, label/weight/split columns, object/index columns, codec defaults. → flag for §9: revisit compression (codec/level), and whether dedup (§4b) holds at this real size. (The buffer is headroom, not a substitute for this.)
  • Caveat B — n=2, both fold 3: not a true p95. The 30 MB knee gives margin, but larger cells/folds should be re-checked on a wider sample during Gate 4.
  • Not measured here: read+verify ≤ 10 s and S3 upload/read latency — those are Gate 4 (cold→warm in-cluster), deferred.

Verdict: Gate-3 size does not block Lever #1 implementation (within both knees, now with a 5 MB/fold buffer), but §9 storage/compression should still be revisited given the 4–8× gap vs estimate, and the per-fold size re-checked on a wider sample during Gate 4.

Caveat — infra noise observed (not a regression, not our DAG)

Two psycopg2.OperationalError tracebacks at 10:32:35 in process DagFileProcessor12001 (Airflow's DAG-file parser losing its connection to the metadata DB at 172.16.16.4:5432 — "server closed the connection unexpectedly"). This is a transient infra fault (scheduler ↔ Postgres reconnect), explicitly exempt from the no-crash rule, and separate from our chain task — AAVEUSDC completed GREEN with its clean verdict. No action required beyond noting it.

Net

  • #1080 fix proven on the real divergence case (AAVEUSDC): clean severity=error verdict, GREEN task, no traceback.
  • Gate-3 measured: ~21–24 MB/fold; knee raised 25 → 30 MB (5 MB/fold safety buffer) → Lever #1 unblocked on size with headroom; §9 compression still to revisit.
  • Lever #1 entry gates status: Gate 1 (value/reuse) = GO (Phase-0), Gate 3 (size) = PASS-with-caveat (this run). Gate 2 (drift-rate) still pending; Gate 4 is the in-cluster release gate.