Experiment report — CVN-N001-EI-S07 Gate 2, Phase-0 (cheap-first drift investigation)¶
Story: CVN-N001-EI-S07 — wp#232 · GH #1071 · Epic CVN-N001-EI (#1055) Date: 2026-05-27 · Author: dococeven · Status: for operator review (gates the decision to build the §0bis/2c probe) Reviews folded: r1 (wording "no bug indicated" + code-infra/policy-constant split → §6/§8); r2 (the cache confounder — §0bis needs forced regen, §5/§7; the two decisive config-reads → §3b/§4iv; the multi-mode impl constraint → §6). Both confirm Q-A (no bug) + the FTF→now-vs-now→now reframing as the real result. Companion docs: Gate-2 design (protocol) · validation runs 2026-05-27 (the seed observation) · Lever #1 design (§2bis/§2ter/§4a)
1. Question¶
Lever #1 (Model B) pins the first capture and reuses it by cell_ref; per §2bis a closed historical window can still drift when re-derived. Gate 2 must set the --check-drift default + max-pin-age + TTL on evidence. Phase-0 answers the two cheap-first questions that gate everything downstream:
- Q-A (bug check): is the drift a bug (raw-OHLCV rewrite / non-determinism) — which would forbid any policy tuning — or an expected, bounded mechanism (a policy knob)?
- Q-B (direction): do the cheap signals already point at a policy regime (opt-in / auto-above-age / default-on), enough to provisionally unblock the §15 implementation?
Out of scope (deferred to §0bis + 2c): the pin-relevant drift rate itself — see the §4 reframing on why Phase-0 cannot measure it.
2. Method¶
Two read-only readouts, no workload, no replay (mirrors Gate 1 = pure read):
- 2a — config-read: inspect the code paths that cause drift — the ETL re-pull (
src/ETL/cvntrade_etl_pipeline.py:412), the feature-store cache key (src/commun/cache/cvntrade_cache_interface.py:191), the WF fold slice (src/backtest/cvntrade_wf_backtest_engine.py:220). - 2b — retrospective mining: query Loki for the historical
event=s18_step0_verdictlines (via/loki-query, line-filter path; 7-day window after a 14-day window timed out — narrowed per the skill's failure-mode rule, NOT concluded "no logs"). Each line carriesobserved/expectedf1 +abs_delta+epsilonper re-audit per cell.
Tolerance: ε_f1 = 0.005 (the Step-0 replay epsilon, read live from epsilon=0.0050).
3. Results¶
3a. Retrospective drift census (2b) — 5 defi_top5 fold-3 cells, 11 observations / 7 days¶
| Cell | abs_delta = |observed − expected| |
status (ε=0.005) | obs | delta reproducible across runs |
|---|---|---|---|---|
| LDOUSDC/3 | 0.0000 | PASS | ×3 | yes — bit-exact |
| UNIUSDC/3 | 0.0047 | PASS | ×2 | yes — sub-ε but non-zero |
| OPUSDC/3 | 0.0063 | FAIL | ×2 | yes |
| AAVEUSDC/3 | 0.0070 | FAIL | ×3 | yes |
| ARBUSDC/3 | 0.0156 | FAIL | ×1 | n/a (single obs) |
Verbatim anchors (load-bearing lines):
status=PASS crypto=LDOUSDC fold_id=3 observed=0.3092 expected=0.3092 abs_delta=0.0000 epsilon=0.0050 (×3: 05-25 09:14, 05-25 20:39, 05-27 10:10)
status=FAIL crypto=AAVEUSDC fold_id=3 observed=0.3591 expected=0.3520 abs_delta=0.0070 epsilon=0.0050 (×3: 05-25 20:29, 05-27 08:30, 05-27 10:33)
status=FAIL crypto=ARBUSDC fold_id=3 observed=0.3768 expected=0.3613 abs_delta=0.0156 epsilon=0.0050
Observations: 4/5 cells show non-zero output divergence; 3/5 cross ε; only LDOUSDC is bit-exact. Per-cell deltas are reproducible across re-runs (AAVE = 0.0070 on 05-25 and 05-27; OP = 0.0063 ×2).
3b. Mechanism (2a — config-read, incl. the two decisive reads)¶
- ETL re-pulls fresh from Binance each run. Cutoff path → bounded window
[cutoff − history_months, cutoff]; legacy path (no cutoff) →"{history_months} months ago UTC"→ the pulled window slides daily. - Binance closed candles are immutable (external invariant) → no raw-OHLCV rewrite of closed candles.
- A feature-store cache exists, keyed on
(crypto, timeframe, history_months)— not on the date range. Re-derivation hits the cache (stable) unlessFORCE_FEATURE_STORE/ cache-miss forces regeneration. The WF engine respects the caller'sFORCE_FEATURE_STORE(:241). - [Decisive read 1 — active path]
CVN_TRAINING_CUTOFF_DATEhas NO setter anywhere in code/config/helm/ftf — only readers with a""default (etl_pipeline.py:391,autonomous_fe.py:198). The s18 capture sets no cutoff. ⇒ the legacy sliding path is the default ⇒P(regen changes the blob) ≈ 1(the window slides on every regeneration). (Caveat: a runtime DAG-param/base_env could still inject it — the §0bis forced-regen run confirms empirically.) - [Decisive read 2 — cache TTL] feature-store cache
CACHE_TTL_DAYS = 7(cache_manager.py:108; freshness =age_days ≤ ttl_days,:465). ⇒ within 7 days a re-derivation hits the same cached blob (no regen → no drift); past 7 days the entry is stale → regen → (legacy) slide → drift.
4. Analysis¶
(i) Bug check (Q-A) → no bug-dominated early-exit. No raw-OHLCV rewrite of closed candles (Binance immutability + the cutoff-bounded path). The reproducible per-cell deltas (AAVE constant at 0.0070, OP at 0.0063) are not consistent with training-pass non-determinism (seed/threading would give varying deltas) → the §3c "non-deterministic derivation" bug class is not indicated. Predicted mechanism = edge rolling-feature recompute on feature-store regeneration (the legacy "months ago" window shifts the left-edge warm-up) — the expected, bounded knob class.
(ii) Critical reframing — 2b measures the WRONG drift for the pin policy. observed = re-derived now; expected = the original FTF baseline. So 2b's abs_delta is the FTF→now gap — a one-time offset (the FTF data is unrecoverable, feature_hash="unknown"; this is the §2ter limitation, not a recurring process). The pin freezes "now" and only goes stale if now→now drifts. Therefore "3/5 cross ε" does NOT prove the pin stales — it proves now ≠ FTF. The pin-relevant quantity (now→now) is not measurable by Phase-0; only §0bis (hash ×2) and the 2c panel reach it.
(iii) The "stable now→now" signal is CACHE-HIT stability, not forever-stability. The reproducible deltas and the ~2 days of identical re-derivation have a mundane cause: both fall inside the 7-day feature-store cache TTL (decisive read 2). We were re-reading the same cached blob — that proves the cache hits within its TTL, not that re-derivation is stable across a regeneration. The pin-relevant drift can only appear after the cache evicts (> 7 d) and the legacy window slides on regen. So the earlier "leans opt-in/long-TTL" reading was too optimistic — corrected in (iv).
(iv) The two config-reads give P(stale at age A) almost directly — a step at the cache TTL (likely short-circuiting 2c).
P(pin stale at A) ≈ P(feature-store regenerated within A) × P(regen changes the blob)-P(regen changes blob) ≈ 1— legacy sliding path is the default (read 1). -P(regen within A)is governed by the 7-day cache TTL (read 2): ≈ 0 forA < 7 d(cache hits → same blob → no drift), rising once the entry evicts atA ≥ 7 d.
⇒ P(stale at A) is approximately a step function at A ≈ 7 days. Direct policy implication: max_pin_age / TTL ≈ the cache TTL (7 d), and --check-drift = auto-above-pin-age with A ≈ 7 d (check only once a pin outlives the cache window; before that the cache guarantees stability). This is reachable from two free config-reads + one forced-regen confirmation — the 14–30 day calendar panel (2c) is likely unnecessary, needed only if the forced-regen check is ambiguous.
5. Threats to validity¶
- n = 2 calendar days of repeated observation — far short of any candidate TTL; "stable now→now" is a hint, not a measurement.
- 2b is a biased opportunity sample (only audits that happened) and the wrong quantity (FTF→now, not now→now) — see §4(ii).
INPUT_DATA_SHA_V1not yet exercised — Phase-0 used the f1 proxy, never the canonical hash. AAVE's hash-drift is predicted, not confirmed (the entire point of §0bis).- K = 5 cells, all fold 3 — no stratification by window age yet (edge vs deep-historical).
abs_deltais an output proxy — a 0.0070 f1 gap is consistent with input drift and (less likely, given reproducibility) with a deterministic non-data effect; only the hash disambiguates.- ⚠ Cache confounder — invalidates §0bis as originally specified. Because the feature-store cache is keyed on
(crypto, tf, history_months)(not date range), a plain "load + hash ×2" hits the same cached blob → identical hash → a false no-drift confirmation (it measures cache-hit determinism ≈ 0, not re-derivation drift). §0bis MUST force regeneration between the two reads (FORCE_FEATURE_STORE=1), and 2c likewise must force regen (or rely on natural eviction past the 7-day TTL within W). The Gate-2 design §0bis/2c are being corrected accordingly.
6. Verdict¶
- Q-A: no bug indicated by Phase-0 (not "proven absent"). No raw-rewrite; deltas reproducible (not random non-determinism). No correctness-bug block on implementation.
- Q-B: provisional GO, conservative lean
auto-above-pin-age(the safe default per the §0/§5 relevance-not-correctness argument). The cheap config-reads now derive a likely policy —max_pin_age/ TTL ≈ the 7-day cache TTL,--check-driftauto-above-pin-age withA ≈ 7 d(§4iv) — rather than leaving it fully open. The earlier "leans opt-in/long-TTL" reading is withdrawn (it was cache-hit stability, §4iii). - Net: §15 implementation is provisionally unblocked (§1bis provisional pass); the calendar-W panel gates only the runtime policy constant (config, ADR-59), not the code — and 2c is likely unnecessary given §4(iv).
- Constraint on the unblock (reviewer-2 rider): the unblock holds only if the implementation is built policy-agnostic — all three
--check-driftmodes (opt-in / auto-above-pin-age / default-on) switchable by config (ADR-59), no code change. Coding only the leaned mode would re-incur a code change if the gate lands elsewhere → then it was never a real unblock.
7. Decision this gates / next steps¶
- DONE in this report — the two decisive config-reads (active path + cache TTL). They already place the policy near
auto-above-pin-age, A ≈ 7 d. - Reformulate §0bis with regeneration control (cache confounder, §5): operator-triggered load +
FORCE_FEATURE_STORE=1+ hash, ×2 on AAVEUSDC/3 and LDOUSDC/3 → confirm a forced regen on the legacy path actually changesINPUT_DATA_SHA_V1(P(regen changes blob) ≈ 1) and that AAVE is genuine hash-drift (knob), not a fixed FTF-vs-now offset. A plain ×2 (no force) would falsely confirm no-drift. - Build the probe (PR), policy-agnostic / multi-mode: the
INPUT_DATA_SHA_V1canonical-hash helper (§4a contract, first implementation) + a thin operator-triggereddag_diagnostic__gate2_drift_probe(schedule=None,max_active_runs=1) that forces regeneration between reads. - 2c forward panel — only if needed. Per §4(iv) the cache-TTL step function likely settles
P(stale at A)without it; run 2c (with forced regen / natural eviction over W) only if the §0bis forced-regen result is ambiguous.
8. Conclusion¶
Phase-0 does not identify a bug-dominated drift mechanism. The observed per-cell deltas are reproducible and are most consistent with deterministic re-derivation effects rather than raw-OHLCV rewrite or training non-determinism.
However, Phase-0 mostly measures the FTF→now gap, not the pin-relevant now→now drift rate — and the apparent "stable now→now" is in fact cache-hit stability inside the 7-day feature-store cache TTL. Two free config-reads (no cutoff setter → legacy sliding path; CACHE_TTL_DAYS = 7) nonetheless place P(stale at age A) as an approximate step at the cache TTL, which largely derives the policy without a calendar panel.
Outcome:
- implementation of the Gate-2 hash/probe infrastructure is provisionally unblocked — provided it is built policy-agnostic (all three --check-drift modes switchable by config, ADR-59);
- the provisional runtime posture is auto-above-pin-age with A ≈ 7 d (the cache TTL), conservative per the relevance-not-correctness argument (§0/§5);
- the final --check-drift default, max pin age, and TTL remain gated by the §0bis forced-regeneration confirmation; the 2c forward panel is likely unnecessary (§4iv) and runs only if §0bis is ambiguous.
Phase-0 machine-readable summary¶
gate: CVN-N001-EI-S07-G2
phase: 0_cheap_first
bug_early_exit: false # Q-A: no raw-rewrite, no non-determinism indicated (NOT "proven absent")
mechanism_predicted: edge_recompute
cells_observed: 5
consequential_drift_cells: 3 # |Δ|>0.005, FTF-vs-now proxy (NOT pin-staleness)
bit_exact_cells: 1 # LDOUSDC/3
measures: ftf_to_now # NOT now_to_now — see §4(ii)
active_path: legacy_sliding # no CVN_TRAINING_CUTOFF_DATE setter found
cache_ttl_days: 7 # CACHE_TTL_DAYS default
p_regen_changes_blob: ~1 # legacy sliding window
p_stale_at_age_A: step_at_cache_ttl # ≈0 for A<7d (cache hit), rising past 7d
now_to_now_rate: unknown # confirmed by §0bis forced-regen (NOT plain ×2)
provisional_direction: auto_above_pin_age # A ≈ 7d; opt_in_long_ttl WITHDRAWN (cache-hit stability)
panel_2c_needed: unlikely # config-reads + §0bis likely sufficient
impl_provisionally_unblocked: true
impl_must_be_multimode: true # all 3 --check-drift modes config-switchable (reviewer-2 rider)