Skip to content

Experiment report — CVN-N001-EI-S07 Gate 2, Phase-0 (cheap-first drift investigation)

Story: CVN-N001-EI-S07 — wp#232 · GH #1071 · Epic CVN-N001-EI (#1055) Date: 2026-05-27 · Author: dococeven · Status: for operator review (gates the decision to build the §0bis/2c probe) Reviews folded: r1 (wording "no bug indicated" + code-infra/policy-constant split → §6/§8); r2 (the cache confounder — §0bis needs forced regen, §5/§7; the two decisive config-reads → §3b/§4iv; the multi-mode impl constraint → §6). Both confirm Q-A (no bug) + the FTF→now-vs-now→now reframing as the real result. Companion docs: Gate-2 design (protocol) · validation runs 2026-05-27 (the seed observation) · Lever #1 design (§2bis/§2ter/§4a)


1. Question

Lever #1 (Model B) pins the first capture and reuses it by cell_ref; per §2bis a closed historical window can still drift when re-derived. Gate 2 must set the --check-drift default + max-pin-age + TTL on evidence. Phase-0 answers the two cheap-first questions that gate everything downstream:

  • Q-A (bug check): is the drift a bug (raw-OHLCV rewrite / non-determinism) — which would forbid any policy tuning — or an expected, bounded mechanism (a policy knob)?
  • Q-B (direction): do the cheap signals already point at a policy regime (opt-in / auto-above-age / default-on), enough to provisionally unblock the §15 implementation?

Out of scope (deferred to §0bis + 2c): the pin-relevant drift rate itself — see the §4 reframing on why Phase-0 cannot measure it.

2. Method

Two read-only readouts, no workload, no replay (mirrors Gate 1 = pure read):

  • 2a — config-read: inspect the code paths that cause drift — the ETL re-pull (src/ETL/cvntrade_etl_pipeline.py:412), the feature-store cache key (src/commun/cache/cvntrade_cache_interface.py:191), the WF fold slice (src/backtest/cvntrade_wf_backtest_engine.py:220).
  • 2b — retrospective mining: query Loki for the historical event=s18_step0_verdict lines (via /loki-query, line-filter path; 7-day window after a 14-day window timed out — narrowed per the skill's failure-mode rule, NOT concluded "no logs"). Each line carries observed/expected f1 + abs_delta + epsilon per re-audit per cell.

Tolerance: ε_f1 = 0.005 (the Step-0 replay epsilon, read live from epsilon=0.0050).

3. Results

3a. Retrospective drift census (2b) — 5 defi_top5 fold-3 cells, 11 observations / 7 days

Cell abs_delta = |observed − expected| status (ε=0.005) obs delta reproducible across runs
LDOUSDC/3 0.0000 PASS ×3 yes — bit-exact
UNIUSDC/3 0.0047 PASS ×2 yes — sub-ε but non-zero
OPUSDC/3 0.0063 FAIL ×2 yes
AAVEUSDC/3 0.0070 FAIL ×3 yes
ARBUSDC/3 0.0156 FAIL ×1 n/a (single obs)

Verbatim anchors (load-bearing lines):

status=PASS crypto=LDOUSDC fold_id=3 observed=0.3092 expected=0.3092 abs_delta=0.0000 epsilon=0.0050   (×3: 05-25 09:14, 05-25 20:39, 05-27 10:10)
status=FAIL crypto=AAVEUSDC fold_id=3 observed=0.3591 expected=0.3520 abs_delta=0.0070 epsilon=0.0050   (×3: 05-25 20:29, 05-27 08:30, 05-27 10:33)
status=FAIL crypto=ARBUSDC fold_id=3 observed=0.3768 expected=0.3613 abs_delta=0.0156 epsilon=0.0050

Observations: 4/5 cells show non-zero output divergence; 3/5 cross ε; only LDOUSDC is bit-exact. Per-cell deltas are reproducible across re-runs (AAVE = 0.0070 on 05-25 and 05-27; OP = 0.0063 ×2).

3b. Mechanism (2a — config-read, incl. the two decisive reads)

  • ETL re-pulls fresh from Binance each run. Cutoff path → bounded window [cutoff − history_months, cutoff]; legacy path (no cutoff) → "{history_months} months ago UTC" → the pulled window slides daily.
  • Binance closed candles are immutable (external invariant) → no raw-OHLCV rewrite of closed candles.
  • A feature-store cache exists, keyed on (crypto, timeframe, history_months)not on the date range. Re-derivation hits the cache (stable) unless FORCE_FEATURE_STORE / cache-miss forces regeneration. The WF engine respects the caller's FORCE_FEATURE_STORE (:241).
  • [Decisive read 1 — active path] CVN_TRAINING_CUTOFF_DATE has NO setter anywhere in code/config/helm/ftf — only readers with a "" default (etl_pipeline.py:391, autonomous_fe.py:198). The s18 capture sets no cutoff. ⇒ the legacy sliding path is the defaultP(regen changes the blob) ≈ 1 (the window slides on every regeneration). (Caveat: a runtime DAG-param/base_env could still inject it — the §0bis forced-regen run confirms empirically.)
  • [Decisive read 2 — cache TTL] feature-store cache CACHE_TTL_DAYS = 7 (cache_manager.py:108; freshness = age_days ≤ ttl_days, :465). ⇒ within 7 days a re-derivation hits the same cached blob (no regen → no drift); past 7 days the entry is stale → regen → (legacy) slide → drift.

4. Analysis

(i) Bug check (Q-A) → no bug-dominated early-exit. No raw-OHLCV rewrite of closed candles (Binance immutability + the cutoff-bounded path). The reproducible per-cell deltas (AAVE constant at 0.0070, OP at 0.0063) are not consistent with training-pass non-determinism (seed/threading would give varying deltas) → the §3c "non-deterministic derivation" bug class is not indicated. Predicted mechanism = edge rolling-feature recompute on feature-store regeneration (the legacy "months ago" window shifts the left-edge warm-up) — the expected, bounded knob class.

(ii) Critical reframing — 2b measures the WRONG drift for the pin policy. observed = re-derived now; expected = the original FTF baseline. So 2b's abs_delta is the FTF→now gap — a one-time offset (the FTF data is unrecoverable, feature_hash="unknown"; this is the §2ter limitation, not a recurring process). The pin freezes "now" and only goes stale if now→now drifts. Therefore "3/5 cross ε" does NOT prove the pin stales — it proves now ≠ FTF. The pin-relevant quantity (now→now) is not measurable by Phase-0; only §0bis (hash ×2) and the 2c panel reach it.

(iii) The "stable now→now" signal is CACHE-HIT stability, not forever-stability. The reproducible deltas and the ~2 days of identical re-derivation have a mundane cause: both fall inside the 7-day feature-store cache TTL (decisive read 2). We were re-reading the same cached blob — that proves the cache hits within its TTL, not that re-derivation is stable across a regeneration. The pin-relevant drift can only appear after the cache evicts (> 7 d) and the legacy window slides on regen. So the earlier "leans opt-in/long-TTL" reading was too optimistic — corrected in (iv).

(iv) The two config-reads give P(stale at age A) almost directly — a step at the cache TTL (likely short-circuiting 2c).

P(pin stale at A) ≈ P(feature-store regenerated within A) × P(regen changes the blob) - P(regen changes blob) ≈ 1 — legacy sliding path is the default (read 1). - P(regen within A) is governed by the 7-day cache TTL (read 2): ≈ 0 for A < 7 d (cache hits → same blob → no drift), rising once the entry evicts at A ≥ 7 d.

P(stale at A) is approximately a step function at A ≈ 7 days. Direct policy implication: max_pin_age / TTL ≈ the cache TTL (7 d), and --check-drift = auto-above-pin-age with A ≈ 7 d (check only once a pin outlives the cache window; before that the cache guarantees stability). This is reachable from two free config-reads + one forced-regen confirmationthe 14–30 day calendar panel (2c) is likely unnecessary, needed only if the forced-regen check is ambiguous.

5. Threats to validity

  • n = 2 calendar days of repeated observation — far short of any candidate TTL; "stable now→now" is a hint, not a measurement.
  • 2b is a biased opportunity sample (only audits that happened) and the wrong quantity (FTF→now, not now→now) — see §4(ii).
  • INPUT_DATA_SHA_V1 not yet exercised — Phase-0 used the f1 proxy, never the canonical hash. AAVE's hash-drift is predicted, not confirmed (the entire point of §0bis).
  • K = 5 cells, all fold 3 — no stratification by window age yet (edge vs deep-historical).
  • abs_delta is an output proxy — a 0.0070 f1 gap is consistent with input drift and (less likely, given reproducibility) with a deterministic non-data effect; only the hash disambiguates.
  • ⚠ Cache confounder — invalidates §0bis as originally specified. Because the feature-store cache is keyed on (crypto, tf, history_months) (not date range), a plain "load + hash ×2" hits the same cached blob → identical hash → a false no-drift confirmation (it measures cache-hit determinism ≈ 0, not re-derivation drift). §0bis MUST force regeneration between the two reads (FORCE_FEATURE_STORE=1), and 2c likewise must force regen (or rely on natural eviction past the 7-day TTL within W). The Gate-2 design §0bis/2c are being corrected accordingly.

6. Verdict

  • Q-A: no bug indicated by Phase-0 (not "proven absent"). No raw-rewrite; deltas reproducible (not random non-determinism). No correctness-bug block on implementation.
  • Q-B: provisional GO, conservative lean auto-above-pin-age (the safe default per the §0/§5 relevance-not-correctness argument). The cheap config-reads now derive a likely policy — max_pin_age / TTL ≈ the 7-day cache TTL, --check-drift auto-above-pin-age with A ≈ 7 d (§4iv) — rather than leaving it fully open. The earlier "leans opt-in/long-TTL" reading is withdrawn (it was cache-hit stability, §4iii).
  • Net: §15 implementation is provisionally unblocked (§1bis provisional pass); the calendar-W panel gates only the runtime policy constant (config, ADR-59), not the code — and 2c is likely unnecessary given §4(iv).
  • Constraint on the unblock (reviewer-2 rider): the unblock holds only if the implementation is built policy-agnostic — all three --check-drift modes (opt-in / auto-above-pin-age / default-on) switchable by config (ADR-59), no code change. Coding only the leaned mode would re-incur a code change if the gate lands elsewhere → then it was never a real unblock.

7. Decision this gates / next steps

  1. DONE in this report — the two decisive config-reads (active path + cache TTL). They already place the policy near auto-above-pin-age, A ≈ 7 d.
  2. Reformulate §0bis with regeneration control (cache confounder, §5): operator-triggered load + FORCE_FEATURE_STORE=1 + hash, ×2 on AAVEUSDC/3 and LDOUSDC/3 → confirm a forced regen on the legacy path actually changes INPUT_DATA_SHA_V1 (P(regen changes blob) ≈ 1) and that AAVE is genuine hash-drift (knob), not a fixed FTF-vs-now offset. A plain ×2 (no force) would falsely confirm no-drift.
  3. Build the probe (PR), policy-agnostic / multi-mode: the INPUT_DATA_SHA_V1 canonical-hash helper (§4a contract, first implementation) + a thin operator-triggered dag_diagnostic__gate2_drift_probe (schedule=None, max_active_runs=1) that forces regeneration between reads.
  4. 2c forward panel — only if needed. Per §4(iv) the cache-TTL step function likely settles P(stale at A) without it; run 2c (with forced regen / natural eviction over W) only if the §0bis forced-regen result is ambiguous.

8. Conclusion

Phase-0 does not identify a bug-dominated drift mechanism. The observed per-cell deltas are reproducible and are most consistent with deterministic re-derivation effects rather than raw-OHLCV rewrite or training non-determinism.

However, Phase-0 mostly measures the FTF→now gap, not the pin-relevant now→now drift rate — and the apparent "stable now→now" is in fact cache-hit stability inside the 7-day feature-store cache TTL. Two free config-reads (no cutoff setter → legacy sliding path; CACHE_TTL_DAYS = 7) nonetheless place P(stale at age A) as an approximate step at the cache TTL, which largely derives the policy without a calendar panel.

Outcome: - implementation of the Gate-2 hash/probe infrastructure is provisionally unblockedprovided it is built policy-agnostic (all three --check-drift modes switchable by config, ADR-59); - the provisional runtime posture is auto-above-pin-age with A ≈ 7 d (the cache TTL), conservative per the relevance-not-correctness argument (§0/§5); - the final --check-drift default, max pin age, and TTL remain gated by the §0bis forced-regeneration confirmation; the 2c forward panel is likely unnecessary (§4iv) and runs only if §0bis is ambiguous.

Phase-0 machine-readable summary

gate: CVN-N001-EI-S07-G2
phase: 0_cheap_first
bug_early_exit: false              # Q-A: no raw-rewrite, no non-determinism indicated (NOT "proven absent")
mechanism_predicted: edge_recompute
cells_observed: 5
consequential_drift_cells: 3       # |Δ|>0.005, FTF-vs-now proxy (NOT pin-staleness)
bit_exact_cells: 1                 # LDOUSDC/3
measures: ftf_to_now               # NOT now_to_now — see §4(ii)
active_path: legacy_sliding        # no CVN_TRAINING_CUTOFF_DATE setter found
cache_ttl_days: 7                  # CACHE_TTL_DAYS default
p_regen_changes_blob: ~1           # legacy sliding window
p_stale_at_age_A: step_at_cache_ttl   # ≈0 for A<7d (cache hit), rising past 7d
now_to_now_rate: unknown           # confirmed by §0bis forced-regen (NOT plain ×2)
provisional_direction: auto_above_pin_age   # A ≈ 7d; opt_in_long_ttl WITHDRAWN (cache-hit stability)
panel_2c_needed: unlikely          # config-reads + §0bis likely sufficient
impl_provisionally_unblocked: true
impl_must_be_multimode: true       # all 3 --check-drift modes config-switchable (reviewer-2 rider)