# Auditor response — 2026-04-25 — pivot to historical-first; minimal ingest plan drafted ## Rule logged Historical-first. Live monitoring is now conditional on historical results showing potential. Auditor revises strategy accordingly. ## Disposition of live watcher Recommend stopping the watcher and cancelling the 15-min cron health pings (they contradict the new rule). Auditor will not act until operator approves. - `kill -TERM 3170546` (clean shutdown, flushes JSONL) - `CronDelete e5343ad9` (stops auto health checks) - ~30 min of live JSONL preserved on disk as side-by-side sanity check during historical analysis. ## Fastest practical historical L1 data sources | Venue | Free archive | Granularity | Window | Effort | |---|---|---|---|---| | Binance | `data.binance.vision` daily bookTicker ZIPs | tick-level | years | trivial — wget + unzip | | OKX | `www.okx.com/data-download` BBO archives | tick-level | years | trivial — wget | | Kraken | REST `/0/public/Spread` endpoint | tick-level | **~24h max** | small — pagination | | Coinbase | none free; Tardis paid (~$10–20 for 7d) | — | — | skip initially | Free coverage: - Binance ↔ OKX: 7 full days - Kraken ↔ Binance/OKX: 24h overlap - Coinbase: skipped (highest fees of the four — least likely to flip the verdict; add later if results warrant). ## Why this is much smaller than the prior replay plan Not building: replay engine, ReplayClock injection, equivalence tests, engine refactor, multi-day batching, vendor adapter layer. Building: a small ingestion tool that emits the existing JSONL schema. The existing `analyze.mjs` then runs unchanged. Wall-clock to viability answer: a few hours. ## Developer prompt — DRAFTED, NOT SENT > **Task: historical L1 ingest for fast cross-venue viability check (reuses existing analyzer)** > > Build a tiny data-ingestion tool that pulls 7 days of historical bid/ask snapshots from free public archives across 3 venues (Binance, OKX, Kraken) and writes them in the **exact same JSONL schema** the existing `tools/spread-watcher` already emits and analyses. > > **Path:** new directory `tools/historical-ingest/` at repo root. > > **Constraints:** > - ZERO changes under `engine/`, `dashboard/`, `config/`, `ops/`, root, or **any existing file under `tools/spread-watcher/`**. > - No DB writes — JSONL only. > - No external dependencies beyond Node's built-in `fetch`, `node:zlib`, `node:stream`. > - Output JSONL matches the existing schema exactly: `{ ts_ms, exchange, symbol_canonical, quote_ccy, bid, ask, bid_size, ask_size }`, all numeric fields as decimal strings. > - Output goes to `tools/spread-watcher/data/` named `historical---.jsonl`. The existing analyzer picks them up. > > **Three ingesters:** > > 1. **Binance** (`src/ingest-binance.mjs`): > - Download `https://data.binance.vision/data/spot/daily/bookTicker/{SYMBOL}/{SYMBOL}-bookTicker-{YYYY-MM-DD}.zip` for each day × `BTCUSDT`, `ETHUSDT`. > - Stream-unzip, parse CSV (best_bid_price/qty, best_ask_price/qty, transaction_time). > - Emit JSONL with `quote_ccy: "USDT"`, `symbol_canonical: "BTC"|"ETH"`. > - Skip 404 days (weekends/missing) with a warning. > > 2. **OKX** (`src/ingest-okx.mjs`): > - OKX historical BBO archive (dev verifies exact URL pattern). If the BBO archive is unavailable, fall back to REST `/api/v5/market/history-mark-price-candles` for 1-min OHLC and synthesize bid/ask=close±0; flag this clearly in the README/report header. > - Symbol mapping `BTC-USDT → BTC`, `ETH-USDT → ETH`. `quote_ccy: "USDT"`. > > 3. **Kraken** (`src/ingest-kraken.mjs`): > - Page `GET https://api.kraken.com/0/public/Spread?pair={XBTUSD|ETHUSD}&since=` from 24h ago. > - Emit JSONL with empty `bid_size`/`ask_size` strings (Kraken's free Spread endpoint omits sizes; analyzer must tolerate). > - `quote_ccy: "USD"`. > > **CLI runner:** > ``` > npm run ingest -- --venue --from 2026-04-18 --to 2026-04-25 > ``` > Idempotent (skip already-downloaded files unless `--force`). Exponential backoff on 429/5xx. End-of-run summary. > > **Tests:** 5 lightweight unit tests (3 parser fixtures, 1 schema-validation, 1 empty-size acceptance). > > **Definition of done:** > - `npm run ingest -- --venue all --from 2026-04-18 --to 2026-04-25` completes in ≲30 min wall-clock. > - Output appears in `tools/spread-watcher/data/historical-*.jsonl`. > - Running existing `cd ../spread-watcher && npm run analyze -- --from --to ` produces a report with `binance`, `okx`, `kraken` directions (Coinbase rows absent). > - 5 unit tests pass. > - README explains data sources, Coinbase skip rationale, Kraken size omission, OKX synthesis caveat if any. > > **Out of scope:** Coinbase historical, modifying existing analyzer/watcher, L2 depth, replay engine, dashboard surfacing, DB integration. ## Three approvals needed 1. Stop the live watcher and cancel the 15-min cron polling? 2. Approve the 3-venue historical ingest plan (skip Coinbase for now)? 3. Approve the dev prompt as drafted, or want changes? Nothing sent to dev until operator confirms.