# Auditor response — 2026-04-25 — role confirmed + venue plan revised + Step 1/2 drafted

## 1. Role confirmed

I am the **advisor / system architect**. I read, analyze, design, and write prompts. I do not write or modify production code. The dev CLI executes. I will:

- Continue read-only inspection (logs, DB, configs, file system) — these are within my mandate.
- Continue advising and drafting dev prompts.
- **Wait for your explicit approval before sending any prompt to the dev CLI.**
- Not perform "convenient" sysadmin tasks unless you explicitly authorize them as scoped exceptions (as you did earlier with nginx + the published plan).

If I'm about to take an action that crosses into execution, I will pause and ask first.

---

## 2. Plan adjustment: broaden venues from day 1

You're right — confining the initial historical test to Kraken + Coinbase risks "confirming what we already know." Two-venue spot arb between two US-regulated retail venues at retail fees is a known-thin market.

### Updated venue set for initial historical replay

| Venue | Why include | Inefficiency expectation |
|---|---|---|
| Kraken | Already in live engine | Baseline |
| Coinbase | Already in live engine | Baseline |
| **Binance** (NEW) | Largest by volume globally; sets the "anchor" price. Useful as a reference even if it's the most efficient venue. | Low — but seeing spreads vs Binance reveals where Kraken/Coinbase mis-quote |
| **OKX** (NEW) | Top-5 by volume, Asia-centric, has known microstructure differences during US off-hours. Most likely place to find genuine inefficiency at retail tier. | Medium-high |

**Recommendation: start with all 4 venues.** This gives `4 × 3 = 12` directed venue-pairs per currency vs `2` today (6× the surface area for finding inefficiency). Marginal data cost is small.

If the budget needs to be tighter, drop to 3 venues: keep **Kraken + Coinbase + OKX**. OKX is the better single addition for inefficiency-detection because Binance is so efficient it tends to compress the spreads of everyone else.

I am **not** including new currency pairs (SOL, XRP, DOGE etc.) in this round. That's a separate axis (Step 6 in my prior plan). Mixing the two axes makes it harder to attribute results.

### Revised cost / scope estimate

| Item | Old plan | New plan |
|---|---|---|
| Venues | 2 (kraken, coinbase) | 4 (kraken, coinbase, binance, okx) — or 3 with okx as the cheapest extra |
| Pairs | 2 (BTC/USD, ETH/USD) | 2 (unchanged) |
| Window | 7 days | 7 days (unchanged) |
| Tier B (L1 tick) data cost from Tardis.dev | ~$30–$60 | ~$60–$120 (4 venues) / ~$45–$90 (3 venues) |
| Directed venue-pairs per currency | 2 | 12 (or 6 if 3 venues) |

I will re-publish the updated `/plan.md` only on your authorization.

---

## 3. Step 1 — read-only audit (this is what I will do, no dev involvement)

**Scope:** all reading. No writes, no config changes, no dev prompts.

**Tasks:**

| # | Task | Tool I'll use | Output |
|---|---|---|---|
| 1.1 | Inventory self-captured tick data (`engine/data/ticks/`): file format, daily volume per venue/symbol, gaps, completeness, retention policy | sudo read of `engine/data/ticks/`, file inspection | A coverage table: `(venue, symbol, days, MB/day, gaps, ts integrity)` |
| 1.2 | Audit engine module boundaries (`engine/src/feeds/`, `book/`, `strategy/`, `risk/`, `execution/`) to confirm a feed-layer swap is structurally feasible | sudo read of source | Yes/no on "can replay swap feeds without modifying downstream"; list of any cross-cutting touch points (clock, IDs, persistence) |
| 1.3 | Audit ledger DB schema for replay-compatibility | sudo read of migrations + schema dump | Confirm same `opportunities` schema can hold replay rows; flag any UNIQUE constraints that would prevent replay re-runs |
| 1.4 | Identify all `nowUtcMs()`/`nowMelb()` call sites — this is the surface area the replay clock must cover | grep through `engine/src/` | A list of files/lines that need replay-clock injection |
| 1.5 | Research historical L1 data vendors (Tardis.dev, Kaiko, CryptoChassis, Crypto-Lake) for 7-day window, kraken+coinbase+binance+okx, BTC/USD+ETH/USD | WebFetch/WebSearch on vendor pricing pages | Comparison table: cost, format, latency-of-quote granularity, ease of integration. Final venue/vendor recommendation. |
| 1.6 | Identify existing live capture coverage gaps (e.g., does `data/ticks/` capture both venues for both pairs in unified time? do timestamps line up?) | sudo read + simple JSONL diff | Statement on whether self-captured data is good enough for a parallel "free" replay test before paid data lands |

**Time:** ~20 minutes if I dive in. **Output:** a single audit report in this conversation, plus an offer to update `/plan.md` if anything changes the strategy materially.

**This step does not need dev approval — it's all read-only.** I'll start it on your "go."

---

## 4. Step 2 developer prompt (DRAFTED, NOT SENT)

This is what I would send to the dev CLI **after** Step 1 confirms feasibility and **after** you approve. Reviewing it now lets you change scope before any code is touched.

---

> **Task: replay/backtest infrastructure (read-only over historical data)**
>
> Add the ability to replay captured or vendor-supplied L1 tick data through the existing engine pipeline (`OrderBook → Strategy → Risk → Simulator`) without modifying any of those modules and without touching live state.
>
> **Constraints (must hold):**
> - `mode: paper` config lock unchanged.
> - No schema changes, no migrations against the live `engine/data/db/ledger.db`.
> - No modification of strategy / risk / execution / fees / slippage logic.
> - No `new Date()` introduced anywhere in business code.
> - All 85 existing tests must still pass.
> - Replay must be **venue-agnostic**: accept any `(exchange, symbol)` tuple present in the input stream — do not hardcode kraken/coinbase.
> - Replay output writes to a separate SQLite file path (e.g. `data/db/replay-{runId}.db`), never the live ledger.
>
> **Components to add:**
>
> 1. **`FeedAdapter` interface** in `engine/src/feeds/feed-adapter.ts` exposing:
>    - `start(handler: (quote: Quote) => void): Promise<void>`
>    - `stop(): Promise<void>`
>    - `health(): { exchange: string; healthy: boolean; lastMessageAt: number }`
>    Refactor `feeds/kraken.ts` and `feeds/coinbase.ts` to implement it. **No behavior change.**
>
> 2. **`HistoricalFeedAdapter`** in `engine/src/feeds/historical.ts`:
>    - Constructor accepts an array of file paths (JSONL in the existing `data/ticks/` format) and an optional `[fromMs, toMs]` window.
>    - On `start`, opens all files, k-way-merges quote events by `ts` ascending, calls handler for each quote.
>    - Emits in batches with no real-time delay (as-fast-as-possible mode).
>    - Validates each line against the existing `Quote` zod schema; reject malformed lines with a counted-and-logged error, do not crash.
>
> 3. **`ReplayClock`** in `engine/src/time/replay-clock.ts`:
>    - Module-level pluggable clock. The existing `nowUtcMs()` / `nowMelb()` helpers must read from a clock-source that defaults to wall-clock but can be swapped at startup to a `ReplayClock`.
>    - The replay clock returns the timestamp of the most recently dispatched quote.
>    - Audit and update **only** the existing time helper module — do not change call sites.
>
> 4. **`replay` CLI runner** in `engine/src/replay.ts` (entrypoint registered in `package.json` as `npm run replay`):
>    - Args: `--tick-dir <path>` (or `--tick-files <glob>`), `--from <iso|ms>`, `--to <iso|ms>`, `--pairs <list>`, `--venues <list>`, `--out-db <path>`, optional `--fees-override <yaml-path>`, `--slippage-override <num>`, `--min-edge-pct-override <num>`.
>    - Wires `HistoricalFeedAdapter` + `ReplayClock` into the existing pipeline factory.
>    - Writes `opportunities` rows to `--out-db` using the same repo code as live (so ledger code is unchanged).
>    - Prints a summary at the end: total quotes, total opportunities, qualifying count, wall-clock duration, replayed-time duration.
>
> 5. **Tests** (additional, must keep all 85 prior passing):
>    - Unit: `HistoricalFeedAdapter` k-way merges two files in correct order on a synthetic 100-quote fixture.
>    - Unit: `ReplayClock` returns last-dispatched quote ts.
>    - Integration: replay engine over a fixture JSONL containing 1,000 quotes spanning 5 minutes; assert opportunity count, qualified count, and net_edge percentile match a hand-computed baseline.
>    - **Regression-of-equivalence test:** take a 30-min slice of `data/ticks/` from a known live run, replay it through the new infra, and assert the output `opportunities` rows match the live ledger rows (filtered to the same time window) on `(detected_at, symbol, buy_exchange, sell_exchange, gross_spread_usd, net_edge_usd, qualifies)`. ID column may differ — exclude from comparison. Allow ≤1% mismatch tolerance for races at quote boundaries.
>
> **Out of scope for this task:**
> - Live WS adapters for Binance / OKX (separate task, only needed for live capture, not historical replay).
> - Ingestion of vendor (Tardis/Kaiko) data formats (separate task).
> - Dashboard UI for replay results.
> - Performance optimization beyond "completes 7 days of 4-venue ticks in <10 minutes wall-clock."
>
> **Definition of done:**
> - `npm test` ≥ 90 passing (was 85 + at least 5 new).
> - `npm run replay -- --tick-dir engine/data/ticks --pairs BTC/USD,ETH/USD --venues kraken,coinbase --out-db /tmp/replay-self.db` completes successfully.
> - `curl 'http://127.0.0.1:3200/opportunities/stats'` continues to return live-ledger stats unchanged. Replay does not contaminate live DB. Verified by checksum of live `ledger.db` before and after replay run.
> - One git commit with a clear message; no side commits to unrelated files.

---

## 5. Awaiting your decisions

Three things I need from you before moving:

1. **Approve Step 1** (read-only audit) — yes / no / change scope?
2. **Venue choice for the initial historical run** — 4 venues (kraken, coinbase, binance, okx) or 3 (drop one — which)?
3. **Step 2 prompt review** — any changes? Or hold it as-is until Step 1 finishes and I might tighten it based on what the audit reveals?

I will not start Step 1 or send the Step 2 prompt until you respond.