Point-in-Time Data Storage

Why it matters

Many race-card fields (rating, draw, jockey, weight, scratchings) are revised multiple times before a race. Back-testing on the final cleaned version implicitly assumes the model could see those revisions in advance — every result will be optimistically biased.

Storage model

Each mutable table (racecard, ratings, odds_snapshots, scratchings) is stored bi-temporally:

valid_from / valid_to — the period during which the recorded fact was true.
recorded_at — when Horsorion ingested the row (system clock).

Asking "what did the racecard look like 5 minutes before the off?" becomes:

sql

SELECT *
FROM racecard
WHERE race_id = 'ST-241110-08'
  AND valid_from <= jump_time - INTERVAL '5 minutes'
  AND (valid_to IS NULL OR valid_to > jump_time - INTERVAL '5 minutes')
  AND recorded_at <= jump_time - INTERVAL '5 minutes'
ORDER BY horse_no;

Change tracking

Every racecard, rating, or odds revision is written as a new row — Horsorion never overwrites. Clients can reconstruct any past moment's snapshot.

Odds-snapshot cadence

Time-series odds are sampled both at a fixed interval (every 10 seconds) and on event triggers (every market move), depending on the chosen delivery profile.

Use cases

Historical model back-tests requiring strict leakage prevention.
Market microstructure research on tote pools.
Machine-learning feature engineering with reproducible snapshots.

Why it matters

Storage model

Change tracking

Odds-snapshot cadence

Use cases