Methodology · Point-in-Time
Point-in-Time data storage
Every racecard field, rating change, and odds snapshot is stored under point-in-time (PIT) discipline. When researchers replay historical models, they only see what was publicly visible at that timestamp — eliminating look-ahead bias.
Why it matters
Many race-card fields (rating, draw, jockey, weight, scratchings) are revised multiple times before a race. Back-testing on the final cleaned version implicitly assumes the model could see those revisions in advance — every result will be optimistically biased.
Storage model
Each mutable table (racecard, ratings, odds_snapshots, scratchings) is stored bi-temporally:
valid_from/valid_to— the period during which the recorded fact was true.recorded_at— when Horsorion ingested the row (system clock).
Asking "what did the racecard look like 5 minutes before the off?" becomes:
SELECT *
FROM racecard
WHERE race_id = 'ST-241110-08'
AND valid_from <= jump_time - INTERVAL '5 minutes'
AND (valid_to IS NULL OR valid_to > jump_time - INTERVAL '5 minutes')
AND recorded_at <= jump_time - INTERVAL '5 minutes'
ORDER BY horse_no; Change tracking
Every racecard, rating, or odds revision is written as a new row — Horsorion never overwrites. Clients can reconstruct any past moment's snapshot.
Odds-snapshot cadence
Time-series odds are sampled both at a fixed interval (every 10 seconds) and on event triggers (every market move), depending on the chosen delivery profile.
Use cases
- Historical model back-tests requiring strict leakage prevention.
- Market microstructure research on tote pools.
- Machine-learning feature engineering with reproducible snapshots.