Horsorion

Horsorion

Methodology · Point-in-Time

Point-in-Time data storage

Every racecard field, rating change, and odds snapshot is stored under point-in-time (PIT) discipline. When researchers replay historical models, they only see what was publicly visible at that timestamp — eliminating look-ahead bias.

Why it matters

Many race-card fields (rating, draw, jockey, weight, scratchings) are revised multiple times before a race. Back-testing on the final cleaned version implicitly assumes the model could see those revisions in advance — every result will be optimistically biased.

Storage model

Each mutable table (racecard, ratings, odds_snapshots, scratchings) is stored bi-temporally:

  • valid_from / valid_to — the period during which the recorded fact was true.
  • recorded_at — when Horsorion ingested the row (system clock).

Asking "what did the racecard look like 5 minutes before the off?" becomes:

sql
SELECT *
FROM racecard
WHERE race_id = 'ST-241110-08'
  AND valid_from <= jump_time - INTERVAL '5 minutes'
  AND (valid_to IS NULL OR valid_to > jump_time - INTERVAL '5 minutes')
  AND recorded_at <= jump_time - INTERVAL '5 minutes'
ORDER BY horse_no;

Change tracking

Every racecard, rating, or odds revision is written as a new row — Horsorion never overwrites. Clients can reconstruct any past moment's snapshot.

Odds-snapshot cadence

Time-series odds are sampled both at a fixed interval (every 10 seconds) and on event triggers (every market move), depending on the chosen delivery profile.

Use cases

  • Historical model back-tests requiring strict leakage prevention.
  • Market microstructure research on tote pools.
  • Machine-learning feature engineering with reproducible snapshots.