Validation spec schema reference

The validation_spec.json artefact is written to 10_validation/ for every validation-aware pipeline run. It records the concrete validation provenance for that specific run, including the holdout strategy, split geometry, and date windows.

Fields

Field Type Description
mode "validation" | "final_fit" Whether this is a validation split or the final production fit.
strategy "none" | "blocked_tail" | "rolling_origin" | "authored_holdout" Validation strategy that produced this run.
split_label str Human-readable identifier for the split (e.g. "blocked_tail", "split_01", "final_fit").
holdout_source "generated_validation" | "authored_model_spec" | "none" How the holdout mask was produced.
generated_holdout bool Whether the holdout mask was auto-generated by meridian-tools.
run_name_suffix str Suffix appended to the run name for this split.
holdout_shape list[int] | null Shape of the holdout mask array. null for final-fit runs.
train_indices list[int] Integer indices into the time axis used for training.
test_indices list[int] Integer indices into the time axis used for testing. Empty for final-fit runs.
train_dates list[str] Date values corresponding to train_indices.
test_dates list[str] Date values corresponding to test_indices. Empty for final-fit runs.

Mode and strategy combinations

Mode Strategy Holdout source Description
validation blocked_tail generated_validation Auto-generated contiguous tail holdout.
validation rolling_origin generated_validation One split from an expanding-window plan.
validation authored_holdout authored_model_spec User-provided holdout mask from YAML.
final_fit none none Full-sample production fit after validation.

Invariants

  • Validation-mode specs always have a non-null holdout_shape.
  • Final-fit specs always have holdout_shape: null, empty test_indices, and empty test_dates.
  • train_indices and train_dates always have matching lengths.
  • test_indices and test_dates always have matching lengths.
  • Authored-holdout specs have empty train_indices, test_indices, train_dates, and test_dates.

Example: blocked tail validation

{
  "mode": "validation",
  "strategy": "blocked_tail",
  "split_label": "blocked_tail",
  "holdout_source": "generated_validation",
  "generated_holdout": true,
  "run_name_suffix": "blocked_tail",
  "holdout_shape": [10],
  "train_indices": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
  "test_indices": [12, 13, 14, 15, 16, 17, 18, 19],
  "train_dates": ["2024-01-01", "2024-01-08", "..."],
  "test_dates": ["2024-03-25", "2024-04-01", "..."]
}

Example: rolling origin split

{
  "mode": "validation",
  "strategy": "rolling_origin",
  "split_label": "split_01",
  "holdout_source": "generated_validation",
  "generated_holdout": true,
  "run_name_suffix": "split_01",
  "holdout_shape": [60],
  "train_indices": [0, 1, 2, "...", 51],
  "test_indices": [52, 53, 54, 55],
  "train_dates": ["2024-01-01", "..."],
  "test_dates": ["2024-12-30", "2025-01-06", "2025-01-13", "2025-01-20"]
}

Example: final fit

{
  "mode": "final_fit",
  "strategy": "none",
  "split_label": "final_fit",
  "holdout_source": "none",
  "generated_holdout": false,
  "run_name_suffix": "final_fit",
  "holdout_shape": null,
  "train_indices": [0, 1, 2, "...", 59],
  "test_indices": [],
  "train_dates": ["2024-01-01", "...", "2025-02-24"],
  "test_dates": []
}

Note on holdout mask storage

The actual holdout mask array (boolean NumPy array) is not stored in validation_spec.json because it can be large for geo-panel models (n_geos × n_times). Only its holdout_shape is recorded. The mask is injected into the Meridian model at runtime and can be reconstructed from train_indices, test_indices, and the data geometry.