meridian_tools.cv
Cross-validation and holdout orchestration utilities.
Module: meridian_tools.cv
Functions
build_last_window_holdout_mask
Build a blocked-tail holdout mask for Meridian’s holdout_id.
Returns a 1-D boolean mask for national data and a 2-D (n_geos, n_times)
mask when geo_index is provided. The last holdout_size time periods are
marked as True (held out).
Parameters:
time_index— Strictly increasing sequence of time period identifiers.holdout_size— Number of tail periods to hold out. Must be positive and less than the length oftime_index.geo_index— Optional sequence of geo identifiers. If provided, the mask is broadcast across geos.
Returns: Boolean NumPy array.
Raises: ValueError for non-monotonic indices, undersized indices, or
impossible holdout sizes.
build_rolling_origin_splits
Create expanding-window blocked time splits for rolling-origin validation.
Parameters:
time_index— Strictly increasing sequence of time period identifiers.initial_train_size— Size of the first training window.test_size— Size of each test window.step_size— Step between splits. Must equaltest_size. Defaults totest_size.max_splits— Maximum number of splits to generate. Must be>= 2if set.
Returns: List of BlockedTimeSplit instances (at least 2).
Raises: ValueError for invalid parameters or if fewer than 2 splits
can be generated.
build_validation_splits
Build deterministic split definitions from the typed validation config.
Dispatches to the appropriate split builder based on
validation_config.strategy. Returns an empty list for strategy: none.
Parameters:
validation_config— A validatedValidationConfiginstance.time_index— Strictly increasing sequence of time period identifiers.
Returns: List of BlockedTimeSplit instances (empty for none).
build_validation_plan
Materialise concrete validation and final-fit run specs from one config.
For strategy: none, returns a plan with no validation runs and no
final-fit run. For blocked_tail or rolling_origin, returns one
ValidationRunSpec per split plus a final_fit_run spec that trains on
the full time axis with no holdout.
Parameters:
validation_config— A validatedValidationConfiginstance.time_index— Strictly increasing sequence of time period identifiers.geo_index— Optional sequence of geo identifiers for geo-panel models.
Returns: A ValidationPlan instance.
Example:
Classes
BlockedTimeSplit
One blocked time split for validation.
| Attribute | Type | Description |
|---|---|---|
label |
str |
Human-readable split label (e.g. "blocked_tail", "split_01"). |
train_indices |
tuple[int, ...] |
Integer indices into the time axis for training. |
test_indices |
tuple[int, ...] |
Integer indices into the time axis for testing. |
train_dates |
tuple[str, ...] |
Date values for training periods. |
test_dates |
tuple[str, ...] |
Date values for test periods. |
ValidationRunSpec
One concrete validation or final-fit run derived from a split plan. Passed
to PipelineRunConfig.validation_spec to control a single pipeline
execution.
| Attribute | Type | Description |
|---|---|---|
mode |
"validation" | "final_fit" |
Run mode. |
strategy |
str |
Validation strategy. |
split_label |
str |
Human-readable split identifier. |
holdout_source |
str |
How the holdout mask was produced. |
generated_holdout |
bool |
Whether the holdout was auto-generated. |
holdout_id |
np.ndarray | None |
Concrete holdout mask (immutable). |
train_indices |
tuple[int, ...] |
Training time indices. |
test_indices |
tuple[int, ...] |
Test time indices. |
train_dates |
tuple[str, ...] |
Training date values. |
test_dates |
tuple[str, ...] |
Test date values. |
run_name_suffix |
str |
Suffix for the run directory name. |
Methods:
to_artifact_payload()— Returns the JSON-serialisable dictionary written tovalidation_spec.json.
ValidationPlan
Concrete validation runs and the separate final-fit run for one config.
| Attribute | Type | Description |
|---|---|---|
validation_runs |
tuple[ValidationRunSpec, ...] |
One spec per validation split. |
final_fit_run |
ValidationRunSpec | None |
Full-sample final-fit spec. None for strategy: none. |