Python API

Public Python APIs exposed by meridian-tools.

meridian_tools.config

Configuration models and YAML loading for meridian-tools.

Module: meridian_tools.config

Functions

`load_yaml_config`

def load_yaml_config(path: str | Path) -> MeridianToolsConfig

Load and validate a meridian-tools YAML file.

Parameters:

path — Path to the YAML configuration file.

Returns: A validated MeridianToolsConfig instance.

Raises: pydantic.ValidationError if the YAML content does not match the schema.

Example:

from meridian_tools.config import load_yaml_config

config = load_yaml_config("project.yml")
print(config.project.name)
print(config.data.path)
print(config.validation.strategy)

Classes

`MeridianToolsConfig`

class MeridianToolsConfig(BaseModel)

Full YAML configuration for one meridian-tools run. This is the top-level model returned by load_yaml_config.

Attribute	Type	Default
`project`	`ProjectConfig`	`ProjectConfig()`
`data`	`CsvDataConfig`	required
`model_spec`	`ModelSpecConfig`	`ModelSpecConfig()`
`fit`	`FitConfig`	`FitConfig()`
`validation`	`ValidationConfig`	`ValidationConfig()`
`exports`	`ExportsConfig`	`ExportsConfig()`
`response_curves`	`ResponseCurvesConfig	None`
`optimisation`	`OptimisationConfig	None`

`PipelineRunConfig`

@dataclass(frozen=True)
class PipelineRunConfig

Runtime options that sit outside the YAML file. Passed to run_pipeline.

Attribute	Type	Default	Description
`config_path`	`Path`	required	Path to the YAML config file.
`output_dir`	`Path`	`Path("runs")`	Directory for run output.
`run_name`	`str	None`	`None`
`validation_spec`	`ValidationRunSpec	None`	`None`
`apply_run_name_suffix`	`bool`	`True`	Whether to append validation-aware suffixes to the run name.
`source_config_path`	`Path	None`	`None`

`ProjectConfig`

class ProjectConfig(BaseModel)

Attribute	Type	Default
`name`	`str`	`"meridian-project"`

`CsvDataConfig`

class CsvDataConfig(BaseModel)

CSV loader configuration compatible with Meridian’s CsvDataLoader.

Attribute	Type	Default
`path`	`Path`	required
`kpi_type`	`Literal["revenue", "non-revenue"]`	`"revenue"`
`coord_to_columns`	`dict[str, Any]`	required
`media_to_channel`	`dict[str, str]	None`
`media_spend_to_channel`	`dict[str, str]	None`
`reach_to_channel`	`dict[str, str]	None`
`frequency_to_channel`	`dict[str, str]	None`
`rf_spend_to_channel`	`dict[str, str]	None`
`organic_reach_to_channel`	`dict[str, str]	None`
`organic_frequency_to_channel`	`dict[str, str]	None`

`ModelSpecConfig`

class ModelSpecConfig(BaseModel)

Attribute	Type	Default
`kwargs`	`dict[str, Any]`	`{}`
`priors`	`PriorsConfig	None`

`DistributionSpec`

class DistributionSpec(BaseModel)

One YAML-authored TensorFlow Probability distribution.

Supported distribution names are Normal, LogNormal, TruncatedNormal, and Beta. Required parameters depend on the distribution and extra distribution parameters are rejected.

`ChannelPriorSpec`

class ChannelPriorSpec(BaseModel)

A media prior with a default distribution and optional per-channel overrides.

Attribute	Type	Default
`default`	`DistributionSpec`	required
`channels`	`dict[str, DistributionSpec]`	`{}`

`PriorsConfig`

class PriorsConfig(BaseModel)

YAML-friendly subset of Meridian media prior distributions.

Attribute	Type	Default
`roi_m`	`DistributionSpec	ChannelPriorSpec
`mroi_m`	`DistributionSpec	ChannelPriorSpec
`alpha_m`	`DistributionSpec	ChannelPriorSpec

`FitConfig`

class FitConfig(BaseModel)

Sampling configuration for Meridian posterior fitting.

Attribute	Type	Default
`sample_prior_draws`	`PositiveInt	None`
`n_chains`	`PositiveInt	list[PositiveInt]`
`n_adapt`	`PositiveInt`	`500`
`n_burnin`	`PositiveInt`	`500`
`n_keep`	`PositiveInt`	`1000`
`seed`	`int	list[int]
`max_tree_depth`	`PositiveInt`	`10`
`max_energy_diff`	`float`	`500.0`
`unrolled_leapfrog_steps`	`PositiveInt`	`1`
`parallel_iterations`	`PositiveInt`	`10`

`ValidationConfig`

class ValidationConfig(BaseModel)

Validation and holdout orchestration settings.

Attribute	Type	Default
`strategy`	`Literal["none", "blocked_tail", "rolling_origin"]`	`"none"`
`holdout_size`	`PositiveInt	None`
`initial_train_size`	`PositiveInt	None`
`test_size`	`PositiveInt	None`
`step_size`	`PositiveInt	None`
`max_splits`	`PositiveInt	None`

See the validation guide for cross-field validation rules.

`ExportsConfig`

class ExportsConfig(BaseModel)

Attribute	Type	Default
`use_kpi`	`bool`	`False`
`batch_size`	`PositiveInt`	`1000`
`export_predictive_accuracy`	`bool`	`True`
`export_review_summary`	`bool`	`True`
`export_model_selection`	`bool`	`True`
`export_plots`	`bool`	`True`

`ResponseCurvesConfig`

class ResponseCurvesConfig(BaseModel)

Attribute	Type	Default	Constraint
`spend_multipliers`	`list[float]`	required	Non-empty, all `>= 0`
`use_posterior`	`bool`	`True`
`by_reach`	`bool`	`True`
`use_optimal_frequency`	`bool`	`False`
`confidence_level`	`float`	`0.9`	`0 < x < 1`

`OptimisationConfig`

class OptimisationConfig(BaseModel)

Attribute	Type	Default	Constraint
`start_date`	`str`	required	ISO `YYYY-MM-DD`
`end_date`	`str`	required	ISO `YYYY-MM-DD`, `>= start_date`
`budget`	`OptimisationBudgetConfig`	required
`use_posterior`	`bool`	`True`
`use_optimal_frequency`	`bool`	`True`
`confidence_level`	`float`	`0.9`	`0 < x < 1`

`OptimisationBudgetConfig`

class OptimisationBudgetConfig(BaseModel)

Attribute	Type	Default
`mode`	`Literal["fixed_total", "relative_reference_window_total"]`	required
`value`	`PositiveFloat`	required

meridian_tools.runner

Pipeline orchestration for meridian-tools.

Module: meridian_tools.runner

Functions

`run_pipeline`

def run_pipeline(
    run_config: PipelineRunConfig,
    *,
    progress_callback: Callable | None = None,
) -> PipelineRunResult

Execute the full meridian-tools staged pipeline.

The pipeline proceeds through the following stages in order:

00_run_metadata — Archive source and resolved configs and write input_data_provenance.json.
10_validation — Write validation spec (if validation-aware).
20_model_fit — Build input data, construct the Meridian model, sample prior and posterior.
30_model_assessment — Export diagnostics, model summary, and model selection outputs.
40_decomposition — Export summary metrics.
60_response_curves — Export response curves (if configured).
70_optimisation — Export optimisation results (if configured).

The manifest is written to disk after each stage, so a failure mid-pipeline leaves a readable partial manifest.

Before creating the dated run directory, the runner enforces three separate pre-run checks:

dependency preflight (google-meridian[schema], optional plot support)
validation-execution contract checks for incompatible single-run validation combinations
a narrow wrapper-owned config/data preflight over the resolved input file and authored column mapping

The wrapper-owned preflight checks exactly:

resolved data.path exists and is a regular file
the CSV header row can be read
the parsed header is non-empty
no parsed header cell is blank after trimming whitespace
every authored scalar entry in data.coord_to_columns exists in the header
every authored list member in data.coord_to_columns exists in the header
every authored key in media_to_channel, media_spend_to_channel, reach_to_channel, frequency_to_channel, rf_spend_to_channel, organic_reach_to_channel, and organic_frequency_to_channel exists in the header
authored list-valued coord families are non-empty
authored mapping fields above are non-empty
supported media/RF family groups are complete when authored

Header matching is exact and case-sensitive. Anything outside this closed matrix remains Meridian-owned validation.

Parameters:

run_config — A PipelineRunConfig specifying the execution config path, output directory, run name, optional validation spec, and optional source_config_path for metadata archival.
progress_callback — Optional callable invoked on stage lifecycle events. The callback receives keyword arguments:
- stage_name (str) — stage identifier.
- event (str) — one of "started", "completed", "skipped", or "failed".
- stage_index (int) — 1-based position in the pipeline.
- stage_count (int) — total number of stages.
- elapsed_seconds (float) — wall-clock time (present for "completed" and "failed" events).
- message (str) — human-readable detail (present for "skipped" and "failed" events).

Returns: A PipelineRunResult with the run directory and manifest path.

Raises:

RuntimeError if Meridian schema support is unavailable (checked at preflight before the run directory is created).
RuntimeError if exports.export_plots is true but vl-convert-python is not installed (also checked at preflight).
ValidationExecutionContractError if the requested single-run validation execution path is incompatible with the authored config.
ConfigPreflightError if wrapper-owned config/data preflight fails before run-directory creation.
PipelineRunFailure if any exception occurs after the dated run directory already exists.

Example:

from pathlib import Path
from meridian_tools.config import PipelineRunConfig
from meridian_tools.runner import run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("project.yml"),
        output_dir=Path("runs"),
    )
)

print(result.run_dir)
print(result.manifest_path)

Classes

`PipelineRunResult`

@dataclass(frozen=True)
class PipelineRunResult

Disk locations for one completed meridian-tools run.

Attribute	Type	Description
`run_dir`	`Path`	Absolute path to the run directory.
`manifest_path`	`Path`	Absolute path to `run_manifest.json`.

`ValidationExecutionContractError`

class ValidationExecutionContractError(ValueError)

Raised when the requested single-run validation execution path is incompatible with the authored config. Current examples include direct rolling_origin execution through run_pipeline(...) and combining PipelineRunConfig.validation_spec with authored model_spec.kwargs.holdout_id.

`ConfigPreflightError`

class ConfigPreflightError(ValueError)

Raised when the wrapper-owned Phase 10 preflight fails before run-directory creation. This covers only the closed wrapper preflight boundary, not full Meridian model validation.

`PipelineRunFailure`

class PipelineRunFailure(RuntimeError)

Raised when a run fails after the dated run directory already exists. The original underlying exception is preserved via __cause__.

Attribute	Type	Description
`run_dir`	`Path`	Absolute failed run directory.
`manifest_path`	`Path`	Absolute path to the failed run manifest.
`stage_name`	`str \| None`	Failing stage name when one is available.

Constants

Stage names

Constant	Value
`STAGE_RUN_METADATA`	`"00_run_metadata"`
`STAGE_VALIDATION`	`"10_validation"`
`STAGE_MODEL_FIT`	`"20_model_fit"`
`STAGE_MODEL_ASSESSMENT`	`"30_model_assessment"`
`STAGE_DECOMPOSITION`	`"40_decomposition"`
`STAGE_RESPONSE_CURVES`	`"60_response_curves"`
`STAGE_OPTIMISATION`	`"70_optimisation"`

`PIPELINE_STAGE_ORDER`

PIPELINE_STAGE_ORDER: tuple[str, ...] = (
    "00_run_metadata",
    "10_validation",
    "20_model_fit",
    "30_model_assessment",
    "40_decomposition",
    "60_response_curves",
    "70_optimisation",
)

The numbering gap at 50 is intentional, reserving space for future stages.

meridian_tools.cv

Cross-validation and holdout orchestration utilities.

Module: meridian_tools.cv

Functions

`build_last_window_holdout_mask`

def build_last_window_holdout_mask(
    time_index: Sequence[Any],
    holdout_size: int,
    geo_index: Sequence[Any] | None = None,
) -> np.ndarray

Build a blocked-tail holdout mask for Meridian’s holdout_id.

Returns a 1-D boolean mask for national data and a 2-D (n_geos, n_times) mask when geo_index is provided. The last holdout_size time periods are marked as True (held out).

Parameters:

time_index — Strictly increasing sequence of time period identifiers.
holdout_size — Number of tail periods to hold out. Must be positive and less than the length of time_index.
geo_index — Optional sequence of geo identifiers. If provided, the mask is broadcast across geos.

Returns: Boolean NumPy array.

Raises: ValueError for non-monotonic indices, undersized indices, or impossible holdout sizes.

`build_rolling_origin_splits`

def build_rolling_origin_splits(
    time_index: Sequence[Any],
    *,
    initial_train_size: int,
    test_size: int,
    step_size: int | None = None,
    max_splits: int | None = None,
) -> list[BlockedTimeSplit]

Create expanding-window blocked time splits for rolling-origin validation.

Parameters:

time_index — Strictly increasing sequence of time period identifiers.
initial_train_size — Size of the first training window.
test_size — Size of each test window.
step_size — Step between splits. Must equal test_size. Defaults to test_size.
max_splits — Maximum number of splits to generate. Must be >= 2 if set.

Returns: List of BlockedTimeSplit instances (at least 2).

Raises: ValueError for invalid parameters or if fewer than 2 splits can be generated.

`build_validation_splits`

def build_validation_splits(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
) -> list[BlockedTimeSplit]

Build deterministic split definitions from the typed validation config.

Dispatches to the appropriate split builder based on validation_config.strategy. Returns an empty list for strategy: none.

Parameters:

validation_config — A validated ValidationConfig instance.
time_index — Strictly increasing sequence of time period identifiers.

Returns: List of BlockedTimeSplit instances (empty for none).

`build_validation_plan`

def build_validation_plan(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
    geo_index: Sequence[Any] | None = None,
) -> ValidationPlan

Materialise concrete validation and final-fit run specs from one config.

For strategy: none, returns a plan with no validation runs and no final-fit run. For blocked_tail or rolling_origin, returns one ValidationRunSpec per split plus a final_fit_run spec that trains on the full time axis with no holdout.

Parameters:

validation_config — A validated ValidationConfig instance.
time_index — Strictly increasing sequence of time period identifiers.
geo_index — Optional sequence of geo identifiers for geo-panel models.

Returns: A ValidationPlan instance.

Example:

from meridian_tools.config import load_yaml_config
from meridian_tools.cv import build_validation_plan

config = load_yaml_config("project.yml")
plan = build_validation_plan(
    config.validation,
    time_index=["2024-01-01", "2024-01-08", "..."],
    geo_index=["US-CA", "US-NY"],
)

for run_spec in plan.validation_runs:
    print(run_spec.split_label, len(run_spec.train_indices), len(run_spec.test_indices))

if plan.final_fit_run:
    print("Final fit:", plan.final_fit_run.split_label)

Classes

`BlockedTimeSplit`

@dataclass(frozen=True)
class BlockedTimeSplit

One blocked time split for validation.

Attribute	Type	Description
`label`	`str`	Human-readable split label (e.g. `"blocked_tail"`, `"split_01"`).
`train_indices`	`tuple[int, ...]`	Integer indices into the time axis for training.
`test_indices`	`tuple[int, ...]`	Integer indices into the time axis for testing.
`train_dates`	`tuple[str, ...]`	Date values for training periods.
`test_dates`	`tuple[str, ...]`	Date values for test periods.

`ValidationRunSpec`

@dataclass(frozen=True)
class ValidationRunSpec

One concrete validation or final-fit run derived from a split plan. Passed to PipelineRunConfig.validation_spec to control a single pipeline execution.

Attribute	Type	Description
`mode`	`"validation"` \| `"final_fit"`	Run mode.
`strategy`	`str`	Validation strategy.
`split_label`	`str`	Human-readable split identifier.
`holdout_source`	`str`	How the holdout mask was produced.
`generated_holdout`	`bool`	Whether the holdout was auto-generated.
`holdout_id`	`np.ndarray \| None`	Concrete holdout mask (immutable).
`train_indices`	`tuple[int, ...]`	Training time indices.
`test_indices`	`tuple[int, ...]`	Test time indices.
`train_dates`	`tuple[str, ...]`	Training date values.
`test_dates`	`tuple[str, ...]`	Test date values.
`run_name_suffix`	`str`	Suffix for the run directory name.

Methods:

to_artifact_payload() — Returns the JSON-serialisable dictionary written to validation_spec.json.

`ValidationPlan`

@dataclass(frozen=True)
class ValidationPlan

Concrete validation runs and the separate final-fit run for one config.

Attribute	Type	Description
`validation_runs`	`tuple[ValidationRunSpec, ...]`	One spec per validation split.
`final_fit_run`	`ValidationRunSpec \| None`	Full-sample final-fit spec. `None` for `strategy: none`.

meridian_tools.exports

Helpers for manifest-backed Meridian export families.

Module: meridian_tools.exports

Functions

`export_model_fit_artifacts`

def export_model_fit_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    fit_config: FitConfig,
    meridian_version: str | None,
) -> dict[str, Path]

Write the stable model-fit artefact set.

Produces:

meridian_model.binpb — Serialised Meridian model (Protocol Buffers).
fit_metadata.json — Records FitConfig values and Meridian version.
prior_distributions.json — Records applied Meridian prior distributions.

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
fit_config — The FitConfig used for this run.
meridian_version — Meridian version string (or None).

Returns: Dictionary mapping artefact names to file paths.

`extract_prior_summary`

def extract_prior_summary(model: Any) -> dict[str, Any]

Return JSON-serializable applied prior distributions from a Meridian model.

The summary is based on the constructed model’s broadcast prior distribution, so it records what Meridian will use rather than only echoing YAML input.

Parameters:

model — Constructed Meridian model instance.

Returns: Dictionary keyed by Meridian prior parameter name.

`export_model_assessment_artifacts`

def export_model_assessment_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
    diagnostics_exporter: Callable,
    model_selection_exporter: Callable,
) -> dict[str, Path]

Write the stable assessment artefact set.

Produces diagnostics bundle, model results summary HTML, and optionally model selection outputs (LOO/WAIC) and diagnostic plots.

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
exports_config — Export switches.
diagnostics_exporter — Callable for diagnostics bundle export (typically export_diagnostics_bundle).
model_selection_exporter — Callable for model selection export.

Returns: Dictionary mapping artefact names to file paths.

`export_decomposition_artifacts`

def export_decomposition_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable decomposition artefact set.

Produces:

summary_metrics.nc — NetCDF decomposition dataset.
summary_metrics.csv — Flattened tabular decomposition.
plots/ — Channel contribution, waterfall, spend vs. contribution, and ROI charts (when export_plots: true).

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.

`export_response_curve_artifacts`

def export_response_curve_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    response_curves_config: ResponseCurvesConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable response-curve artefact set.

Produces:

response_curves.nc — NetCDF response curve dataset.
response_curves.csv — Flattened tabular response curves.
plots/response_curves_plot.png — Response curve visualisation (when export_plots: true).

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
response_curves_config — Response curves settings from YAML.
exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.

`export_optimisation_artifacts`

def export_optimisation_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    optimisation_config: OptimisationConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable optimisation artefact set.

Produces:

optimisation_summary.html — Meridian optimisation summary report.
optimised_data.nc / .csv — Optimised budget allocation.
nonoptimised_data.nc / .csv — Baseline allocation.
optimisation_grid.csv — Full optimisation grid.
plots/ — Delta, allocation, spend, and response curve charts (when export_plots: true).

For budget.mode: relative_reference_window_total, the effective budget is computed as value × total_spend_in_reference_window using the model’s media and RF spend data within the start_date–end_date window.

Parameters:

model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
optimisation_config — Optimisation settings from YAML.
exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.

`ensure_meridian_schema_support`

def ensure_meridian_schema_support() -> Callable

Return Meridian’s schema serialiser or raise a stable runtime error.

Checks for meridian.schema.serde.meridian_serde.save_meridian. If the import fails, raises RuntimeError with guidance to install google-meridian[schema].

Returns: The save_meridian callable.

`ensure_altair_png_support`

def ensure_altair_png_support() -> Any

Return the Altair PNG backend or raise a stable runtime error.

Checks for vl_convert. If the import fails, raises RuntimeError with guidance to install vl-convert-python.

Returns: The vl_convert module.

meridian_tools.diagnostics

Diagnostics extraction and export helpers for Meridian runs.

Module: meridian_tools.diagnostics

Functions

`predictive_accuracy_frame`

def predictive_accuracy_frame(
    meridian_model: Any,
    *,
    use_kpi: bool = False,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> pd.DataFrame

Return Meridian predictive accuracy as a flat DataFrame.

Uses Meridian’s Analyzer.predictive_accuracy internally and flattens the resulting xarray dataset into a pandas DataFrame.

Parameters:

meridian_model — Fitted Meridian model instance.
use_kpi — Use KPI-based metrics.
selected_geos — Optional subset of geos to evaluate.
selected_times — Optional subset of time periods to evaluate.
batch_size — Batch size for Meridian analysis.

Returns: A pandas DataFrame with one row per observation.

`review_summary_dict`

def review_summary_dict(
    meridian_model: Any,
    *,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
) -> dict[str, Any]

Run Meridian’s review battery and return a JSON-ready dictionary.

Uses Meridian’s ModelReviewer internally. All non-primitive values (dataclasses, enums, NumPy arrays) are recursively converted to JSON-serialisable types.

Parameters:

meridian_model — Fitted Meridian model instance.
selected_geos — Optional subset of geos.
selected_times — Optional subset of time periods.

Returns: A JSON-serialisable dictionary.

`export_diagnostics_bundle`

def export_diagnostics_bundle(
    meridian_model: Any,
    output_dir: str | Path,
    *,
    use_kpi: bool = False,
    export_predictive_accuracy: bool = True,
    export_review_summary: bool = True,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> dict[str, Path]

Write predictive accuracy, review summary, and bundle manifest to disk.

The bundle manifest (diagnostics_bundle.json) records the status of each sub-export ("exported" or "disabled") along with the file name and format. This provides a stable machine-readable contract for downstream consumers.

When an export is disabled, any pre-existing file from a previous run at the same path is removed to prevent stale data.

Parameters:

meridian_model — Fitted Meridian model instance.
output_dir — Directory to write artefacts to.
use_kpi — Use KPI-based metrics.
export_predictive_accuracy — Write predictive_accuracy.csv.
export_review_summary — Write review_summary.json.
selected_geos — Not supported in current scope (raises ValueError).
selected_times — Not supported in current scope (raises ValueError).
batch_size — Batch size for Meridian analysis.

Returns: Dictionary mapping artefact names to file paths. Always includes "diagnostics_bundle". Conditionally includes "predictive_accuracy" and "review_summary".

Example:

from meridian_tools.diagnostics import export_diagnostics_bundle

artifacts = export_diagnostics_bundle(
    fitted_model,
    "output/30_model_assessment",
    export_predictive_accuracy=True,
    export_review_summary=True,
)

print(artifacts["diagnostics_bundle"])
# Path("output/30_model_assessment/diagnostics_bundle.json")

meridian_tools.model_selection

Model-selection helpers layered on top of ArviZ and Meridian.

Module: meridian_tools.model_selection

Functions

`has_log_likelihood`

def has_log_likelihood(candidate: Any) -> bool

Return whether the candidate exposes a non-empty log_likelihood group.

Accepts either an ArviZ InferenceData or any object with an .inference_data attribute (e.g. a fitted Meridian model).

Parameters:

candidate — ArviZ InferenceData or fitted Meridian model.

Returns: True if a non-empty log_likelihood group exists.

`compute_loo`

def compute_loo(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute PSIS-LOO for a Meridian model or InferenceData.

If the candidate is a fitted Meridian model without a log_likelihood group, the function automatically reconstructs it through attach_log_likelihood.

Parameters:

candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
pointwise — Include per-observation LOO values and Pareto k diagnostics.
scale — Scale for ELPD computation ("log", "negative_log", or "deviance").

Returns: An InformationCriterionResult with kind="loo".

Raises: ModelSelectionError if log-likelihood cannot be obtained.

`compute_waic`

def compute_waic(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute WAIC for a Meridian model or InferenceData.

Same automatic log-likelihood reconstruction as compute_loo.

Parameters:

candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
pointwise — Include per-observation WAIC values.
scale — Scale for ELPD computation.

Returns: An InformationCriterionResult with kind="waic".

Raises: ModelSelectionError if log-likelihood cannot be obtained.

`compare_models`

def compare_models(
    candidates: Mapping[str, Any],
    *,
    ic: str = "loo",
    scale: str = "log",
) -> pd.DataFrame

Compare multiple models with ArviZ compare.

Parameters:

candidates — Dictionary mapping model names to fitted Meridian models or InferenceData objects.
ic — Information criterion to use: "loo" or "waic".
scale — Scale for ELPD computation.

Returns: A pandas DataFrame with columns: model, rank, elpd_{ic}, p_{ic}, elpd_diff, weight, se, dse, warning, scale. Ranked by ELPD (rank 0 is best).

For a single candidate, returns a one-row DataFrame with rank=0, elpd_diff=0.0, and weight=1.0.

Raises:

ValueError if ic is not "loo" or "waic", or if candidates is empty.
ModelSelectionError if any candidate lacks log-likelihood data.

Classes

`ModelSelectionError`

class ModelSelectionError(RuntimeError)

Raised when information criteria cannot be computed.

Property	Type	Description
`reason_code`	`str \| None`	Structured code identifying the failure reason.

Known reason codes:

Code	Meaning
`missing_log_likelihood_group`	`InferenceData` has no `log_likelihood` group and cannot be reconstructed.
`holdout_fit_unsupported`	Model was fitted with a holdout mask.
`requires_fitted_meridian_model`	Missing posterior samples or ArviZ `InferenceData`.
`meridian_internal_seam_incompatible`	Meridian version lacks required reconstruction methods.

`InformationCriterionResult`

@dataclass(frozen=True)
class InformationCriterionResult

Summary of one information-criterion computation.

Attribute	Type	Description
`kind`	`str`	`"loo"` or `"waic"`.
`summary`	`dict[str, Any]`	Summary statistics (ELPD, p, SE, etc.).
`pointwise`	`pd.DataFrame \| None`	Per-observation values (if `pointwise=True`).

meridian_tools.log_likelihood

Log-likelihood computation and attachment for Meridian models.

Module: meridian_tools.log_likelihood

Functions

`compute_log_likelihood_dataset`

def compute_log_likelihood_dataset(
    meridian_model: Any,
) -> xr.Dataset

Compute the pointwise log-likelihood dataset for a fitted Meridian model.

This function reconstructs the joint distribution from the posterior samples and computes observation-level log-likelihood values. It handles both geo-panel and national models.

The reconstruction recovers unsaved posterior parameters (e.g. geo deviations, tau_g_excl_baseline) that Meridian does not persist to InferenceData by default.

Parameters:

meridian_model — A fitted Meridian model with posterior samples and a compatible posterior_sampler_callable.

Returns: An xarray Dataset with a log_likelihood variable.

Raises: ModelSelectionError if the model does not expose the required internal reconstruction seams or lacks posterior samples.

`attach_log_likelihood`

def attach_log_likelihood(
    meridian_model: Any,
    *,
    in_place: bool = False,
) -> az.InferenceData

Attach a log_likelihood group to a Meridian model’s InferenceData.

If the model’s InferenceData already has a non-empty log_likelihood group, it is returned as-is (or the existing InferenceData is returned for in_place=True).

Parameters:

meridian_model — A fitted Meridian model.
in_place — If True, mutates meridian_model.inference_data directly. If False (default), returns a deep copy with the log_likelihood group attached. The original model is never modified.

Returns: An ArviZ InferenceData with a log_likelihood group.

Raises:

ModelSelectionError with reason_code="meridian_internal_seam_incompatible" if the Meridian version lacks the required private reconstruction methods.
ModelSelectionError with reason_code="requires_fitted_meridian_model" if the model has no posterior samples.
ModelSelectionError with reason_code="holdout_fit_unsupported" if the model was fitted with a holdout mask.

Example:

from meridian_tools.log_likelihood import attach_log_likelihood

# Non-mutating (default)
idata = attach_log_likelihood(fitted_model, in_place=False)
assert hasattr(idata, "log_likelihood")

# Mutating
attach_log_likelihood(fitted_model, in_place=True)
assert hasattr(fitted_model.inference_data, "log_likelihood")

Implementation notes

The reconstruction accesses three private methods on Meridian’s posterior_sampler_callable:

_get_joint_dist_unpinned
_prepare_latents_for_reconstruction
_reconstruct_posteriors

These are Meridian-internal and may change without notice. If any method is missing, a ModelSelectionError with reason_code="meridian_internal_seam_incompatible" is raised instead of crashing. See the Meridian integration notes for details on this coupling boundary.

meridian_tools.lifecycle

Post-run record management: loading, listing, comparing, and refreshing runs.

Module: meridian_tools.lifecycle

Functions

`resolve_run_directory`

def resolve_run_directory(path: str | Path) -> Path

Return the absolute resolved run directory for a run path or manifest path.

If path points to a file, it must be named run_manifest.json; the function returns its parent directory. If path is a directory, it must contain run_manifest.json.

Parameters:

path — Path to a run directory or to run_manifest.json directly.

Returns: Absolute Path to the run directory.

Raises: LifecycleError if the path does not exist, is an unexpected file, or the directory does not contain run_manifest.json.

`load_run_record`

def load_run_record(path: str | Path) -> RunRecord

Load one run directory through the versioned lifecycle contract.

Resolves the run directory, parses the manifest, and resolves artefact paths. Required artefacts (config_source, config_resolved) must be present in the manifest and exist on disk. Manifest version 3 and 4 runs must also include input_data_provenance. Optional artefacts (validation_spec, diagnostics_bundle, model_selection_status) are resolved when present and set to None when absent.

Parameters:

path — Path to a run directory or to run_manifest.json directly.

Returns: A validated RunRecord instance.

Raises: LifecycleError for missing required artefacts, malformed manifests, artefact path traversal, or claimed-but-missing artefacts.

`list_run_records`

def list_run_records(root: str | Path) -> list[RunRecord]

Discover direct child run directories under one output root.

Scans direct child directories of root for run_manifest.json files. Returns records sorted by started_at (most recent first), with directory name as a secondary sort key.

Parameters:

root — Directory to scan. Must be a directory, not a file.

Returns: List of RunRecord instances.

Raises: LifecycleError if root is not a directory or if any discovered run has an invalid manifest.

`build_refresh_run_config`

def build_refresh_run_config(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunConfig

Build a runtime refresh config from one stored run directory.

The execution config path points to the source run’s config.resolved.yaml. The returned PipelineRunConfig.source_config_path preserves the source run’s archived config.source.yaml so the refresh can re-copy the original YAML into the new run metadata. The output directory defaults to the source run’s parent directory (creating a sibling run). For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Parameters:

path — Path to the run directory or manifest to refresh.
output_dir — Override the output directory (default: source parent).
run_name — Override the run name.

Returns: A PipelineRunConfig ready for run_pipeline.

Raises: LifecycleError if the source run cannot be loaded or if authored-holdout refresh requirements are not met.

`refresh_run`

def refresh_run(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunResult

Execute a non-destructive refresh run from one stored lifecycle record.

This is a convenience function that calls build_refresh_run_config followed by run_pipeline. The original run directory is never modified.

Parameters:

path — Path to the run directory or manifest to refresh.
output_dir — Override the output directory (default: source parent).
run_name — Override the run name.

Returns: A PipelineRunResult for the new run.

`compare_run_records`

def compare_run_records(
    left: str | Path,
    right: str | Path,
) -> pd.DataFrame

Compare two run records at the pinned metadata layer.

Loads both run records and compares run name, status, versions, validation spec presence, diagnostics statuses, model selection availability, and input-data provenance.

Parameters:

left — Path to the first run directory or manifest.
right — Path to the second run directory or manifest.

Returns: A pandas DataFrame with columns field, left, right, status, and changed. Rows follow a fixed order:

Row (`field`)	Description
`run_name`	Human-readable run name.
`status`	Overall run status.
`meridian_tools_version`	`meridian-tools` version.
`meridian_version`	Google Meridian version.
`has_validation_spec`	Whether a validation spec is present.
`has_diagnostics_bundle`	Whether a diagnostics bundle is present.
`predictive_accuracy_status`	Status from the diagnostics bundle.
`review_summary_status`	Status from the diagnostics bundle.
`has_model_selection_outputs`	Whether LOO/WAIC outputs are present.
`model_selection_reason_code`	Reason code if model selection is unavailable.
`input_authored_path`	YAML-owned `data.path` string.
`input_resolved_path`	Absolute runtime input path.
`input_mtime_utc`	Input file mtime.
`input_sha256`	Input file SHA-256 digest.
`input_size_bytes`	Input file size in bytes.
`input_row_count`	Input row count.
`input_column_count`	Input column count.
`input_ordered_columns`	Input CSV column order.

For provenance rows, status is "legacy_unknown" and changed is None when either run predates manifest version 3 and therefore has no stored provenance payload.

Raises: LifecycleError if either run cannot be loaded or if diagnostics or model selection artefacts are malformed.

Classes

`RunRecord`

@dataclass(frozen=True)
class RunRecord

Resolved lifecycle view over one on-disk run directory.

Attribute	Type	Description
`run_dir`	`Path`	Absolute path to the run directory.
`manifest_path`	`Path`	Absolute path to `run_manifest.json`.
`manifest`	`RunManifest`	Parsed manifest with stages, timestamps, and versions.
`config_source_path`	`Path`	Absolute path to `config.source.yaml`. Always present.
`config_resolved_path`	`Path`	Absolute path to `config.resolved.yaml`. Always present.
`input_data_provenance_path`	`Path \| None`	Path to `input_data_provenance.json`. Required for manifest version 3 and 4 runs, otherwise `None`.
`validation_spec_path`	`Path \| None`	Path to `validation_spec.json`, or `None` if absent.
`diagnostics_bundle_path`	`Path \| None`	Path to `diagnostics_bundle.json`, or `None` if absent.
`model_selection_status_path`	`Path \| None`	Path to `model_selection_status.json`, or `None` if absent.
`model_selection_warnings_path`	`Path \| None`	Path to `model_selection_warnings.json`, or `None` if absent.

Required attributes (config_source_path, config_resolved_path) are always present. input_data_provenance_path is present for manifest version 3 and 4 runs. Other optional attributes are None when the corresponding artefact was not produced by the run or is absent from the manifest.

Example:

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

# Required — always available
print(record.config_source_path)
print(record.config_resolved_path)

# Optional — may be None
if record.diagnostics_bundle_path:
    print(f"Diagnostics: {record.diagnostics_bundle_path}")
if record.validation_spec_path:
    print(f"Validation spec: {record.validation_spec_path}")

`LifecycleError`

class LifecycleError(RuntimeError)

Raised when a run directory cannot be loaded through the lifecycle contract. All lifecycle functions raise this exception type instead of generic ValueError or RuntimeError.

meridian_tools.artifacts

Manifest and JSON helpers for run artefact management.

Module: meridian_tools.artifacts

Functions

`write_json`

def write_json(path: str | Path, payload: Any) -> None

Write a JSON-serialisable payload to disk with UTF-8 encoding and 2-space indentation. Creates parent directories if they do not exist. Writes through a private same-directory temporary file and atomically replaces the destination. Existing destination permission bits are preserved on overwrite; new files use the normal mode implied by the process umask.

`write_manifest`

def write_manifest(path: str | Path, manifest: RunManifest) -> None

Serialise and write a RunManifest to disk as JSON using write_json.

`normalize_artifact_paths`

def normalize_artifact_paths(
    run_dir: str | Path,
    artifacts: Mapping[str, str | Path],
) -> dict[str, str]

Convert artefact paths to relative paths against run_dir so the manifest stores portable references.

Parameters:

run_dir — The run directory root.
artifacts — Mapping of artefact names to file paths.

Returns: Dictionary mapping artefact names to relative path strings.

`validate_artifact_paths`

def validate_artifact_paths(
    run_dir: str | Path,
    artifacts: Mapping[str, str | Path],
) -> dict[str, str]

Validate that manifest artefact paths are relative regular files beneath run_dir, then return normalised POSIX relative paths. The validator rejects absolute paths, lexical .. components, paths that resolve outside the run directory, missing paths, directories, and special files. Internal symlinks are accepted only when they resolve to regular files inside the run directory.

Parameters:

run_dir - The run directory root.
artifacts - Mapping of artefact names to file paths.

Returns: Dictionary mapping artefact names to validated relative path strings.

`timestamp_utc`

def timestamp_utc() -> str

Return the current time as a UTC ISO-8601 string with second precision.

Classes

`RunManifest`

@dataclass
class RunManifest

Machine-readable summary of one meridian-tools run.

Attribute	Type	Default	Description
`run_name`	`str`	required	Human-readable run name.
`config_path`	`Path`	required	Path to the authored YAML config file.
`output_dir`	`Path`	required	Path to the run directory.
`started_at`	`str`	required	UTC ISO-8601 start timestamp.
`manifest_version`	`int`	`CURRENT_MANIFEST_VERSION`	Schema version (0, 1, 2, 3, or 4).
`status`	`str`	`"running"`	Overall run status: `"running"`, `"completed"`, or `"failed"`.
`finished_at`	`str \| None`	`None`	UTC ISO-8601 finish timestamp. `None` while the run is in progress.
`meridian_tools_version`	`str`	`__version__`	Version of `meridian-tools`.
`meridian_version`	`str \| None`	`None`	Version of Google Meridian.
`artifacts`	`dict[str, str]`	`{}`	Top-level artefact index. Key artefacts from stages are promoted here.
`stages`	`list[StageRecord]`	`[]`	Ordered list of stage records (completed, skipped, and failed).

Class methods:

from_dict(payload: Mapping[str, Any]) -> RunManifest — Deserialise from a JSON-parsed dictionary. Supports manifest versions 0, 1, 2, 3, and 4 with default values for missing fields in older versions. Raises ValueError for unsupported versions or missing required fields.

Instance methods:

to_dict() -> dict[str, Any] — Serialise to a JSON-compatible dictionary.

`StageRecord`

@dataclass
class StageRecord

One pipeline stage entry in the run manifest.

Attribute	Type	Default	Description
`name`	`str`	required	Stage identifier (for example, `"00_run_metadata"`).
`status`	`str`	`"pending"`	Stage status: `"pending"`, `"running"`, `"completed"`, `"skipped"`, or `"failed"`.
`started_at`	`str \| None`	`None`	UTC ISO-8601 start timestamp.
`finished_at`	`str \| None`	`None`	UTC ISO-8601 finish timestamp.
`elapsed_seconds`	`float \| None`	`None`	Wall-clock seconds for stage execution.
`message`	`str \| None`	`None`	Human-readable message (skip reason or error detail).
`artifacts`	`dict[str, str]`	`{}`	Map of artefact names to relative paths. Empty for skipped stages.

Class methods:

from_dict(payload: Mapping[str, Any]) -> StageRecord — Deserialise from a JSON-parsed dictionary. Raises ValueError if name is missing.

`InputDataProvenance`

@dataclass(frozen=True)
class InputDataProvenance

Pinned input-data provenance payload used by manifest version 3 and 4 runs.

Attribute	Type	Default	Description
`authored_path`	`str`	required	Exact `data.path` string from the source YAML.
`resolved_path`	`str`	required	Absolute runtime path used for input loading.
`sha256`	`str`	required	SHA-256 digest of the resolved input file.
`size_bytes`	`int`	required	Input file size in bytes.
`mtime_utc`	`str`	required	Input file modification time in UTC ISO-8601 format.
`row_count`	`int`	required	Number of CSV data rows.
`column_count`	`int`	required	Number of CSV columns.
`ordered_columns`	`tuple[str, ...]`	required	CSV header order.
`provenance_version`	`int`	`INPUT_DATA_PROVENANCE_VERSION`	Payload schema version.

Class methods:

from_dict(payload: Mapping[str, Any]) -> InputDataProvenance — Validates the exact pinned Phase 09 key set and types.

Instance methods:

to_dict() -> dict[str, Any] — Serialise to the exact JSON payload written into input_data_provenance.json.

Constants

`CURRENT_MANIFEST_VERSION`

CURRENT_MANIFEST_VERSION: int = 4

`SUPPORTED_MANIFEST_VERSIONS`

SUPPORTED_MANIFEST_VERSIONS: tuple[int, ...] = (0, 1, 2, 3, 4)

`INPUT_DATA_PROVENANCE_VERSION`

INPUT_DATA_PROVENANCE_VERSION: int = 1

`REQUIRED_MANIFEST_ARTIFACTS`

REQUIRED_MANIFEST_ARTIFACTS: tuple[str, ...] = (
    "config_resolved",
    "config_source",
    "input_data_provenance",
    "diagnostics_bundle",
)

These artefact entries are validated at run completion time by the runner. New runs must produce all four to complete successfully.

The lifecycle loader enforces config_source and config_resolved as required for all supported manifests. It also enforces input_data_provenance for manifest version 3 and 4 runs. diagnostics_bundle remains optional, so older or partial runs can still be loaded without it.

Python API

Pages

Subsections of Python API

meridian_tools.config

Functions

load_yaml_config

Classes

MeridianToolsConfig

PipelineRunConfig

ProjectConfig

CsvDataConfig

ModelSpecConfig

DistributionSpec

ChannelPriorSpec

PriorsConfig

FitConfig

ValidationConfig

ExportsConfig

ResponseCurvesConfig

OptimisationConfig

OptimisationBudgetConfig

meridian_tools.runner

Functions

run_pipeline

Classes

PipelineRunResult

ValidationExecutionContractError

ConfigPreflightError

PipelineRunFailure

Constants

Stage names

PIPELINE_STAGE_ORDER

meridian_tools.cv

Functions

build_last_window_holdout_mask

build_rolling_origin_splits

build_validation_splits

build_validation_plan

Classes

BlockedTimeSplit

ValidationRunSpec

ValidationPlan

meridian_tools.exports

Functions

export_model_fit_artifacts

extract_prior_summary

export_model_assessment_artifacts

export_decomposition_artifacts

export_response_curve_artifacts

export_optimisation_artifacts

ensure_meridian_schema_support

ensure_altair_png_support

meridian_tools.diagnostics

Functions

predictive_accuracy_frame

review_summary_dict

export_diagnostics_bundle

meridian_tools.model_selection

Functions

has_log_likelihood

compute_loo

compute_waic

compare_models

Classes

ModelSelectionError

InformationCriterionResult

meridian_tools.log_likelihood

Functions

compute_log_likelihood_dataset

attach_log_likelihood

Implementation notes

meridian_tools.lifecycle

Functions

resolve_run_directory

load_run_record

list_run_records

build_refresh_run_config

refresh_run

compare_run_records

Classes

`load_yaml_config`

`MeridianToolsConfig`

`PipelineRunConfig`

`ProjectConfig`

`CsvDataConfig`

`ModelSpecConfig`

`DistributionSpec`

`ChannelPriorSpec`

`PriorsConfig`

`FitConfig`

`ValidationConfig`

`ExportsConfig`

`ResponseCurvesConfig`

`OptimisationConfig`

`OptimisationBudgetConfig`

`run_pipeline`

`PipelineRunResult`

`ValidationExecutionContractError`

`ConfigPreflightError`

`PipelineRunFailure`

`PIPELINE_STAGE_ORDER`

`build_last_window_holdout_mask`

`build_rolling_origin_splits`

`build_validation_splits`

`build_validation_plan`

`BlockedTimeSplit`

`ValidationRunSpec`

`ValidationPlan`

`export_model_fit_artifacts`

`extract_prior_summary`

`export_model_assessment_artifacts`

`export_decomposition_artifacts`

`export_response_curve_artifacts`

`export_optimisation_artifacts`

`ensure_meridian_schema_support`

`ensure_altair_png_support`

`predictive_accuracy_frame`

`review_summary_dict`

`export_diagnostics_bundle`

`has_log_likelihood`

`compute_loo`

`compute_waic`

`compare_models`

`ModelSelectionError`

`InformationCriterionResult`

`compute_log_likelihood_dataset`

`attach_log_likelihood`

`resolve_run_directory`

`load_run_record`

`list_run_records`

`build_refresh_run_config`

`refresh_run`

`compare_run_records`

`RunRecord`

`LifecycleError`

`write_json`

`write_manifest`

`normalize_artifact_paths`

`validate_artifact_paths`

`timestamp_utc`

`RunManifest`

`StageRecord`

`InputDataProvenance`

`CURRENT_MANIFEST_VERSION`

`SUPPORTED_MANIFEST_VERSIONS`

`INPUT_DATA_PROVENANCE_VERSION`

`REQUIRED_MANIFEST_ARTIFACTS`