Python API

Public Python APIs exposed by meridian-tools.

Pages

Subsections of Python API

meridian_tools.config

Configuration models and YAML loading for meridian-tools.

Module: meridian_tools.config

Functions

load_yaml_config

def load_yaml_config(path: str | Path) -> MeridianToolsConfig

Load and validate a meridian-tools YAML file.

Parameters:

  • path — Path to the YAML configuration file.

Returns: A validated MeridianToolsConfig instance.

Raises: pydantic.ValidationError if the YAML content does not match the schema.

Example:

from meridian_tools.config import load_yaml_config

config = load_yaml_config("project.yml")
print(config.project.name)
print(config.data.path)
print(config.validation.strategy)

Classes

MeridianToolsConfig

class MeridianToolsConfig(BaseModel)

Full YAML configuration for one meridian-tools run. This is the top-level model returned by load_yaml_config.

Attribute Type Default
project ProjectConfig ProjectConfig()
data CsvDataConfig required
model_spec ModelSpecConfig ModelSpecConfig()
fit FitConfig FitConfig()
validation ValidationConfig ValidationConfig()
exports ExportsConfig ExportsConfig()
response_curves `ResponseCurvesConfig None`
optimisation `OptimisationConfig None`

PipelineRunConfig

@dataclass(frozen=True)
class PipelineRunConfig

Runtime options that sit outside the YAML file. Passed to run_pipeline.

Attribute Type Default Description
config_path Path required Path to the YAML config file.
output_dir Path Path("runs") Directory for run output.
run_name `str None` None
validation_spec `ValidationRunSpec None` None
apply_run_name_suffix bool True Whether to append validation-aware suffixes to the run name.
source_config_path `Path None` None

ProjectConfig

class ProjectConfig(BaseModel)
Attribute Type Default
name str "meridian-project"

CsvDataConfig

class CsvDataConfig(BaseModel)

CSV loader configuration compatible with Meridian’s CsvDataLoader.

Attribute Type Default
path Path required
kpi_type Literal["revenue", "non-revenue"] "revenue"
coord_to_columns dict[str, Any] required
media_to_channel `dict[str, str] None`
media_spend_to_channel `dict[str, str] None`
reach_to_channel `dict[str, str] None`
frequency_to_channel `dict[str, str] None`
rf_spend_to_channel `dict[str, str] None`
organic_reach_to_channel `dict[str, str] None`
organic_frequency_to_channel `dict[str, str] None`

ModelSpecConfig

class ModelSpecConfig(BaseModel)
Attribute Type Default
kwargs dict[str, Any] {}

FitConfig

class FitConfig(BaseModel)

Sampling configuration for Meridian posterior fitting.

Attribute Type Default
sample_prior_draws `PositiveInt None`
n_chains `PositiveInt list[PositiveInt]`
n_adapt PositiveInt 500
n_burnin PositiveInt 500
n_keep PositiveInt 1000
seed `int list[int]
max_tree_depth PositiveInt 10
max_energy_diff float 500.0
unrolled_leapfrog_steps PositiveInt 1
parallel_iterations PositiveInt 10

ValidationConfig

class ValidationConfig(BaseModel)

Validation and holdout orchestration settings.

Attribute Type Default
strategy Literal["none", "blocked_tail", "rolling_origin"] "none"
holdout_size `PositiveInt None`
initial_train_size `PositiveInt None`
test_size `PositiveInt None`
step_size `PositiveInt None`
max_splits `PositiveInt None`

See the validation guide for cross-field validation rules.


ExportsConfig

class ExportsConfig(BaseModel)
Attribute Type Default
use_kpi bool False
batch_size PositiveInt 1000
export_predictive_accuracy bool True
export_review_summary bool True
export_model_selection bool True
export_plots bool True

ResponseCurvesConfig

class ResponseCurvesConfig(BaseModel)
Attribute Type Default Constraint
spend_multipliers list[float] required Non-empty, all >= 0
use_posterior bool True
by_reach bool True
use_optimal_frequency bool False
confidence_level float 0.9 0 < x < 1

OptimisationConfig

class OptimisationConfig(BaseModel)
Attribute Type Default Constraint
start_date str required ISO YYYY-MM-DD
end_date str required ISO YYYY-MM-DD, >= start_date
budget OptimisationBudgetConfig required
use_posterior bool True
use_optimal_frequency bool True
confidence_level float 0.9 0 < x < 1

OptimisationBudgetConfig

class OptimisationBudgetConfig(BaseModel)
Attribute Type Default
mode Literal["fixed_total", "relative_reference_window_total"] required
value PositiveFloat required

meridian_tools.runner

Pipeline orchestration for meridian-tools.

Module: meridian_tools.runner

Functions

run_pipeline

def run_pipeline(
    run_config: PipelineRunConfig,
    *,
    progress_callback: Callable | None = None,
) -> PipelineRunResult

Execute the full meridian-tools staged pipeline.

The pipeline proceeds through the following stages in order:

  1. 00_run_metadata — Archive source and resolved configs and write input_data_provenance.json.
  2. 10_validation — Write validation spec (if validation-aware).
  3. 20_model_fit — Build input data, construct the Meridian model, sample prior and posterior.
  4. 30_model_assessment — Export diagnostics, model summary, and model selection outputs.
  5. 40_decomposition — Export summary metrics.
  6. 60_response_curves — Export response curves (if configured).
  7. 70_optimisation — Export optimisation results (if configured).

The manifest is written to disk after each stage, so a failure mid-pipeline leaves a readable partial manifest.

Before creating the dated run directory, the runner enforces three separate pre-run checks:

  1. dependency preflight (google-meridian[schema], optional plot support)
  2. validation-execution contract checks for incompatible single-run validation combinations
  3. a narrow wrapper-owned config/data preflight over the resolved input file and authored column mapping

The wrapper-owned preflight checks exactly:

  • resolved data.path exists and is a regular file
  • the CSV header row can be read
  • the parsed header is non-empty
  • no parsed header cell is blank after trimming whitespace
  • every authored scalar entry in data.coord_to_columns exists in the header
  • every authored list member in data.coord_to_columns exists in the header
  • every authored key in media_to_channel, media_spend_to_channel, reach_to_channel, frequency_to_channel, rf_spend_to_channel, organic_reach_to_channel, and organic_frequency_to_channel exists in the header
  • authored list-valued coord families are non-empty
  • authored mapping fields above are non-empty
  • supported media/RF family groups are complete when authored

Header matching is exact and case-sensitive. Anything outside this closed matrix remains Meridian-owned validation.

Parameters:

  • run_config — A PipelineRunConfig specifying the execution config path, output directory, run name, optional validation spec, and optional source_config_path for metadata archival.
  • progress_callback — Optional callable invoked on stage lifecycle events. The callback receives keyword arguments:
    • stage_name (str) — stage identifier.
    • event (str) — one of "started", "completed", "skipped", or "failed".
    • stage_index (int) — 1-based position in the pipeline.
    • stage_count (int) — total number of stages.
    • elapsed_seconds (float) — wall-clock time (present for "completed" and "failed" events).
    • message (str) — human-readable detail (present for "skipped" and "failed" events).

Returns: A PipelineRunResult with the run directory and manifest path.

Raises:

  • RuntimeError if Meridian schema support is unavailable (checked at preflight before the run directory is created).
  • RuntimeError if exports.export_plots is true but vl-convert-python is not installed (also checked at preflight).
  • ValidationExecutionContractError if the requested single-run validation execution path is incompatible with the authored config.
  • ConfigPreflightError if wrapper-owned config/data preflight fails before run-directory creation.
  • PipelineRunFailure if any exception occurs after the dated run directory already exists.

Example:

from pathlib import Path
from meridian_tools.config import PipelineRunConfig
from meridian_tools.runner import run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path=Path("project.yml"),
        output_dir=Path("runs"),
    )
)

print(result.run_dir)
print(result.manifest_path)

Classes

PipelineRunResult

@dataclass(frozen=True)
class PipelineRunResult

Disk locations for one completed meridian-tools run.

Attribute Type Description
run_dir Path Absolute path to the run directory.
manifest_path Path Absolute path to run_manifest.json.

ValidationExecutionContractError

class ValidationExecutionContractError(ValueError)

Raised when the requested single-run validation execution path is incompatible with the authored config. Current examples include direct rolling_origin execution through run_pipeline(...) and combining PipelineRunConfig.validation_spec with authored model_spec.kwargs.holdout_id.


ConfigPreflightError

class ConfigPreflightError(ValueError)

Raised when the wrapper-owned Phase 10 preflight fails before run-directory creation. This covers only the closed wrapper preflight boundary, not full Meridian model validation.


PipelineRunFailure

class PipelineRunFailure(RuntimeError)

Raised when a run fails after the dated run directory already exists. The original underlying exception is preserved via __cause__.

Attribute Type Description
run_dir Path Absolute failed run directory.
manifest_path Path Absolute path to the failed run manifest.
stage_name str | None Failing stage name when one is available.

Constants

Stage names

Constant Value
STAGE_RUN_METADATA "00_run_metadata"
STAGE_VALIDATION "10_validation"
STAGE_MODEL_FIT "20_model_fit"
STAGE_MODEL_ASSESSMENT "30_model_assessment"
STAGE_DECOMPOSITION "40_decomposition"
STAGE_RESPONSE_CURVES "60_response_curves"
STAGE_OPTIMISATION "70_optimisation"

PIPELINE_STAGE_ORDER

PIPELINE_STAGE_ORDER: tuple[str, ...] = (
    "00_run_metadata",
    "10_validation",
    "20_model_fit",
    "30_model_assessment",
    "40_decomposition",
    "60_response_curves",
    "70_optimisation",
)

The numbering gap at 50 is intentional, reserving space for future stages.

meridian_tools.cv

Cross-validation and holdout orchestration utilities.

Module: meridian_tools.cv

Functions

build_last_window_holdout_mask

def build_last_window_holdout_mask(
    time_index: Sequence[Any],
    holdout_size: int,
    geo_index: Sequence[Any] | None = None,
) -> np.ndarray

Build a blocked-tail holdout mask for Meridian’s holdout_id.

Returns a 1-D boolean mask for national data and a 2-D (n_geos, n_times) mask when geo_index is provided. The last holdout_size time periods are marked as True (held out).

Parameters:

  • time_index — Strictly increasing sequence of time period identifiers.
  • holdout_size — Number of tail periods to hold out. Must be positive and less than the length of time_index.
  • geo_index — Optional sequence of geo identifiers. If provided, the mask is broadcast across geos.

Returns: Boolean NumPy array.

Raises: ValueError for non-monotonic indices, undersized indices, or impossible holdout sizes.


build_rolling_origin_splits

def build_rolling_origin_splits(
    time_index: Sequence[Any],
    *,
    initial_train_size: int,
    test_size: int,
    step_size: int | None = None,
    max_splits: int | None = None,
) -> list[BlockedTimeSplit]

Create expanding-window blocked time splits for rolling-origin validation.

Parameters:

  • time_index — Strictly increasing sequence of time period identifiers.
  • initial_train_size — Size of the first training window.
  • test_size — Size of each test window.
  • step_size — Step between splits. Must equal test_size. Defaults to test_size.
  • max_splits — Maximum number of splits to generate. Must be >= 2 if set.

Returns: List of BlockedTimeSplit instances (at least 2).

Raises: ValueError for invalid parameters or if fewer than 2 splits can be generated.


build_validation_splits

def build_validation_splits(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
) -> list[BlockedTimeSplit]

Build deterministic split definitions from the typed validation config.

Dispatches to the appropriate split builder based on validation_config.strategy. Returns an empty list for strategy: none.

Parameters:

  • validation_config — A validated ValidationConfig instance.
  • time_index — Strictly increasing sequence of time period identifiers.

Returns: List of BlockedTimeSplit instances (empty for none).


build_validation_plan

def build_validation_plan(
    validation_config: ValidationConfig,
    time_index: Sequence[Any],
    geo_index: Sequence[Any] | None = None,
) -> ValidationPlan

Materialise concrete validation and final-fit run specs from one config.

For strategy: none, returns a plan with no validation runs and no final-fit run. For blocked_tail or rolling_origin, returns one ValidationRunSpec per split plus a final_fit_run spec that trains on the full time axis with no holdout.

Parameters:

  • validation_config — A validated ValidationConfig instance.
  • time_index — Strictly increasing sequence of time period identifiers.
  • geo_index — Optional sequence of geo identifiers for geo-panel models.

Returns: A ValidationPlan instance.

Example:

from meridian_tools.config import load_yaml_config
from meridian_tools.cv import build_validation_plan

config = load_yaml_config("project.yml")
plan = build_validation_plan(
    config.validation,
    time_index=["2024-01-01", "2024-01-08", "..."],
    geo_index=["US-CA", "US-NY"],
)

for run_spec in plan.validation_runs:
    print(run_spec.split_label, len(run_spec.train_indices), len(run_spec.test_indices))

if plan.final_fit_run:
    print("Final fit:", plan.final_fit_run.split_label)

Classes

BlockedTimeSplit

@dataclass(frozen=True)
class BlockedTimeSplit

One blocked time split for validation.

Attribute Type Description
label str Human-readable split label (e.g. "blocked_tail", "split_01").
train_indices tuple[int, ...] Integer indices into the time axis for training.
test_indices tuple[int, ...] Integer indices into the time axis for testing.
train_dates tuple[str, ...] Date values for training periods.
test_dates tuple[str, ...] Date values for test periods.

ValidationRunSpec

@dataclass(frozen=True)
class ValidationRunSpec

One concrete validation or final-fit run derived from a split plan. Passed to PipelineRunConfig.validation_spec to control a single pipeline execution.

Attribute Type Description
mode "validation" | "final_fit" Run mode.
strategy str Validation strategy.
split_label str Human-readable split identifier.
holdout_source str How the holdout mask was produced.
generated_holdout bool Whether the holdout was auto-generated.
holdout_id np.ndarray | None Concrete holdout mask (immutable).
train_indices tuple[int, ...] Training time indices.
test_indices tuple[int, ...] Test time indices.
train_dates tuple[str, ...] Training date values.
test_dates tuple[str, ...] Test date values.
run_name_suffix str Suffix for the run directory name.

Methods:

  • to_artifact_payload() — Returns the JSON-serialisable dictionary written to validation_spec.json.

ValidationPlan

@dataclass(frozen=True)
class ValidationPlan

Concrete validation runs and the separate final-fit run for one config.

Attribute Type Description
validation_runs tuple[ValidationRunSpec, ...] One spec per validation split.
final_fit_run ValidationRunSpec | None Full-sample final-fit spec. None for strategy: none.

meridian_tools.exports

Helpers for manifest-backed Meridian export families.

Module: meridian_tools.exports

Functions

export_model_fit_artifacts

def export_model_fit_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    fit_config: FitConfig,
    meridian_version: str | None,
) -> dict[str, Path]

Write the stable model-fit artefact set.

Produces:

  • meridian_model.binpb — Serialised Meridian model (Protocol Buffers).
  • fit_metadata.json — Records FitConfig values and Meridian version.

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • fit_config — The FitConfig used for this run.
  • meridian_version — Meridian version string (or None).

Returns: Dictionary mapping artefact names to file paths.


export_model_assessment_artifacts

def export_model_assessment_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
    diagnostics_exporter: Callable,
    model_selection_exporter: Callable,
) -> dict[str, Path]

Write the stable assessment artefact set.

Produces diagnostics bundle, model results summary HTML, and optionally model selection outputs (LOO/WAIC) and diagnostic plots.

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • exports_config — Export switches.
  • diagnostics_exporter — Callable for diagnostics bundle export (typically export_diagnostics_bundle).
  • model_selection_exporter — Callable for model selection export.

Returns: Dictionary mapping artefact names to file paths.


export_decomposition_artifacts

def export_decomposition_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable decomposition artefact set.

Produces:

  • summary_metrics.nc — NetCDF decomposition dataset.
  • summary_metrics.csv — Flattened tabular decomposition.
  • plots/ — Channel contribution, waterfall, spend vs. contribution, and ROI charts (when export_plots: true).

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.


export_response_curve_artifacts

def export_response_curve_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    response_curves_config: ResponseCurvesConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable response-curve artefact set.

Produces:

  • response_curves.nc — NetCDF response curve dataset.
  • response_curves.csv — Flattened tabular response curves.
  • plots/response_curves_plot.png — Response curve visualisation (when export_plots: true).

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • response_curves_config — Response curves settings from YAML.
  • exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.


export_optimisation_artifacts

def export_optimisation_artifacts(
    model: Any,
    output_dir: str | Path,
    *,
    optimisation_config: OptimisationConfig,
    exports_config: ExportsConfig,
) -> dict[str, Path]

Write the stable optimisation artefact set.

Produces:

  • optimisation_summary.html — Meridian optimisation summary report.
  • optimised_data.nc / .csv — Optimised budget allocation.
  • nonoptimised_data.nc / .csv — Baseline allocation.
  • optimisation_grid.csv — Full optimisation grid.
  • plots/ — Delta, allocation, spend, and response curve charts (when export_plots: true).

For budget.mode: relative_reference_window_total, the effective budget is computed as value × total_spend_in_reference_window using the model’s media and RF spend data within the start_dateend_date window.

Parameters:

  • model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • optimisation_config — Optimisation settings from YAML.
  • exports_config — Export switches.

Returns: Dictionary mapping artefact names to file paths.


ensure_meridian_schema_support

def ensure_meridian_schema_support() -> Callable

Return Meridian’s schema serialiser or raise a stable runtime error.

Checks for meridian.schema.serde.meridian_serde.save_meridian. If the import fails, raises RuntimeError with guidance to install google-meridian[schema].

Returns: The save_meridian callable.


ensure_altair_png_support

def ensure_altair_png_support() -> Any

Return the Altair PNG backend or raise a stable runtime error.

Checks for vl_convert. If the import fails, raises RuntimeError with guidance to install vl-convert-python.

Returns: The vl_convert module.

meridian_tools.diagnostics

Diagnostics extraction and export helpers for Meridian runs.

Module: meridian_tools.diagnostics

Functions

predictive_accuracy_frame

def predictive_accuracy_frame(
    meridian_model: Any,
    *,
    use_kpi: bool = False,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> pd.DataFrame

Return Meridian predictive accuracy as a flat DataFrame.

Uses Meridian’s Analyzer.predictive_accuracy internally and flattens the resulting xarray dataset into a pandas DataFrame.

Parameters:

  • meridian_model — Fitted Meridian model instance.
  • use_kpi — Use KPI-based metrics.
  • selected_geos — Optional subset of geos to evaluate.
  • selected_times — Optional subset of time periods to evaluate.
  • batch_size — Batch size for Meridian analysis.

Returns: A pandas DataFrame with one row per observation.


review_summary_dict

def review_summary_dict(
    meridian_model: Any,
    *,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
) -> dict[str, Any]

Run Meridian’s review battery and return a JSON-ready dictionary.

Uses Meridian’s ModelReviewer internally. All non-primitive values (dataclasses, enums, NumPy arrays) are recursively converted to JSON-serialisable types.

Parameters:

  • meridian_model — Fitted Meridian model instance.
  • selected_geos — Optional subset of geos.
  • selected_times — Optional subset of time periods.

Returns: A JSON-serialisable dictionary.


export_diagnostics_bundle

def export_diagnostics_bundle(
    meridian_model: Any,
    output_dir: str | Path,
    *,
    use_kpi: bool = False,
    export_predictive_accuracy: bool = True,
    export_review_summary: bool = True,
    selected_geos: Sequence[str] | None = None,
    selected_times: Sequence[str] | None = None,
    batch_size: int = 1000,
) -> dict[str, Path]

Write predictive accuracy, review summary, and bundle manifest to disk.

The bundle manifest (diagnostics_bundle.json) records the status of each sub-export ("exported" or "disabled") along with the file name and format. This provides a stable machine-readable contract for downstream consumers.

When an export is disabled, any pre-existing file from a previous run at the same path is removed to prevent stale data.

Parameters:

  • meridian_model — Fitted Meridian model instance.
  • output_dir — Directory to write artefacts to.
  • use_kpi — Use KPI-based metrics.
  • export_predictive_accuracy — Write predictive_accuracy.csv.
  • export_review_summary — Write review_summary.json.
  • selected_geos — Not supported in current scope (raises ValueError).
  • selected_times — Not supported in current scope (raises ValueError).
  • batch_size — Batch size for Meridian analysis.

Returns: Dictionary mapping artefact names to file paths. Always includes "diagnostics_bundle". Conditionally includes "predictive_accuracy" and "review_summary".

Example:

from meridian_tools.diagnostics import export_diagnostics_bundle

artifacts = export_diagnostics_bundle(
    fitted_model,
    "output/30_model_assessment",
    export_predictive_accuracy=True,
    export_review_summary=True,
)

print(artifacts["diagnostics_bundle"])
# Path("output/30_model_assessment/diagnostics_bundle.json")

meridian_tools.model_selection

Model-selection helpers layered on top of ArviZ and Meridian.

Module: meridian_tools.model_selection

Functions

has_log_likelihood

def has_log_likelihood(candidate: Any) -> bool

Return whether the candidate exposes a non-empty log_likelihood group.

Accepts either an ArviZ InferenceData or any object with an .inference_data attribute (e.g. a fitted Meridian model).

Parameters:

  • candidate — ArviZ InferenceData or fitted Meridian model.

Returns: True if a non-empty log_likelihood group exists.


compute_loo

def compute_loo(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute PSIS-LOO for a Meridian model or InferenceData.

If the candidate is a fitted Meridian model without a log_likelihood group, the function automatically reconstructs it through attach_log_likelihood.

Parameters:

  • candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
  • pointwise — Include per-observation LOO values and Pareto k diagnostics.
  • scale — Scale for ELPD computation ("log", "negative_log", or "deviance").

Returns: An InformationCriterionResult with kind="loo".

Raises: ModelSelectionError if log-likelihood cannot be obtained.


compute_waic

def compute_waic(
    candidate: Any,
    *,
    pointwise: bool = False,
    scale: str = "log",
) -> InformationCriterionResult

Compute WAIC for a Meridian model or InferenceData.

Same automatic log-likelihood reconstruction as compute_loo.

Parameters:

  • candidate — Fitted Meridian model or ArviZ InferenceData with log_likelihood.
  • pointwise — Include per-observation WAIC values.
  • scale — Scale for ELPD computation.

Returns: An InformationCriterionResult with kind="waic".

Raises: ModelSelectionError if log-likelihood cannot be obtained.


compare_models

def compare_models(
    candidates: Mapping[str, Any],
    *,
    ic: str = "loo",
    scale: str = "log",
) -> pd.DataFrame

Compare multiple models with ArviZ compare.

Parameters:

  • candidates — Dictionary mapping model names to fitted Meridian models or InferenceData objects.
  • ic — Information criterion to use: "loo" or "waic".
  • scale — Scale for ELPD computation.

Returns: A pandas DataFrame with columns: model, rank, elpd_{ic}, p_{ic}, elpd_diff, weight, se, dse, warning, scale. Ranked by ELPD (rank 0 is best).

For a single candidate, returns a one-row DataFrame with rank=0, elpd_diff=0.0, and weight=1.0.

Raises:

  • ValueError if ic is not "loo" or "waic", or if candidates is empty.
  • ModelSelectionError if any candidate lacks log-likelihood data.

Classes

ModelSelectionError

class ModelSelectionError(RuntimeError)

Raised when information criteria cannot be computed.

Property Type Description
reason_code str | None Structured code identifying the failure reason.

Known reason codes:

Code Meaning
missing_log_likelihood_group InferenceData has no log_likelihood group and cannot be reconstructed.
holdout_fit_unsupported Model was fitted with a holdout mask.
requires_fitted_meridian_model Missing posterior samples or ArviZ InferenceData.
meridian_internal_seam_incompatible Meridian version lacks required reconstruction methods.

InformationCriterionResult

@dataclass(frozen=True)
class InformationCriterionResult

Summary of one information-criterion computation.

Attribute Type Description
kind str "loo" or "waic".
summary dict[str, Any] Summary statistics (ELPD, p, SE, etc.).
pointwise pd.DataFrame | None Per-observation values (if pointwise=True).

meridian_tools.log_likelihood

Log-likelihood computation and attachment for Meridian models.

Module: meridian_tools.log_likelihood

Functions

compute_log_likelihood_dataset

def compute_log_likelihood_dataset(
    meridian_model: Any,
) -> xr.Dataset

Compute the pointwise log-likelihood dataset for a fitted Meridian model.

This function reconstructs the joint distribution from the posterior samples and computes observation-level log-likelihood values. It handles both geo-panel and national models.

The reconstruction recovers unsaved posterior parameters (e.g. geo deviations, tau_g_excl_baseline) that Meridian does not persist to InferenceData by default.

Parameters:

  • meridian_model — A fitted Meridian model with posterior samples and a compatible posterior_sampler_callable.

Returns: An xarray Dataset with a log_likelihood variable.

Raises: ModelSelectionError if the model does not expose the required internal reconstruction seams or lacks posterior samples.


attach_log_likelihood

def attach_log_likelihood(
    meridian_model: Any,
    *,
    in_place: bool = False,
) -> az.InferenceData

Attach a log_likelihood group to a Meridian model’s InferenceData.

If the model’s InferenceData already has a non-empty log_likelihood group, it is returned as-is (or the existing InferenceData is returned for in_place=True).

Parameters:

  • meridian_model — A fitted Meridian model.
  • in_place — If True, mutates meridian_model.inference_data directly. If False (default), returns a deep copy with the log_likelihood group attached. The original model is never modified.

Returns: An ArviZ InferenceData with a log_likelihood group.

Raises:

  • ModelSelectionError with reason_code="meridian_internal_seam_incompatible" if the Meridian version lacks the required private reconstruction methods.
  • ModelSelectionError with reason_code="requires_fitted_meridian_model" if the model has no posterior samples.
  • ModelSelectionError with reason_code="holdout_fit_unsupported" if the model was fitted with a holdout mask.

Example:

from meridian_tools.log_likelihood import attach_log_likelihood

# Non-mutating (default)
idata = attach_log_likelihood(fitted_model, in_place=False)
assert hasattr(idata, "log_likelihood")

# Mutating
attach_log_likelihood(fitted_model, in_place=True)
assert hasattr(fitted_model.inference_data, "log_likelihood")

Implementation notes

The reconstruction accesses three private methods on Meridian’s posterior_sampler_callable:

  • _get_joint_dist_unpinned
  • _prepare_latents_for_reconstruction
  • _reconstruct_posteriors

These are Meridian-internal and may change without notice. If any method is missing, a ModelSelectionError with reason_code="meridian_internal_seam_incompatible" is raised instead of crashing. See the Meridian integration notes for details on this coupling boundary.

meridian_tools.lifecycle

Post-run record management: loading, listing, comparing, and refreshing runs.

Module: meridian_tools.lifecycle

Functions

resolve_run_directory

def resolve_run_directory(path: str | Path) -> Path

Return the absolute resolved run directory for a run path or manifest path.

If path points to a file, it must be named run_manifest.json; the function returns its parent directory. If path is a directory, it must contain run_manifest.json.

Parameters:

  • path — Path to a run directory or to run_manifest.json directly.

Returns: Absolute Path to the run directory.

Raises: LifecycleError if the path does not exist, is an unexpected file, or the directory does not contain run_manifest.json.


load_run_record

def load_run_record(path: str | Path) -> RunRecord

Load one run directory through the versioned lifecycle contract.

Resolves the run directory, parses the manifest, and resolves artefact paths. Required artefacts (config_source, config_resolved) must be present in the manifest and exist on disk. Manifest version 3 runs must also include input_data_provenance. Optional artefacts (validation_spec, diagnostics_bundle, model_selection_status) are resolved when present and set to None when absent.

Parameters:

  • path — Path to a run directory or to run_manifest.json directly.

Returns: A validated RunRecord instance.

Raises: LifecycleError for missing required artefacts, malformed manifests, artefact path traversal, or claimed-but-missing artefacts.


list_run_records

def list_run_records(root: str | Path) -> list[RunRecord]

Discover direct child run directories under one output root.

Scans direct child directories of root for run_manifest.json files. Returns records sorted by started_at (most recent first), with directory name as a secondary sort key.

Parameters:

  • root — Directory to scan. Must be a directory, not a file.

Returns: List of RunRecord instances.

Raises: LifecycleError if root is not a directory or if any discovered run has an invalid manifest.


build_refresh_run_config

def build_refresh_run_config(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunConfig

Build a runtime refresh config from one stored run directory.

The execution config path points to the source run’s config.resolved.yaml. The returned PipelineRunConfig.source_config_path preserves the source run’s archived config.source.yaml so the refresh can re-copy the original YAML into the new run metadata. The output directory defaults to the source run’s parent directory (creating a sibling run). For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Parameters:

  • path — Path to the run directory or manifest to refresh.
  • output_dir — Override the output directory (default: source parent).
  • run_name — Override the run name.

Returns: A PipelineRunConfig ready for run_pipeline.

Raises: LifecycleError if the source run cannot be loaded or if authored-holdout refresh requirements are not met.


refresh_run

def refresh_run(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunResult

Execute a non-destructive refresh run from one stored lifecycle record.

This is a convenience function that calls build_refresh_run_config followed by run_pipeline. The original run directory is never modified.

Parameters:

  • path — Path to the run directory or manifest to refresh.
  • output_dir — Override the output directory (default: source parent).
  • run_name — Override the run name.

Returns: A PipelineRunResult for the new run.


compare_run_records

def compare_run_records(
    left: str | Path,
    right: str | Path,
) -> pd.DataFrame

Compare two run records at the pinned metadata layer.

Loads both run records and compares run name, status, versions, validation spec presence, diagnostics statuses, model selection availability, and input-data provenance.

Parameters:

  • left — Path to the first run directory or manifest.
  • right — Path to the second run directory or manifest.

Returns: A pandas DataFrame with columns field, left, right, status, and changed. Rows follow a fixed order:

Row (field) Description
run_name Human-readable run name.
status Overall run status.
meridian_tools_version meridian-tools version.
meridian_version Google Meridian version.
has_validation_spec Whether a validation spec is present.
has_diagnostics_bundle Whether a diagnostics bundle is present.
predictive_accuracy_status Status from the diagnostics bundle.
review_summary_status Status from the diagnostics bundle.
has_model_selection_outputs Whether LOO/WAIC outputs are present.
model_selection_reason_code Reason code if model selection is unavailable.
input_authored_path YAML-owned data.path string.
input_resolved_path Absolute runtime input path.
input_mtime_utc Input file mtime.
input_sha256 Input file SHA-256 digest.
input_size_bytes Input file size in bytes.
input_row_count Input row count.
input_column_count Input column count.
input_ordered_columns Input CSV column order.

For provenance rows, status is "legacy_unknown" and changed is None when either run predates manifest version 3 and therefore has no stored provenance payload.

Raises: LifecycleError if either run cannot be loaded or if diagnostics or model selection artefacts are malformed.


Classes

RunRecord

@dataclass(frozen=True)
class RunRecord

Resolved lifecycle view over one on-disk run directory.

Attribute Type Description
run_dir Path Absolute path to the run directory.
manifest_path Path Absolute path to run_manifest.json.
manifest RunManifest Parsed manifest with stages, timestamps, and versions.
config_source_path Path Absolute path to config.source.yaml. Always present.
config_resolved_path Path Absolute path to config.resolved.yaml. Always present.
input_data_provenance_path Path | None Path to input_data_provenance.json. Required for manifest version 3 runs, otherwise None.
validation_spec_path Path | None Path to validation_spec.json, or None if absent.
diagnostics_bundle_path Path | None Path to diagnostics_bundle.json, or None if absent.
model_selection_status_path Path | None Path to model_selection_status.json, or None if absent.

Required attributes (config_source_path, config_resolved_path) are always present. input_data_provenance_path is present for manifest version 3 runs. Other optional attributes are None when the corresponding artefact was not produced by the run or is absent from the manifest.

Example:

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

# Required — always available
print(record.config_source_path)
print(record.config_resolved_path)

# Optional — may be None
if record.diagnostics_bundle_path:
    print(f"Diagnostics: {record.diagnostics_bundle_path}")
if record.validation_spec_path:
    print(f"Validation spec: {record.validation_spec_path}")

LifecycleError

class LifecycleError(RuntimeError)

Raised when a run directory cannot be loaded through the lifecycle contract. All lifecycle functions raise this exception type instead of generic ValueError or RuntimeError.

meridian_tools.artifacts

Manifest and JSON helpers for run artefact management.

Module: meridian_tools.artifacts

Functions

write_json

def write_json(path: str | Path, payload: Any) -> None

Write a JSON-serialisable payload to disk with UTF-8 encoding and 2-space indentation. Creates parent directories if they do not exist.


write_manifest

def write_manifest(path: str | Path, manifest: RunManifest) -> None

Serialise and write a RunManifest to disk as JSON using write_json.


normalize_artifact_paths

def normalize_artifact_paths(
    run_dir: str | Path,
    artifacts: Mapping[str, str | Path],
) -> dict[str, str]

Convert artefact paths to relative paths against run_dir so the manifest stores portable references.

Parameters:

  • run_dir — The run directory root.
  • artifacts — Mapping of artefact names to file paths.

Returns: Dictionary mapping artefact names to relative path strings.


timestamp_utc

def timestamp_utc() -> str

Return the current time as a UTC ISO-8601 string with second precision.


Classes

RunManifest

@dataclass
class RunManifest

Machine-readable summary of one meridian-tools run.

Attribute Type Default Description
run_name str required Human-readable run name.
config_path Path required Path to the authored YAML config file.
output_dir Path required Path to the run directory.
started_at str required UTC ISO-8601 start timestamp.
manifest_version int CURRENT_MANIFEST_VERSION Schema version (0, 1, 2, or 3).
status str "running" Overall run status: "running", "completed", or "failed".
finished_at str | None None UTC ISO-8601 finish timestamp. None while the run is in progress.
meridian_tools_version str __version__ Version of meridian-tools.
meridian_version str | None None Version of Google Meridian.
artifacts dict[str, str] {} Top-level artefact index. Key artefacts from stages are promoted here.
stages list[StageRecord] [] Ordered list of stage records (completed, skipped, and failed).

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> RunManifest — Deserialise from a JSON-parsed dictionary. Supports manifest versions 0, 1, 2, and 3 with default values for missing fields in older versions. Raises ValueError for unsupported versions or missing required fields.

Instance methods:

  • to_dict() -> dict[str, Any] — Serialise to a JSON-compatible dictionary.

StageRecord

@dataclass
class StageRecord

One pipeline stage entry in the run manifest.

Attribute Type Default Description
name str required Stage identifier (for example, "00_run_metadata").
status str "pending" Stage status: "pending", "running", "completed", "skipped", or "failed".
started_at str | None None UTC ISO-8601 start timestamp.
finished_at str | None None UTC ISO-8601 finish timestamp.
elapsed_seconds float | None None Wall-clock seconds for stage execution.
message str | None None Human-readable message (skip reason or error detail).
artifacts dict[str, str] {} Map of artefact names to relative paths. Empty for skipped stages.

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> StageRecord — Deserialise from a JSON-parsed dictionary. Raises ValueError if name is missing.

InputDataProvenance

@dataclass(frozen=True)
class InputDataProvenance

Pinned input-data provenance payload used by manifest version 3 runs.

Attribute Type Default Description
authored_path str required Exact data.path string from the source YAML.
resolved_path str required Absolute runtime path used for input loading.
sha256 str required SHA-256 digest of the resolved input file.
size_bytes int required Input file size in bytes.
mtime_utc str required Input file modification time in UTC ISO-8601 format.
row_count int required Number of CSV data rows.
column_count int required Number of CSV columns.
ordered_columns tuple[str, ...] required CSV header order.
provenance_version int INPUT_DATA_PROVENANCE_VERSION Payload schema version.

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> InputDataProvenance — Validates the exact pinned Phase 09 key set and types.

Instance methods:

  • to_dict() -> dict[str, Any] — Serialise to the exact JSON payload written into input_data_provenance.json.

Constants

CURRENT_MANIFEST_VERSION

CURRENT_MANIFEST_VERSION: int = 3

SUPPORTED_MANIFEST_VERSIONS

SUPPORTED_MANIFEST_VERSIONS: tuple[int, ...] = (0, 1, 2, 3)

INPUT_DATA_PROVENANCE_VERSION

INPUT_DATA_PROVENANCE_VERSION: int = 1

REQUIRED_MANIFEST_ARTIFACTS

REQUIRED_MANIFEST_ARTIFACTS: tuple[str, ...] = (
    "config_resolved",
    "config_source",
    "input_data_provenance",
    "diagnostics_bundle",
)

These artefact entries are validated at run completion time by the runner. New runs must produce all four to complete successfully.

The lifecycle loader enforces config_source and config_resolved as required for all supported manifests. It also enforces input_data_provenance for manifest version 3 runs. diagnostics_bundle remains optional, so older or partial runs can still be loaded without it.