meridian_tools.lifecycle

Post-run record management: loading, listing, comparing, and refreshing runs.

Module: meridian_tools.lifecycle

Functions

`resolve_run_directory`

def resolve_run_directory(path: str | Path) -> Path

Return the absolute resolved run directory for a run path or manifest path.

If path points to a file, it must be named run_manifest.json; the function returns its parent directory. If path is a directory, it must contain run_manifest.json.

Parameters:

path — Path to a run directory or to run_manifest.json directly.

Returns: Absolute Path to the run directory.

Raises: LifecycleError if the path does not exist, is an unexpected file, or the directory does not contain run_manifest.json.

`load_run_record`

def load_run_record(path: str | Path) -> RunRecord

Load one run directory through the versioned lifecycle contract.

Resolves the run directory, parses the manifest, and resolves artefact paths. Required artefacts (config_source, config_resolved) must be present in the manifest and exist on disk. Manifest version 3 and 4 runs must also include input_data_provenance. Optional artefacts (validation_spec, diagnostics_bundle, model_selection_status) are resolved when present and set to None when absent.

Parameters:

path — Path to a run directory or to run_manifest.json directly.

Returns: A validated RunRecord instance.

Raises: LifecycleError for missing required artefacts, malformed manifests, artefact path traversal, or claimed-but-missing artefacts.

`list_run_records`

def list_run_records(root: str | Path) -> list[RunRecord]

Discover direct child run directories under one output root.

Scans direct child directories of root for run_manifest.json files. Returns records sorted by started_at (most recent first), with directory name as a secondary sort key.

Parameters:

root — Directory to scan. Must be a directory, not a file.

Returns: List of RunRecord instances.

Raises: LifecycleError if root is not a directory or if any discovered run has an invalid manifest.

`build_refresh_run_config`

def build_refresh_run_config(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunConfig

Build a runtime refresh config from one stored run directory.

The execution config path points to the source run’s config.resolved.yaml. The returned PipelineRunConfig.source_config_path preserves the source run’s archived config.source.yaml so the refresh can re-copy the original YAML into the new run metadata. The output directory defaults to the source run’s parent directory (creating a sibling run). For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Parameters:

path — Path to the run directory or manifest to refresh.
output_dir — Override the output directory (default: source parent).
run_name — Override the run name.

Returns: A PipelineRunConfig ready for run_pipeline.

Raises: LifecycleError if the source run cannot be loaded or if authored-holdout refresh requirements are not met.

`refresh_run`

def refresh_run(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunResult

Execute a non-destructive refresh run from one stored lifecycle record.

This is a convenience function that calls build_refresh_run_config followed by run_pipeline. The original run directory is never modified.

Parameters:

path — Path to the run directory or manifest to refresh.
output_dir — Override the output directory (default: source parent).
run_name — Override the run name.

Returns: A PipelineRunResult for the new run.

`compare_run_records`

def compare_run_records(
    left: str | Path,
    right: str | Path,
) -> pd.DataFrame

Compare two run records at the pinned metadata layer.

Loads both run records and compares run name, status, versions, validation spec presence, diagnostics statuses, model selection availability, and input-data provenance.

Parameters:

left — Path to the first run directory or manifest.
right — Path to the second run directory or manifest.

Returns: A pandas DataFrame with columns field, left, right, status, and changed. Rows follow a fixed order:

Row (`field`)	Description
`run_name`	Human-readable run name.
`status`	Overall run status.
`meridian_tools_version`	`meridian-tools` version.
`meridian_version`	Google Meridian version.
`has_validation_spec`	Whether a validation spec is present.
`has_diagnostics_bundle`	Whether a diagnostics bundle is present.
`predictive_accuracy_status`	Status from the diagnostics bundle.
`review_summary_status`	Status from the diagnostics bundle.
`has_model_selection_outputs`	Whether LOO/WAIC outputs are present.
`model_selection_reason_code`	Reason code if model selection is unavailable.
`input_authored_path`	YAML-owned `data.path` string.
`input_resolved_path`	Absolute runtime input path.
`input_mtime_utc`	Input file mtime.
`input_sha256`	Input file SHA-256 digest.
`input_size_bytes`	Input file size in bytes.
`input_row_count`	Input row count.
`input_column_count`	Input column count.
`input_ordered_columns`	Input CSV column order.

For provenance rows, status is "legacy_unknown" and changed is None when either run predates manifest version 3 and therefore has no stored provenance payload.

Raises: LifecycleError if either run cannot be loaded or if diagnostics or model selection artefacts are malformed.

Classes

`RunRecord`

@dataclass(frozen=True)
class RunRecord

Resolved lifecycle view over one on-disk run directory.

Attribute	Type	Description
`run_dir`	`Path`	Absolute path to the run directory.
`manifest_path`	`Path`	Absolute path to `run_manifest.json`.
`manifest`	`RunManifest`	Parsed manifest with stages, timestamps, and versions.
`config_source_path`	`Path`	Absolute path to `config.source.yaml`. Always present.
`config_resolved_path`	`Path`	Absolute path to `config.resolved.yaml`. Always present.
`input_data_provenance_path`	`Path \| None`	Path to `input_data_provenance.json`. Required for manifest version 3 and 4 runs, otherwise `None`.
`validation_spec_path`	`Path \| None`	Path to `validation_spec.json`, or `None` if absent.
`diagnostics_bundle_path`	`Path \| None`	Path to `diagnostics_bundle.json`, or `None` if absent.
`model_selection_status_path`	`Path \| None`	Path to `model_selection_status.json`, or `None` if absent.
`model_selection_warnings_path`	`Path \| None`	Path to `model_selection_warnings.json`, or `None` if absent.

Required attributes (config_source_path, config_resolved_path) are always present. input_data_provenance_path is present for manifest version 3 and 4 runs. Other optional attributes are None when the corresponding artefact was not produced by the run or is absent from the manifest.

Example:

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

# Required — always available
print(record.config_source_path)
print(record.config_resolved_path)

# Optional — may be None
if record.diagnostics_bundle_path:
    print(f"Diagnostics: {record.diagnostics_bundle_path}")
if record.validation_spec_path:
    print(f"Validation spec: {record.validation_spec_path}")

`LifecycleError`

class LifecycleError(RuntimeError)

Raised when a run directory cannot be loaded through the lifecycle contract. All lifecycle functions raise this exception type instead of generic ValueError or RuntimeError.