meridian_tools.lifecycle

Post-run record management: loading, listing, comparing, and refreshing runs.

Module: meridian_tools.lifecycle

Functions

resolve_run_directory

def resolve_run_directory(path: str | Path) -> Path

Return the absolute resolved run directory for a run path or manifest path.

If path points to a file, it must be named run_manifest.json; the function returns its parent directory. If path is a directory, it must contain run_manifest.json.

Parameters:

  • path — Path to a run directory or to run_manifest.json directly.

Returns: Absolute Path to the run directory.

Raises: LifecycleError if the path does not exist, is an unexpected file, or the directory does not contain run_manifest.json.


load_run_record

def load_run_record(path: str | Path) -> RunRecord

Load one run directory through the versioned lifecycle contract.

Resolves the run directory, parses the manifest, and resolves artefact paths. Required artefacts (config_source, config_resolved) must be present in the manifest and exist on disk. Manifest version 3 runs must also include input_data_provenance. Optional artefacts (validation_spec, diagnostics_bundle, model_selection_status) are resolved when present and set to None when absent.

Parameters:

  • path — Path to a run directory or to run_manifest.json directly.

Returns: A validated RunRecord instance.

Raises: LifecycleError for missing required artefacts, malformed manifests, artefact path traversal, or claimed-but-missing artefacts.


list_run_records

def list_run_records(root: str | Path) -> list[RunRecord]

Discover direct child run directories under one output root.

Scans direct child directories of root for run_manifest.json files. Returns records sorted by started_at (most recent first), with directory name as a secondary sort key.

Parameters:

  • root — Directory to scan. Must be a directory, not a file.

Returns: List of RunRecord instances.

Raises: LifecycleError if root is not a directory or if any discovered run has an invalid manifest.


build_refresh_run_config

def build_refresh_run_config(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunConfig

Build a runtime refresh config from one stored run directory.

The execution config path points to the source run’s config.resolved.yaml. The returned PipelineRunConfig.source_config_path preserves the source run’s archived config.source.yaml so the refresh can re-copy the original YAML into the new run metadata. The output directory defaults to the source run’s parent directory (creating a sibling run). For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Parameters:

  • path — Path to the run directory or manifest to refresh.
  • output_dir — Override the output directory (default: source parent).
  • run_name — Override the run name.

Returns: A PipelineRunConfig ready for run_pipeline.

Raises: LifecycleError if the source run cannot be loaded or if authored-holdout refresh requirements are not met.


refresh_run

def refresh_run(
    path: str | Path,
    *,
    output_dir: str | Path | None = None,
    run_name: str | None = None,
) -> PipelineRunResult

Execute a non-destructive refresh run from one stored lifecycle record.

This is a convenience function that calls build_refresh_run_config followed by run_pipeline. The original run directory is never modified.

Parameters:

  • path — Path to the run directory or manifest to refresh.
  • output_dir — Override the output directory (default: source parent).
  • run_name — Override the run name.

Returns: A PipelineRunResult for the new run.


compare_run_records

def compare_run_records(
    left: str | Path,
    right: str | Path,
) -> pd.DataFrame

Compare two run records at the pinned metadata layer.

Loads both run records and compares run name, status, versions, validation spec presence, diagnostics statuses, model selection availability, and input-data provenance.

Parameters:

  • left — Path to the first run directory or manifest.
  • right — Path to the second run directory or manifest.

Returns: A pandas DataFrame with columns field, left, right, status, and changed. Rows follow a fixed order:

Row (field) Description
run_name Human-readable run name.
status Overall run status.
meridian_tools_version meridian-tools version.
meridian_version Google Meridian version.
has_validation_spec Whether a validation spec is present.
has_diagnostics_bundle Whether a diagnostics bundle is present.
predictive_accuracy_status Status from the diagnostics bundle.
review_summary_status Status from the diagnostics bundle.
has_model_selection_outputs Whether LOO/WAIC outputs are present.
model_selection_reason_code Reason code if model selection is unavailable.
input_authored_path YAML-owned data.path string.
input_resolved_path Absolute runtime input path.
input_mtime_utc Input file mtime.
input_sha256 Input file SHA-256 digest.
input_size_bytes Input file size in bytes.
input_row_count Input row count.
input_column_count Input column count.
input_ordered_columns Input CSV column order.

For provenance rows, status is "legacy_unknown" and changed is None when either run predates manifest version 3 and therefore has no stored provenance payload.

Raises: LifecycleError if either run cannot be loaded or if diagnostics or model selection artefacts are malformed.


Classes

RunRecord

@dataclass(frozen=True)
class RunRecord

Resolved lifecycle view over one on-disk run directory.

Attribute Type Description
run_dir Path Absolute path to the run directory.
manifest_path Path Absolute path to run_manifest.json.
manifest RunManifest Parsed manifest with stages, timestamps, and versions.
config_source_path Path Absolute path to config.source.yaml. Always present.
config_resolved_path Path Absolute path to config.resolved.yaml. Always present.
input_data_provenance_path Path | None Path to input_data_provenance.json. Required for manifest version 3 runs, otherwise None.
validation_spec_path Path | None Path to validation_spec.json, or None if absent.
diagnostics_bundle_path Path | None Path to diagnostics_bundle.json, or None if absent.
model_selection_status_path Path | None Path to model_selection_status.json, or None if absent.

Required attributes (config_source_path, config_resolved_path) are always present. input_data_provenance_path is present for manifest version 3 runs. Other optional attributes are None when the corresponding artefact was not produced by the run or is absent from the manifest.

Example:

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

# Required — always available
print(record.config_source_path)
print(record.config_resolved_path)

# Optional — may be None
if record.diagnostics_bundle_path:
    print(f"Diagnostics: {record.diagnostics_bundle_path}")
if record.validation_spec_path:
    print(f"Validation spec: {record.validation_spec_path}")

LifecycleError

class LifecycleError(RuntimeError)

Raised when a run directory cannot be loaded through the lifecycle contract. All lifecycle functions raise this exception type instead of generic ValueError or RuntimeError.