Lifecycle management guide

meridian-tools treats completed runs as immutable artefacts. The lifecycle module provides tools to load, compare, and refresh past runs without mutating them. This guide explains each lifecycle operation and when to use it.

Core concepts

Run records

A RunRecord encapsulates a run’s metadata and artefact paths. It is loaded from a run directory by reading run_manifest.json and resolving all artefact paths against the directory.

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

print(record.run_dir)                    # Path to the run directory
print(record.manifest)                   # RunManifest with stages, timestamps, versions
print(record.config_source_path)         # Path to config.source.yaml
print(record.config_resolved_path)       # Path to config.resolved.yaml
print(record.input_data_provenance_path) # Path to input_data_provenance.json (or None for older runs)
print(record.diagnostics_bundle_path)    # Path to diagnostics_bundle.json (or None)
print(record.validation_spec_path)       # Path to validation_spec.json (or None)
print(record.model_selection_status_path)  # Path to model_selection_status.json (or None)

All paths in the record are absolute. Required artefacts (config_source, config_resolved) are validated at load time and always present. input_data_provenance is also required for manifest version 3 runs. Optional artefacts (diagnostics_bundle, validation_spec, model_selection_status) are None if not present in the manifest.

Immutability

Lifecycle operations never modify a source run directory. When you refresh a run, the output goes to a new sibling directory. When you compare runs, both source directories remain untouched.

All lifecycle functions raise LifecycleError (a RuntimeError subclass) when they encounter invalid state.

Loading a run record

From a run directory

from meridian_tools.lifecycle import load_run_record

record = load_run_record("runs/my-project_blocked_tail_20260402_073500")

From a manifest path

record = load_run_record("runs/my-project_blocked_tail_20260402_073500/run_manifest.json")

Both forms are accepted. The function detects whether the argument is a directory or a manifest file.

Validation at load time

load_run_record validates:

  • The manifest JSON is well-formed and has a supported version (0, 1, 2, or 3).
  • Required config artefact entries (config_source, config_resolved) exist in the manifest.
  • Manifest version 3 runs also include input_data_provenance.
  • Required artefact files actually exist on disk.
  • No artefact path escapes the run directory (path traversal protection).
  • Claimed optional artefacts exist on disk (a manifest that references a missing file is rejected).

If any check fails, a LifecycleError is raised with a descriptive message.

Listing run records

from meridian_tools.lifecycle import list_run_records

records = list_run_records("runs/")
for record in records:
    print(record.manifest.started_at, record.run_dir.name)

list_run_records discovers all direct child directories that contain a run_manifest.json and returns them sorted by started_at timestamp (most recent first), with run directory name as a secondary sort key.

The function requires a directory path (not a file). It will raise an error if any discovered run directory contains an invalid manifest — it does not silently skip broken runs.

Refreshing a run

Refreshing re-executes a run using its stored configuration but writes the output to a new directory. The source run is never modified.

When to refresh

  • After a Meridian upgrade — to check whether the new version produces comparable results with the same specification.
  • After a code change — to verify that refactoring did not change model outputs.
  • After extending the dataset — to refit the model with additional observations using the same validated specification.

How to refresh

from meridian_tools.lifecycle import build_refresh_run_config
from meridian_tools.runner import run_pipeline

refresh_config = build_refresh_run_config("runs/my-project_blocked_tail_20260402_073500")
result = run_pipeline(refresh_config)

build_refresh_run_config reconstructs a PipelineRunConfig from the source run’s stored configuration:

  • The execution config path points to the source run’s config.resolved.yaml.
  • The source config path points to the source run’s config.source.yaml, so the refreshed run preserves the original authored YAML in its own metadata.
  • The output directory is set to the source run’s parent directory (creating a sibling).
  • The run name suffix is stripped to produce a clean refresh name.
  • For validation runs, the validation spec is reconstructed from the stored validation_spec.json.

Refresh with overrides

You can override specific settings:

from pathlib import Path

refresh_config = build_refresh_run_config(
    "runs/my-project_blocked_tail_20260402_073500",
    output_dir=Path("runs/refreshed"),
    run_name="my-project-refresh",
)

Validation-aware refresh

If the source run was a validation run (blocked tail or rolling origin), build_refresh_run_config reconstructs the validation spec from the stored artefact, including the holdout mask geometry. For authored-holdout runs, it reuses the YAML-owned holdout from the copied config.

For final-fit runs, the refresh produces another final-fit run with the same full-sample training specification.

Comparing runs

from meridian_tools.lifecycle import compare_run_records

comparison = compare_run_records(
    "runs/my-project_blocked_tail_20260402_073500",
    "runs/my-project_blocked_tail_20260415_090000",
)
print(comparison)

compare_run_records accepts run directory paths (not RunRecord objects) and returns a pandas DataFrame with columns field, left, right, status, and changed. The compared fields include:

  • run_name and status — basic identity.
  • meridian_tools_version and meridian_version — version drift.
  • has_validation_spec and has_diagnostics_bundle — artefact presence.
  • predictive_accuracy_status and review_summary_status — diagnostics.
  • has_model_selection_outputs and model_selection_reason_code — model selection.
  • input_authored_path, input_resolved_path, input_sha256, input_size_bytes, input_mtime_utc, input_row_count, input_column_count, and input_ordered_columns — dataset identity and shape.

This is useful for auditing whether a refresh or a specification change produced materially different results.

If either run predates manifest version 3, provenance rows are reported with status == "legacy_unknown" and changed == None. That distinguishes “no stored provenance exists” from “the dataset definitely changed”.

Lifecycle workflow example

A typical lifecycle workflow for a quarterly model refresh:

from pathlib import Path
from meridian_tools.lifecycle import (
    load_run_record,
    list_run_records,
    build_refresh_run_config,
)
from meridian_tools.runner import run_pipeline

# 1. Find the most recent production run
records = list_run_records("runs/")
production_run = records[0]  # Most recent by started_at

# 2. Refresh with the updated dataset
refresh_config = build_refresh_run_config(
    production_run.run_dir,
    output_dir=Path("runs/quarterly-refresh"),
)
refresh_result = run_pipeline(refresh_config)

# 3. Compare the results
comparison = compare_run_records(production_run.run_dir, refresh_result.run_dir)
print(comparison)

Manifest versioning

The lifecycle layer supports manifest versions 0, 1, 2, and 3. Older manifests are handled gracefully with default values for fields that were added in later versions. The current version is 3.

This means you can load run directories created by earlier versions of meridian-tools without issues. The loaded RunRecord keeps the same shape, but input_data_provenance_path is None for pre-v3 runs because those manifests predate provenance capture.