meridian_tools.artifacts

Manifest and JSON helpers for run artefact management.

Module: meridian_tools.artifacts

Functions

write_json

def write_json(path: str | Path, payload: Any) -> None

Write a JSON-serialisable payload to disk with UTF-8 encoding and 2-space indentation. Creates parent directories if they do not exist.


write_manifest

def write_manifest(path: str | Path, manifest: RunManifest) -> None

Serialise and write a RunManifest to disk as JSON using write_json.


normalize_artifact_paths

def normalize_artifact_paths(
    run_dir: str | Path,
    artifacts: Mapping[str, str | Path],
) -> dict[str, str]

Convert artefact paths to relative paths against run_dir so the manifest stores portable references.

Parameters:

  • run_dir — The run directory root.
  • artifacts — Mapping of artefact names to file paths.

Returns: Dictionary mapping artefact names to relative path strings.


timestamp_utc

def timestamp_utc() -> str

Return the current time as a UTC ISO-8601 string with second precision.


Classes

RunManifest

@dataclass
class RunManifest

Machine-readable summary of one meridian-tools run.

Attribute Type Default Description
run_name str required Human-readable run name.
config_path Path required Path to the authored YAML config file.
output_dir Path required Path to the run directory.
started_at str required UTC ISO-8601 start timestamp.
manifest_version int CURRENT_MANIFEST_VERSION Schema version (0, 1, 2, or 3).
status str "running" Overall run status: "running", "completed", or "failed".
finished_at str | None None UTC ISO-8601 finish timestamp. None while the run is in progress.
meridian_tools_version str __version__ Version of meridian-tools.
meridian_version str | None None Version of Google Meridian.
artifacts dict[str, str] {} Top-level artefact index. Key artefacts from stages are promoted here.
stages list[StageRecord] [] Ordered list of stage records (completed, skipped, and failed).

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> RunManifest — Deserialise from a JSON-parsed dictionary. Supports manifest versions 0, 1, 2, and 3 with default values for missing fields in older versions. Raises ValueError for unsupported versions or missing required fields.

Instance methods:

  • to_dict() -> dict[str, Any] — Serialise to a JSON-compatible dictionary.

StageRecord

@dataclass
class StageRecord

One pipeline stage entry in the run manifest.

Attribute Type Default Description
name str required Stage identifier (for example, "00_run_metadata").
status str "pending" Stage status: "pending", "running", "completed", "skipped", or "failed".
started_at str | None None UTC ISO-8601 start timestamp.
finished_at str | None None UTC ISO-8601 finish timestamp.
elapsed_seconds float | None None Wall-clock seconds for stage execution.
message str | None None Human-readable message (skip reason or error detail).
artifacts dict[str, str] {} Map of artefact names to relative paths. Empty for skipped stages.

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> StageRecord — Deserialise from a JSON-parsed dictionary. Raises ValueError if name is missing.

InputDataProvenance

@dataclass(frozen=True)
class InputDataProvenance

Pinned input-data provenance payload used by manifest version 3 runs.

Attribute Type Default Description
authored_path str required Exact data.path string from the source YAML.
resolved_path str required Absolute runtime path used for input loading.
sha256 str required SHA-256 digest of the resolved input file.
size_bytes int required Input file size in bytes.
mtime_utc str required Input file modification time in UTC ISO-8601 format.
row_count int required Number of CSV data rows.
column_count int required Number of CSV columns.
ordered_columns tuple[str, ...] required CSV header order.
provenance_version int INPUT_DATA_PROVENANCE_VERSION Payload schema version.

Class methods:

  • from_dict(payload: Mapping[str, Any]) -> InputDataProvenance — Validates the exact pinned Phase 09 key set and types.

Instance methods:

  • to_dict() -> dict[str, Any] — Serialise to the exact JSON payload written into input_data_provenance.json.

Constants

CURRENT_MANIFEST_VERSION

CURRENT_MANIFEST_VERSION: int = 3

SUPPORTED_MANIFEST_VERSIONS

SUPPORTED_MANIFEST_VERSIONS: tuple[int, ...] = (0, 1, 2, 3)

INPUT_DATA_PROVENANCE_VERSION

INPUT_DATA_PROVENANCE_VERSION: int = 1

REQUIRED_MANIFEST_ARTIFACTS

REQUIRED_MANIFEST_ARTIFACTS: tuple[str, ...] = (
    "config_resolved",
    "config_source",
    "input_data_provenance",
    "diagnostics_bundle",
)

These artefact entries are validated at run completion time by the runner. New runs must produce all four to complete successfully.

The lifecycle loader enforces config_source and config_resolved as required for all supported manifests. It also enforces input_data_provenance for manifest version 3 runs. diagnostics_bundle remains optional, so older or partial runs can still be loaded without it.