Project

Contributor-facing project documentation, release baselines, and changelog material.

Pages

  • Contributing — This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.
  • Acceptance checklist — Use this page as the canonical local acceptance checklist for the current repository state. Run the commands in this order. The acceptance gate is local and command-driven. It does not depend on CI, GitHub Actions, or unpublished helper scripts.
  • Release baseline — This page records the current milestone release baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the same local command sequence as the acceptance checklist and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.
  • Changelog — All notable changes to meridian-tools are documented in this file.

Subsections of Project

Contributing

This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.

Development setup

Clone and install

git clone <repo-url> meridian-tools
cd meridian-tools
pip install -e ".[dev]"

The [dev] extra installs pytest, ruff, and mypy.

Verify the install

meridian-tools --help
python -m compileall src tests
ruff check src tests
mypy src
pytest tests/ -v

Acceptance gate

Before submitting any change, run the full acceptance sequence from the repository root:

python -m compileall src tests
ruff check src tests
ruff format --check src tests
mypy src
python -m pip install -e . --no-deps
meridian-tools --help
pytest tests/ -v

See acceptance.md for the expected results and how to interpret failures.

Code style

Formatting and linting

The project uses Ruff for both linting and formatting:

# Check
ruff check src tests
ruff format --check src tests

# Auto-fix
ruff check --fix src tests
ruff format src tests

Configuration is in pyproject.toml:

[tool.ruff]
line-length = 120
target-version = "py311"

[tool.ruff.lint]
select = ["E", "F", "I", "UP", "B", "C90", "SIM", "RUF"]

Type annotations

All public functions and classes use type annotations. The codebase uses from __future__ import annotations for forward-reference support.

Import conventions

  • Standard library imports first, then third-party, then local.
  • Heavy dependencies (Meridian, TensorFlow, ArviZ) are imported lazily inside functions, not at module level, in the config/CLI/validation layers.
  • Ruff rule I enforces import sorting.

Configuration models

All Pydantic models use ConfigDict(extra="forbid"). New config fields must be added with appropriate types, defaults, and validators.

Testing

Running tests

# Full suite
pytest tests/ -v

# Specific file
pytest tests/test_runner.py -v

# Specific test
pytest tests/test_runner.py::test_run_pipeline_writes_manifest -v

Test conventions

  • Tests use pytest with tmp_path for temporary directories.
  • monkeypatch is used extensively to mock Meridian internals and isolate unit tests from real MCMC sampling.
  • Module-scoped fixtures (scope="module") are used for expensive model construction in test_log_likelihood.py and test_model_selection.py.
  • Shared test infrastructure is defined inline in individual test modules. There is no top-level conftest.py.

Live Meridian verification

One opt-in command exercises the bounded real Meridian seam:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This is not part of the default suite. It proves one reduced real pipeline run over bundled demo data, one stored-run refresh after the original YAML is removed, and the lower-level live log-likelihood seam. Run it after Meridian version upgrades and before release-candidate handoff when you want extra confidence beyond the fast suite.

Writing new tests

  • Place tests in the appropriate tests/test_<module>.py file.
  • Use monkeypatch to avoid real MCMC sampling in unit tests.
  • Test both success paths and error conditions.
  • Verify artefact file contents, not just their existence.
  • Use tmp_path for all filesystem operations.

Project structure

meridian-tools/
├── src/meridian_tools/       # Package source
│   ├── __init__.py           # Lazy-loading exports
│   ├── artifacts.py          # Manifest helpers
│   ├── cli.py                # CLI entry point
│   ├── config.py             # Pydantic models
│   ├── cv.py                 # Validation splits
│   ├── demo.py               # Demo discovery
│   ├── diagnostics.py        # Diagnostics export
│   ├── exports.py            # Meridian export wrappers
│   ├── launcher.py           # Run execution wrapper
│   ├── lifecycle.py          # Post-run management
│   ├── log_likelihood.py     # Log-likelihood adapter
│   ├── model_selection.py    # LOO/WAIC wrappers
│   ├── terminal.py           # CLI presentation
│   └── version.py            # Static version
├── tests/                    # Test suite
│   ├── _demo_data/           # Bundled demo data (packaged)
├── docs/                     # Documentation
├── runme.py                  # Source-tree demo launcher
└── pyproject.toml            # Build and dependency config

Versioning

The version is defined in src/meridian_tools/version.py:

__version__ = "0.3.0"

Version bumps are manual edits. Update this file when preparing a release.

Documentation

Documentation lives in docs/. When adding new features:

  1. Update relevant guide or reference pages.
  2. Add API documentation for new public functions or classes.
  3. Update the YAML schema reference if config fields changed.
  4. Update the output schema if new artefacts are produced.

Common pitfalls

  • Do not import Meridian at module level in config, CLI, or validation modules. This breaks CLI responsiveness.
  • Do not add extra="allow" to Pydantic models. The extra="forbid" policy prevents silent misconfiguration.
  • Do not modify source run directories in lifecycle operations. Always create new sibling directories.
  • Do not weaken or delete existing tests without explicit direction.

Acceptance checklist

Use this page as the canonical local acceptance checklist for the current repository state. Run the commands in this order. The acceptance gate is local and command-driven. It does not depend on CI, GitHub Actions, or unpublished helper scripts.

Acceptance gate

Run the following commands from the repository root:

python -m compileall src tests
ruff check src tests
ruff format --check src tests
mypy src
python -m pip install -e . --no-deps
meridian-tools --help
pytest tests/ -v

The canonical acceptance-gate result for the last command is:

244 passed, 2 skipped

That result is the pass or fail line for the default local acceptance gate. The recorded warning profile belongs to the release baseline, not to the acceptance-gate definition itself.

What each command proves

python -m compileall src tests proves that the checked-in Python files parse cleanly. If this step fails, you are dealing with a syntax or import-time parse issue and you should stop there.

ruff check src tests proves that the repository still satisfies the pinned lint rules. If this step fails, fix the reported lint violations before moving on.

ruff format --check src tests proves that the checked-in files still match the agreed formatting contract. If this step fails, run the formatter and then rerun the verification sequence.

mypy src proves that the configured static typing baseline still runs cleanly. If this step fails, either fix the reported type issue or update the documented ratchet intentionally.

python -m pip install -e . --no-deps proves that the package still builds and installs in editable mode from the local source tree. If this fails, treat it as a packaging or build-metadata break rather than a test-only problem.

meridian-tools --help proves that the published CLI entrypoint still resolves and that the lightweight command surface still imports cleanly. If this step fails, check the package entrypoint and import boundary before continuing.

pytest tests/ -v proves the behavioural contract of the repository. This is the broadest local validation step. If it fails, use the failing test names to identify which package contract regressed.

How to interpret failure

If the compile step fails, fix syntax or parse problems first. The later steps will not give you useful signal until that is resolved.

If lint, format, or type checks fail, treat that as a source-tree quality issue, not as an optional clean-up item. Bring the tree back to the pinned Ruff and mypy state before trusting the rest of the loop.

If editable install fails, treat the repository as not ready for contributor handoff. The package must install cleanly before the test result matters.

If CLI help fails, assume the published command surface is broken even if the Python modules still import manually.

If pytest tests/ -v fails, the acceptance gate is not met. A partial pass is not enough. Fix the failing behavioural contract and rerun the full command sequence.

Optional extra confidence

The repository also carries one opt-in live Meridian verification command for extra technical confidence:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v

This command is not part of the default blocking acceptance gate. It exists to provide one bounded live Meridian route that proves:

  • real pipeline execution over bundled demo data
  • manifest-backed stored-run refresh after the original YAML is removed
  • the lower-level live log-likelihood reconstruction seam

On the reference development environment, the recorded run finished in 185.42 seconds (0:03:05); keep a budget of roughly six minutes or less for this extra-confidence command.

Release baseline

This page records the current milestone release baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the same local command sequence as the acceptance checklist and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.

Release-ready definition in this repository

The repository is release-ready only when the documented local acceptance command set passes, pytest tests/ -v returns the recorded pass/skip count below, the same validated run is recorded with the observed warning count, the warning categories match the accepted ones below, and the accepted trade-offs remain explicit rather than hidden.

Validated baseline record

The current verified local baseline is:

python -m compileall src tests
ruff check src tests
ruff format --check src tests
mypy src
python -m pip install -e . --no-deps
meridian-tools --help
pytest tests/ -v
-> 244 passed, 2 skipped, 60 warnings

The optional extra-confidence live path remains separate:

MERIDIAN_TOOLS_ENABLE_REAL_FIT=1 pytest tests/test_demo_integration.py::test_real_pipeline_refresh_smoke tests/test_log_likelihood.py::test_compute_log_likelihood_dataset_real_posterior_smoke -m real_fit -v
-> 2 passed, 47 warnings in 185.42s (0:03:05)

That command remains opt-in local confidence, not the default developer loop or silent CI policy. On the reference development environment, the recorded run finished in 185.42 seconds (0:03:05); keep a budget of roughly six minutes or less for ordinary local execution.

Runtime dependency boundary

The current runtime boundary recorded from pyproject.toml is:

  • requires-python >=3.11
  • google-meridian==1.5.3
  • arviz>=0.18.0,<0.20.0
  • pandas>=2.2.0,<3
  • pydantic>=2.8.0,<3
  • PyYAML>=6.0.0,<7

These are the direct runtime dependency bounds for the milestone baseline. This page does not imply broader environment reproducibility than the repository currently implements.

Accepted warning profile

The recorded 60 warnings are accepted in the current milestone baseline. They fall into two pinned categories:

  • Meridian model / prior warnings
  • ArviZ model-selection warnings

This baseline does not pretend the repository is warning-free. It records the current observed warning profile honestly and treats those warning categories as accepted for the present milestone.

Accepted trade-offs

The current release baseline also depends on several explicit trade-offs.

The package takes a no-fork Meridian approach. We keep Meridian as the modelling engine and add workflow and compatibility tooling around it rather than modifying Meridian source.

Bayesian model selection remains intentionally limited to fitted Meridian models where holdout_id is None. Validation-fit and authored-holdout runs are not treated as compatible LOO or WAIC candidates.

Lifecycle tooling remains Python-first. The repository does not currently ship a broader lifecycle CLI.

Version bumping remains a manual edit rather than a fully automated release pipeline.

Boundary of this record

This page records one validated milestone state. It does not introduce CI as the source of truth. It does not define publish automation. It does not promise zero warnings. It does not claim a broader release process than the repository actually supports today.

Changelog

All notable changes to meridian-tools are documented in this file.

The format is based on Keep a Changelog.

[Unreleased]

[0.3.0] — 2026-04-24

Changed

  • CLI single source of truthrunme.py now delegates directly to meridian_tools.cli, removing duplicate root-level argument parsing.
  • Typed runner state — Pipeline orchestration now uses PipelineContext for shared stage state.
  • Shared posterior sampling — Runner posterior sampling keyword mapping is centralized in one helper.
  • Lifecycle comparison schema — Run comparison rows are generated from declarative comparison field descriptors.
  • Meridian compatibility pin — The package pins google-meridian[schema]==1.5.3, and log-likelihood reconstruction refuses unvalidated Meridian versions.
  • Static analysis tooling — Development extras now include mypy, and Ruff enables additional complexity, simplification, and Ruff-specific rule families.

Fixed

  • Optimized Python safety — Validation helpers now use explicit exceptions instead of assert for runtime invariants.
  • Shared confidence validation — Response curve and optimisation configs share one confidence_level validator.
  • Export coercion documentation — NetCDF attribute coercion now documents its input-to-output type mapping.

[0.2.0] — 2026-04-07

Added

  • Docs site build — Hugo-based website documentation under docs-site/, generated from the repository Markdown set by docs-site/build_content.py.
  • Manifest v3 provenance — Explicit input_data_provenance capture for stored runs and lifecycle refresh or compare workflows.
  • Typed failure boundariesConfigPreflightError, ValidationExecutionContractError, and PipelineRunFailure distinguish wrapper-owned preflight, validation contract misuse, and post-directory runtime failures.
  • Bounded live verification — An opt-in Meridian real-fit smoke route gated behind MERIDIAN_TOOLS_ENABLE_REAL_FIT=1.
  • Module-path CLI contract — Explicit support and regression coverage for python -m meridian_tools.cli ....

Changed

  • Shared launch flowmeridian-tools and the repo-root runme.py launcher now share one launch flow for config loading, preflight checks, progress reporting, and terminal success or failure output.
  • Packaged demo assets — Bundled demo configs and datasets are resolved from packaged _demo_data, so demo runs work from installed wheels as well as source checkouts.
  • Default demo fit mode — Bundled demos now default to full-sample fits (validation.strategy: none), so loo_summary.json and waic_summary.json are generated by default and 10_validation is recorded as skipped.
  • Refresh contract — Stored-run refresh now reloads from the saved resolved config while preserving the original source config copy in run metadata.
  • Lifecycle compare semantics — Compare now distinguishes legacy runs without dataset provenance from real dataset changes.
  • Documentation layout — Public documentation is reorganised under docs/ into getting-started, guides, reference, concepts, and project sections.

Fixed

  • Structured public entrypoint failures — Missing or invalid config paths in public entrypoints now produce structured failure output instead of raw Python tracebacks unless --traceback is used.
  • Relative-path refresh — Refreshing a stored run with relative data.path input no longer depends on the original source config location remaining present on disk.
  • Partial-run failure reporting — Failed runs that already created an output directory now report the concrete run directory, manifest path, and failing stage through the CLI and runme.py.
  • Docs-site theme resolution — Hugo builds resolve the Relearn theme through a pinned module dependency instead of requiring a local theme checkout.

[0.1.0] — 2026-04-02

Added

  • Typed YAML configuration — Pydantic-validated config with extra="forbid" strictness for all sections: project, data, model_spec, fit, validation, exports, response_curves, optimisation.
  • Staged pipeline runner — Sequential execution through 00_run_metadata, 10_validation, 20_model_fit, 30_model_assessment, 40_decomposition, 60_response_curves, 70_optimisation with manifest persistence after each stage.
  • Validation orchestrationblocked_tail and rolling_origin time-series validation strategies with auto-generated holdout masks. Authored holdout passthrough through model_spec.kwargs.holdout_id.
  • Diagnostics bundlingdiagnostics_bundle.json manifest with optional predictive_accuracy.csv and review_summary.json exports.
  • Bayesian model selection — Compatibility-aware LOO and WAIC computation through ArviZ, with automatic log-likelihood reconstruction for fitted Meridian models. Graceful degradation for incompatible runs through structured ModelSelectionError with reason codes.
  • Response curves export — Configurable spend multiplier grid with NetCDF and CSV outputs.
  • Optimisation export — Fixed-budget and relative-budget optimisation with full artefact set including allocation charts.
  • Plot exports — PNG plot artefacts through Altair/vl-convert for model fit, diagnostics, decomposition, response curves, and optimisation stages.
  • Lifecycle managementload_run_record, list_run_records, build_refresh_run_config, compare_run_records for post-run analysis and reproducible refresh workflows.
  • CLImeridian-tools run and meridian-tools demo subcommands with lightweight imports for fast startup.
  • Bundled demostimeseries and geo_panel reference workflows with packaged data and configs.
  • Manifest versioning — Support for manifest versions 0, 1, and 2 with backward-compatible deserialisation.
  • Comprehensive test suite — 218 tests across 15 test files covering configuration, validation, pipeline execution, exports, diagnostics, model selection, lifecycle, and demos.