Contributor-facing project documentation, release baselines, and changelog material.
Pages
Contributing — This guide covers the development setup, conventions, and workflow for contributing to meridian-tools.
Acceptance checklist — Use this page as the canonical local acceptance checklist for the current repository state. Run the commands in this order. The acceptance gate is local and command-driven. It does not depend on CI, GitHub Actions, or unpublished helper scripts.
Release baseline — This page records the current milestone release baseline for the repository. Treat it as a validated project state, not as an automated release system. The baseline uses the same local command sequence as the acceptance checklist and records the observed warning profile, the direct runtime dependency bounds, and the accepted trade-offs that still shape the package.
Changelog — All notable changes to meridian-tools are documented in this file.
Subsections of Project
Contributing
This guide covers the development setup, conventions, and workflow for
contributing to meridian-tools.
Development setup
Clone and install
git clone <repo-url> meridian-tools
cd meridian-tools
pip install -e ".[dev]"
All public functions and classes use type annotations. The codebase uses
from __future__ import annotations for forward-reference support.
Import conventions
Standard library imports first, then third-party, then local.
Heavy dependencies (Meridian, TensorFlow, ArviZ) are imported lazily inside
functions, not at module level, in the config/CLI/validation layers.
Ruff rule I enforces import sorting.
Configuration models
All Pydantic models use ConfigDict(extra="forbid"). New config fields must
be added with appropriate types, defaults, and validators.
Testing
Running tests
# Full suitepytest tests/ -v
# Specific filepytest tests/test_runner.py -v
# Specific testpytest tests/test_runner.py::test_run_pipeline_writes_manifest -v
Test conventions
Tests use pytest with tmp_path for temporary directories.
monkeypatch is used extensively to mock Meridian internals and isolate
unit tests from real MCMC sampling.
Module-scoped fixtures (scope="module") are used for expensive model
construction in test_log_likelihood.py and test_model_selection.py.
Shared test infrastructure is defined inline in individual test modules.
There is no top-level conftest.py.
Live Meridian verification
One opt-in command exercises the bounded real Meridian seam:
This is not part of the default suite. It proves one reduced real pipeline run
over bundled demo data, one stored-run refresh after the original YAML is
removed, and the lower-level live log-likelihood seam. Run it after Meridian
version upgrades and before release-candidate handoff when you want extra
confidence beyond the fast suite.
Writing new tests
Place tests in the appropriate tests/test_<module>.py file.
Use monkeypatch to avoid real MCMC sampling in unit tests.
Test both success paths and error conditions.
Verify artefact file contents, not just their existence.
The version is defined in src/meridian_tools/version.py:
__version__="0.3.0"
Version bumps are manual edits. Update this file when preparing a release.
Documentation
Documentation lives in docs/. When adding new features:
Update relevant guide or reference pages.
Add API documentation for new public functions or classes.
Update the YAML schema reference if config fields changed.
Update the output schema if new artefacts are produced.
Common pitfalls
Do not import Meridian at module level in config, CLI, or validation
modules. This breaks CLI responsiveness.
Do not add extra="allow" to Pydantic models. The extra="forbid"
policy prevents silent misconfiguration.
Do not modify source run directories in lifecycle operations. Always
create new sibling directories.
Do not weaken or delete existing tests without explicit direction.
Acceptance checklist
Use this page as the canonical local acceptance checklist for the current
repository state. Run the commands in this order. The acceptance gate is local
and command-driven. It does not depend on CI, GitHub Actions, or unpublished
helper scripts.
Acceptance gate
Run the following commands from the repository root:
The canonical acceptance-gate result for the last command is:
244 passed, 2 skipped
That result is the pass or fail line for the default local acceptance gate.
The recorded warning profile belongs to the release baseline, not to the
acceptance-gate definition itself.
What each command proves
python -m compileall src tests proves that the checked-in Python files parse
cleanly. If this step fails, you are dealing with a syntax or import-time
parse issue and you should stop there.
ruff check src tests proves that the repository still satisfies the pinned
lint rules. If this step fails, fix the reported lint violations before moving
on.
ruff format --check src tests proves that the checked-in files still match
the agreed formatting contract. If this step fails, run the formatter and then
rerun the verification sequence.
mypy src proves that the configured static typing baseline still runs
cleanly. If this step fails, either fix the reported type issue or update the
documented ratchet intentionally.
python -m pip install -e . --no-deps proves that the package still builds
and installs in editable mode from the local source tree. If this fails, treat
it as a packaging or build-metadata break rather than a test-only problem.
meridian-tools --help proves that the published CLI entrypoint still resolves
and that the lightweight command surface still imports cleanly. If this step
fails, check the package entrypoint and import boundary before continuing.
pytest tests/ -v proves the behavioural contract of the repository. This is
the broadest local validation step. If it fails, use the failing test names to
identify which package contract regressed.
How to interpret failure
If the compile step fails, fix syntax or parse problems first. The later steps
will not give you useful signal until that is resolved.
If lint, format, or type checks fail, treat that as a source-tree quality
issue, not as an optional clean-up item. Bring the tree back to the pinned
Ruff and mypy state before trusting the rest of the loop.
If editable install fails, treat the repository as not ready for contributor
handoff. The package must install cleanly before the test result matters.
If CLI help fails, assume the published command surface is broken even if the
Python modules still import manually.
If pytest tests/ -v fails, the acceptance gate is not met. A partial pass is
not enough. Fix the failing behavioural contract and rerun the full command
sequence.
Optional extra confidence
The repository also carries one opt-in live Meridian verification command for
extra technical confidence:
This command is not part of the default blocking acceptance gate. It exists to
provide one bounded live Meridian route that proves:
real pipeline execution over bundled demo data
manifest-backed stored-run refresh after the original YAML is removed
the lower-level live log-likelihood reconstruction seam
On the reference development environment, the recorded run finished in 185.42
seconds (0:03:05); keep a budget of roughly six minutes or less for this
extra-confidence command.
Release baseline
This page records the current milestone release baseline for the repository.
Treat it as a validated project state, not as an automated release system. The
baseline uses the same local command sequence as the acceptance checklist and
records the observed warning profile, the direct runtime dependency bounds, and
the accepted trade-offs that still shape the package.
Release-ready definition in this repository
The repository is release-ready only when the documented local acceptance
command set passes, pytest tests/ -v returns the recorded pass/skip count
below, the same validated run is recorded with the observed warning count, the
warning categories match the accepted ones below, and the accepted trade-offs
remain explicit rather than hidden.
That command remains opt-in local confidence, not the default developer loop or
silent CI policy. On the reference development environment, the recorded run
finished in 185.42 seconds (0:03:05); keep a budget of roughly six minutes or
less for ordinary local execution.
Runtime dependency boundary
The current runtime boundary recorded from pyproject.toml is:
requires-python >=3.11
google-meridian==1.5.3
arviz>=0.18.0,<0.20.0
pandas>=2.2.0,<3
pydantic>=2.8.0,<3
PyYAML>=6.0.0,<7
These are the direct runtime dependency bounds for the milestone baseline. This
page does not imply broader environment reproducibility than the repository
currently implements.
Accepted warning profile
The recorded 60 warnings are accepted in the current milestone baseline.
They fall into two pinned categories:
Meridian model / prior warnings
ArviZ model-selection warnings
This baseline does not pretend the repository is warning-free. It records the
current observed warning profile honestly and treats those warning categories as
accepted for the present milestone.
Accepted trade-offs
The current release baseline also depends on several explicit trade-offs.
The package takes a no-fork Meridian approach. We keep Meridian as the
modelling engine and add workflow and compatibility tooling around it rather
than modifying Meridian source.
Bayesian model selection remains intentionally limited to fitted Meridian
models where holdout_id is None. Validation-fit and authored-holdout runs are
not treated as compatible LOO or WAIC candidates.
Lifecycle tooling remains Python-first. The repository does not currently ship
a broader lifecycle CLI.
Version bumping remains a manual edit rather than a fully automated release
pipeline.
Boundary of this record
This page records one validated milestone state. It does not introduce CI as
the source of truth. It does not define publish automation. It does not promise
zero warnings. It does not claim a broader release process than the repository
actually supports today.
Changelog
All notable changes to meridian-tools are documented in this file.
CLI single source of truth — runme.py now delegates directly to
meridian_tools.cli, removing duplicate root-level argument parsing.
Typed runner state — Pipeline orchestration now uses PipelineContext
for shared stage state.
Shared posterior sampling — Runner posterior sampling keyword mapping is
centralized in one helper.
Lifecycle comparison schema — Run comparison rows are generated from
declarative comparison field descriptors.
Meridian compatibility pin — The package pins
google-meridian[schema]==1.5.3, and log-likelihood reconstruction refuses
unvalidated Meridian versions.
Static analysis tooling — Development extras now include mypy, and
Ruff enables additional complexity, simplification, and Ruff-specific rule
families.
Fixed
Optimized Python safety — Validation helpers now use explicit exceptions
instead of assert for runtime invariants.
Shared confidence validation — Response curve and optimisation configs
share one confidence_level validator.
Export coercion documentation — NetCDF attribute coercion now documents
its input-to-output type mapping.
[0.2.0] — 2026-04-07
Added
Docs site build — Hugo-based website documentation under docs-site/,
generated from the repository Markdown set by
docs-site/build_content.py.
Manifest v3 provenance — Explicit input_data_provenance capture for
stored runs and lifecycle refresh or compare workflows.
Typed failure boundaries — ConfigPreflightError,
ValidationExecutionContractError, and PipelineRunFailure distinguish
wrapper-owned preflight, validation contract misuse, and post-directory
runtime failures.
Bounded live verification — An opt-in Meridian real-fit smoke route
gated behind MERIDIAN_TOOLS_ENABLE_REAL_FIT=1.
Module-path CLI contract — Explicit support and regression coverage for
python -m meridian_tools.cli ....
Changed
Shared launch flow — meridian-tools and the repo-root runme.py
launcher now share one launch flow for config loading, preflight checks,
progress reporting, and terminal success or failure output.
Packaged demo assets — Bundled demo configs and datasets are resolved
from packaged _demo_data, so demo runs work from installed wheels as well
as source checkouts.
Default demo fit mode — Bundled demos now default to full-sample fits
(validation.strategy: none), so loo_summary.json and waic_summary.json
are generated by default and 10_validation is recorded as skipped.
Refresh contract — Stored-run refresh now reloads from the saved
resolved config while preserving the original source config copy in run
metadata.
Lifecycle compare semantics — Compare now distinguishes legacy runs
without dataset provenance from real dataset changes.
Documentation layout — Public documentation is reorganised under docs/
into getting-started, guides, reference, concepts, and project sections.
Fixed
Structured public entrypoint failures — Missing or invalid config paths
in public entrypoints now produce structured failure output instead of raw
Python tracebacks unless --traceback is used.
Relative-path refresh — Refreshing a stored run with relative
data.path input no longer depends on the original source config location
remaining present on disk.
Partial-run failure reporting — Failed runs that already created an
output directory now report the concrete run directory, manifest path, and
failing stage through the CLI and runme.py.
Docs-site theme resolution — Hugo builds resolve the Relearn theme
through a pinned module dependency instead of requiring a local theme
checkout.
[0.1.0] — 2026-04-02
Added
Typed YAML configuration — Pydantic-validated config with extra="forbid"
strictness for all sections: project, data, model_spec, fit,
validation, exports, response_curves, optimisation.
Staged pipeline runner — Sequential execution through 00_run_metadata,
10_validation, 20_model_fit, 30_model_assessment, 40_decomposition,
60_response_curves, 70_optimisation with manifest persistence after each
stage.
Validation orchestration — blocked_tail and rolling_origin time-series
validation strategies with auto-generated holdout masks. Authored holdout
passthrough through model_spec.kwargs.holdout_id.
Diagnostics bundling — diagnostics_bundle.json manifest with optional
predictive_accuracy.csv and review_summary.json exports.
Bayesian model selection — Compatibility-aware LOO and WAIC computation
through ArviZ, with automatic log-likelihood reconstruction for fitted Meridian
models. Graceful degradation for incompatible runs through structured
ModelSelectionError with reason codes.
Response curves export — Configurable spend multiplier grid with NetCDF
and CSV outputs.
Optimisation export — Fixed-budget and relative-budget optimisation with
full artefact set including allocation charts.
Plot exports — PNG plot artefacts through Altair/vl-convert for model fit,
diagnostics, decomposition, response curves, and optimisation stages.
Lifecycle management — load_run_record, list_run_records,
build_refresh_run_config, compare_run_records for post-run analysis and
reproducible refresh workflows.
CLI — meridian-tools run and meridian-tools demo subcommands with
lightweight imports for fast startup.
Bundled demos — timeseries and geo_panel reference workflows with
packaged data and configs.
Manifest versioning — Support for manifest versions 0, 1, and 2 with
backward-compatible deserialisation.
Comprehensive test suite — 218 tests across 15 test files covering
configuration, validation, pipeline execution, exports, diagnostics, model
selection, lifecycle, and demos.