DSAMbayes Documentation

Documentation for DSAMbayes v1.2.2 — a Bayesian marketing mix modelling toolkit for R, built on Stan.

DSAMbayes provides a unified interface for building, fitting, and interpreting MMM models. It supports single-market regression (BLM), multi-market hierarchical models with partial pooling, and pooled models with structured media coefficients. All model types share the same post-fit interface for posterior extraction, diagnostics, decomposition, and budget optimisation.

Where to start

You want to… Start here
Install and run your first model Install and SetupYour First BLM Model
Understand the modelling framework ConceptsModel Classes
Run a reproducible YAML-driven pipeline QuickstartCLI Usage
Interpret run outputs and plots Plot CatalogueInterpret Diagnostics
Configure priors, boundaries, or optimisation Config Schema
Compare models and select a candidate Compare Runs

What changed in v1.2.2

Key changes since v1.2.0 (see CHANGELOG.md for full details):

  • KPI back-transform correction — log-response models now default to the conditional-mean estimator exp(mu + sigma²/2) rather than the median exp(mu). Use log_response = "median" to retain the old behaviour. See Response Scale Semantics.
  • Composite hierarchy keys — hierarchical models now support composite grouping keys (e.g. market:brand).
  • Pooled lognormal_ms support — pooled models accept noise_sd ~ lognormal_ms(...) priors.
  • Strict pre-flight validation — runner-driven fits now abort on structural data-quality failures instead of warning silently.
  • Stan cache hardening — stale compiled models are recompiled automatically.

Subsections of DSAMbayes Documentation

Getting Started

Purpose

Onboard a new user from install to first successful DSAMbayes run.

Audience

  • New DSAMbayes users.
  • Analysts running DSAMbayes through R scripts or CLI.

Pages

Page Topic
Install and Setup Prerequisites, installation commands, and verification
Concepts What DSAMbayes does and how Bayesian MMM works
Your First BLM Model Build, fit, and interpret a single-market model using the R API
Your First Hierarchical Model Multi-market model with partial pooling and CRE
Quickstart (YAML Runner) Minimal end-to-end CLI run from config to output inspection
FAQ Answers to common questions

Subsections of Getting Started

Install and Setup

Audience

Engineers and analysts setting up DSAMbayes for local development or modelling runs.

Prerequisites

  • R >= 4.1 — check with R --version.
  • A C++ toolchain for Stan compilation. This is the most common source of setup issues:
    • macOS: install Xcode Command Line Tools (xcode-select --install).
    • Windows: install Rtools matching your R version. Ensure make is on your PATH.
    • Linux (Ubuntu/Debian): sudo apt install build-essential.
    • See the RStan Getting Started Guide for detailed platform instructions.
  • A local checkout of this repository.

Open a terminal in the repository root and run:

# 1. Create repo-local library and cache directories
mkdir -p .Rlib .cache

# 2. Set environment variables (add to .bashrc/.zshrc for persistence)
export R_LIBS_USER="$PWD/.Rlib"
export XDG_CACHE_HOME="$PWD/.cache"

# 3. Install DSAMbayes from the local checkout
R_LIBS_USER="$PWD/.Rlib" R -q -e 'install.packages(".", repos = NULL, type = "source")'

This keeps all package libraries and Stan compilation caches inside the repo, avoiding permission issues with system library paths.

Verify the installation

1. Confirm DSAMbayes loads

R_LIBS_USER="$PWD/.Rlib" R -q -e 'library(DSAMbayes); cat("Version:", as.character(utils::packageVersion("DSAMbayes")), "\n")'

Expected: prints Version: 1.2.2 (or current version).

2. Confirm the runner works

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml

Expected: validation completes without errors.

3. (Optional) Run the test suite

R_LIBS_USER="$PWD/.Rlib" R -q -e 'testthat::test_dir("tests/testthat")'

Expected: all tests pass.

Alternative: install from GitHub

If you do not have a local checkout, install from the GitHub remote:

R -q -e 'remotes::install_github("groupm-global/DSAMbayes")'

This installs the latest version on main. For the development fork with v1.2.2 features, use the local-checkout path above.

Using renv (optional)

The repository includes a renv.lock file for fully reproducible dependency management. To use it:

R -q -e 'install.packages("renv"); renv::restore()'

This installs the exact dependency versions used during development. It is optional but recommended for production runs where reproducibility matters.

Runner setup and first execution

1. Validate the example config

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml

2. Execute a full run

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R run --config config/blm_synthetic_mcmc.yaml

Expected: a timestamped run directory is created under results/ with model outputs and diagnostics.

Troubleshooting

Stan compilation fails

Symptom: errors during Compiling model... referencing C++ or compiler issues.

Actions:

  1. Confirm your C++ toolchain is working: R -q -e 'pkgbuild::has_build_tools(debug = TRUE)'.
  2. On Windows, ensure Rtools is installed and make is on your PATH.
  3. Clear the Stan cache and retry: rm -rf .cache/dsambayes.
  4. Follow the RStan Getting Started Guide for your platform.

Package installation fails

Symptom: install.packages(".", repos = NULL, type = "source") errors.

Actions:

  1. Confirm you are in the repository root directory.
  2. Confirm .Rlib exists and is writable: ls -la .Rlib.
  3. Check for missing system dependencies in the error output.

Stale Stan cache

Symptom: unexpected model behaviour after updating the package.

Actions:

  1. Clear the cache: rm -rf .cache/dsambayes.
  2. Re-run with model.force_recompile: true in your config (or leave default false — v1.2.2 auto-detects stale caches).

Permission issues

Symptom: write failures for library, cache, or run outputs.

Actions:

  1. Ensure .Rlib, .cache, and results/ are writable.
  2. Keep R_LIBS_USER and XDG_CACHE_HOME set in your shell session.
  3. Run all commands from the repository root.

Concepts

What is DSAMbayes?

DSAMbayes is an R package that fits Bayesian marketing mix models (MMM) using Stan. It provides a familiar lm()-style interface for specifying models, adds prior and boundary controls, and delegates estimation to Stan’s Hamiltonian Monte Carlo (HMC) sampler. The result is a full posterior distribution over model parameters — not just point estimates — enabling rigorous uncertainty quantification for media contribution and budget allocation decisions.

Why Bayesian MMM?

Classical OLS regression gives point estimates and confidence intervals that assume the model is correctly specified. In MMM, where media variables are collinear, sample sizes are small (often 100–200 weeks), and the functional form is uncertain, these assumptions are routinely violated.

Bayesian estimation addresses this by:

  • Regularisation through priors — weakly informative priors stabilise estimates when the data alone cannot separate correlated effects. This is particularly valuable for media channels with overlapping campaign timing.
  • Hard parameter constraints — boundary constraints (e.g. media coefficients must be non-negative) are enforced directly in the posterior, rather than post-hoc.
  • Full uncertainty propagation — every downstream output (fitted values, decomposition, budget allocation) carries the full posterior uncertainty, not just a point estimate ± standard error.
  • Principled model comparison — leave-one-out cross-validation (LOO-CV) via Pareto-smoothed importance sampling provides predictive model comparison without refitting.

Model classes

DSAMbayes supports three model classes, all sharing the same post-fit interface:

BLM (Bayesian Linear Model)

The simplest class. One market, one response variable, one set of predictors. Equivalent to lm() but with Bayesian estimation.

kpi ~ trend + seasonality + holidays + media_channels

Use when: single-market modelling with sufficient data (typically 100+ weeks).

Hierarchical

Panel data with multiple groups (e.g. markets, regions, brands). Random effects allow each group to deviate from the population average while sharing information across groups (partial pooling).

kpi ~ population_terms + (varying_terms | group)

Use when: multi-market data where you want to borrow strength across markets while allowing market-specific effects.

Pooled

Single-market data where media variables have a nested structure (e.g. campaign > channel > platform). Coefficients are pooled across labelled dimensions.

blm(formula, data) %>% pool(grouping_vars, variable_map)

Use when: single-market data with structured media hierarchies.

The estimation pipeline

Every DSAMbayes model follows the same lifecycle:

  1. Constructblm() creates an unfitted model object with default priors and boundaries.
  2. Configureset_prior(), set_boundary(), set_cre() adjust the specification.
  3. Fitfit() (MCMC) or optimise() (MAP) estimates the posterior.
  4. Extractget_posterior(), fitted(), decomp() retrieve results.
  5. Decideoptimise_budget() translates estimates into budget allocation recommendations.

Key concepts for practitioners

Priors

A prior distribution encodes what you believe about a parameter before seeing the data. In DSAMbayes:

  • Default priors are normal(0, 5) — weakly informative, centred at zero.
  • Informative priors can be set when domain knowledge justifies it (e.g. price_index ~ normal(-0.2, 0.1) if you know price has a small negative effect).
  • Priors are specified using formula notation: set_prior(model, m_tv ~ normal(0, 10)).

Boundaries

Hard constraints on parameter values. The posterior density is zero outside the boundary:

  • set_boundary(model, m_tv > 0) forces the TV coefficient to be non-negative.
  • Default boundaries are unconstrained (-Inf, Inf).

MCMC vs MAP

  • MCMC (fit()) draws samples from the full posterior distribution. Slower but gives complete uncertainty quantification. Use for final reporting.
  • MAP (optimise()) finds the single most probable parameter vector. Much faster but gives only a point estimate. Use for rapid iteration during model development.

Response scale

Models can operate on the original KPI scale (identity response) or the log scale:

  • Identity: kpi ~ ... — coefficients represent unit changes in KPI.
  • Log: log(kpi) ~ ... — coefficients represent approximate percentage changes.

Log-response models require careful back-transformation to the KPI scale. DSAMbayes handles this automatically via fitted_kpi(). See Response Scale Semantics.

Diagnostics

After fitting, DSAMbayes runs a battery of diagnostic checks:

  • Sampler quality — Rhat, effective sample size, divergences.
  • Residual behaviour — autocorrelation, normality, Ljung-Box test.
  • Identifiability — baseline-media correlation.
  • Boundary monitoring — share of draws hitting constraints.

Each check produces a pass, warn, or fail status. See Diagnostics Gates.

The YAML runner

For reproducible, configuration-driven runs, DSAMbayes provides a CLI runner:

Rscript scripts/dsambayes.R run --config config/my_model.yaml

The runner reads a YAML file specifying the data, formula, priors, boundaries, fit settings, diagnostics policy, and optional budget optimisation. It writes structured artefacts (CSVs, plots, model objects) to a timestamped directory under results/. See CLI Usage and Config Schema.

Further reading

Your First BLM Model

Goal

Build, fit, and interpret a single-market Bayesian linear model (BLM) using the DSAMbayes R API.

Prerequisites

  • DSAMbayes installed locally (see Install and Setup).
  • Familiarity with R and lm()-style formulas.

Dataset

This walkthrough uses the synthetic dataset shipped at data/synthetic_dsam_example_wide_data.csv. It contains weekly observations for a single market with columns for:

  • Response: kpi_os_hfb01_value — weekly KPI (e.g. revenue).
  • Media: m_tv, m_search, m_social, m_display, m_ooh, m_email, m_affiliate.
  • Controls: t_scaled (trend), sin52_1/cos52_1/sin52_2/cos52_2 (Fourier seasonality), price_index, distribution, bm_ikea_trust_12r (brand metric).
  • Holidays: h_black_friday, h_christmas, h_new_year, h_easter, h_summer_sale.
library(DSAMbayes)
df <- read.csv("data/synthetic_dsam_example_wide_data.csv")
str(df)

Step 1: Construct the model

blm() creates an unfitted model object. No fitting happens yet.

model <- blm(
  kpi_os_hfb01_value ~
    t_scaled + sin52_1 + cos52_1 + sin52_2 + cos52_2 +
    h_black_friday + h_christmas + h_new_year + h_easter + h_summer_sale +
    bm_ikea_trust_12r + price_index + distribution +
    m_tv + m_search + m_social + m_display + m_ooh + m_email + m_affiliate,
  data = df
)

Inspect defaults:

peek_prior(model)      # normal(0, 5) for all terms
peek_boundary(model)   # (-Inf, Inf) — unconstrained

Step 2: Set boundaries

Media channels should have non-negative effects. Use inequality notation:

model <- model %>%
  set_boundary(
    m_tv > 0, m_search > 0, m_social > 0,
    m_display > 0, m_ooh > 0, m_email > 0, m_affiliate > 0
  )

Step 3: (Optional) Override priors

Default priors are weakly informative. Override only with domain knowledge:

model <- model %>%
  set_prior(price_index ~ normal(-0.2, 0.1))

See Minimal-Prior Policy for guidance.

Step 4: Fit with MCMC

fitted_model <- model %>%
  fit(cores = 2, iter = 2000, warmup = 1000, seed = 123)

First-time Stan compilation takes 1–3 minutes. Subsequent runs use a cached binary. With 2 chains on synthetic data, sampling typically completes in under 2 minutes.

Step 5: Sampler diagnostics

chain_diagnostics(fitted_model)
Metric Good Concern
Max Rhat < 1.01 > 1.05 means chains have not converged
Min ESS (bulk) > 400 < 200 means too few effective samples
Divergences 0 Any non-zero count warrants investigation

Step 6: Extract the posterior

post <- get_posterior(fitted_model)

post is a tibble with one row per draw containing coef (named coefficient list), yhat (fitted values), noise_sd, r2, rmse, and smape.

Summarise coefficients:

library(dplyr); library(tidyr)

coef_summary <- post %>%
  select(coef) %>%
  unnest_wider(coef) %>%
  pivot_longer(everything(), names_to = "term") %>%
  group_by(term) %>%
  summarise(
    mean = mean(value), median = median(value), sd = sd(value),
    ci_low = quantile(value, 0.025), ci_high = quantile(value, 0.975),
    .groups = "drop"
  )
print(coef_summary, n = 30)

What to look for:

  • Media coefficients should be positive (boundaries enforce this).
  • Wide credible intervals mean the prior dominates — the data cannot identify the effect precisely.

Step 7: Assess model fit

fit_tbl <- fitted(fitted_model)
cat("Median R²:", median(r2(fitted_model)), "\n")
cat("Median RMSE:", median(rmse(fitted_model)), "\n")

For a well-specified MMM on weekly data, in-sample R² above 0.85 is typical.

Step 8: Response decomposition

decomp_tbl <- decomp(fitted_model)
head(decomp_tbl)

Shows each term’s contribution (coefficient × design-matrix column) to the predicted KPI at each time point — the foundation for media contribution and ROI reporting.

Step 9: MAP for rapid iteration

During development, use MAP for fast point estimates:

map_model <- model %>% optimise(n_runs = 10)
get_posterior(map_model)

Use MCMC for final reporting; MAP for formula iteration.

Common pitfalls

Pitfall Symptom Fix
Forgetting to set boundaries Media coefficients go negative Add set_boundary(m_x > 0)
Too few iterations High Rhat, low ESS Increase iter and warmup
Missing controls High residual autocorrelation Add trend, seasonality, or holiday terms
Scaling confusion Coefficients look wrong model.scale: true is default; get_posterior() back-transforms automatically

Next steps

Your First Hierarchical Model

Goal

Build, fit, and interpret a multi-market hierarchical model with partial pooling and optional CRE (Mundlak) correction using the DSAMbayes R API.

Prerequisites

Dataset

This walkthrough uses data/synthetic_dsam_example_hierarchical_data.csv — a panel dataset with weekly observations across multiple markets. Key columns:

  • Response: kpi_value — weekly KPI per market.
  • Group: market — market identifier.
  • Media: m_tv, m_search, m_social — media exposure variables.
  • Controls: trend, seasonality, brand_metric.
  • Date: date — weekly date index.
library(DSAMbayes)
panel_df <- read.csv("data/synthetic_dsam_example_hierarchical_data.csv")
table(panel_df$market)  # Check group counts

Step 1: Construct the hierarchical model

The (term | group) syntax tells DSAMbayes to fit random effects. Terms inside the parentheses get group-specific deviations from the population mean:

model <- blm(
  kpi_value ~
    trend + seasonality + brand_metric +
    m_tv + m_search + m_social +
    (1 + m_tv + m_search + m_social | market),
  data = panel_df
)

This specifies:

  • Population (fixed) effects for all terms — the average effect across markets.
  • Random intercepts and slopes for media terms by market — each market can deviate from the population average.

Step 2: Set boundaries

model <- model %>%
  set_boundary(m_tv > 0, m_search > 0, m_social > 0)

Boundaries apply to the population-level coefficients.

Step 3: (Optional) Add CRE / Mundlak correction

If you suspect that group-level spending patterns are correlated with unobserved market characteristics (e.g. high-spend markets also have higher baseline demand), CRE controls for this:

model <- model %>%
  set_cre(vars = c("m_tv", "m_search", "m_social"))

This adds cre_mean_m_tv, cre_mean_m_search, cre_mean_m_social as fixed effects — the group-level means of each media variable. The within-group coefficients then represent purely temporal variation, controlling for between-group confounding.

See CRE / Mundlak for when and why to use this.

Step 4: Fit with MCMC

fitted_model <- model %>%
  fit(cores = 4, iter = 2000, warmup = 1000, seed = 42)

Hierarchical models are slower than BLM — expect 10–30 minutes depending on group count and data size. First-time Stan compilation of the hierarchical template adds 2–3 minutes.

Step 5: Check diagnostics

chain_diagnostics(fitted_model)

Pay special attention to Rhat and ESS for sd_* parameters (group-level standard deviations), which are often harder to estimate than population coefficients.

Step 6: Extract the posterior

post <- get_posterior(fitted_model)

For hierarchical models, coefficient draws from get_posterior() return vectors (one value per group) rather than scalars. The population-level (fixed-effect) estimates are averaged across groups.

Step 7: Group-level results

Fitted values and decomposition are returned per group:

# Fitted values — one row per observation, grouped by market
fit_tbl <- fitted(fitted_model)
head(fit_tbl)

# Decomposition — per-group predictor contributions
decomp_tbl <- decomp(fitted_model)

Step 8: Budget optimisation (population level)

Budget optimisation uses population-level (fixed-effect) beta draws, not group-specific totals:

# See Budget Optimisation docs for full scenario specification
result <- optimise_budget(fitted_model, scenario = my_scenario)

Key differences from BLM

Aspect BLM Hierarchical
Data structure Single market Panel (multiple groups)
Coefficient draws Scalars Vectors (one per group)
Fit time 2–5 min 10–30 min
Decomposition Direct May fail gracefully for `
Forest/prior-posterior plots Direct Group-averaged population estimates
Stan template bayes_lm_updater_revised.stan general_hierarchical.stan (templated per group count)

Common pitfalls

Pitfall Symptom Fix
Too few groups Weak partial pooling; group SDs poorly estimated Need 4+ groups for meaningful hierarchical structure
Too few obs per group High Rhat on sd_* parameters Increase iterations; simplify random-effect structure
CRE with too many vars More CRE variables than groups Reduce CRE variable set; see identification warnings
CRE mean has zero variance scale=TRUE aborts with constant column error Use model.type: re (without CRE) or model.scale: false

Next steps

Quickstart (YAML Runner)

Goal

Complete one reproducible DSAMbayes runner execution from validation to artefact inspection, then load the fitted model in R to explore the results interactively.

Before you start

Complete the setup in Install and Setup. If you want to build a model interactively from R code instead of YAML, see Your First BLM Model.

1. Set up the environment

Open a terminal in the repository root:

mkdir -p .Rlib .cache
export R_LIBS_USER="$PWD/.Rlib"
export XDG_CACHE_HOME="$PWD/.cache"

2. Validate the configuration (dry run)

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml

Expected: exits with code 0. No Stan compilation or sampling occurs.

3. Execute the full run

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R run --config config/blm_synthetic_mcmc.yaml

Expected: a timestamped run directory under results/ with staged outputs.

4. Locate and inspect the run directory

latest_run="$(ls -td results/* | head -n 1)"
echo "$latest_run"
find "$latest_run" -maxdepth 2 -type d | sort

Expected stage folders:

Folder Content
00_run_metadata/ Resolved config, session info
10_pre_run/ VIF report, data dictionary, media spend plots
20_model_fit/ Fitted model object, fit plots
30_post_run/ Posterior summary, fitted/observed CSVs
40_diagnostics/ Diagnostics report, residual plots
50_model_selection/ LOO summary, Pareto-k diagnostics
60_optimisation/ Budget allocation, response curves (when enabled)

5. Verify key artefacts

test -f "$latest_run/00_run_metadata/config.resolved.yaml" && echo "ok: config.resolved.yaml"
test -f "$latest_run/20_model_fit/model.rds" && echo "ok: model.rds"
test -f "$latest_run/30_post_run/posterior_summary.csv" && echo "ok: posterior_summary.csv"
test -f "$latest_run/40_diagnostics/diagnostics_report.csv" && echo "ok: diagnostics_report.csv"

6. Load the model in R

The fitted model is saved as an RDS object. Load it interactively to explore:

library(DSAMbayes)

model <- readRDS("results/<run_dir>/20_model_fit/model.rds")

# Posterior coefficient summary
post <- get_posterior(model)

# Fit quality
cat("Median R²:", median(r2(model)), "\n")

# Sampler diagnostics
chain_diagnostics(model)

# Fitted values
head(fitted(model))

7. Review diagnostics

Open 40_diagnostics/diagnostics_report.csv — each row is one diagnostic check:

cat "$latest_run/40_diagnostics/diagnostics_summary.txt"
  • pass — no action needed.
  • warn — review recommended; see Interpret Diagnostics.
  • fail — remediate before using results for decisions.

8. Bootstrap a new config

Generate a template for your own data:

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R init --template blm --out config/my_model.yaml

Edit config/my_model.yaml to point to your data and formula, then validate and run.

If the quickstart fails

  • Re-run validate before run to catch config errors early.
  • Check the error message — DSAMbayes uses cli::cli_abort() with descriptive context.
  • Inspect 00_run_metadata/config.resolved.yaml to see what defaults were applied.
  • See Debug Run Failures for common failure modes.

Next steps

FAQ

Installation and setup

How long does the first Stan compilation take?

1–3 minutes on most machines. Subsequent runs use a cached binary and start sampling immediately. If compilation seems stuck, check your C++ toolchain — see Install and Setup.

Do I need to set R_LIBS_USER every time?

Yes, unless you add it to your shell profile (.bashrc, .zshrc, or equivalent). The repo-local .Rlib path keeps DSAMbayes and its dependencies isolated from your system R library.

Can I use renv instead of .Rlib?

Yes. The repository includes a renv.lock file. Run renv::restore() to install exact dependency versions. See the renv section in Install and Setup.

Modelling

How many weeks of data do I need?

There is no hard minimum, but as a rough guide:

  • BLM: 100+ weeks for a model with 10–15 predictors. Below 80 weeks, most media effects will be prior-driven.
  • Hierarchical: 80+ weeks per group, ideally with 4+ groups for meaningful partial pooling.

More data is always better. Short series with many predictors will lean heavily on priors.

Should I use identity or log response?

  • Identity (kpi ~ ...) when the KPI is naturally additive and variance is roughly constant across levels. Coefficients represent absolute unit changes.
  • Log (log(kpi) ~ ...) when the KPI is strictly positive, variance scales with level, or you want multiplicative (percentage) effects. Common for revenue and sales.

If unsure, fit both and compare diagnostics.

How many MCMC iterations do I need?

The defaults (iter = 2000, warmup = 1000, chains = 4) are a reasonable starting point. Check diagnostics after fitting:

  • Rhat < 1.01 and ESS > 400 → iterations are sufficient.
  • Rhat > 1.05 or ESS < 200 → increase iter and warmup (e.g. double both).
  • Divergences > 0 → increase adapt_delta (e.g. 0.95 → 0.99) before increasing iterations.

For rapid iteration during development, use MAP estimation (optimise()) instead of MCMC.

When should I set boundaries on media coefficients?

Set m_channel > 0 when you are confident that additional media exposure cannot decrease the KPI. This is the most common boundary specification in MMM. Do not set boundaries on control variables (trend, seasonality, price) unless you have a clear structural reason — see the Minimal-Prior Policy.

When should I use CRE (Mundlak)?

Use CRE when fitting a hierarchical model where:

  1. Time-varying regressors (e.g. media spend) have group-level means correlated with unobserved group heterogeneity.
  2. You want to separate within-group (temporal) effects from between-group (cross-sectional) effects.

CRE adds group-mean terms to the population formula. See CRE / Mundlak.

What does scale = TRUE do?

It standardises the response and predictors (centre and divide by SD) before passing them to Stan. This improves sampler efficiency by putting all coefficients on a comparable scale. Post-fit, get_posterior() back-transforms coefficients to the original data scale automatically. Leave it on (the default) unless you have a specific reason to disable it.

Runner and outputs

How long does a typical run take?

Model type Fit method Data size Typical time
BLM MCMC (2 chains, 2000 iter) 150 weeks 2–5 minutes
BLM MAP (10 starts) 150 weeks 10–30 seconds
Hierarchical MCMC (4 chains, 2000 iter) 150 weeks × 5 groups 10–30 minutes
Pooled MCMC (4 chains, 2000 iter) 150 weeks 5–15 minutes

First-time Stan compilation adds 1–3 minutes.

What is the difference between validate and run?

  • validate checks config structure, data paths, formula validity, and cross-field constraints — without compiling or fitting Stan models. Use it as a pre-run gate.
  • run does everything validate does, then compiles, fits, runs diagnostics, and writes artefacts.

Always validate before run when you change config or data.

Where do outputs go?

By default, under results/<timestamp>_<model_name>/ with numbered stage folders. See Output Artefacts for the full contract.

How do I compare two model runs?

Use compare_runs() in R or compare 50_model_selection/loo_summary.csv files manually. See Compare Runs.

Diagnostics

What does “Pareto-k > 0.7” mean?

It means the PSIS-LOO approximation is unreliable for that observation — the observation is highly influential. A few amber (0.5–0.7) points are normal. Red (> 0.7) points warrant investigation. See Model Selection Plots.

My diagnostics say “warn” — should I worry?

It depends on the policy mode:

  • explore — warnings are expected during development. Continue iterating.
  • publish — review warnings before sharing results. Most warns are acceptable if you understand the cause.
  • strict — warnings require documented justification.

See Diagnostics Gates for threshold details.

Budget optimisation

How does the allocator work?

It generates feasible spend allocations within channel bounds, evaluates each against the posterior, and selects the allocation that maximises the chosen objective (KPI uplift or profit). It is a Monte Carlo search, not an analytical optimiser. See Budget Optimisation.

Can I use budget optimisation with MAP-fitted models?

Yes, but the results will be based on a single point estimate rather than the full posterior distribution. Uncertainty intervals will not be meaningful.

Runner

Purpose

Document CLI and YAML runner contracts for reproducible DSAMbayes runs.

Audience

  • Users operating DSAMbayes through scripts/dsambayes.R.
  • Engineers maintaining runner config and artefact contracts.

Pages

Page Topic
CLI Usage Commands, flags, exit codes, and error modes
Config Schema YAML keys, defaults, and validation rules
Output Artefacts Staged folder layout, file semantics, and precedence rules

Subsections of Runner

CLI Usage

Purpose

Define the supported command-line interface for scripts/dsambayes.R, including required flags, optional flags, and execution semantics.

Prerequisites

Before using the CLI:

  • Complete Install and Setup.
  • Run commands from repository root.
  • Ensure DSAMbayes is installed and available in R_LIBS_USER.

Entry point

Rscript scripts/dsambayes.R <command> [flags]

The script supports these commands:

  • init
  • validate
  • run
  • help (or -h / --help)

Command summary

Command Required flags Optional flags Behaviour
init --out --template, --overwrite Writes a config template file.
validate --config --run-dir Runs config and data checks only (dry_run = TRUE).
run --config --run-dir Executes the full pipeline (dry_run = FALSE) and writes run artefacts.
help none none Prints usage text and exits.

Flag reference

init

  • --out <path> (required): output path for the generated YAML file.
  • --template <name> (optional): template name. Default is blm.
    • Supported values in script: master, blm, re, cre, pooled, hierarchical.
    • hierarchical maps to the same template file as re.
  • --overwrite (optional flag): allow overwrite of an existing --out file.

validate

  • --config <path> (required): YAML config path.
  • --run-dir <path> (optional): explicit run directory path.

run

  • --config <path> (required): YAML config path.
  • --run-dir <path> (optional): explicit run directory path.

Usage examples

Show help

Rscript scripts/dsambayes.R --help

Expected outcome: usage panel is printed with command syntax and notes.

Create a new config from template

Rscript scripts/dsambayes.R init --template blm --out config/local_quickstart.yaml

Expected outcome: config/local_quickstart.yaml is created.

Validate only (dry-run behaviour)

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml

Expected outcome: validation completes without fitting Stan models.

Validate with explicit run directory

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate \
    --config config/blm_synthetic_mcmc.yaml \
    --run-dir results/quickstart_validate

Expected outcome: validation uses the provided run directory path when writing run metadata.

Execute full run

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R run --config config/blm_synthetic_mcmc.yaml

Expected outcome: full modelling pipeline executes and artefacts are written under results/.

Execute full run with explicit run directory

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R run \
    --config config/blm_synthetic_mcmc.yaml \
    --run-dir results/quickstart_run

Expected outcome: artefacts are written to results/quickstart_run (subject to overwrite rules in config).

Exit and error behaviour

  • Success exits with status 0.
  • CLI argument or runtime errors exit with status 2.
  • Typical hard failures include:
    • DSAMbayes not installed.
    • Missing required flags (--out or --config).
    • Unknown command.
    • Unknown argument format.

Operational notes

  • validate is the recommended pre-run gate. Use it before run whenever you change config or data.
  • run prints a run summary and suggested next-step artefacts at completion.
  • The CLI itself does not define model semantics. It delegates execution to DSAMbayes::run_from_yaml().

Config Schema

Purpose

This page defines the YAML contract used by:

  • scripts/dsambayes.R
  • DSAMbayes::run_from_yaml()

It documents defaults, allowed values, and cross-field constraints enforced by the runner.

Schema processing order

The runner processes config in this order:

  1. Parse YAML.
  2. Coerce YAML infinity tokens (.Inf, -.Inf).
  3. Apply defaults (resolve_config_defaults()).
  4. Resolve relative paths against the config file directory (resolve_config_paths()).
  5. Validate values and cross-field constraints (validate_config()).
  6. Validate formula safety unless security.allow_unsafe_formula: true.

Root sections

Key Default Notes
schema_version 1 Must be 1.
data object Input data path, format, date handling, dictionary metadata.
model object Formula, model type, scaling, compile behaviour.
cre object Correlated random effects settings (Mundlak).
pooling object Pooled model map and grouping settings.
transforms object Transform mode and sensitivity scenarios.
priors object Default priors plus sparse overrides.
boundaries object Parameter boundary overrides.
time_components object Managed holiday feature generation.
fit object MCMC or optimisation fit arguments.
allocation object Post-fit budget optimisation settings.
outputs object Run directory and artefact save flags.
forecast object Reserved forecast stage toggle.
diagnostics object Policy mode, identifiability, model selection, time-series selection.
security object Formula safety bypass flag.

Minimal valid config

schema_version: 1

data:
  path: data/your_data.csv
  format: csv

model:
  formula: y ~ x1 + x2

Expected outcome: this resolves with defaults for all omitted sections and passes schema validation if the file exists and formula variables exist in data.

Section reference

schema_version

Key Type Default Rules
schema_version integer 1 Only 1 is supported.

data

Key Type Default Rules
data.path string none Required. File must exist. Relative path resolves from config directory.
data.format string inferred from file extension Must be csv, rds, or long.
data.date_var string or null null Required for data.format: long. Required when holidays are enabled. Required at runtime for time-series selection.
data.date_format string or null null Optional date parser format for date columns.
data.na_action string omit Must be omit or error during formula/data checks.
data.long_id_col string or null null Required when data.format: long.
data.long_variable_col string or null null Required when data.format: long.
data.long_value_col string or null null Required when data.format: long.
data.dictionary_path string or null null Optional CSV. Must exist if provided. Relative path resolves from config directory.
data.dictionary mapping {} Optional inline metadata keyed by term name. Allowed fields: unit, cadence, source, transform, rationale.

Long-format-specific rules:

  • data.long_id_col, data.long_variable_col, data.long_value_col, and data.date_var must all be set.
  • These four column names must be distinct.
  • Long data is reshaped wide before modelling and duplicate key rows are rejected.

model

Key Type Default Rules
model.name string config filename stem Used in run folder slug.
model.formula string none Required. Must parse to y ~ ....
model.type string auto auto, blm, re, cre, pooled.
model.kpi_type string revenue revenue or subscriptions.
model.scale boolean true Controls internal scaling before fit.
model.force_recompile boolean false Forces Stan recompile when true.

Model type resolution rules:

  • auto resolves to pooled if pooling.enabled: true.
  • auto resolves to cre if cre.enabled: true.
  • auto resolves to re if formula contains bar terms (for example (1 | group)).
  • auto resolves to blm otherwise.
  • re or cre requires bar terms in formula.
  • blm or pooled cannot be used with bar terms.

cre

Key Type Default Rules
cre.enabled boolean false (or true when model.type: cre) Cannot be true for model.type: blm or pooled.
cre.vars list of strings [] Required and non-empty when cre.enabled: true.
cre.group string or null null Grouping column used in CRE construction.
cre.prefix string cre_mean_ Prefix for generated CRE mean terms.

pooling

Key Type Default Rules
pooling.enabled boolean false (or true when model.type: pooled) Cannot be true for model.type: re or cre.
pooling.grouping_vars list of strings [] Required and non-empty when pooling is enabled.
pooling.map_path string or null null Required when pooling is enabled. File must exist. Relative path resolves from config directory.
pooling.map_format string or null inferred from map_path extension Must be csv or rds.
pooling.min_waves integer or null null If set, must be positive integer.

Pooling map requirements at model-build time:

  • Must include a variable column.
  • Must include every column named in pooling.grouping_vars.
  • variable values must be unique.

transforms

Key Type Default Rules
transforms.mode string fixed_formula Currently only fixed_formula is supported.
transforms.sensitivity.enabled boolean false If true, requires fit.method: optimise.
transforms.sensitivity.scenarios list [] Required and non-empty when sensitivity is enabled.

Each sensitivity scenario requires:

  • name (unique, non-empty, not base)
  • formula (safe formula string unless unsafe mode is enabled)

priors

Key Type Default Rules
priors.use_defaults boolean true Must be true in current runner version.
priors.overrides list [] Optional sparse overrides.

Prior override row contract:

  • parameter (string, must exist in model prior table)
  • family (normal or lognormal_ms, default normal)
  • mean (numeric)
  • sd (numeric, > 0)

lognormal_ms extra constraints:

  • mean > 0
  • allowed only for noise_sd and parameters matching sd_<index>[<term>]

boundaries

Key Type Default Rules
boundaries.overrides list [] Optional parameter boundary overrides.

Boundary override row contract:

  • parameter (string, must exist in model boundary table)
  • lower (numeric, default -Inf)
  • upper (numeric, default Inf)

time_components

Key Type Default Rules
time_components.enabled boolean false Master toggle.
time_components.holidays.enabled boolean false Enables holiday feature generation.
time_components.holidays.calendar_path string or null null Required when holidays are enabled. CSV or RDS. Relative path resolves from config directory.
time_components.holidays.date_col string or null null Optional calendar date column override.
time_components.holidays.label_col string holiday Holiday label column.
time_components.holidays.date_format string or null null Optional parser format for character dates.
time_components.holidays.week_start string monday One of mondaysunday.
time_components.holidays.timezone string UTC Timezone used in week alignment.
time_components.holidays.prefix string holiday_ Prefix for generated terms.
time_components.holidays.window_before integer 0 Must be non-negative.
time_components.holidays.window_after integer 0 Must be non-negative.
time_components.holidays.aggregation_rule string count count or any.
time_components.holidays.overlap_policy string count_all count_all or dedupe_label_date.
time_components.holidays.add_to_formula boolean true Auto-add generated terms to formula.
time_components.holidays.overwrite_existing boolean false If false, generated name collisions abort.

fit

Key Type Default Rules
fit.method string mcmc mcmc or optimise.
fit.seed numeric or null null Optional scalar seed.
fit.optimise.n_runs integer 10 Multi-start retries for fit_map().
fit.mcmc.chains integer 4 MCMC chains.
fit.mcmc.iter integer 2000 Total iterations per chain.
fit.mcmc.warmup integer 1000 Warmup iterations per chain.
fit.mcmc.cores integer 1 Parallel chains.
fit.mcmc.refresh integer 0 Stan progress refresh interval.
fit.mcmc.parameterization.positive_priors string centered centered or noncentered.

Allowed keys under fit.optimise:

  • n_runs, iter, seed, init, algorithm, hessian, as_vector

Allowed keys under fit.mcmc:

  • chains, iter, warmup, thin, cores, refresh, seed, init, control, parameterization

Allowed keys under fit.mcmc.parameterization:

  • positive_priors

allocation

Key Type Default Rules
allocation.enabled boolean false Enables budget optimisation stage.
allocation.scenario string max_response max_response or target_efficiency.
allocation.target_value numeric or null null Required and > 0 for target_efficiency.
allocation.n_candidates integer 2000 Must be integer >= 10.
allocation.seed numeric or null null Optional allocator seed.
allocation.budget.total numeric or null null Required and > 0 for max_response. Optional > 0 for target_efficiency.
allocation.channels list [] Required and non-empty when allocation is enabled.
allocation.reference_spend numeric/list or null null Optional baseline spend vector.
allocation.currency_scale numeric or null null Optional positive scaling factor.
allocation.posterior.draws integer 500 Must be positive integer when provided.
allocation.objective.target string kpi_uplift kpi_uplift or profit.
allocation.objective.value_per_kpi numeric or null null Required at optimisation runtime for objective.target: profit.
allocation.objective.kpi_baseline numeric or null null Must be > 0 when provided.
allocation.objective.allow_relative_log_uplift boolean false Allows relative uplift output for log-response runs without baseline.
allocation.objective.risk.type string mean mean, mean_minus_sd, or quantile.
allocation.objective.risk.lambda numeric 0 Must be >= 0 when risk.type: mean_minus_sd.
allocation.objective.risk.quantile numeric 0.1 Must be in (0, 1).

Channel row contract (allocation.channels[]):

  • term required, scalar string, unique across channels.
  • name optional, defaults to term, must be unique.
  • spend_col optional, defaults to name.
  • bounds.min optional, defaults 0, must be finite and >= 0.
  • bounds.max optional, defaults Inf, but operationally must be finite and >= bounds.min.
  • currency_col optional.
  • response optional mapping:
    • type: identity (default)
    • type: atan requires positive scale
    • type: log1p requires positive scale
    • type: hill requires positive k and positive n

Log-response runtime rules:

  • scenario: target_efficiency requires allocation.objective.kpi_baseline.
  • Other scenarios require kpi_baseline unless allow_relative_log_uplift: true.

outputs

Path keys:

Key Type Default Rules
outputs.root_dir string results Relative path resolves from config directory.
outputs.run_dir string or null null If relative, resolves under outputs.root_dir.
outputs.overwrite boolean false Existing run dir can be reused only when true and contents are recognised runner artefacts.
outputs.layout string staged staged or flat.
outputs.decomp_top_n integer 8 Must be positive integer.

Save toggles (all booleans):

Key Default
outputs.save_model_rds true
outputs.save_posterior_rds false
outputs.save_posterior_summary_csv true
outputs.save_fitted_csv true
outputs.save_observed_csv true
outputs.save_chain_diagnostics_txt true
outputs.save_diagnostics_report_csv true
outputs.save_diagnostics_summary_txt true
outputs.save_session_info_txt true
outputs.save_transform_sensitivity_summary_csv true
outputs.save_transform_sensitivity_parameters_csv true
outputs.save_transform_assumptions_txt true
outputs.save_data_dictionary_csv true
outputs.save_allocator_csv true
outputs.save_allocator_png true
outputs.save_allocator_json false
outputs.save_decomp_csv true
outputs.save_decomp_png true
outputs.save_spec_summary_csv true
outputs.save_design_matrix_manifest_csv true
outputs.save_vif_report_csv true
outputs.save_predictor_risk_register_csv true
outputs.save_fit_png true
outputs.save_residuals_csv true
outputs.save_diagnostics_png true
outputs.save_model_selection_csv true
outputs.save_model_selection_pointwise_csv true

Run directory precedence:

  1. CLI --run-dir / run_from_yaml(..., run_dir=...)
  2. outputs.run_dir
  3. Timestamped directory under outputs.root_dir

forecast

Key Type Default Rules
forecast.enabled boolean false Reserved stage toggle (70_forecast).

diagnostics

Key Type Default Rules
diagnostics.enabled boolean true Enables diagnostics pipeline.
diagnostics.policy_mode string publish explore, publish, or strict.
diagnostics.enforce_publish_gate boolean false If true, run aborts only when overall status is fail.

diagnostics.model_selection:

Key Default Rules
enabled true Toggle PSIS-LOO check pipeline.
method psis_loo Currently only psis_loo is supported.
max_draws null If set, must be > 0.
pareto_k_warn 0.7 Finite value must be in [0, 1], or Inf.
pareto_k_fail Inf Finite value must be in [0, 1] and strictly greater than pareto_k_warn.
moment_match false Boolean.
reloo false Boolean.
top_n 10 Must be >= 0.

diagnostics.time_series_selection:

Key Default Rules
enabled false When true, runs refit-and-score time-series selection.
method blocked_cv blocked_cv or leave_future_out.
horizon_weeks 13 Positive integer.
n_folds 4 Positive integer.
stride_weeks horizon_weeks Positive integer.
min_train_weeks 52 Positive integer.
save_pointwise false Boolean.
save_png true Boolean.

Time-series selection constraints:

  • Requires fit.method: mcmc.
  • Not supported for pooled runs.
  • Requires data.date_var set and present in data.

diagnostics.identifiability:

Key Default Rules
enabled true Boolean.
media_terms [] Character list.
baseline_terms [] Character list.
baseline_regex `["^t(_ $)", “^sin[0-9]”, “^cos[0-9]”, “^holiday_”]`
abs_corr_warn 0.80 Must be in [0, 1).
abs_corr_fail 0.95 Finite value must be > abs_corr_warn and <= 1, or Inf.

security

Key Type Default Rules
security.allow_unsafe_formula boolean false If false, formula safety checks are enforced.

Safe formula calls when unsafe mode is disabled:

  • Operators: ~, +, -, *, /, ^, :, |, (
  • Functions: log, exp, sqrt, atan, sin, cos, tan, I, offset, pmax, pmin, abs
  • Namespaced calls: dplyr::lag, dplyr::lead
  1. Generate a template:
Rscript scripts/dsambayes.R init --template master --out config/my_run.yaml
  1. Validate before running:
R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate --config config/my_run.yaml
  1. Execute:
R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R run --config config/my_run.yaml

Output Artefacts

Purpose

This page defines what the YAML runner writes, where files are written, and which config flags control each artefact.

Related pages:

Run directory and layout semantics

Run directory precedence:

  1. CLI --run-dir
  2. outputs.run_dir
  3. Timestamped folder under outputs.root_dir

Layout behaviour:

  • outputs.layout: staged (default) writes files under numbered stage folders.
  • outputs.layout: flat writes all files directly under the run directory.

Stage folders used by the runner:

  • 00_run_metadata
  • 10_pre_run
  • 20_model_fit
  • 30_post_run
  • 40_diagnostics
  • 50_model_selection
  • 60_optimisation
  • 70_forecast (directory only, when forecast.enabled: true)

Command behaviour

validate

  • validate uses dry_run = TRUE.
  • If no run directory is resolved, no artefacts are written.
  • If a run directory is resolved (--run-dir or outputs.run_dir), config.original.yaml is written.
  • If a run directory is resolved (--run-dir or outputs.run_dir), config.resolved.yaml is written.
  • If a run directory is resolved and outputs.save_session_info_txt: true, session_info.txt is written.
  • If forecast is enabled and a run directory is materialised, the 70_forecast/ directory is created.

run

  • run writes the full artefact set subject to config toggles and runtime conditions.

Artefact contract by stage

00_run_metadata

File Controlled by Written when Notes
config.original.yaml always run dir materialised Raw YAML text from the input config.
config.resolved.yaml always run dir materialised Defaults applied, paths resolved, schema validated.
session_info.txt outputs.save_session_info_txt flag is true Includes DSAMbayes version, schema version, model/fit metadata, and sessionInfo().

10_pre_run

File Controlled by Written when Notes
transform_assumptions.txt outputs.save_transform_assumptions_txt flag is true Written even if transform sensitivity scenarios are disabled.
transform_sensitivity_summary.csv outputs.save_transform_sensitivity_summary_csv sensitivity object exists with rows Requires transforms.sensitivity.enabled: true and successful scenario execution.
transform_sensitivity_parameters.csv outputs.save_transform_sensitivity_parameters_csv sensitivity object exists with rows Parameter means/SD by scenario.
dropped_groups.csv none groups dropped by pooling.min_waves filter Written only when sparse groups are excluded.
holiday_feature_manifest.csv none managed holidays enabled and features generated Documents generated holiday terms and active-week counts.
design_matrix_manifest.csv outputs.save_design_matrix_manifest_csv flag is true and manifest non-empty Per-term design metadata.
data_dictionary.csv outputs.save_data_dictionary_csv flag is true and dictionary table non-empty Merges inline YAML metadata and optional CSV dictionary metadata.
spec_summary.csv outputs.save_spec_summary_csv flag is true and table available Single-row model/spec summary.
vif_report.csv outputs.save_vif_report_csv flag is true and predictors available VIF diagnostics for non-intercept predictors.

20_model_fit

File Controlled by Written when Notes
model.rds outputs.save_model_rds flag is true Fitted model object.
posterior.rds outputs.save_posterior_rds flag is true and MCMC fit Raw posterior object for MCMC runs only.
fit_metrics_by_group.csv implicit fitted summary is computed Written when any of save_fitted_csv, save_fit_png, save_residuals_csv, save_diagnostics_png is true.
fit_timeseries.png outputs.save_fit_png flag is true and ggplot2 installed Observed vs fitted over time.
fit_scatter.png outputs.save_fit_png flag is true and ggplot2 installed Observed vs fitted scatter.

30_post_run

File Controlled by Written when Notes
observed.csv outputs.save_observed_csv flag is true Observed response on model response scale.
observed_kpi.csv outputs.save_observed_csv flag is true and response scale is log KPI-scale observed values (exp) with conversion_method = point_exp.
fitted.csv outputs.save_fitted_csv flag is true Fitted summaries on model response scale.
fitted_kpi.csv outputs.save_fitted_csv flag is true and response scale is log KPI-scale fitted summaries (exp).
posterior_summary.csv outputs.save_posterior_summary_csv flag is true and MCMC fit Posterior summaries for coefficients and scalar diagnostics.
optimisation_runs.csv none fit.method: optimise All optimisation starts.
optimisation_best.csv none fit.method: optimise Best run by RMSE.

Implementation note:

  • decomp_predictor_impact.csv, decomp_predictor_impact.png, decomp_timeseries.csv, and decomp_timeseries.png are present in stage mapping and config flags, but are not currently invoked by write_run_artifacts() in the active pipeline.

40_diagnostics

File Controlled by Written when Notes
chain_diagnostics.txt outputs.save_chain_diagnostics_txt flag is true and MCMC fit Chain diagnostics text output.
diagnostics_report.csv outputs.save_diagnostics_report_csv flag is true and diagnostics object exists One row per diagnostic check.
diagnostics_summary.txt outputs.save_diagnostics_summary_txt flag is true and diagnostics object exists Counts by status and overall status.
residuals.csv outputs.save_residuals_csv flag is true and fitted summary is computed Residual table on response scale.
residuals_timeseries.png outputs.save_diagnostics_png flag is true and ggplot2 installed Residuals over time.
residuals_vs_fitted.png outputs.save_diagnostics_png flag is true and ggplot2 installed Residuals vs fitted.
residuals_hist.png outputs.save_diagnostics_png flag is true and ggplot2 installed Residual histogram.
residuals_acf.png outputs.save_diagnostics_png flag is true and ggplot2 installed Residual autocorrelation plot.
residual_diagnostics.csv none diagnostics residual checks available Ljung-Box / ACF check outputs.
residuals_latent.csv none diagnostics latent residuals available Latent residual series from diagnostics object.
residuals_latent_acf.png outputs.save_diagnostics_png latent residuals available and ggplot2 installed Latent residual ACF plot.
boundary_hits.csv none boundary-hit table available Boundary-hit rates per parameter.
boundary_hits.png outputs.save_diagnostics_png boundary-hit table available and ggplot2 installed Boundary-hit visualisation.
within_variation.csv none within-variation table available Within-variation diagnostics for hierarchical terms.
within_variation.png outputs.save_diagnostics_png within-variation table available and ggplot2 installed Within-variation visualisation.
predictor_risk_register.csv outputs.save_predictor_risk_register_csv flag is true and table non-empty Ranked risk register combining VIF, within-variation, boundary hits, and slow-moving flags.

50_model_selection

File Controlled by Written when Notes
loo_summary.csv outputs.save_model_selection_csv flag is true, diagnostics.model_selection.enabled: true, and diagnostics report exists May be full PSIS-LOO summary or a stub row with skip reason.
loo_pointwise.csv outputs.save_model_selection_pointwise_csv flag is true, diagnostics report exists, and pointwise PSIS-LOO is available Optional pointwise LOO diagnostics.
tscv_folds.csv diagnostics.time_series_selection.enabled time-series selection enabled and folds produced Fold windows plus fold-level runtime/status metadata.
tscv_summary.csv diagnostics.time_series_selection.enabled time-series selection enabled Written for success, skipped, or error outcomes.
tscv_pointwise.csv diagnostics.time_series_selection.enabled + diagnostics.time_series_selection.save_pointwise enabled and pointwise rows available Optional pointwise holdout log predictive densities.
tscv_elpd_by_fold.png diagnostics.time_series_selection.save_png + outputs.save_diagnostics_png enabled and ggplot2 installed ELPD-by-fold chart.

60_optimisation

File Controlled by Written when Notes
budget_summary.csv outputs.save_allocator_csv allocation enabled and flag is true Scenario-level optimisation summary.
budget_allocation.csv outputs.save_allocator_csv allocation enabled and flag is true Recommended allocation by channel.
budget_diagnostics.csv outputs.save_allocator_csv allocation enabled and flag is true Candidate and objective diagnostics.
budget_response_curves.csv outputs.save_allocator_csv allocation enabled and flag is true Response-curve payload.
budget_response_points.csv outputs.save_allocator_csv allocation enabled and flag is true Key plotted points for response curves.
budget_roi_cpa.csv outputs.save_allocator_csv allocation enabled and flag is true ROI/CPA panel payload (depends on KPI type).
budget_impact.csv outputs.save_allocator_csv allocation enabled and flag is true Allocation impact payload.
budget_response_curves.png outputs.save_allocator_png allocation enabled, flag is true, and ggplot2 installed Response curves plot.
budget_roi_cpa.png outputs.save_allocator_png allocation enabled, flag is true, and ggplot2 installed ROI/CPA panel plot.
budget_impact.png outputs.save_allocator_png allocation enabled, flag is true, and ggplot2 installed Allocation impact plot.
budget_optimisation.json outputs.save_allocator_json allocation enabled, flag is true, and jsonlite installed Combined JSON payload (summary, allocation, diagnostics, plot_data).

70_forecast

Item Controlled by Written when Notes
70_forecast/ directory forecast.enabled flag is true Directory is created, but no forecast files are currently emitted by runner writers.

Response scale semantics (*_kpi.csv vs base files)

Base files (observed.csv, fitted.csv) are always on the model response scale:

  • identity response: KPI units
  • log response: log(KPI)

KPI-scale files are written only for log-response models:

  • observed_kpi.csv
  • fitted_kpi.csv

Conversion metadata:

  • observed_kpi.csv uses conversion_method = point_exp.
  • fitted_kpi.csv uses conversion_method = point_exp for fit.method: optimise.
  • fitted_kpi.csv uses conversion_method = drawwise_exp for fit.method: mcmc.

Diagnostics status semantics

diagnostics_report.csv status values:

  • pass: check passed configured thresholds
  • warn: check breached warning threshold
  • fail: check breached fail threshold
  • skipped: check not applicable or intentionally skipped

Overall status logic:

  • fail if any check is fail
  • warn if no fails and at least one warn
  • pass otherwise

diagnostics_summary.txt reports:

  • overall_status
  • counts for pass, warn, fail, skipped

Quick verification commands

List produced files for a run:

latest_run="$(ls -td results/* | head -n 1)"
find "$latest_run" -type f | sort

Inspect key diagnostics files:

latest_run="$(ls -td results/* | head -n 1)"
head -n 20 "$latest_run/40_diagnostics/diagnostics_report.csv"
head -n 20 "$latest_run/40_diagnostics/diagnostics_summary.txt"

Modelling

Purpose

Describe model classes, inference contracts, diagnostics, and decision-layer semantics for DSAMbayes.

Audience

  • Practitioners building and interpreting DSAMbayes models.
  • Reviewers validating modelling assumptions and outputs.

Pages

Page Topic
Model Classes BLM, hierarchical, and pooled class constructors, fit support, and limitations
Model Object Lifecycle State transitions from construction through fitting to post-fit extraction
Priors and Boundaries Prior schema, defaults, overrides, boundary controls, and scale semantics
Minimal-Prior Policy Governance guidance for prior specification in MMM
Response Scale Semantics Identity vs log response, KPI-scale conversion, Jensen-safe reporting
Diagnostics Gates Policy modes, threshold tables, identifiability gate, and remediation actions
CRE / Mundlak Correlated random effects for hierarchical models
Time Components Managed holiday feature generation and weekly anchoring
Budget Optimisation Decision-layer budget allocation, objectives, risk scoring, and response transforms

Subsections of Modelling

Model Classes

Purpose

DSAMbayes provides three model classes for Bayesian marketing mix modelling. Each class targets a different data structure and pooling strategy. This page describes the constructor pathways, fit support, and practical limitations of each class so that an operator can select the appropriate model for a given dataset.

Class summary

Class S3 class chain Constructor Data structure Grouping Typical use case
BLM blm blm(formula, data) Single market/brand None One-market regression with full prior and boundary control
Hierarchical hierarchical, blm blm(formula, data) with (term &#124; group) syntax Panel (long format) Random effects by group Multi-market models sharing strength across groups
Pooled pooled, blm pool(blm_obj, grouping_vars, map) Single market Structured coefficient pooling via dimension map Single-market models with media coefficients pooled across labelled dimensions

BLM (blm)

Construction

model <- blm(kpi ~ m_tv + m_search + trend + seasonality, data = df)

blm() dispatches on the first argument. When passed a formula, it creates a blm object with default priors and boundaries. When passed an lm object, it creates a bayes_lm_updater whose priors are initialised from the OLS coefficient estimates and standard errors.

Fit support

Method Function Backend
MCMC fit(model, ...) rstan::sampling()
MAP fit_map(model, n_runs, ...) rstan::optimizing() (repeated starts)

Post-fit accessors

  • get_posterior() — coefficient draws, fitted values, metrics
  • fitted() — predicted response on the original scale
  • decomp() — predictor-level decomposition via DSAMdecomp
  • optimise_budget() — decision-layer budget allocation

Limitations

  • No group structure. For multi-market data, use the hierarchical class.
  • optimise_budget() aborts if scale=TRUE and an offset is present (unsupported combination for the bayes_lm_updater Stan template).

Hierarchical (hierarchical)

Construction

The hierarchical class is created automatically when blm() detects random-effects syntax (|) in the formula:

model <- blm(
  kpi ~ m_tv + m_search + trend + (1 + m_tv + m_search | market),
  data = panel_df
)

Terms to the left of | become random slopes; the variable to the right defines the grouping factor. Multiple grouping terms are supported.

CRE / Mundlak extension

For correlated random effects, call set_cre() after construction:

model <- set_cre(model, vars = c("m_tv", "m_search"))

This augments the population formula with group-mean terms (cre_mean_*) and updates priors and boundaries accordingly. See CRE / Mundlak for details.

Fit support

Method Function Backend
MCMC fit(model, ...) rstan::sampling()
MAP fit_map(model, n_runs, ...) rstan::optimizing() (repeated starts)

Post-fit accessors

Same as BLM. Coefficient draws from get_posterior() return vectors (one value per group) rather than scalars. Budget optimisation uses the population-level (fixed-effect) coefficient draws from the beta parameter.

Limitations

  • Stan template compilation uses a templated source (general_hierarchical.stan) rendered per number of groups and parameterisation mode. First compilation is slow; subsequent runs use a cached binary.
  • Response decomposition via model.matrix() may fail for formulas containing | syntax. The runner wraps this in tryCatch and skips gracefully.
  • Posterior forest and prior-vs-posterior plots average group-specific draws to produce a single population-level estimate.
  • Offset support in the hierarchical Stan template is handled via stats::model.offset() within build_hierarchical_frame_data().

Pooled (pooled)

Construction

The pooled class is created by converting an existing BLM object with pool():

base <- blm(kpi ~ m_tv + m_search + trend + seasonality, data = df)
model <- pool(base, grouping_vars = c("channel"), map = pooling_map)

The map is a data frame with a variable column mapping formula terms to pooling dimension labels. Priors and boundaries are reset to defaults when pool() is called.

Fit support

Method Function Backend
MCMC fit(model, ...) rstan::sampling()

MAP fitting (fit_map) is not currently implemented for pooled models.

Post-fit accessors

Same as BLM. The design matrix is split into base terms (intercept + non-pooled) and media terms (pooled). The Stan template uses a per-dimension coefficient structure.

Limitations

  • MAP fitting is not available.
  • extract_stan_design_matrix() may return a zero-row matrix, which causes VIF computation to be skipped.
  • The pooled Stan cache key includes sorted grouping variable names to avoid collisions between different pooling configurations.
  • Time-series cross-validation is not supported for pooled models (rejected by config validation).

Class selection guide

Scenario Recommended class Rationale
Single market, sufficient data BLM Simplest pathway; full accessor and optimisation support
Single market, OLS baseline available BLM via blm(lm_obj, data) Priors initialised from OLS; Bayesian updating
Multi-market panel Hierarchical Partial pooling shares strength across markets
Multi-market panel with confounding concerns Hierarchical + CRE Mundlak terms control for between-group confounding
Single market with structured media dimensions Pooled Coefficient pooling across labelled media categories

Fit method selection

Criterion MCMC (fit) MAP (fit_map)
Full posterior Yes No (point estimate only)
Credible intervals Yes Approximate via repeated starts
Diagnostics (Rhat, ESS, divergences) Yes Not applicable
LOO-CV / model selection Yes Not supported
Speed Minutes to hours Seconds to minutes
Budget optimisation Full posterior-based Point-estimate-based

For production runs where diagnostics and uncertainty quantification matter, MCMC is the recommended fit method. MAP is useful for rapid iteration during model development.

Cross-references

Model Object Lifecycle

DSAMbayes model objects (blm, hierarchical, pooled) are mutable S3 lists that progress through a well-defined sequence of states. Understanding these states helps avoid calling post-fit accessors on an unfitted object or forgetting to compile before fitting.

State-machine diagram

                        ┌─────────────────────────────────────────┐
                        │           CREATED                       │
                        │  blm(), blm.formula(), blm.lm(),        │
                        │  as_bayes_lm_updater()                  │
                        │  Fields set: .formula, .original_data,  │
                        │    .prior, .boundaries                  │
                        └──────────────┬──────────────────────────┘
                                       │
            ┌──────────────────────────┼──────────────────────────┐
            ▼                          ▼                          ▼
    set_prior(obj, …)         set_boundary(obj, …)        set_date(obj, …)
    Mutates .prior            Mutates .boundaries         Sets .date_var
            │                          │                          │
            └──────────────────────────┼──────────────────────────┘
                                       │
                    (optional: pool() transitions blm → pooled,
                     resets .prior/.boundaries, adds .pooling_vars/.pooling_map)
                                       │
                                       ▼
                        ┌─────────────────────────────────────────┐
                        │           CONFIGURED                    │
                        │  Priors, boundaries, date variable are  │
                        │  set (may still use defaults).          │
                        └──────────────┬──────────────────────────┘
                                       │
                                       ▼
                        ┌─────────────────────────────────────────┐
                        │           COMPILED                      │
                        │  compile_model(obj)                     │
                        │  Sets .stan_model                       │
                        │  (pre_flight_checks auto-compiles if    │
                        │   .stan_model is NULL)                  │
                        └──────────────┬──────────────────────────┘
                                       │
                                       ▼
                        ┌─────────────────────────────────────────┐
                        │           PRE-FLIGHTED                  │
                        │  pre_flight_checks(obj, data)           │
                        │  Validates formula/data compatibility,  │
                        │  auto-compiles and auto-sets date_var   │
                        │  if missing. Sets .response_transform,  │
                        │  .response_scale.                       │
                        └──────────────┬──────────────────────────┘
                                       │
                                       ▼
                        ┌─────────────────────────────────────────┐
                        │           FITTED                        │
                        │  fit(obj) / fit_map(obj)                │
                        │  Calls pre_flight_checks internally,    │
                        │  then prep_data_for_fit → rstan.        │
                        │  Sets .stan_data, .date_val, .posterior │
                        └──────────────┬──────────────────────────┘
                                       │
                   ┌───────────────────┼───────────────────┐
                   ▼                   ▼                   ▼
           get_posterior(obj)    fitted(obj)         decomp(obj)
           Returns tibble of    Predicted values    Decomposition
           posterior draws      (yhat)              via DSAMdecomp
                   │                                       │
                   ▼                                       ▼
           optimise_budget(obj, …)                         Further analysis
           Budget allocation
           (requires fitted model)

States and key fields

State Entry point Fields populated
Created blm(), blm.formula(), blm.lm(), as_bayes_lm_updater() .formula, .original_data, .prior, .boundaries, .response_transform, .response_scale
Configured set_prior(), set_boundary(), set_date() Mutates .prior, .boundaries, .date_var
Pooled pool(obj, grouping_vars, map) Adds .pooling_vars, .pooling_map; resets .prior, .boundaries; class becomes pooled
Compiled compile_model(obj) .stan_model
Pre-flighted pre_flight_checks(obj, data) .response_transform, .response_scale; auto-sets .stan_model, .date_var if missing
Fitted fit(obj) / fit_map(obj) .stan_data, .date_val, .posterior

Post-fit accessors

These functions require a fitted model (.posterior is not NULL):

Accessor Returns Notes
get_posterior(obj) Tibble of posterior draws (coefficients, metrics, yhat) Back-transforms to original scale when scale=TRUE
fitted(obj) Predicted values (yhat) on original scale
get_optimisation(obj) Optimisation results tibble Only for MAP-fitted models (.posterior inherits optimisation)
decomp(obj) Predictor-level decomposition via DSAMdecomp
optimise_budget(obj, …) Budget allocation results Requires fitted model with media terms
chain_diagnostics(obj) MCMC chain diagnostic summary Only for MCMC-fitted models

Guards and auto-transitions

  • pre_flight_checks() auto-compiles via compile_model() if .stan_model is NULL, and auto-sets .date_var to "date" if not already set.
  • fit() and fit_map() call pre_flight_checks() internally, so explicit compilation is optional.
  • get_posterior() aborts with a clear error if .posterior is NULL.
  • optimise_budget() aborts if the model has scale=TRUE and an offset is present (unsupported combination for the bayes_lm_updater class).

Object field reference

All fields are initialised by model_object_schema_defaults() in R/model_schema.R. The canonical field list:

Field Type Set by
.original lm object or NULL Constructor
.formula formula Constructor
.original_data data.frame Constructor
.response_transform character(1) Constructor / pre_flight_checks
.response_scale character(1) Constructor / pre_flight_checks
.prior tibble Constructor / set_prior
.boundaries tibble Constructor / set_boundary
.stan_model stanmodel compile_model
.stan_data list prep_data_for_fit (via fit)
.posterior stanfit or optimisation fit / fit_map
.fitted logical(1) Internal
.offset matrix or NULL prep_offset (via fit)
.date_var character(1) set_date / pre_flight_checks
.date_val vector fit / fit_map
.cre list or NULL apply_cre_data (hierarchical)
.pooling_vars character pool()
.pooling_map data.frame pool()
.positive_prior_parameterization character(1) Runner config

Runner-injected fields

These are set by the YAML/CLI runner (run_from_yaml()) for artifact writing and are not part of the core modelling API:

  • .runner_config, .runner_kpi_type, .runner_identifiability
  • .runner_time_components, .runner_budget_optimisation
  • .runner_model_selection, .runner_model_type

Priors and Boundaries

Purpose

This page defines how DSAMbayes specifies, defaults, overrides, and scales coefficient priors and parameter boundaries for all model classes. It covers the prior schema, supported families, default-generation logic, YAML override contract, and the interaction between priors, boundaries, and the scale=TRUE pathway.

Prior schema

Each model object carries a .prior tibble with one row per parameter. The columns are:

Column Type Meaning
parameter character Parameter name (matches design-matrix column or special name)
description character Human-readable label
distribution call R distribution call, e.g. normal(0, 5)
is_default logical Whether the row was generated by default_prior()

Supported prior families

Family Stan encoding Use case
normal(mean, sd) Default (prior_family_noise_sd = 0) Coefficient priors (location–scale)
lognormal_ms(mean, sd) Encoded as prior_family_noise_sd = 1 with log-transformed parameters noise_sd prior when positive-support is desired

All coefficient priors use normal(). The lognormal_ms family is available only for the noise_sd parameter and is parameterised by the mean and standard deviation on the original (non-log) scale; DSAMbayes converts these internally to log-space parameters.

Default prior generation

BLM and hierarchical (population terms)

default_prior.blm() calls standard_prior_terms(), which produces normal(0, 5) for each population-formula term (intercept and slope terms) plus a noise_sd entry.

Hierarchical (group-level standard deviations)

default_prior.hierarchical() additionally generates sd_<idx>[<term>] rows for each group factor. The prior standard deviation is set to the between-group standard deviation of the response, rounded to two decimal places.

BLM from lm (Bayesian updating)

default_prior.bayes_lm_updater() initialises coefficient priors from the OLS point estimates (mean) and standard errors (sd), enabling informative Bayesian updating.

Pooled

default_prior.pooled() uses the BLM defaults for non-pooled terms (intercept, base regressors, noise_sd) and normal(0, 5) for each dimension-level pooled coefficient.

Boundary schema

Each model object carries a .boundaries tibble with one row per parameter:

Column Type Meaning
parameter character Parameter name
description character Human-readable label
boundary list-column List with $lower and $upper (numeric scalars)
is_default logical Whether the row was generated by default_boundary()

Default boundaries are lower = -Inf, upper = Inf for all terms. No sign constraints are imposed by default.

YAML override contract

Prior overrides

priors:
  use_defaults: true
  overrides:
    - { parameter: m_tv, mean: 0.5, sd: 0.2 }
    - { parameter: price_index, mean: -0.2, sd: 0.1 }

Each override replaces the distribution call for the named parameter with normal(mean, sd). Overrides are sparse: only the listed parameters are changed; all other parameters keep their defaults.

When use_defaults: false, the default prior table is not generated. This is not recommended for typical use.

Boundary overrides

boundaries:
  overrides:
    - { parameter: m_tv, lower: 0.0, upper: .Inf }
    - { parameter: competitor_discount, lower: -.Inf, upper: 0.0 }

Each override replaces the boundary entry for the named parameter. YAML infinity tokens (.Inf, -.Inf) are coerced during config resolution.

Scale semantics (scale = TRUE)

When model.scale: true (the default), the response and predictors are standardised before Stan fitting. This affects both priors and boundaries.

Coefficient prior scaling

Prior standard deviations are scaled by the ratio sx / sy for slope terms and by 1 / sy for the intercept. The noise_sd prior standard deviation is multiplied by sy (the response standard deviation) to remain interpretable in the scaled space.

Boundary scaling

  • Zero boundaries (0) are invariant under scaling.
  • Infinite boundaries (±Inf) are invariant under scaling.
  • Finite non-zero boundaries for slope terms are scaled using scale_boundary_for_parameter(), which applies the same sx / sy ratio used for slope priors.
  • If a finite non-zero boundary is specified for a parameter without a matching scale factor in the design matrix, DSAMbayes aborts with a validation error.

Practical implication

Users specify priors and boundaries on the original (unscaled) data scale. DSAMbayes converts them internally before passing data to Stan. Post-fit, coefficient draws are back-transformed to the original scale by get_posterior().

Interaction with model classes

Behaviour BLM Hierarchical Pooled
Default priors normal(0, 5) per term Population: same as BLM; group SD: data-derived Non-pooled: BLM defaults; pooled: normal(0, 5) per dimension
Boundary defaults (-Inf, Inf) per term Same as BLM for population terms Per-dimension boundaries for pooled terms
Prior scaling sx / sy ratio Same, computed on pooled model frame Same, computed on full model frame
Boundary scaling Same ratio Same Same

Programmatic API

Inspect priors and boundaries

peek_prior(model)
peek_boundary(model)

Override priors

model <- model %>%
  set_prior(
    m_tv ~ normal(0.5, 0.2),
    price_index ~ normal(-0.2, 0.1)
  )

Override boundaries

model <- model %>%
  set_boundary(
    m_tv > 0,
    competitor_discount < 0
  )

Minimal-prior policy

The recommended operating profile for MMM is documented in Minimal-Prior Policy. The policy keeps priors weak by default and uses hard constraints only when there is structural business knowledge.

Cross-references

Minimal-Prior Policy

Purpose

Use a principled but low-friction prior setup that avoids specification-hunting while preserving identifiability in short, collinear MMM datasets.

Policy

  1. Default-first: keep priors.use_defaults: true.
  2. Sparse overrides: only add priors.overrides for high-conviction terms.
  3. Selective bounds: add boundaries.overrides only for structural signs.
  4. No blanket constraints: do not force all controls/media to one sign by default.
  5. Diagnose before tightening: use pre-flight and diagnostics gates first, then add priors/bounds if uncertainty is still unstable.

YAML mapping

priors:
  use_defaults: true
  overrides:
    # Optional high-conviction override examples:
    # - { parameter: price_index, mean: -0.2, sd: 0.1 }
    # - { parameter: distribution, mean: 0.15, sd: 0.1 }

boundaries:
  overrides:
    # Optional structural sign constraints:
    # - { parameter: m_tv, lower: 0.0, upper: .Inf }
    # - { parameter: competitor_discount, lower: -Inf, upper: 0.0 }

When to override defaults

  • Do override when domain mechanism is stable and defensible.
  • Do not override only to improve one run’s fit metrics.
  • Do not add bounds if sign can plausibly flip under promotion, pricing, or substitution effects.

Review checklist

  • Are overrides fewer than the number of major business assumptions?
  • Is each bound tied to a concrete causal rationale?
  • Did diagnostics indicate a real identifiability problem before tightening?

Response Scale Semantics

Purpose

DSAMbayes models can operate on an identity (level) or log response scale. This page defines how response scale is detected, stored, and used for post-fit reporting, so that operators understand which scale their outputs are on and how KPI-scale conversions work.

Response scale detection

Response scale is determined at construction time by detect_response_scale(), which inspects the left-hand side of the formula:

Formula LHS Detected transform Response scale label
kpi ~ ... identity response_level
log(kpi) ~ ... log response_log

The detected value is stored in two model-object fields:

  • .response_transform"identity" or "log". Describes the mathematical transform applied to the response before modelling.
  • .response_scale"identity" or "log". Used as a label when reporting whether outputs are on the model scale or the KPI scale.

Both fields are set by the constructor and confirmed by pre_flight_checks().

Model scale vs KPI scale

Concept Identity response Log response
Model scale Raw KPI units Log of KPI units
KPI scale Same as model scale exp() of model scale
Coefficient interpretation Unit change in KPI per unit change in predictor Approximate percentage change in KPI per unit change in predictor

For identity-response models, model scale and KPI scale are identical. For log-response models, fitted values and residuals on the model scale are in log units and must be exponentiated to obtain KPI-scale values.

Post-fit accessors and scale behaviour

fitted() — model scale

fitted() returns predicted values on the model scale. For identity-response models this is the KPI scale. For log-response models this is the log scale.

fit_tbl <- fitted(model)
# fit_tbl$fitted is on model scale

fitted_kpi() — KPI scale

fitted_kpi() applies the inverse transform draw-wise before summarising. For log-response models the default conversion (since v1.2.2) uses the conditional-mean estimator:

$$E[Y] = \exp\!\bigl(\mu + \tfrac{\sigma^2}{2}\bigr)$$

This is the bias-corrected back-transform that accounts for the log-normal variance term. The previous behaviour (v1.2.0) used the simpler exp(mu) estimator, which corresponds to the conditional median on the KPI scale. To retain that behaviour, pass log_response = "median":

# Default (v1.2.2): conditional mean — bias-corrected
kpi_tbl <- fitted_kpi(model)

# Explicit median — equivalent to pre-v1.2.2 behaviour
kpi_tbl <- fitted_kpi(model, log_response = "median")

The output includes source_response_scale (the model’s response scale), response_scale = "kpi", and conversion_method ("conditional_mean" or "point_exp") to label the result.

observed() — model scale

observed() returns the observed response on the model scale after unscaling (if scale=TRUE).

observed_kpi() — KPI scale

observed_kpi() returns the observed response on the KPI scale. For log-response models, this applies exp() to the model-scale observed values.

to_kpi_scale() helper

The internal function to_kpi_scale(x, response_scale) implements the conversion:

  • If response_scale == "log": returns exp(x).
  • Otherwise: returns x unchanged.

This function is used consistently by fitted_kpi(), observed_kpi(), and runner artefact writers.

Runner artefact scale conventions

Runner artefact writers use the response scale metadata to determine which scale to report:

Artefact Scale Notes
fitted.csv Model scale Direct output from fitted()
observed.csv Model scale Direct output from observed()
posterior_summary.csv Model scale Coefficient summaries on model scale
Fit time series plot KPI scale Uses fitted_kpi() and observed_kpi() for visual comparison
Fit scatter plot KPI scale Same as fit time series
Diagnostics (residuals) Model scale Residuals computed on model scale
Budget optimisation outputs KPI scale Response curves and allocations reported on KPI scale

Interaction with scale = TRUE

The scale flag and response scale are orthogonal:

  • scale = TRUE standardises predictors and response by centring and dividing by standard deviation before Stan fitting. Coefficients and fitted values are back-transformed to the original scale by get_posterior().
  • Response scale determines whether the original scale is levels (identity) or logs (log).

Both transformations compose: a log-response model with scale=TRUE first takes the log of the response (via the formula), then standardises the logged values. Post-fit, draws are first unscaled, then (for KPI-scale outputs) exponentiated.

Jensen’s inequality and draw-wise conversion

When converting log-scale posterior draws to KPI scale, DSAMbayes applies exp() to each draw individually before computing summaries (mean, median, credible intervals). This is the correct Bayesian approach because:

  • E[exp(X)] ≠ exp(E[X]) when X has non-zero variance (Jensen’s inequality).
  • Draw-wise conversion preserves the full posterior distribution on the KPI scale.
  • Summary statistics (mean, quantiles) computed after conversion correctly reflect KPI-scale uncertainty.

Practical guidance

  • Use identity-response models when the KPI is naturally additive and coefficients should represent unit changes.
  • Use log-response models when the KPI is naturally multiplicative, when variance scales with level, or when the response must remain positive.
  • Always check response_scale_label(model) before interpreting coefficient magnitudes.
  • Use fitted_kpi() for business reporting; use fitted() for diagnostics.
  • Do not manually exponentiate posterior means from log-response models. Use fitted_kpi() or to_kpi_scale() on individual draws.

Cross-references

CRE / Mundlak

Purpose

The correlated random effects (CRE) pathway, implemented as a Mundlak device, augments hierarchical DSAMbayes models with group-mean terms. This separates within-group variation from between-group variation for selected regressors, reducing confounding bias when group-level means are correlated with the random effects.

When to use CRE

Use CRE when:

  • The model is hierarchical (panel data with (term | group) syntax).
  • Time-varying regressors (e.g. media spend) have group-level means that may be correlated with the group intercept or slope.
  • You want to decompose effects into within-group (temporal) and between-group (cross-sectional) components.

Do not use CRE when:

  • The model is BLM or pooled (CRE requires hierarchical class).
  • The panel has only one group (no between-group variation exists).
  • All regressors of interest are time-invariant (CRE mean terms would be constant).

Construction

CRE is applied after model construction via set_cre():

model <- blm(
  kpi ~ m_tv + m_search + trend + (1 + m_tv + m_search | market),
  data = panel_df
)
model <- set_cre(model, vars = c("m_tv", "m_search"))

What set_cre() does

  1. Resolves the grouping variable. If the formula has one group factor, it is used automatically. If multiple group factors exist, the group argument must be specified explicitly.

  2. Generates group-mean column names. For each variable in vars, a mean-term column is named cre_mean_<variable> (configurable via prefix).

  3. Augments the data. apply_cre_data() computes group-level means of each CRE variable and joins them back to the panel data as new columns.

  4. Updates the formula. The generated mean terms are appended to the population formula as fixed effects.

  5. Extends priors and boundaries. Default prior and boundary entries are added for each new mean term, matching the existing prior schema.

YAML runner configuration

When using the runner, CRE is configured via:

cre:
  enabled: true
  vars: [m_tv, m_search, m_social]
  group: market
  prefix: cre_mean_

The runner calls set_cre() during model construction if cre.enabled: true.

Mundlak decomposition

For a regressor $x_{gt}$ (group $g$, time $t$), the Mundlak device decomposes the effect into:

  • Within-group effect: the coefficient on $x_{gt}$ in the population formula captures temporal variation after conditioning on the group mean.
  • Between-group effect: the coefficient on $\bar{x}_g$ (the CRE mean term) captures cross-sectional variation in group-level averages.

The original coefficient on $x_{gt}$ in a standard random-effects model conflates both sources. Adding $\bar{x}_g$ as a fixed effect separates them.

Validation and identification warnings

Input validation

set_cre() validates:

  • The model is hierarchical (aborts for BLM or pooled).
  • All vars are present in the data and are numeric.
  • The group variable exists in the formula’s group factors.
  • No CRE mean terms appear in random-slope blocks (would cause double-counting).

Identification warnings

warn_cre_identification() checks two conditions after CRE setup:

  1. More CRE variables than groups. If length(vars) > n_groups, between-effect estimates may be weakly identified. The function emits a warning.

  2. Near-zero within-group variation. For each CRE variable, the within-group residual ($x_{gt} - \bar{x}_g$) standard deviation is checked. If it is effectively zero, within-effect identification is weak. The function emits a per-variable warning.

Zero-variance CRE mean terms

If a CRE mean term has zero variance across all observations (possible when the underlying variable has identical group means), calculate_scaling_terms() in R/scale.R will abort when scale=TRUE. The error message identifies the constant CRE columns and suggests using model.type: re (without CRE) or model.scale: false as workarounds.

Panel assumptions

  • Balanced panels are not required. apply_cre_data() computes group means using dplyr::group_by() and mean(), which handles unequal group sizes.
  • Missing values in CRE variables are excluded from the group-mean calculation (na.rm = TRUE).
  • Group-mean recomputation. CRE mean columns are recomputed each time apply_cre_data() is called, including during prep_data_for_fit.hierarchical(). Existing CRE mean columns are dropped and regenerated to prevent stale values.

Decomposition and reporting

CRE mean terms appear as ordinary fixed-effect terms in the population formula. This means:

  • Posterior summary includes CRE mean-term coefficients alongside other population coefficients.
  • Response decomposition via decomp() attributes fitted-value contributions to CRE mean terms separately from their within-group counterparts.
  • Plots (posterior forest, prior-vs-posterior) include CRE mean terms.

Interpretation note: the CRE mean-term coefficient represents the between-group effect conditional on the within-group variation. It does not represent the total effect of the underlying variable.

Cross-references

Time Components

Purpose

DSAMbayes provides managed time-component generation through the time_components config section. When enabled, the runner deterministically generates holiday feature columns from a calendar file and optionally appends them to the model formula. This page defines the configuration contract, generation logic, naming conventions, and audit properties.

Overview

Time components in DSAMbayes cover:

  • Holidays — deterministic weekly indicator features derived from an external calendar file.
  • Trend and seasonality — specified directly in the model formula (e.g. t_scaled, sin52_1, cos52_1). These are not generated by the time-components system; they are user-supplied columns in the data.

The time_components system is responsible only for holiday feature generation.

YAML configuration

time_components:
  enabled: true
  holidays:
    enabled: true
    calendar_path: data/holidays.csv
    date_col: null          # auto-detected: date, ds, or event_date
    label_col: holiday
    date_format: null       # null = ISO 8601; or e.g. "%d/%m/%Y"
    week_start: monday
    timezone: UTC
    prefix: holiday_
    window_before: 0
    window_after: 0
    aggregation_rule: count # count | any
    overlap_policy: count_all # count_all | dedupe_label_date
    add_to_formula: true
    overwrite_existing: false

Key definitions

Key Default Description
enabled false Master toggle for the time-components system
holidays.enabled false Toggle for holiday feature generation
holidays.calendar_path null Path to the holiday calendar CSV (resolved relative to the config file)
holidays.date_col null Date column in the calendar; auto-detected from date, ds, or event_date
holidays.label_col holiday Column containing holiday event labels
holidays.date_format null Date parse format; null assumes ISO 8601
holidays.week_start monday Day-of-week anchor for weekly aggregation
holidays.timezone UTC Timezone used when parsing POSIX date-time inputs
holidays.prefix holiday_ Prefix prepended to generated feature column names
holidays.window_before 0 Days before each event date to include in the holiday window
holidays.window_after 0 Days after each event date to include in the holiday window
holidays.aggregation_rule count Weekly aggregation: count sums event-days per week; any produces a binary indicator
holidays.overlap_policy count_all Overlap handling: count_all counts every event-day; dedupe_label_date deduplicates per label and date
holidays.add_to_formula true Whether generated holiday terms are appended to the model formula automatically
holidays.overwrite_existing false Whether existing columns with matching names are overwritten

Calendar file contract

The holiday calendar is a CSV (or data frame) with at minimum:

Column Required Content
Date column Yes Daily event dates (one row per event occurrence)
Label column Yes Human-readable event name (e.g. Christmas, Black Friday)

Date column detection

If date_col is null, the system tries column names in order: date, ds, event_date. If none is found, validation aborts.

Label normalisation

Holiday labels are normalised to lowercase, alphanumeric-plus-underscore form via normalise_holiday_label(). For example:

  • Black Fridayblack_friday
  • New Year's Daynew_year_s_day
  • Empty labels → unnamed

The generated feature column name is {prefix}{normalised_label}, e.g. holiday_black_friday.

Generation pipeline

The runner calls build_weekly_holiday_features() with the following steps:

  1. Parse and validate the calendar. validate_holiday_calendar() checks column presence, date parsing, and label completeness.

  2. Expand holiday windows. expand_holiday_windows() replicates each event row across the [event_date - window_before, event_date + window_after] range.

  3. Align to weekly index. Each expanded event-day is mapped to its containing week using week_floor_date() with the configured week_start.

  4. Aggregate per week. Events are counted per week per feature. Under aggregation_rule: any, counts are collapsed to binary (0/1). Under overlap_policy: dedupe_label_date, duplicate label-date pairs within a week are removed before counting.

  5. Join to model data. The generated feature matrix is left-joined to the model data by the date column. Weeks with no events receive zero.

  6. Append to formula. If add_to_formula: true, generated feature columns are appended as additive terms to the population formula.

Weekly anchoring

All weekly alignment uses week_floor_date(), which computes the most recent occurrence of week_start on or before each date. The model data’s date column must contain week-start-aligned dates; normalise_weekly_index() validates this and aborts if dates are not aligned.

Supported week-start values

monday, tuesday, wednesday, thursday, friday, saturday, sunday.

Timezone handling

  • Calendar dates are parsed using the configured timezone (default UTC).
  • If the calendar contains POSIXt values, they are coerced to Date in the configured timezone.
  • Character dates are parsed as ISO 8601 by default, or using date_format if specified.

Generated-term audit contract

Generated holiday terms are tracked for downstream diagnostics and reporting:

  • The list of generated term names is stored in model$.runner_time_components$generated_terms.
  • The identifiability gate in R/diagnostics_report.R uses this list to auto-detect baseline terms (via detect_baseline_terms()), so generated holiday terms are included in baseline-media correlation checks without requiring explicit configuration.

Feature naming collision

If two different holiday labels normalise to the same feature name, build_weekly_holiday_features() aborts with a collision error. Ensure calendar labels are distinct after normalisation.

Interaction with existing data columns

  • If overwrite_existing: false (default), the runner aborts if any generated column name already exists in the data.
  • If overwrite_existing: true, existing columns with matching names are replaced by the generated features.

Practical guidance

  • Start with aggregation_rule: count to capture multi-day holiday effects (e.g. a holiday spanning two days in one week produces a count of 2).
  • Use window_before and window_after for events with known anticipation or lingering effects (e.g. window_before: 7 for pre-Christmas shopping).
  • Use aggregation_rule: any when you want binary holiday indicators regardless of how many event-days fall in a week.
  • Check generated terms in the resolved config (config.resolved.yaml) and posterior summary to confirm which holidays entered the model.

Cross-references

Diagnostics Gates

Purpose

DSAMbayes runs a deterministic diagnostics framework after model fitting. Each diagnostic check produces a pass, warn, or fail status. The policy mode controls how lenient or strict the thresholds are. This page defines the check taxonomy, threshold tables, policy modes, identifiability gate, and the overall status aggregation rule.

Policy modes

The diagnostics framework supports three policy modes, configured via diagnostics.policy_mode in YAML:

Mode Intent Threshold behaviour
explore Rapid iteration during model development Relaxed fail thresholds; many checks can only warn, not fail
publish Default production mode for shareable outputs Balanced thresholds; condition-number fail is downgraded to warn
strict Audit-grade gating for release candidates Tightest thresholds; rank deficit fails rather than warns

The mode is resolved by diagnostics_policy_thresholds(mode) in R/diagnostics_report.R.

Check taxonomy

Checks are organised into phases:

Phase Scope When evaluated
P0 Data integrity and design matrix validity Pre-fit (design matrix available)
P1 Sampler quality, residual behaviour, identifiability Post-fit (posterior available)

Each check row includes:

Field Meaning
check_id Unique identifier
phase P0 or P1
severity Priority rating (P0 = critical, P1 = important)
status pass, warn, fail, or skipped
metric Metric name
value Observed value
threshold Applied threshold description
message Human-readable explanation

P0 design checks

Check ID Metric Pass Warn Fail
pre_response_finite non_finite_response_count == 0 > 0
pre_design_constants_duplicates constant_plus_duplicate_columns == 0 > 0
pre_design_rank_deficit rank_deficit == 0 > 0 (publish) > 0 (strict)
pre_design_condition_number kappa_X ≤ warn > warn > fail

Condition number thresholds by mode

Mode Warn Fail
explore 10,000 ∞ (cannot fail)
publish 10,000 1,000,000 (downgraded to warn)
strict 10,000 1,000,000

P1 sampler checks (MCMC only)

Check ID Metric Direction Warn Fail
sampler_rhat_max max_rhat Lower is better 1.01 1.05
sampler_ess_bulk_min min_ess_bulk Higher is better 400 200
sampler_ess_tail_min min_ess_tail Higher is better 200 100
sampler_ebfmi_min min_ebfmi Higher is better 0.30 0.20
sampler_treedepth_frac treedepth_hit_fraction Lower is better 0.00 0.01
sampler_divergences divergent_fraction Lower is better 0.00 0.00

Mode adjustments for sampler checks

In explore mode, fail thresholds are substantially relaxed (e.g. rhat_fail = 1.10, ess_bulk_fail = 50). In strict mode, warn thresholds match publish fail thresholds.

P1 residual checks

Check ID Metric Direction Warn Fail
resid_ljung_box_p resid_lb_p Higher is better 0.05 0.01
resid_acf_max resid_acf_max Lower is better 0.20 0.40

Mode adjustments for residual checks

Mode resid_lb_p warn resid_lb_p fail resid_acf warn resid_acf fail
explore 0.05 0.00 (cannot fail) 0.20 ∞ (cannot fail)
publish 0.05 0.01 0.20 0.40
strict 0.10 0.05 0.15 0.30

P1 boundary hit check

Check ID Metric Direction Warn Fail
boundary_hit_fraction boundary_hit_frac Lower is better 0.05 0.20

In explore mode, boundary hits cannot fail. In strict mode, thresholds tighten to warn > 0.02, fail > 0.10.

P1 within-group variation check

Check ID Metric Direction Warn Fail
within_var_ratio within_var_min_ratio Higher is better 0.10 0.05

This check applies to hierarchical models and flags groups where within-group variation is extremely low relative to between-group variation. In explore mode, the fail threshold is zero (cannot fail).

Identifiability gate

The identifiability gate measures the maximum absolute correlation between baseline terms and media terms in the design matrix. It is configured via diagnostics.identifiability in YAML:

diagnostics:
  identifiability:
    enabled: true
    media_terms: [m_tv, m_search, m_social]
    baseline_terms: [trend, seasonality]
    baseline_regex: ["^h_", "^sin", "^cos"]
    abs_corr_warn: 0.80
    abs_corr_fail: 0.95

Term detection

  • Media terms: explicitly listed in media_terms.
  • Baseline terms: union of baseline_terms, generated time-component terms, and matches from baseline_regex patterns.
  • Both sets are intersected with actual design-matrix columns and filtered to remove constant columns.

Thresholds by mode

Mode Warn Fail
explore 0.80 ∞ (cannot fail)
publish 0.80 0.95
strict 0.70 0.85

Skip conditions

The identifiability gate reports skipped when:

  • identifiability.enabled: false
  • No configured media terms found in the design matrix
  • No baseline terms detected from configured terms/regex
  • All resolved baseline or media terms are constant

Overall status aggregation

The overall diagnostics status is determined by diagnostics_overall_status():

  1. If any check has status == "fail" → overall status is fail.
  2. If any check has status == "warn" (and none fail) → overall status is warn.
  3. Otherwise → overall status is pass.

Checks with status == "skipped" do not affect the overall status.

Runner artefact output

The diagnostics framework produces:

Artefact Location Content
diagnostics_report.csv 40_diagnostics/ Full check table with all fields
diagnostics_summary.txt 40_diagnostics/ Human-readable summary of overall status and failing checks

Interpretation guidance

  • pass — no remediation needed; model is suitable for the configured policy mode.
  • warn — review recommended; the model may have quality concerns but does not block the configured policy.
  • fail — remediation required before the model can be considered production-ready under the configured policy.

Common remediation actions

Diagnostic area Warning signs Actions
High Rhat > 1.01 Increase MCMC iterations or warmup; simplify model
Low ESS < 400 bulk or < 200 tail Increase iterations; check for multimodality
Divergences Any non-zero fraction Increase adapt_delta; reparameterise model
High condition number kappa > 10,000 Reduce collinearity; remove redundant terms
Residual autocorrelation High ACF or low Ljung-Box p Add time controls (trend, seasonality, holidays)
Boundary hits > 5% of draws Review boundary specification; widen or remove constraints
High baseline-media correlation > 0.80 Add controls to separate baseline from media; consider alternative model specifications

Cross-references

Budget Optimisation

Purpose

DSAMbayes provides a decision-layer budget optimisation engine that operates on fitted model posteriors. Given a channel scenario with spend bounds, response-transform specifications, and an objective function, the engine searches for the allocation that maximises the chosen objective while respecting channel-level constraints. This page defines the inputs, objectives, risk scoring, response-scale handling, and output structure.

Overview

Budget optimisation is separate from parameter estimation. It takes a fitted model and a scenario specification, then:

  1. Extracts posterior coefficient draws for the scenario’s channel terms.
  2. Generates feasible candidate allocations within channel bounds that sum to the total budget.
  3. Evaluates each candidate across all posterior draws to obtain a distribution of KPI outcomes.
  4. Ranks candidates by the configured objective and risk scoring function.
  5. Returns the best allocation, channel-level summaries, response curves, and impact breakdowns.

Entry point

result <- optimise_budget(model, scenario, n_candidates = 2000L, seed = 123L)

The optimize_budget() alias is also available for American English convention.

Scenario specification

The scenario is a structured list with the following top-level keys:

channels

A list of channel definitions, each containing:

Key Required Default Description
term Yes Model formula term name for this channel
name No Same as term Human-readable channel label
spend_col No Same as name Data column used for reference spend lookup
bounds.min No 0 Minimum allowed spend for this channel
bounds.max No Inf Maximum allowed spend for this channel
response No {type: "identity"} Response transform specification
currency_col No null Data column for currency-unit conversion

Channel names and terms must be unique across the scenario.

budget_total

Total budget to allocate across all channels. All feasible allocations sum to this value.

reference_spend

Optional named list of per-channel reference spend values. If not provided, reference spend is estimated from the mean of the spend_col in the model’s original data.

objective

Defines the optimisation target and risk scoring:

Key Values Description
target kpi_uplift, profit What to maximise
value_per_kpi numeric (required for profit) Currency value of one KPI unit
risk.type mean, mean_minus_sd, quantile Risk scoring function
risk.lambda numeric ≥ 0 (for mean_minus_sd) Penalty weight on posterior standard deviation
risk.quantile (0, 1) (for quantile) Quantile level for pessimistic scoring

Response transforms

Each channel can specify a response transform that maps raw spend to the transformed value used in the linear predictor. Supported types:

Type Formula Parameters
identity spend None
atan atan(spend / scale) scale (positive scalar)
log1p log(1 + spend / scale) scale (positive scalar)
hill spend^n / (spend^n + k^n) k (half-saturation), n (shape)

The response transform is applied within response_transform_value() and determines the shape of the channel’s response curve.

Objective functions

kpi_uplift

Maximises the expected change in KPI relative to the reference allocation. The metric for each candidate is:

$$\Delta\text{KPI}_d = f(\text{candidate}) - f(\text{reference})$$

evaluated across posterior draws $d$.

profit

Maximises expected profit, defined as:

$$\text{profit}_d = \text{value\_per\_kpi} \times \Delta\text{KPI}_d - \Delta\text{spend}$$

where $\Delta\text{spend} = \text{candidate total} - \text{reference total}$.

Risk-aware scoring

The risk scoring function determines how the distribution of objective draws is summarised into a single score for ranking candidates:

Risk type Score formula Use case
mean $\bar{m}$ Risk-neutral; maximises expected value
mean_minus_sd $\bar{m} - \lambda \cdot \sigma$ Penalises uncertainty; higher $\lambda$ is more conservative
quantile $Q_\alpha(m)$ Optimises the $\alpha$-quantile; directly targets worst-case outcomes

Coefficient extraction

BLM and pooled models

Coefficient draws are extracted via get_posterior() and indexed by the scenario’s channel terms.

Hierarchical models

For hierarchical MCMC models, the population-level (fixed-effect) beta draws are extracted directly from the Stan posterior. If the model was fitted with scale=TRUE, draws are back-transformed to the original scale before optimisation. This ensures that optimisation operates on the population effect rather than group-specific random-effect totals.

Draw thinning

If max_draws is specified, a random subsample of posterior draws is used for computational efficiency. The subsampling uses the configured seed for reproducibility.

Response-scale handling

Budget optimisation handles both identity and log response scales:

  • Identity response: $\Delta\text{KPI}$ is the difference in linear-predictor draws between candidate and reference allocations.
  • Log response: $\Delta\text{KPI}$ is computed via kpi_delta_from_link_levels(), which correctly accounts for the exponential back-transformation. If kpi_baseline is available, the delta is expressed in absolute KPI units; otherwise, it is expressed as a relative change.

The delta_kpi_from_link() and kpi_delta_from_link_levels() functions ensure Jensen-safe conversions by operating draw-wise.

Feasible allocation generation

sample_feasible_allocation() generates random allocations that:

  1. Respect per-channel lower bounds.
  2. Respect per-channel upper bounds.
  3. Sum exactly to budget_total.

Allocation is performed by distributing remaining budget (after lower bounds) using exponential random weights, iteratively filling channels until the budget is exhausted. project_to_budget() ensures exact budget equality via proportional adjustment.

Output structure

optimise_budget() returns a budget_optimisation object containing:

Field Content
best_spend Named numeric vector of optimal per-channel spend
best_score Objective score of the best allocation
channel_summary Tibble with per-channel reference vs optimised spend, response, ROI, CPA, and deltas
curves List of per-channel response curve tibbles (spend grid × mean/lower/p50/upper)
points Tibble of reference and optimised points per channel with confidence intervals
impact Waterfall-style tibble of per-channel KPI contribution and interaction residual
objective_cfg Echo of the objective configuration
scenario Echo of the input scenario
model_metadata Model class, response scale, and scale flag

Runner integration

When allocation.enabled: true in YAML, the runner calls optimise_budget() after fitting and writes artefacts under 60_optimisation/:

Artefact Content
allocation_summary.csv Channel summary table
response_curves.csv Response curve data for all channels
allocation_impact.csv Waterfall impact breakdown
Plot PNGs Response curves, ROI/CPA panel, allocation waterfall, and other visual outputs

Constraints and guardrails

  • Budget feasibility: if channel lower bounds sum to more than budget_total, the engine aborts.
  • Upper bound capacity: if channel upper bounds cannot accommodate the full budget, the engine aborts.
  • Missing terms: if a scenario term is not found in the posterior coefficients, the engine aborts with a descriptive error.
  • Offset + scale combination: for bayes_lm_updater models, optimise_budget() aborts if scale=TRUE and an offset is present.

Cross-references

Plots

Purpose

This section documents every plot the DSAMbayes runner produces. Each page covers one pipeline stage, describes what the plot shows, explains when and why the runner generates it, and gives practical interpretation guidance. The target reader is a modelling operator or analyst who needs to assess run quality without reading source code.

Pipeline stages

The runner writes artefacts into timestamped directories under results/. Plots are organised into six stages, each with its own subdirectory:

Stage Directory Role Page
Pre-run 10_pre_run/ Data quality and input sanity checks before fitting Pre-run plots
Model fit 20_model_fit/ Posterior summaries and fitted-vs-observed visualisations Model fit plots
Post-run 30_post_run/ Response decomposition and predictor contributions Post-run plots
Diagnostics 40_diagnostics/ Residual analysis, boundary monitoring, posterior predictive checks Diagnostics plots
Model selection 50_model_selection/ LOO-CV diagnostics for model comparison and calibration Model selection plots
Optimisation 60_optimisation/ Budget allocation, response curves, efficiency comparisons Optimisation plots

Plot catalogue

All plots listed below map to files in docs/images/ and to actual runner output filenames.

Pre-run (10_pre_run/)

Filename Description Type
media_spend_timeseries.png Stacked area chart of channel spend over time Descriptive
kpi_media_overlay.png KPI and total media spend on dual axes Descriptive
vif_bar.png Variance inflation factor per predictor Diagnostic

Model fit (20_model_fit/)

Filename Description Type
fit_timeseries.png Observed vs fitted response over time with credible band Descriptive
fit_scatter.png Observed vs fitted scatter with 45-degree reference Descriptive
posterior_forest.png Coefficient estimates with 90% credible intervals Descriptive
prior_posterior.png Prior-to-posterior shift for media coefficients Descriptive

Post-run (30_post_run/)

Filename Description Type
decomp_predictor_impact.png Total contribution per model term (bar chart) Descriptive
decomp_timeseries.png Stacked media contribution over time Descriptive

Diagnostics (40_diagnostics/)

Filename Description Type
ppc.png Posterior predictive check fan chart Diagnostic
residuals_timeseries.png Residuals over time Diagnostic
residuals_vs_fitted.png Residuals vs fitted values Diagnostic
residuals_hist.png Residual distribution histogram Diagnostic
residuals_acf.png Residual autocorrelation function Diagnostic
residuals_latent_acf.png Latent-scale residual ACF (log-response models) Diagnostic
boundary_hits.png Share of posterior draws near coefficient boundaries Diagnostic/Gating

Model selection (50_model_selection/)

Filename Description Type
pareto_k.png Pareto-k diagnostic scatter from PSIS-LOO Diagnostic/Gating
loo_pit.png LOO probability integral transform histogram Diagnostic
elpd_influence.png Pointwise ELPD contributions over time Diagnostic

Optimisation (60_optimisation/)

Filename Description Type
budget_response_curves.png Channel response curves with current/optimised points Decision
budget_roi_cpa.png ROI or CPA comparison (current vs optimised) Decision
budget_impact.png Spend reallocation and response impact diverging bars Decision
budget_contribution.png Absolute response comparison by channel Decision
budget_confidence_comparison.png Posterior credible intervals for allocations Decision
budget_sensitivity.png Sensitivity of optimised allocation to budget changes Decision
budget_efficient_frontier.png Efficient frontier across budget levels Decision
budget_kpi_waterfall.png KPI waterfall from reference to optimised allocation Decision
budget_marginal_roi.png Marginal ROI curves at the optimised point Decision
budget_spend_share.png Spend share comparison (current vs optimised) Decision

Source code references

Plot generation is implemented across three files:

  • R/runner_fit_plots.R — pre-run, fit, diagnostics, and model selection plots
  • R/optimise_budget_plots.R — budget optimisation plots
  • R/run_artifacts.R — stage map and orchestration
  • R/run_artifacts_enrichment.R — wiring for fit-stage and pre-run plots
  • R/run_artifacts_diagnostics.R — wiring for diagnostics and model selection plots

Subsections of Plots

Pre-run Plots

Purpose

Pre-run plots are generated before the model is fitted. They visualise the input data and flag structural problems — multicollinearity, missing spend periods, implausible KPI–media relationships — that could compromise inference. Treat these as a data quality gate: review them before interpreting any downstream output.

All pre-run plots are written to 10_pre_run/ within the run directory. They require ggplot2 and are generated by write_pre_run_plots() in R/run_artifacts_enrichment.R. The runner produces them whenever an allocation.channels block is present in the configuration and the data contains the referenced spend columns.

Plot catalogue

Filename What it shows Conditions
media_spend_timeseries.png Stacked channel spend over time Allocation channels defined with valid spend_col
kpi_media_overlay.png KPI and total spend on dual axes Allocation channels defined; response variable present
vif_bar.png VIF per predictor with severity thresholds Design matrix extractable with >1 predictor and >1 row

Media spend time series

Filename: media_spend_timeseries.png

Media spend time series Media spend time series

What it shows

A stacked area chart of weekly media spend by channel, drawn from the raw spend_col columns declared in the allocation configuration. The x-axis is the date variable; the y-axis is spend in model units.

When it is generated

The runner generates this plot when:

  • The configuration includes an allocation.channels block.
  • At least one declared spend_col exists in the input data.

If no valid spend columns are found, the plot is silently skipped.

How to interpret it

Look for three things. First, check that each channel has plausible seasonal patterns and no unexpected gaps — zero-spend weeks in the middle of a campaign period suggest data ingestion problems. Second, verify that the relative magnitudes make sense: if TV dominates the stack but the brand has historically been digital-first, the data may be mislabelled or aggregated incorrectly. Third, confirm that the date range matches the modelling window declared in the configuration.

Warning signs

  • Flat channels: A channel with constant spend across all weeks contributes no variation and cannot be identified by the model. The coefficient will be driven entirely by the prior.
  • Sudden jumps or drops: Step changes in spend that do not correspond to known campaign events may indicate data joins across sources with different reporting conventions.
  • Missing periods: Gaps where spend drops to zero mid-series can distort adstock calculations if the model applies geometric decay.

Action

If a channel shows no variation, consider removing it from the formula or fixing the upstream data. If gaps are genuine (e.g. a seasonal channel), confirm the adstock specification handles zero-spend periods correctly.

  • data_dictionary.csv in 10_pre_run/ provides summary statistics for every input column.

KPI–media overlay

Filename: kpi_media_overlay.png

KPI–media overlay KPI–media overlay

What it shows

A dual-axis time series with the KPI response variable on the left axis (blue) and total media spend (sum of all declared spend_col values) on the right axis (red, rescaled to share the vertical space). This is a visual correlation check, not a causal claim.

When it is generated

The runner generates this plot when:

  • The configuration includes an allocation.channels block with at least one valid spend_col.
  • The response variable exists in the data.

If the total spend has zero variance, the plot is skipped.

How to interpret it

The overlay reveals whether KPI and aggregate spend move together over time. A rough co-movement is expected in MMM data — media drives response — but the relationship need not be tight. Seasonal KPI peaks that precede or lag media bursts suggest confounding (e.g. demand-driven spend timing). Divergences where spend rises but KPI falls (or vice versa) are worth investigating: they may reflect diminishing returns, competitor activity, or a structural break in the data.

Warning signs

  • Perfect alignment: If the two series track each other almost exactly, the model may be fitting spend timing rather than incremental media effects.
  • Opposite trends: A persistent negative relationship between total spend and KPI suggests reverse causality or omitted-variable bias.
  • Scale artefacts: The dual-axis rescaling can exaggerate or suppress visual correlation. Do not draw quantitative conclusions from this plot.

Action

Use this plot as a sanity check only. If the relationship looks implausible, investigate the data and consider whether the formula includes adequate controls for seasonality, trend, and external factors.


Variance inflation factor (VIF) bar chart

Filename: vif_bar.png

VIF bar chart VIF bar chart

What it shows

A horizontal bar chart of variance inflation factors for each predictor in the model’s design matrix. Bars are colour-coded by severity: green (VIF < 5), amber (5 ≤ VIF < 10), and red (VIF ≥ 10). Dashed vertical lines mark the 5 and 10 thresholds.

When it is generated

The runner generates this plot when:

  • The design matrix has more than one predictor column and more than one row.
  • The VIF computation does not encounter a singular or degenerate correlation matrix.

For pooled models, the design matrix extraction may return zero rows, in which case the plot is skipped.

How to interpret it

VIF measures how much the variance of a coefficient estimate inflates due to correlation with other predictors. A VIF of 1 means no multicollinearity; a VIF of 10 means the standard error is roughly three times larger than it would be with orthogonal predictors. In Bayesian MMM, high VIF does not break inference the way it does in OLS — priors regularise the estimates — but it does reduce the data’s ability to inform the posterior, making results more prior-dependent.

Warning signs

  • VIF > 10 on media channels: The model cannot reliably separate the effects of those channels. Posterior estimates will lean heavily on the prior. Consider whether the channels can be combined or whether one should be dropped.
  • VIF > 10 on seasonality terms: Common and usually harmless if the terms are included as controls rather than as interpretive outputs.
  • All terms moderate or high: The overall collinearity structure may be too severe for the data length. Consider increasing the sample size or simplifying the formula.

Action

Review the top-VIF terms. If two media channels are highly collinear (e.g. search and affiliate), consider whether they can be meaningfully separated given the available data. If not, combine them or use informative priors to anchor the split.

  • design_matrix_manifest.csv in 10_pre_run/ lists all design matrix columns with variance and uniqueness statistics.
  • spec_summary.csv in 10_pre_run/ summarises the model specification.

Cross-references

Model Fit Plots

Purpose

Model fit plots summarise the posterior and compare fitted values against observed data. They answer two questions: does the model track the response variable adequately, and are the estimated coefficients plausible? These plots are written to 20_model_fit/ within the run directory.

The runner generates them via write_model_fit_plots() in R/run_artifacts_enrichment.R. All four plots require ggplot2 and the fitted model object. Each is wrapped in tryCatch so that a failure in one does not prevent the others from being written.

Plot catalogue

Filename What it shows Conditions
fit_timeseries.png Observed vs fitted over time with 95% credible band Always generated after a successful fit
fit_scatter.png Observed vs fitted scatter Always generated after a successful fit
posterior_forest.png Coefficient point estimates with 90% CIs Posterior draws available via get_posterior()
prior_posterior.png Prior-to-posterior density shift for media terms Model has a .prior table with media (m_*) parameters

Fit time series

Filename: fit_timeseries.png

Fit time series Fit time series

What it shows

The observed KPI (orange) and posterior mean fitted values (blue) plotted over time, with a shaded 95% credible interval band. The subtitle reports in-sample fit metrics: R², RMSE, MAE, mean error (bias), sMAPE, 95% prediction interval coverage, lag-1 ACF of residuals, and sample size. For hierarchical models the plot facets by group.

When it is generated

Always, provided the model has been fitted successfully and the fit table (observed, mean, percentiles) can be computed.

How to interpret it

The fitted line should track the general level and seasonal pattern of the observed series. The 95% credible band should contain most observed points — the subtitle reports the actual coverage, which should be close to 95%. Systematic departures reveal model misspecification: if the fitted line consistently overshoots during holidays or undershoots during quiet periods, the formula may lack appropriate seasonal or event terms.

Warning signs

  • Coverage well below 95%: The model underestimates uncertainty. Common when the noise prior is too tight or the model is overfit to a subset of the data.
  • Coverage well above 95%: The credible interval is too wide. The model is underfit or the noise prior is too diffuse.
  • Persistent bias (ME far from zero): The model systematically over- or under-predicts. Check for missing structural terms (trend, level shifts, intercept misspecification).
  • High lag-1 ACF (> 0.3): Residuals are autocorrelated. The model is missing temporal structure — consider adding lagged terms or checking adstock specifications.

Action

If coverage or bias is unacceptable, revisit the formula (missing controls, wrong functional form) or the prior specification (overly tight noise SD). Cross-reference with the residuals diagnostics for a more detailed picture.

  • fit_metrics_by_group.csv in 20_model_fit/ provides the same metrics in tabular form, broken down by group for hierarchical models.

Fit scatter

Filename: fit_scatter.png

Fit scatter Fit scatter

What it shows

A scatter plot of observed values (y-axis) against posterior mean fitted values (x-axis), with a 45-degree reference line. Points on the line indicate perfect fit. For hierarchical models the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Points should cluster tightly around the diagonal. Curvature away from the line suggests a systematic misfit — for instance, if the model underpredicts at high KPI values, the response may need a nonlinear term or a log transformation. Outliers far from the line warrant investigation: they may correspond to anomalous weeks (data errors, one-off events) that the model cannot capture.

Warning signs

  • Fan shape (wider scatter at higher values): Heteroscedasticity. A log-scale model or a variance-stabilising transform may be more appropriate.
  • Systematic curvature: The mean function is misspecified. Consider adding polynomial or interaction terms.
  • Isolated outliers: Check the dates of extreme residuals against the residuals time series and the input data for data quality issues.

Action

If the scatter reveals non-constant variance, consider fitting on the log scale (model.scale or a log-transformed formula). If curvature is evident, review the functional form of media transforms and control variables.


Posterior forest plot

Filename: posterior_forest.png

Posterior forest Posterior forest

What it shows

A horizontal forest plot of posterior coefficient estimates. Each row is a model term (excluding the intercept). The point marks the posterior median; the horizontal bar spans the 5th to 95th percentile (90% credible interval). Terms whose interval excludes zero are drawn in colour; those consistent with zero are grey.

For hierarchical models, the plot displays population-level (group-averaged) estimates.

When it is generated

The runner generates this plot when posterior draws are available via get_posterior(). It is skipped if the posterior extraction fails.

How to interpret it

Focus on the media coefficients. Positive values indicate that higher media exposure is associated with higher KPI, which is the expected direction for most channels. The width of the interval reflects estimation precision: a narrow interval means the data informed the estimate strongly; a wide interval means the prior dominates.

Terms ordered by absolute magnitude (bottom to top) give a quick ranking of effect sizes, but note that these are on the model’s internal scale. For models fitted on the log scale, coefficients represent approximate percentage effects; for levels models, they represent absolute KPI units per unit of the transformed media input.

Warning signs

  • Media coefficient crosses zero: The model cannot confidently distinguish the channel’s effect from noise. This is not necessarily wrong — some channels may genuinely have weak effects — but it warrants scrutiny, especially if the prior was informative.
  • Implausibly large coefficients: Check for scaling issues. If model.scale: true, coefficients are on the standardised scale and must be interpreted accordingly.
  • All intervals very wide: The data may not have enough variation to identify individual effects. Review the VIF bar chart for multicollinearity.

Action

If a media coefficient is unexpectedly negative, investigate whether the data supports it (e.g. counter-cyclical spend) or whether multicollinearity is pulling the estimate. Cross-reference with the prior vs posterior plot to see how far the data moved the estimate from its prior.


Prior vs posterior

Filename: prior_posterior.png

Prior vs posterior Prior vs posterior

What it shows

Faceted density plots for each media coefficient (m_* parameters). The grey distribution is the prior (Normal, as specified in the model’s .prior table); the blue distribution is the posterior (estimated from MCMC draws). Overlap indicates that the data did not strongly inform the estimate; separation indicates data-driven updating.

For hierarchical models, posterior draws are averaged across groups to show the population-level density.

When it is generated

The runner generates this plot when:

  • The model has a .prior table (i.e. it is a requires_prior model).
  • The prior table contains at least one m_* parameter.
  • Posterior draws are available.

If the model has no prior table (e.g. a pure OLS updater), the plot is skipped.

How to interpret it

A well-identified coefficient shifts noticeably from prior to posterior. If the two densities sit on top of each other, the data provided little information for that channel — the estimate is prior-driven. This is not inherently wrong (the prior may be well-calibrated from previous studies), but it does mean the current dataset alone cannot validate the estimate.

Warning signs

  • No shift at all: The channel has insufficient variation or is too collinear with other terms for the data to update the prior. The resulting coefficient is essentially assumed, not estimated.
  • Posterior much narrower than prior: Expected and healthy. The data concentrated the estimate.
  • Posterior shifted to the boundary: If a boundary constraint is active (e.g. non-negativity), the posterior may pile up at zero. Cross-reference with the boundary hits plot to confirm.

Action

If key media channels show no prior-to-posterior shift, consider whether the prior is appropriate, whether the data period is long enough, or whether multicollinearity prevents identification. For channels where the prior dominates, document this clearly when reporting ROAS or contribution estimates — the output reflects an assumption, not a data-driven finding.


Cross-references

Post-run Plots

Purpose

Post-run plots decompose the fitted response into its constituent parts. They answer the question: how much does each predictor contribute to the modelled KPI, and how do those contributions evolve over time? These plots are written to 30_post_run/ within the run directory.

The runner generates them via write_response_decomposition_artifacts() in R/run_artifacts_enrichment.R, which calls runner_response_decomposition_tables() to compute per-term contributions from the design matrix and posterior coefficient estimates. For hierarchical models with random-effects formula syntax (|), the decomposition may fail gracefully — the runner logs a warning and continues to downstream stages.

Plot catalogue

Filename What it shows Conditions
decomp_predictor_impact.png Total contribution per model term (bar chart) Decomposition tables computed successfully
decomp_timeseries.png Stacked media channel contribution over time Decomposition tables computed successfully; media terms present

Predictor impact

Filename: decomp_predictor_impact.png

Predictor impact Predictor impact

What it shows

A horizontal bar chart of the total contribution of each model term to the response, computed as the sum of coefficient × design-matrix column across all observations. Terms are sorted by absolute contribution magnitude. The intercept and total rows are excluded.

When it is generated

The runner generates this plot when runner_response_decomposition_tables() returns a valid predictor-level summary table. This requires that stats::model.matrix() can parse the model formula against the original input data — a condition that holds for BLM and pooled models but may fail for hierarchical models whose formulas contain random-effects syntax.

How to interpret it

The bar lengths represent total modelled impact over the data period. Media channels with large positive bars drove the most KPI in the model’s account of the data. Control variables (trend, seasonality, holidays) often dominate in absolute terms because they capture baseline demand — this is expected and does not diminish the media findings.

Negative contributions can arise for terms with negative coefficients (e.g. price sensitivity) or for seasonality harmonics where the net effect over the year partially cancels.

Warning signs

  • A media channel with negative total contribution: Unless the coefficient is intentionally unconstrained (no lower boundary at zero), a negative contribution suggests the model is absorbing noise or confounding through that channel. Review the posterior forest plot and check whether the coefficient’s credible interval excludes zero.
  • Intercept-dominated decomposition (not shown here, but visible in the CSV): If the intercept accounts for >90% of the total, media effects are negligible relative to baseline demand. This may be correct, but it limits the utility of the model for budget allocation.
  • Missing plot: If the decomposition failed (logged as a warning), the model type likely does not support direct model.matrix() decomposition. The CSV companions will also be absent.

Action

Use this plot to prioritise which channels to scrutinise. Cross-reference large contributors with the prior vs posterior plot to confirm they are data-driven rather than prior-driven.

  • decomp_predictor_impact.csv in 30_post_run/ contains the same data in tabular form.
  • posterior_summary.csv in 30_post_run/ provides the coefficient summary underlying the decomposition.

Decomposition time series

Filename: decomp_timeseries.png

Decomposition time series Decomposition time series

What it shows

A stacked area chart of media channel contributions over time. Each layer represents one media term’s weekly contribution (coefficient × transformed media input). Non-media terms (intercept, controls, seasonality) are excluded to focus the view on the media mix.

When it is generated

The runner generates this plot alongside the predictor impact chart, provided the decomposition tables include at least one media term.

How to interpret it

The height of each band at a given week represents how much that channel contributed to the modelled response. Seasonal patterns in the stack reflect campaign timing and adstock carry-over. The total height of the stack is the aggregate media contribution — the gap between this and the observed KPI is accounted for by non-media terms and noise.

Warning signs

  • A channel with near-zero contribution throughout: The model assigns negligible effect to that channel. This could be correct (low spend, weak signal) or a sign that multicollinearity is suppressing the estimate.
  • Implausibly large single-channel dominance: If one channel accounts for the vast majority of the media stack, verify the coefficient is plausible and not inflated by collinearity with a correlated channel.
  • Abrupt jumps unrelated to spend changes: Check whether the design matrix term (adstock/saturation output) is well-behaved. Sudden spikes in contribution without corresponding spend changes suggest a data or transform issue.

Action

Compare the relative channel contributions here with the business’s spend allocation. Channels that receive large spend but show small contributions may have diminishing returns or weak effects. This comparison motivates the budget optimisation stage.

  • decomp_timeseries.csv in 30_post_run/ contains the weekly decomposition in long format.

Cross-references

Diagnostics Plots

Purpose

Diagnostics plots assess whether the fitted model’s assumptions hold and whether any structural problems warrant remedial action. They cover residual behaviour, posterior predictive adequacy, and boundary constraint monitoring. These plots are written to 40_diagnostics/ within the run directory.

The runner generates residual plots via write_residual_diagnostics() in R/run_artifacts_diagnostics.R, the PPC plot via write_model_fit_plots() in R/run_artifacts_enrichment.R, and the boundary hits plot via write_boundary_diagnostics() in R/run_artifacts_diagnostics.R. Each plot is wrapped in tryCatch so that individual failures do not block the remaining outputs.

Plot catalogue

Filename What it shows Conditions
ppc.png Posterior predictive check fan chart Posterior draws (yhat) extractable from fitted model
residuals_timeseries.png Residuals over time Fit table available
residuals_vs_fitted.png Residuals vs fitted values Fit table available
residuals_hist.png Residual distribution histogram Fit table available
residuals_acf.png Residual autocorrelation function Fit table available
residuals_latent_acf.png Latent-scale residual ACF Model uses log-scale response (response_scale != "identity")
boundary_hits.png Posterior draw proximity to coefficient bounds Boundary hit rates computable from posterior and bound specifications

Posterior predictive check (PPC)

Filename: ppc.png

Posterior predictive check Posterior predictive check

What it shows

A fan chart of posterior predictive draws overlaid with observed data. The blue line is the posterior mean of the predicted response; the dark band spans the 25th–75th percentile (50% CI) and the light band spans the 5th–95th percentile (90% CI). Red dots mark observed values.

When it is generated

The runner generates this plot whenever posterior predictive draws (yhat) can be extracted from the fitted model via runner_yhat_draws(). This works for BLM, hierarchical, and pooled models fitted with MCMC.

How to interpret it

Well-calibrated models produce bands that contain roughly 50% and 90% of observed points in the respective intervals. The key diagnostic is whether observed values fall systematically outside the bands during specific periods — this reveals time-localised misfit that aggregate metrics like RMSE can mask.

Warning signs

  • Observed points consistently outside the 90% band: The model underestimates uncertainty or misses a structural feature (holiday, promotion, regime change).
  • Bands that widen dramatically in specific periods: The model is uncertain about those periods, possibly because the training data lacks similar observations.
  • Bands that are uniformly very wide: The noise prior may be too diffuse, or the model has too many weakly identified parameters.

Action

If the PPC reveals localised misfit, check whether the affected periods correspond to missing control variables (holidays, events). If the bands are too wide overall, consider tightening the noise prior or simplifying the formula. Cross-reference with the LOO-PIT histogram for an aggregate calibration assessment.


Residuals over time

Filename: residuals_timeseries.png

Residuals time series Residuals time series

What it shows

A line chart of residuals (observed minus posterior mean) over time. A horizontal reference line at zero marks perfect fit. For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Residuals should scatter randomly around zero with no discernible trend or seasonal pattern. Any structure in the residuals indicates that the model has failed to capture a systematic component of the data.

Warning signs

  • Trend in residuals: The model’s trend specification is inadequate. Consider adding a higher-order polynomial or a structural-break term.
  • Seasonal oscillation: The Fourier harmonics or holiday dummies are insufficient. Add more harmonics or specific event indicators.
  • Clusters of large residuals: Localised misfit — check corresponding dates for data anomalies.

Action

Residual structure that persists across multiple weeks warrants a formula revision. Short isolated spikes are often data outliers and may not require model changes.


Residuals vs fitted

Filename: residuals_vs_fitted.png

Residuals vs fitted Residuals vs fitted

What it shows

A scatter plot of residuals (y-axis) against posterior mean fitted values (x-axis), with a horizontal reference at zero. For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

The scatter should form a horizontal band centred on zero with roughly constant vertical spread across the fitted-value range. Patterns in this plot diagnose specific model violations.

Warning signs

  • Funnel shape (wider spread at higher fitted values): Heteroscedasticity. A log-scale model would be more appropriate.
  • Curvature: The mean function is misspecified. The model under- or over-predicts at the extremes.
  • Discrete clusters: May indicate grouping structure that the model does not account for.

Action

Heteroscedasticity in a levels model is the most common finding. If the funnel pattern is pronounced, re-fit on the log scale and compare diagnostics. Cross-reference with the fit scatter plot which shows the same information from a different angle.


Residual distribution

Filename: residuals_hist.png

Residual histogram Residual histogram

What it shows

A histogram of residuals across all observations (40 bins). For hierarchical models with six or fewer groups, the histogram facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

The distribution should be approximately symmetric and unimodal if the Normal noise assumption holds. Heavy tails or skewness indicate departures from normality.

Warning signs

  • Strong right skew: Common in levels models when the response is strictly positive and has occasional large values. A log transform may help.
  • Bimodality: Suggests a mixture or an omitted grouping variable. Check whether the data contains distinct regimes.
  • Extreme outliers: Individual residuals several standard deviations from the mean warrant data inspection.

Action

Moderate departures from normality in the residuals are tolerable in Bayesian inference — the posterior is still valid if the model is otherwise well-specified. Severe skewness or heavy tails, however, can distort credible intervals and predictive coverage. Consider robust likelihood specifications or transformations.


Residual autocorrelation (ACF)

Filename: residuals_acf.png

Residual ACF Residual ACF

What it shows

A bar chart of the sample autocorrelation function of residuals, computed up to lag 26 (roughly half a year of weekly data). Red dashed lines mark the 95% significance bounds (±1.96/√n). For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Bars within the significance bounds indicate no serial correlation at that lag. Significant autocorrelation — especially at low lags (1–4 weeks) — means the model misses short-run temporal dependence. Significant spikes at lag 52 (if the series is long enough) suggest residual annual seasonality.

Warning signs

  • Lag-1 ACF > 0.3: Strong short-run autocorrelation. The model’s uncertainty estimates are anti-conservative (credible intervals too narrow), and coefficient estimates may be biased if lagged effects are present.
  • Decaying positive ACF: Suggests an omitted AR component or insufficient adstock decay modelling.
  • Spike at lag 52: Residual annual seasonality not captured by the Fourier terms.

Action

If lag-1 ACF is material, consider adding lagged response terms or increasing the number of Fourier harmonics. For adstock-driven channels, verify that the decay rate is not too fast (underfitting carry-over) or too slow (overfitting noise).


Latent-scale residual ACF

Filename: residuals_latent_acf.png

Latent ACF Latent ACF

What it shows

The same ACF plot as above, but computed on the latent (log) scale when the model’s response scale is not identity. This is relevant for models fitted with model.scale: true or log-transformed response variables.

When it is generated

The runner generates this plot when response_scale != "identity". It is skipped for levels-scale models.

How to interpret it

Interpretation is identical to the standard ACF plot. The latent-scale version is preferred for log models because autocorrelation in the log residuals is more directly interpretable as a model adequacy check on the scale where inference is performed.

Warning signs

Same as the standard ACF. Compare both plots if both are generated — discrepancies may indicate that the log transformation introduces or removes autocorrelation artefacts.


Boundary hits

Filename: boundary_hits.png

Boundary hits Boundary hits

What it shows

A horizontal chart showing, for each constrained coefficient, the share of posterior draws that fall within a tolerance of the finite lower or upper bound. Bars are colour-coded: green (0% hit rate), amber (1–10%), red (≥10%). When all hit rates are zero, the plot displays green dots with explicit “0.0%” labels.

When it is generated

The runner generates this plot when the model has finite boundary constraints set via set_boundary() and boundary hit rates can be computed from the posterior draws. It is written by write_boundary_diagnostics() in R/run_artifacts_diagnostics.R.

How to interpret it

A zero hit rate for all parameters means no posterior draws approached any boundary — the constraints are not binding and the posterior is effectively unconstrained. This is the ideal outcome.

A non-zero hit rate means the boundary is influencing the posterior shape. Moderate rates (1–10%) suggest the data mildly conflicts with the constraint; high rates (≥10%) mean the data wants the coefficient outside the allowed range and the boundary is actively truncating the posterior.

Warning signs

  • Hit rate ≥10% on a media coefficient: The non-negativity constraint is binding. The true effect may be zero or negative, but the boundary forces a positive estimate. This inflates the channel’s apparent contribution.
  • Hit rate ≥10% on many parameters simultaneously: The overall constraint specification may be too tight for the data. Consider widening bounds or reviewing the formula.
  • Lower-bound hits on a coefficient with strong prior mass at zero: The prior and boundary together may create a “pile-up” at the bound. The posterior is not reflecting the data faithfully.

Action

For channels with high boundary hit rates, critically assess whether the non-negativity constraint is justified by domain knowledge. If the constraint is essential (e.g. media cannot destroy demand), document that the estimate is boundary-driven. If it is not essential, consider relaxing the bound and re-fitting to see whether the unconstrained estimate is materially different.

  • boundary_hits.csv in 40_diagnostics/ provides the per-parameter hit rates in tabular form.
  • diagnostics_report.csv in 40_diagnostics/ includes a summary check for boundary binding.

Hierarchical-specific: within variation

Filename: within_variation.png (generated only for hierarchical models)

This plot shows the within-group variation ratio for each non-CRE (correlated random effects) term: Var(x − mean_g(x)) / Var(x). Low ratios indicate that most variation in a predictor is between groups rather than within groups, making it difficult to identify the coefficient from within-group variation alone. Dashed lines at 5% and 10% mark conventional concern thresholds.

This plot is generated only for hierarchical models and is not included in the standard BLM image set.


Cross-references

Model Selection Plots

Purpose

Model selection plots provide leave-one-out cross-validation (LOO-CV) diagnostics that assess predictive adequacy and calibration. They help answer: does the model generalise to unseen observations, and are any individual data points unduly influencing the fit? These plots are written to 50_model_selection/ within the run directory.

The runner generates them via write_model_selection_artifacts() in R/run_artifacts_diagnostics.R. LOO-CV is computed using Pareto-smoothed importance sampling (PSIS-LOO) from the loo package, which approximates exact leave-one-out predictive densities from a single MCMC fit. All three plots depend on the pointwise LOO table (loo_pointwise.csv), which contains per-observation ELPD contributions, Pareto-k diagnostics, and influence flags.

Plot catalogue

Filename What it shows Conditions
pareto_k.png Pareto-k diagnostic scatter over time Pointwise LOO table available with pareto_k column
loo_pit.png LOO-PIT calibration histogram Posterior draws (yhat) extractable from fitted model
elpd_influence.png Pointwise ELPD contributions over time Pointwise LOO table available with elpd_loo and pareto_k columns

Pareto-k diagnostic

Filename: pareto_k.png

Pareto-k diagnostic Pareto-k diagnostic

What it shows

A scatter plot of Pareto-k values over time, one point per observation. Points are colour-coded by severity:

  • Green (k < 0.5): PSIS approximation is reliable.
  • Amber (0.5 ≤ k < 0.7): Approximation is acceptable but warrants monitoring.
  • Red (0.7 ≤ k < 1.0): Approximation is unreliable. The observation is influential.
  • Purple (k > 1.0): PSIS fails entirely. The observation dominates the posterior.

Dashed horizontal lines mark the 0.5, 0.7, and 1.0 thresholds. The legend always displays all four severity levels regardless of whether points exist in each category.

When it is generated

The runner generates this plot whenever the pointwise LOO table contains a pareto_k column. This requires a successful PSIS-LOO computation, which in turn requires the fitted model to produce log-likelihood values.

How to interpret it

Most points should be green. A small number of amber points is typical and does not invalidate the LOO estimate. Red and purple points identify observations where the posterior changes substantially when that observation is excluded — these are influential data points.

Influential observations concentrated in a specific time period (e.g. a cluster of red points around a holiday) suggest that the model struggles with those conditions. Isolated influential points may correspond to data anomalies or outliers.

Warning signs

  • More than 10% of points above 0.7: The overall PSIS-LOO estimate is unreliable. The loo package will issue a warning. Consider moment-matching or exact refitting for affected observations.
  • Purple points (k > 1): These observations are so influential that removing them would substantially change the posterior. Investigate whether they represent data errors, one-off events, or genuine but rare conditions.
  • Influential points at the start or end of the series: Edge effects in adstock transforms can create artificial influence at series boundaries.

Action

For isolated red/purple points, inspect the corresponding dates and data values. If they are data errors, correct the data. If they are genuine but extreme, consider whether the model’s likelihood (Normal) is appropriate — heavy-tailed alternatives (Student-t) are more robust to outliers. If influential points are numerous, the model may be misspecified more broadly: revisit the formula, priors, and functional form.

  • loo_pointwise.csv in 50_model_selection/ contains the per-observation Pareto-k, ELPD, and influence flags.
  • loo_summary.csv in 50_model_selection/ reports the aggregate ELPD with standard error.

LOO-PIT calibration histogram

Filename: loo_pit.png

LOO-PIT histogram LOO-PIT histogram

What it shows

A histogram of leave-one-out probability integral transform (LOO-PIT) values across all observations. The PIT value for observation t is the proportion of posterior predictive draws that fall below the observed value: PIT_t = Pr(ŷ_t ≤ y_t | y_{-t}). The histogram uses 20 equal-width bins from 0 to 1. A dashed red horizontal line marks the expected count under a perfectly calibrated model (n/20).

When it is generated

The runner generates this plot whenever posterior predictive draws can be extracted via runner_yhat_draws(). It does not require the pointwise LOO table — it computes PIT values directly from the posterior predictive distribution. The plot is written by write_model_fit_plots() in R/run_artifacts_enrichment.R and filed under 50_model_selection/.

How to interpret it

A well-calibrated model produces a uniform PIT distribution — all bins should be roughly equal in height, close to the dashed reference line. Departures from uniformity reveal specific calibration failures:

  • U-shape (excess mass at 0 and 1): The model is overdispersed — its predictive intervals are too narrow. Observed values fall in the tails of the predictive distribution more often than expected.
  • Inverse U-shape (excess mass in the centre): The model is underdispersed — its predictive intervals are too wide. The model is more uncertain than it needs to be.
  • Left-skewed (excess mass near 0): The model systematically overpredicts. Observed values tend to fall below the predictive distribution.
  • Right-skewed (excess mass near 1): The model systematically underpredicts.

Warning signs

  • Strong U-shape: The noise variance is underestimated or the model is missing a source of variation. This is the most concerning pattern because it means the credible intervals are anti-conservative — reported uncertainty is too low.
  • One bin dramatically taller than others: A single bin containing many more observations than expected suggests a discrete cluster of misfits. Check the dates of those observations.
  • Monotone slope: A systematic bias that the model has not captured. Check the residuals time series for trend.

Action

U-shaped PIT histograms call for wider predictive intervals: increase the noise prior, add missing covariates, or allow for heavier tails. Inverse-U patterns suggest the noise prior is too diffuse — tighten it. Skewed patterns indicate systematic bias that should be addressed through formula changes (missing controls, trend, level shifts). Cross-reference with the PPC fan chart for a visual complement.


ELPD influence plot

Filename: elpd_influence.png

ELPD influence ELPD influence

What it shows

A lollipop chart of pointwise expected log predictive density (ELPD) contributions over time. Each vertical stem connects the observation’s ELPD value to zero; the dot marks the ELPD value. Blue points and stems indicate non-influential observations (Pareto-k ≤ 0.7); red indicates influential ones (Pareto-k > 0.7). Larger red dots draw attention to the problematic observations.

When it is generated

The runner generates this plot whenever the pointwise LOO table contains both elpd_loo and pareto_k columns. It is written by write_model_selection_artifacts() in R/run_artifacts_diagnostics.R, immediately after the Pareto-k scatter.

How to interpret it

ELPD values quantify each observation’s contribution to the model’s out-of-sample predictive performance. Values near zero indicate observations that the model predicts well. Large negative values indicate observations where the model assigns low predictive probability — these are the worst-predicted points.

The combination of ELPD magnitude and Pareto-k severity is informative:

  • Large negative ELPD + low k: The model predicts this observation poorly, but the PSIS estimate is reliable. The model genuinely struggles with this data point.
  • Large negative ELPD + high k: Both the prediction and the LOO approximation are unreliable. This observation is highly influential and poorly fit — it warrants the closest scrutiny.
  • Near-zero ELPD + high k: The observation is influential but well-predicted. It may be a leverage point (extreme in predictor space) that happens to lie on the fitted surface.

Warning signs

  • Cluster of large negative values in a specific period: The model systematically fails during that period. Check for missing events, structural breaks, or data quality problems.
  • Many red (influential) points with large negative ELPD: The model’s aggregate LOO estimate is unreliable, and the worst-fit observations are also the most influential. This combination makes model comparison results untrustworthy.
  • Monotone trend in ELPD values: Suggests time-varying model adequacy — the model may fit the training period well but degrade towards the edges.

Action

Investigate the dates of the worst ELPD observations. If they correspond to known anomalies (data errors, one-off events), consider excluding or down-weighting them. If they correspond to regular conditions that the model should handle, the model needs revision. Use the Pareto-k plot to confirm which observations are both poorly predicted and influential, and prioritise those for investigation.

  • loo_pointwise.csv in 50_model_selection/ contains the full pointwise table with ELPD, Pareto-k, and influence flags.
  • loo_summary.csv in 50_model_selection/ reports the aggregate ELPD estimate and standard error for model comparison.

Cross-references

Optimisation Plots

Purpose

Optimisation plots visualise the outputs of the budget allocator. They translate model estimates into actionable budget decisions by showing response curves, efficiency comparisons, and the sensitivity of recommendations to budget changes. These are decision-layer artefacts: they sit downstream of all modelling and diagnostics, and their quality depends entirely on the credibility of the upstream fit.

All optimisation plots are written to 60_optimisation/ within the run directory. The runner generates them via write_budget_optimisation_artifacts() in R/run_artifacts_enrichment.R, which calls the public plotting APIs in R/optimise_budget_plots.R. They require a successful call to optimise_budget() that produces a budget_optimisation object with a plot_data payload.

Plot catalogue

Filename What it shows Conditions
budget_response_curves.png Channel response curves with current/optimised points Optimisation completed with response curve data
budget_roi_cpa.png ROI or CPA comparison by channel Optimisation completed with ROI/CPA summary
budget_impact.png Spend reallocation and response impact (diverging bars) Optimisation completed with ROI/CPA summary
budget_contribution.png Absolute response comparison by channel Optimisation completed with ROI/CPA summary
budget_confidence_comparison.png Posterior credible intervals for current vs optimised Optimisation completed with response points
budget_sensitivity.png Total response change when each channel varies ±20% Optimisation completed with response curve data
budget_efficient_frontier.png Optimised response across budget levels Efficient frontier computed via budget_efficient_frontier()
budget_kpi_waterfall.png KPI decomposition waterfall (base + channels + controls) Waterfall data computable from model coefficients and data means
budget_marginal_roi.png Marginal ROI (or marginal response) curves by channel Optimisation completed with response curve data
budget_spend_share.png Current vs optimised spend allocation as percentage Optimisation completed with ROI/CPA summary

Response curves

Filename: budget_response_curves.png

Response curves Response curves

What it shows

Faceted line charts of the estimated response curve for each media channel. The x-axis is raw spend (model units); the y-axis is expected response. A shaded band shows the posterior credible interval around the mean curve. Two marked points per channel indicate the current (reference) and optimised spend allocations.

The subtitle notes which media transforms were applied (e.g. Hill saturation, adstock). A caption reports the marginal response at the optimised point for each channel.

When it is generated

The runner generates this plot whenever optimise_budget() returns response curve data in the plot_data payload. This requires at least one media channel in the allocation configuration with a computable response function.

How to interpret it

The curve shape encodes diminishing returns. Steep initial slopes indicate high marginal response at low spend; flattening curves indicate saturation. The gap between the current and optimised points shows the direction of the recommended reallocation: if the optimised point sits to the right (higher spend) of the current point, the allocator recommends increasing that channel’s budget.

The credible band width reflects posterior uncertainty about the response function. Wide bands mean the shape is poorly identified — the recommendation is sensitive to modelling assumptions. Narrow bands indicate data-informed estimates.

Warning signs

  • Very wide credible bands: The response curve shape is uncertain. Budget recommendations based on it carry substantial risk.
  • Optimised point near the flat part of the curve: The channel is saturated at the recommended spend. Further increases yield negligible marginal returns.
  • Current and optimised points nearly identical: The allocator found little room for improvement on that channel. The current allocation is already near-optimal (or the response function is too uncertain to justify a change).

Action

Compare the marginal response values across channels. The allocator equalises marginal response at the optimum — if marginal values differ substantially, the optimisation may have hit a constraint (spend floor/ceiling). Cross-reference with the budget sensitivity plot to assess how robust the recommendation is.

  • budget_response_curves.csv in 60_optimisation/ contains the curve data.
  • budget_response_points.csv in 60_optimisation/ contains the current and optimised point coordinates.

ROI/CPA comparison

Filename: budget_roi_cpa.png

ROI/CPA comparison ROI/CPA comparison

What it shows

A grouped bar chart comparing ROI (or CPA, for subscription KPIs) by channel under the current and optimised allocations. If currency_col is defined per channel, bars show financial ROI; otherwise they show response-per-unit-spend in model units. A TOTAL bar summarises the portfolio-level metric.

The metric choice is automatic: the allocator uses ROI for revenue-type KPIs and CPA for subscription-type KPIs.

When it is generated

The runner generates this plot whenever the optimisation result includes a roi_cpa summary table.

How to interpret it

Channels where the optimised bar exceeds the current bar gain efficiency from the reallocation. Channels where the optimised bar is lower have had spend reduced — their marginal efficiency was below the portfolio average. The TOTAL bar shows the net portfolio improvement.

Warning signs

  • Optimised ROI lower than current for most channels: The allocator redistributed spend towards higher-response channels, which may have lower per-unit efficiency but larger absolute contribution. This is not necessarily wrong — the allocator maximises total response, not per-channel ROI.
  • TOTAL bar shows negligible improvement: The current allocation is already near-optimal, or the model’s response functions are too flat to support meaningful reallocation.
  • Very large ROI values on low-spend channels: Small denominators inflate ROI. These channels may have high marginal returns at low spend but limited capacity to absorb budget.

Action

Do not interpret this plot in isolation. Cross-reference with the contribution comparison and the response curves to distinguish efficiency improvements from scale effects.

  • budget_roi_cpa.csv in 60_optimisation/ contains the per-channel ROI/CPA values.
  • budget_summary.csv in 60_optimisation/ provides the top-level allocation summary.

Allocation impact

Filename: budget_impact.png

Budget impact Budget impact

What it shows

A horizontal diverging bar chart in two facets. The left facet shows spend reallocation (positive = increase, negative = decrease) per channel. The right facet shows the corresponding response impact. Bars are coloured green for increases and red for decreases. A TOTAL row at the bottom summarises the net change with muted styling.

Channels are sorted by response impact magnitude — the channels most affected by the reallocation appear at the top.

When it is generated

The runner generates this plot whenever the optimisation result includes a roi_cpa summary with delta_spend and delta_response columns.

How to interpret it

The spend facet shows where the allocator moves budget. The response facet shows the expected consequence. A useful pattern is a channel that receives a spend decrease (red bar, left) but shows a small response decrease (small red bar, right) — that channel was inefficient and the freed budget drives larger gains elsewhere.

Warning signs

  • Large spend increase on a channel with modest response gain: Diminishing returns may be steep. Verify against the response curve.
  • Response decreases that exceed response gains: The allocator expects a net negative outcome. This should not happen with a correctly specified max_response objective, and suggests a configuration or constraint issue.

Action

Use this chart to brief stakeholders on the “where and why” of reallocation. Pair it with the confidence comparison to communicate whether the expected gains are statistically distinguishable from zero.


Response contribution

Filename: budget_contribution.png

Contribution comparison Contribution comparison

What it shows

A grouped bar chart comparing absolute expected response (contribution) by channel under the current and optimised allocations. Delta annotations above each pair show the change. A TOTAL bar with muted styling shows the portfolio-level gain. The subtitle reports the percentage total response gain from optimisation.

When it is generated

The runner generates this plot whenever the optimisation result includes mean_reference and mean_optimised columns in the roi_cpa summary.

How to interpret it

This chart answers the question: in absolute terms, how much more (or less) response does each channel deliver under the optimised allocation? Unlike the ROI chart, this view is not distorted by small denominators — it shows the quantity the allocator actually maximises.

Warning signs

  • Negative delta on a channel with high current contribution: The allocator is pulling spend from a channel that currently contributes a great deal. This is rational if the marginal return on that channel is below the portfolio average, but it requires careful communication to stakeholders accustomed to interpreting total contribution as “importance”.
  • TOTAL gain is small: The reallocation may not justify the operational cost of implementing it. Consider whether the confidence intervals overlap (see confidence comparison).

Action

Report the TOTAL percentage gain as the headline number. Caveat it with the credible interval width from the confidence comparison. If the gain is within posterior uncertainty, the recommendation is suggestive rather than conclusive.

  • budget_allocation.csv in 60_optimisation/ contains the per-channel spend and response values.

Confidence comparison

Filename: budget_confidence_comparison.png

Confidence comparison Confidence comparison

What it shows

A horizontal forest plot (dodge-positioned point-and-errorbar) showing the posterior mean response and 90% credible interval for each channel under the current (grey) and optimised (red) allocations. Channels where the intervals overlap suggest that the reallocation gain may not be statistically meaningful.

When it is generated

The runner generates this plot whenever the optimisation result includes response point data with mean, lower, and upper columns for both reference and optimised allocations.

How to interpret it

Focus on channels where the optimised interval (red) does not overlap with the current interval (grey). These are the channels where the reallocation produces a distinguishable change in expected response. Overlapping intervals mean the posterior cannot confidently distinguish the two allocations — the gain exists in expectation but falls within sampling uncertainty.

Warning signs

  • All intervals overlap: The data is too uncertain to support a confident reallocation recommendation. The allocator’s point estimate suggests improvement, but the posterior cannot distinguish it from noise.
  • One channel shows a clear gain while others overlap: The headline portfolio gain may be driven by a single channel. Verify that channel’s response curve and prior-posterior shift.

Action

Use this plot to calibrate the confidence of the recommendation. If intervals overlap for most channels, present the allocation as “directionally suggestive” rather than “statistically supported”. If key channels show clear separation, the recommendation is stronger.


Budget sensitivity

Filename: budget_sensitivity.png

Budget sensitivity Budget sensitivity

What it shows

A spider chart (line plot) showing how total expected response changes when each channel’s spend is varied ±20% from its optimised level, while all other channels are held fixed. Steeper lines indicate channels whose budgets have the most influence on total response. A horizontal dashed line at zero marks the optimised baseline.

When it is generated

The runner generates this plot whenever the optimisation result includes response curve data. The ±20% range and 11 evaluation points per channel are defaults set in plot_budget_sensitivity().

How to interpret it

Channels with steep lines are the most sensitive: small deviations from their optimised spend produce large response changes. Flat lines indicate channels where modest budget deviations have little impact — the response function is either saturated (on the flat part of the curve) or nearly linear (constant marginal return).

Warning signs

  • A channel with an asymmetric slope (steep downward, flat upward): Cutting this channel’s spend is costly, but increasing it yields little. It is at or near its saturation point.
  • All lines nearly flat: The optimisation surface is plateau-like. The allocator’s recommendation is robust to implementation imprecision, but also implies limited upside from optimisation.
  • Lines that cross: Channels swap in relative importance at different budget perturbations. This complicates simple priority rankings.

Action

Use this chart to communicate implementation risk. If the recommended allocation is operationally difficult to achieve exactly, the sensitivity chart shows which channels require precise execution and which have margin for error.


Efficient frontier

Filename: budget_efficient_frontier.png

Efficient frontier Efficient frontier

What it shows

A line-and-point chart of total optimised response as a function of total budget. Each point represents the optimal allocation at that budget level (expressed as a percentage of the current total budget). A red diamond marks the current budget level. The curve shows how much additional response is achievable by increasing the total budget — and the diminishing returns of doing so.

When it is generated

The runner generates this plot when budget_efficient_frontier() produces a budget_frontier object with at least two feasible points. This requires a valid optimisation result and a set of budget multipliers (configured in allocation.efficient_frontier).

How to interpret it

The frontier’s shape reveals the budget’s overall productivity. A concave curve (steepening, then flattening) is the classic diminishing-returns shape: each additional unit of budget buys less incremental response. The gap between the current point and the curve above it shows the unrealised potential at the same budget — the difference between the current allocation and the optimal one.

Warning signs

  • Frontier is nearly linear: Returns are approximately constant across the budget range. The model may not have enough data to identify saturation, or the budget range is too narrow to reveal it.
  • Frontier flattens early: The portfolio saturates at a budget well below the current level. The current spend may be wastefully high.
  • Only 2–3 feasible points: The optimiser could not find feasible allocations at most budget levels. Constraints may be too tight.

Action

Use the frontier to frame budget conversations. The curve shows what is achievable at each budget level. If a stakeholder proposes a budget cut, the frontier quantifies the response cost. If they propose an increase, it quantifies the expected gain. Present the frontier alongside the spend share comparison to show how the allocation shifts at each level.

  • budget_efficient_frontier.csv in 60_optimisation/ contains the frontier data.

KPI waterfall

Filename: budget_kpi_waterfall.png

KPI waterfall KPI waterfall

What it shows

A horizontal waterfall bar chart decomposing the predicted KPI into its constituent components: base (intercept), trend, seasonality, holidays, controls, and individual media channels. Each bar shows the mean posterior coefficient multiplied by the mean predictor value — the average contribution of that component to the predicted KPI. A red TOTAL bar anchors the sum.

When it is generated

The runner generates this plot when build_kpi_waterfall_data() can extract posterior coefficients and match them to predictor means in the original data. This requires that the model’s .formula and .original_data are both accessible. For hierarchical models with random-effects syntax, the waterfall may fail gracefully and be skipped.

How to interpret it

The waterfall answers: “of the total predicted KPI, how much comes from each source?” The base (intercept) typically dominates, representing baseline demand independent of media and controls. Media channels sit at the bottom, showing their individual incremental contributions. The relative sizes of the media bars correspond to the decomposition impact chart (decomp_predictor_impact.png), but computed slightly differently (mean × mean vs sum over time).

Warning signs

  • Negative media contributions: A channel with a negative bar reduces predicted KPI. Unless the coefficient is intentionally unconstrained, this suggests a fitting or identification problem.
  • Intercept dwarfs all other terms: The model attributes nearly all KPI to baseline demand. Media effects are marginal. This may be realistic for low-spend brands but limits the value of budget optimisation.
  • Missing plot (skipped with warning): The model type does not support direct waterfall decomposition.

Action

Use the waterfall to contextualise media contributions within the total predicted KPI. For stakeholder reporting, it provides a clear answer to “what drives our KPI?” — while emphasising that media is one factor among several.

  • budget_kpi_waterfall.csv in 60_optimisation/ contains the waterfall data.

Marginal ROI curves

Filename: budget_marginal_roi.png

Marginal ROI Marginal ROI

What it shows

Faceted line charts of marginal ROI (or marginal response, if no currency conversion is configured) as a function of spend for each channel. The marginal value is computed as the first difference of the response curve: the additional response per additional unit of spend. Current and optimised points are marked.

When it is generated

The runner generates this plot whenever the optimisation result includes response curve data with at least two points per channel.

How to interpret it

The marginal ROI curve is the derivative of the response curve. At the optimised allocation, the allocator equalises marginal ROI across channels (subject to constraints). If one channel’s marginal ROI at the optimised point is substantially higher than another’s, a constraint (spend floor or ceiling) is preventing further reallocation.

Diminishing returns appear as a downward-sloping marginal curve: each additional unit of spend yields less incremental response than the last. Channels with steeper slopes saturate faster.

Warning signs

  • Marginal ROI near zero at the optimised point: The channel is at or near saturation. Additional spend yields negligible incremental response.
  • Marginal ROI that increases with spend: This implies increasing returns, which is unusual for media. It may indicate a response curve misspecification or insufficient data in the high-spend region.
  • Large differences in marginal ROI at the optimised points across channels: Constraints are binding. The allocator cannot equalise marginal returns because spend bounds prevent it.

Action

Use marginal ROI to identify which channels have headroom (high marginal ROI at the optimised point) and which are saturated (marginal ROI near zero). This informs not just the current allocation but also the value of relaxing spend constraints.


Spend share comparison

Filename: budget_spend_share.png

Spend share Spend share

What it shows

Two horizontal stacked bars showing the percentage allocation of total budget across channels: one for the current allocation and one for the optimised allocation. Percentage labels appear within each segment (for segments ≥ 4% of total). The subtitle reports the total budget in currency or model units for both allocations.

When it is generated

The runner generates this plot whenever the optimisation result includes a roi_cpa summary with spend_reference and spend_optimised columns.

How to interpret it

This is the most intuitive optimisation output for non-technical stakeholders. It answers: “how should we split the budget?” Segments that grow from current to optimised represent channels the allocator recommends investing more in; segments that shrink represent channels to reduce.

Warning signs

  • A channel disappears (0% share) in the optimised allocation: The allocator has hit the channel’s spend floor (which may be zero). If this is unintended, raise the minimum spend constraint.
  • Allocations are nearly identical: The current mix is already near-optimal, or the model cannot distinguish channel effects well enough to justify reallocation.
  • Very small segments in both allocations: Channels with negligible spend share contribute little to the optimisation. Consider whether they should be included or grouped.

Action

Present this chart as the primary recommendation visual. Accompany it with the confidence comparison to communicate the certainty of the recommendation and the allocation impact chart to show the expected consequence.


Cross-references

How-To Guides

Purpose

Provide task-oriented recipes for common DSAMbayes operational workflows. Each guide starts from a user objective, gives minimal reproducible steps, and includes expected output artefacts and quick verification checks.

Audience

  • Users who know the concepts but need execution steps.
  • Engineers debugging run and artefact issues.

Pages

Guide Objective
Run from YAML Execute a complete runner workflow and verify staged outputs
Interpret Diagnostics Read and act on diagnostics gate results
Compare Runs Compare multiple runs and select a candidate model
Debug Run Failures Diagnose and resolve common runner failure modes

Subsections of How-To Guides

Run from YAML

Objective

Execute a complete DSAMbayes model run from a YAML configuration file and verify the staged output artefacts.

Prerequisites

  • DSAMbayes installed locally (see Install and Setup).
  • A YAML config file (see Config Schema for structure).
  • Data file(s) referenced by the config are accessible.

Steps

1. Set up the environment

mkdir -p .Rlib .cache
export R_LIBS_USER="$PWD/.Rlib"
export XDG_CACHE_HOME="$PWD/.cache"

2. Validate the configuration (dry run)

Rscript scripts/dsambayes.R validate --config config/my_config.yaml

Expected outcome:

  • Exit code 0.
  • A timestamped run directory under results/ containing 00_run_metadata/config.resolved.yaml and 00_run_metadata/session_info.txt.
  • No Stan compilation or sampling occurs.

If validation fails:

  • Check the error message for missing data paths, invalid YAML keys, or formula errors.
  • Fix the config and re-run validate before proceeding.

3. Run the model

Rscript scripts/dsambayes.R run --config config/my_config.yaml

Expected outcome:

  • Exit code 0.
  • Full staged artefact tree under the run directory.

4. Locate the run directory

The runner prints the run directory path during execution. It follows the pattern:

results/YYYYMMDD_HHMMSS_<run_label>/

5. Verify artefacts

Check that the following stage folders are populated:

Stage Folder Key files
Metadata 00_run_metadata/ config.resolved.yaml, session_info.txt
Pre-run 10_pre_run/ Media spend plots, VIF bar chart
Model fit 20_model_fit/ model.rds, fit plots
Post-run 30_post_run/ posterior_summary.csv, decomposition plots
Diagnostics 40_diagnostics/ diagnostics_report.csv, diagnostic plots
Model selection 50_model_selection/ LOO summary, Pareto-k plot (if MCMC)
Optimisation 60_optimisation/ Allocation summary, response curves (if enabled)

6. Quick verification commands

# Check diagnostics overall status
head -1 results/<run_dir>/40_diagnostics/diagnostics_report.csv

# View posterior summary
head results/<run_dir>/30_post_run/posterior_summary.csv

# Count artefact files
find results/<run_dir> -type f | wc -l

Failure handling

Symptom Likely cause Action
Exit code 1 during validate Config or data error Read error message; fix config
Exit code 1 during run Stan compilation or sampling failure Check Stan cache; increase iterations or warmup
Missing 20_model_fit/model.rds Fit did not complete Review runner log for Stan errors
Missing 40_diagnostics/ Diagnostics writer failed Check for upstream fit failures; review tryCatch messages

Interpret Diagnostics

Objective

Read and act on the diagnostics report produced by a DSAMbayes runner execution, understanding which checks matter most and what remediation steps to take.

Prerequisites

  • A completed runner run execution with artefacts under 40_diagnostics/.
  • Familiarity with Diagnostics Gates definitions.

Steps

1. Open the diagnostics report

cat results/<run_dir>/40_diagnostics/diagnostics_report.csv

Each row is one diagnostic check. The key columns are:

Column What to look at
check_id Identifies the specific diagnostic
status pass, warn, fail, or skipped
value The observed metric value
threshold The threshold that was applied
message Human-readable explanation

2. Check the overall status

The overall status follows a simple rule:

  • Any fail → overall fail.
  • Any warn (no fails) → overall warn.
  • All pass → overall pass.

If the overall status is pass, no further action is required for the configured policy mode.

3. Triage failing checks

Focus on fail rows first, then warn rows. Use the check phase to prioritise:

Phase Priority Meaning
P0 Highest Data integrity issues — fix before interpreting model results
P1 High Sampler quality or residual issues — may affect inference reliability

4. Common diagnostics and actions

Design matrix issues (P0)

Check Symptom Action
pre_response_finite fails Non-finite values in response Clean data; remove or impute NA/Inf rows
pre_design_constants_duplicates fails Constant or duplicate columns Remove redundant terms from formula
pre_design_condition_number warns/fails High collinearity Reduce correlated predictors; simplify formula

Sampler quality (P1, MCMC only)

Check Symptom Action
sampler_rhat_max warns/fails Poor convergence Increase fit.mcmc.iter and fit.mcmc.warmup; simplify model
sampler_ess_bulk_min or sampler_ess_tail_min warns/fails Insufficient effective samples Increase iterations; check for multimodality
sampler_divergences fails Divergent transitions Increase fit.mcmc.adapt_delta (e.g. 0.95 → 0.99); consider reparameterisation
sampler_treedepth_frac warns/fails Max treedepth saturation Increase fit.mcmc.max_treedepth
sampler_ebfmi_min warns/fails Low energy diagnostic Indicates difficult posterior geometry; simplify model or increase warmup

Residual behaviour (P1)

Check Symptom Action
resid_ljung_box_p warns/fails Significant residual autocorrelation Add time controls (trend, seasonality, holidays)
resid_acf_max warns/fails High residual ACF at early lags Same as above; check for missing structural components

Boundary and variation checks (P1)

Check Symptom Action
boundary_hit_fraction warns/fails Posterior draws hitting parameter bounds Review boundary specification; widen constraints or remove unnecessary bounds
within_var_ratio warns/fails Low within-group variation (hierarchical) Check group structure; some groups may have insufficient temporal variation

Identifiability gate (P1)

Check Symptom Action
pre_identifiability_baseline_media_corr warns/fails High baseline-media correlation Add controls to separate baseline from media effects; review formula specification

5. Review diagnostic plots

Cross-reference the numeric report with visual diagnostics in 40_diagnostics/:

  • Residual diagnostics plot — check for patterns in residuals over time.
  • Boundary hits plot — identify which parameters are constrained.
  • Latent residual ACF plot — confirm autocorrelation structure.

See Diagnostics Plots for interpretation guidance.

6. Decide on next steps

Overall status Action
pass Proceed to post-run analysis and reporting
warn Review warnings; proceed if acceptable for the use case
fail Remediate failing checks before using model results for decisions

7. Change policy mode if appropriate

If you are in early model development, consider switching to explore mode to relax thresholds:

diagnostics:
  policy_mode: explore

For production or audit runs, use publish (default) or strict.

Compare Runs

Objective

Compare multiple DSAMbayes runner executions and select a candidate model for reporting or decision-making, using predictive scoring and diagnostic summaries.

Prerequisites

  • Two or more completed runner run executions (MCMC fit method).
  • Artefacts under 50_model_selection/ for each run (LOO summary, ELPD outputs).
  • Familiarity with Diagnostics Gates and Model Selection Plots.

Steps

1. Collect run directories

Identify the run directories to compare:

results/20260228_083808_blm_synth_kpi_os_hfb01/
results/20260228_084410_blm_synth_kpi_os_hfb01/
results/20260228_084602_blm_synth_kpi_os_hfb01/

2. Compare ELPD scores

The compare_runs() helper ranks runs by expected log predictive density (ELPD):

library(DSAMbayes)
comparison <- compare_runs(
  run_dirs = c(
    "results/20260228_083808_blm_synth_kpi_os_hfb01",
    "results/20260228_084410_blm_synth_kpi_os_hfb01"
  )
)
print(comparison)

The output ranks runs by ELPD (higher is better) and reports Pareto-k diagnostics.

3. Check Pareto-k reliability

Examine the loo_summary.csv in each run’s 50_model_selection/ folder:

cat results/<run_dir>/50_model_selection/loo_summary.csv

Key metrics:

Metric Interpretation
elpd_loo Expected log predictive density; higher is better
p_loo Effective number of parameters
looic LOO information criterion; lower is better
Pareto-k counts Observations with k > 0.7 indicate unreliable LOO estimates

If many observations have high Pareto-k values, the LOO approximation is unreliable for that run. Consider time-series cross-validation as an alternative.

4. Review time-series CV (if available)

If diagnostics.time_series_selection.enabled: true was configured, check:

cat results/<run_dir>/50_model_selection/tscv_summary.csv

This provides expanding-window blocked CV scores (holdout ELPD, RMSE, SMAPE) that are more appropriate for time-series data than standard LOO.

5. Cross-reference diagnostics

For each candidate run, check the diagnostics overall status:

head -1 results/<run_dir>/40_diagnostics/diagnostics_report.csv

A model with better ELPD but failing diagnostics should not be preferred over a model with slightly lower ELPD and passing diagnostics.

6. Compare fit quality visually

Review the fit time series and scatter plots in 20_model_fit/ for each run:

  • Fit time series — does the model track the observed KPI?
  • Fit scatter — is the predicted-vs-observed relationship close to the diagonal?
  • Posterior forest — are coefficient estimates reasonable and well-identified?

7. Selection decision matrix

Criterion Weight Run A Run B
ELPD (higher is better) High value value
Pareto-k reliability (fewer high-k) High value value
Diagnostics overall status High pass/warn/fail pass/warn/fail
TSCV holdout RMSE (if available) Medium value value
Coefficient plausibility Medium judgement judgement
Fit visual quality Low judgement judgement

8. Record the selection

Document the selected run directory and rationale. If using the runner for release evidence, the selected run’s artefacts form part of the evidence pack.

Caveats

  • ELPD is not causal validation. Predictive scoring measures in-sample predictive quality, not whether the model identifies causal media effects correctly.
  • Pooled models do not support time-series CV (rejected by config validation).
  • MAP-fitted models do not produce LOO diagnostics. Use MCMC for model comparison.

Debug Run Failures

Objective

Diagnose and resolve the most common failure modes encountered when running DSAMbayes via the YAML/CLI runner.

Prerequisites

  • A failed runner execution (non-zero exit code or missing artefacts).
  • Access to the terminal output or log from the failed run.
  • Familiarity with CLI Usage and Config Schema.

Triage by failure stage

Stage 0: Config resolution failures

Symptoms: runner exits immediately after validate or at the start of run; no run directory created or only 00_run_metadata/ is present.

Error pattern Cause Fix
data_path not found Data file path is wrong or missing Check data.path in YAML; use absolute path or path relative to config file
Unknown YAML key Typo or unsupported config key Compare against Config Schema; fix spelling
Formula parse error Invalid R formula syntax Check model.formula for unmatched parentheses, missing ~, or invalid operators
holidays.calendar_path not found Holiday calendar file missing Check time_components.holidays.calendar_path; ensure file exists

Quick check:

Rscript scripts/dsambayes.R validate --config config/my_config.yaml

If validate passes, the config is structurally valid.

Stage 1: Stan compilation failures

Symptoms: runner reports compilation errors after “Compiling model”; may reference C++ or Stan syntax errors.

Error pattern Cause Fix
Stan syntax error in generated template Template rendering issue Clear the Stan cache (rm -rf .cache/dsambayes/) and retry
C++ compiler not found Toolchain not installed Install a C++ toolchain (see Install and Setup)
Permission denied on cache directory Cache path not writable Set XDG_CACHE_HOME to a writable directory

Quick check:

mkdir -p .cache
export XDG_CACHE_HOME="$PWD/.cache"

Stage 2: Data preparation failures

Symptoms: runner fails after compilation but before sampling; error messages reference prep_data_for_fit, model.frame, or scaling.

Error pattern Cause Fix
“Cannot scale model data with zero variance” A column in the model frame is constant Remove constant terms from formula, or set model.scale: false
“Constant CRE mean terms” CRE variable has identical group means Use model.type: re (without CRE) or add variation
“non-finite values” in model frame NA or Inf values in data Clean data before running; remove rows with missing values
“Offset vector length does not match” NA handling created length mismatch Ensure offset column has no NA values, or report as a bug

Stage 3: Sampling failures

Symptoms: runner fails during rstan::sampling() or rstan::optimizing(); may report Stan runtime errors.

Error pattern Cause Fix
“Exception: validate transformed params” Parameter hits boundary during sampling Widen boundaries; check for overly tight constraints
“Initialization failed” Poor initial values Increase fit.mcmc.init range or simplify model
Timeout or very slow sampling Model too complex for data size Reduce iterations for initial testing; simplify formula
All chains fail Severe model misspecification Review formula, priors, and data for fundamental issues

Stage 4: Post-fit artefact failures

Symptoms: runner completes sampling but some artefact folders are empty or missing files.

Error pattern Cause Fix
Missing 30_post_run/ files Decomposition failed Check formula compatibility with model.matrix(); hierarchical formulas with `
Missing 40_diagnostics/ files Diagnostics writer error Check for upstream issues in model object; review tryCatch messages in log
Missing 50_model_selection/ files LOO computation failed Ensure MCMC fit (not MAP); check for valid posterior
Missing 60_optimisation/ files Allocation not enabled or failed Check allocation.enabled: true in config; review scenario specification

Quick check:

find results/<run_dir> -type f | sort

Compare against the expected artefact list in Output Artefacts.

Stage 5: Plot generation failures

Symptoms: CSV artefacts are present but PNG plot files are missing.

Error pattern Cause Fix
“cannot open connection” for PNG Graphics device issue Check that grDevices is available; ensure sufficient disk space
Plot function error for hierarchical model Group-level coefficient draws are vectors, not scalars This was fixed in v1.2.0; ensure you are running the latest version

General debugging steps

  1. Read the full error message. DSAMbayes uses cli::cli_abort() with descriptive messages that identify the failing function and parameter.

  2. Check the resolved config. If a run directory was created, inspect 00_run_metadata/config.resolved.yaml to see what defaults were applied.

  3. Check session info. Inspect 00_run_metadata/session_info.txt for package version mismatches.

  4. Clear the Stan cache. Stale compiled models can cause unexpected failures:

    rm -rf .cache/dsambayes/
  5. Run validate before run. Always validate first to catch config errors before committing to a full MCMC run.

  6. Reduce iterations for debugging. Use a minimal config with fit.mcmc.iter: 200 and fit.mcmc.warmup: 100 to iterate quickly on formula and data issues.

Appendices

Purpose

Provide reference material that supports core user guidance without duplicating operational instructions.

Audience

  • Readers needing precise terminology definitions
  • Engineers orienting themselves in R/ source modules
  • Reviewers validating implementation traceability

Pages

Usage rules

  1. Use appendices as reference pages, not primary process documentation.
  2. Keep operational runbooks in docs/getting-started/, docs/runner/, and docs/internal/.
  3. Prefer links to authoritative sources instead of duplicating constraints or commands.

Subsections of Appendices

Glossary

Purpose

Define canonical terms used across DSAMbayes modelling, runner, diagnostics, and release documentation.

How to use this glossary

  1. Use these definitions when writing or reviewing DSAMbayes documentation.
  2. Keep term usage consistent across docs/modelling/, docs/runner/, and docs/internal/.
  3. If a term changes behaviour in code, update this page in the same change.

Terms

Term Definition Primary location
adstock Media carry-over transform that spreads spend effect over subsequent periods. Stan media-transform templates and modelling docs
allocation Post-fit budget optimisation stage (allocation.* in YAML). docs/runner/config-schema.md
artefact File written by runner validate/run workflows. docs/runner/output-artifacts.md
baseline term Non-media explanatory term (for example trend, seasonality, holiday controls). docs/modelling/diagnostics-gates.md
blm Base DSAMbayes model class for non-pooled regression workflows. R/blm.R
blocked CV Expanding-window time-series cross-validation used for model selection. R/time_series_cv.R
boundary Lower and upper constraints on model parameters. docs/modelling/priors-and-boundaries.md
chain diagnostics MCMC quality diagnostics such as Rhat, ESS, and divergence indicators. R/diagnostics.R
config resolution Process of applying defaults, coercions, path normalisation, and validation to YAML. R/run_config_*.R
CRE Correlated random effects approach using Mundlak-style within and between variation terms. R/cre_mundlak.R
decomp Post-fit decomposition of fitted response into predictor-level contributions. R/decomp.R
diagnostics gate Thresholded pass/warn/fail policy checks over model diagnostics. R/diagnostics_report.R
divergence Stan sampler warning indicating problematic Hamiltonian trajectories. MCMC diagnostics outputs
dry run Runner mode that validates config and data without Stan fitting (validate). scripts/dsambayes.R
ELPD Expected log predictive density, used for predictive model comparison. R/compare_runs.R
ESS Effective sample size for MCMC draws. Higher is generally better. Chain diagnostics outputs
fit MCMC fitting path (fit.method: mcmc). R/run_from_yaml.R
fit_map / optimise MAP optimisation path (fit.method: optimise). R/blm.R, R/hierarchy.R, R/pooled.R
hierarchical Model class with grouped random effects (`(term group)` syntax).
identifiability check Diagnostic check for baseline and media term correlation risk. R/diagnostics_report.R
kpi scale Business-outcome scale used for reporting. For log-response models this is back-transformed from model scale. docs/modelling/response-scale-semantics.md
lognormal_ms Positive-support prior family parameterised by mean and standard deviation on the original scale. R/prior_schema.R
MAP Maximum a posteriori point estimate from optimisation. Not a posterior mean. fit_map paths
MCMC Markov chain Monte Carlo posterior sampling. rstan::sampling paths
Pareto-k PSIS-LOO reliability diagnostic for influence of observations. loo_summary.csv outputs
pooled Model class with structured pooling over configured grouping variables. R/pooled.R
posterior draw One sampled value from the posterior distribution. get_posterior() outputs
pre-flight checks Guardrails and model/data compatibility checks run before fitting. R/pre_flight.R
prior_only Fit mode sampling only from priors, excluding likelihood learning. prep_data_for_fit.* interfaces
PSIS-LOO Pareto-smoothed importance-sampling leave-one-out approximation. R/diagnostics_report.R
QG-1 to QG-7 Canonical release quality gates for lint, style, tests, package check, runner smoke, and docs build. docs/internal/quality-gates.md
response scale Scale used inside the fitted model (identity or log). docs/modelling/response-scale-semantics.md
Rhat Convergence diagnostic comparing within- and between-chain variance. Chain diagnostics outputs
runner YAML/CLI execution layer around core DSAMbayes APIs. scripts/dsambayes.R, R/run_from_yaml.R
run_dir Output directory used by a runner validate/run execution. docs/runner/output-artifacts.md
staged layout Structured artefact layout with numbered folders (00_ to 70_). docs/runner/output-artifacts.md
Stan cache Compiled model cache location, typically under XDG_CACHE_HOME. install/setup docs
SMAPE Symmetric mean absolute percentage error metric used in fit summaries. R/stats.R
time-components Managed time control features, including holiday-derived regressors. R/holiday_calendar.R
tscv Time-series selection artefact prefix for blocked CV outputs. 50_model_selection/tscv_*.csv
warmup Initial MCMC iterations used for adaptation and excluded from posterior draws. fit.mcmc.warmup
Hill transform Saturation function spend^n / (spend^n + k^n) used in budget optimisation response curves. k is the half-saturation point, n is the shape parameter. R/optimise_budget.R
atan transform Saturation function atan(spend / scale) mapping spend to a bounded response. R/optimise_budget.R
log1p transform Saturation function log(1 + spend / scale) providing diminishing-returns concavity. R/optimise_budget.R
adstock Media carry-over transform that spreads a spend effect over subsequent periods via geometric decay. Applied as a pre-transform in the data, not estimated within DSAMbayes. Formula transforms
conditional mean Bias-corrected back-transform for log-response models: exp(mu + sigma^2/2). Default in v1.2.2 for fitted_kpi(). R/fitted.R
Jensen's inequality Mathematical property that E[exp(X)] != exp(E[X]) when X has non-zero variance. DSAMbayes avoids this bias by applying exp() draw-wise before summarising. docs/modelling/response-scale-semantics.md

R Module Index

Purpose

Provide a maintainable orientation map for R/ while preserving the flat R package layout.

Layout rule

DSAMbayes keeps a flat R/ directory. Modules are logical groupings, not folder boundaries.

Module map

Module Responsibility Primary files
Model objects and fit engines Class constructors, fit and MAP pathways, class-specific data preparation R/blm.R, R/hierarchy.R, R/pooled.R, R/model_schema.R, R/blm_compiled.R
Stan media transform support Media-transform config parsing and Stan data wiring R/media_transform_config.R
Priors, boundaries, scaling, transforms Prior parsing, boundary handling, scaling contracts, transform helpers, offsets R/prior.R, R/prior_schema.R, R/scale.R, R/scale_prior_sd.R, R/transformations.R, R/transform_sensitivity.R, R/offset.R
Formula and pre-flight validation Formula parsing/safety, data/date checks, pre-fit guardrails R/formula.R, R/formula_safety.R, R/pre_flight.R, R/date.R, R/variable.R
Diagnostics and model selection Fit diagnostics, gate logic, cross-validation, posterior predictive and metrics R/diagnostics.R, R/diagnostics_report.R, R/crossval.R, R/time_series_cv.R, R/post_pred.R, R/stats.R, R/compare_runs.R
Decomposition and extraction Decomposition APIs and extraction helpers R/decomp.R, R/decomp_prep.R, R/extract.R, R/fitted.R
Runner config and orchestration YAML defaults/coercion/validation, model orchestration, run execution R/run_config.R, R/run_config_helpers.R, R/run_config_defaults.R, R/run_config_validation.R, R/run_orchestrator.R, R/run_from_yaml.R
Runner artefacts and reporting Stage mapping and artefact writers for metadata, diagnostics, enrichment R/run_artifacts.R, R/run_artifacts_diagnostics.R, R/run_artifacts_enrichment.R, R/runner_fit_plots.R
Time components and CRE support Holiday feature engineering and CRE/Mundlak model support R/holiday_calendar.R, R/cre_mundlak.R
Budget optimisation and visual outputs Decision-layer optimisation engine and plotting APIs R/optimise_budget.R, R/optimise_budget_plots.R, R/plot_theme_wpp.R
Package infrastructure and utilities Package lifecycle hooks, utility helpers, package metadata helpers, bundled ASCII art R/zzz.R, R/utils.R, R/utils-pipe.R, R/sitrep_.R, R/data.R, R/ascii.txt

High-coupling files to review carefully

File Why it is high-coupling
R/run_from_yaml.R Central runner execution path connecting config, model build, fit, diagnostics, and artefact writing.
R/run_config_validation.R Cross-field validation rules with direct impact on runner safety and allowed contracts.
R/run_artifacts.R Artefact pathing and stage contract used by docs, CI, and release evidence workflows.
R/hierarchy.R Large class-specific fit, posterior, and scaling logic with multiple behavioural branches.
R/blm.R Base class implementation used across many tests and runner pathways.

Placement guide for new code

  1. Put new logic in the closest logical module listed above.
  2. If touching legacy compatibility files (R/run_config.R, R/run_artifacts.R), prefer adding new logic to their split companion files where possible.
  3. Keep exported function names stable unless there is an explicit API migration plan.
  4. Add or update test coverage in tests/testthat/ for any behavioural change.

Traceability Map

Purpose

Provide a traceability reference that maps DSAMbayes issues and recommendations to implementation status and evidence.

Authoritative data source

The single source of truth for all issue and recommendation status is:

  • code_review/audit_report/issue_register.csv

This register contains every ENG, INF, and GOV issue and recommendation with columns for status, severity, owner, linked IDs, notes, and a long-form explanation field.

Two stakeholder-facing summary CSVs are published alongside this page under docs/appendices/traceability-data/:

These files are derived snapshots. Where they conflict with issue_register.csv, the register takes precedence.

Snapshot metrics (as of 2026-02-28)

Non-GOV issues and recommendations

  • All ENG and INF issues: closed (including backlog items accepted and closed 2026-02-28)
  • All ENG and INF recommendations: closed / implemented

GOV issues

  • GOV-ISSUE-001 through GOV-ISSUE-004: open (governance items; resolution deferred to management review cycle)

How to use this map

  1. Start with an issue ID or recommendation ID.
  2. Open code_review/audit_report/issue_register.csv.
  3. Confirm status in the status column and review linked IDs in linked_ids.
  4. Follow evidence paths referenced in the notes column to code files, tests, and review records.
  5. Use internal engineering documentation for release decision criteria.

Representative mappings

ID Type Status Evidence anchors
ENG-ISSUE-014 Issue closed R/scale.R, tests/testthat/test-scale-guardrails.R
INF-ISSUE-006 Issue closed R/diagnostics_report.R, tests/testthat/test-diagnostics-report.R
INF-REC-004 Recommendation closed Holiday/time-components implementation and review references
ENG-REC-003 Recommendation closed Pooled RMSE and scale wiring closure references

Traceability and release approval

Traceability status alone does not authorise a release. Release sign-off still requires:

  1. Mandatory quality gates passed.
  2. Required release evidence bundle complete.
  3. Final decision recorded in sign-off template.

Docs Build and Deploy

Purpose

Describe how the DSAMbayes documentation site is built, previewed, and deployed.

Documentation layers

DSAMbayes has two documentation systems:

1. pkgdown site (R package documentation)

Generated from roxygen comments in R/ source files and vignettes in vignettes/.

Build locally:

R_LIBS_USER="$PWD/.Rlib" \
  Rscript -e 'pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)'

CI deployment: .github/workflows/pkgdown.yaml builds and deploys to gh-pages on push to main or release events.

Output: generated site under docs/ (pkgdown output, not the Markdown docs).

2. Markdown documentation site (this site)

The docs/ directory contains hand-authored Markdown files organised into sections (Getting Started, Runner, Modelling, Plots, How-To, Appendices). These are configured via docs/docs-config.json.

Site generator: the configuration structure (docs-config.json with navigation, branding, and search settings) is designed for a static site generator. The specific generator and hosting are configured in the deployment environment.

Preview locally: open Markdown files directly, or use any Markdown preview tool. The inter-page links use root-relative paths (e.g. /getting-started/install-and-setup).

Deploy target: https://dsambayes.docs.wppma.space/ (as referenced in README.md).

Configuration

docs/docs-config.json defines:

  • metadata — site name, description, version.
  • branding — logo, favicon, primary colour.
  • navigation — navbar links and sidebar structure.
  • features — math rendering (enabled), search (local).

Adding a new page

  1. Create the Markdown file in the appropriate section directory (e.g. docs/modelling/new-page.md).
  2. Add a sidebar entry in docs/docs-config.json under the appropriate section.
  3. Add a row to the section’s index.md page table.
  4. Update docs/_plan/content-map.md if tracking authoring status.

Internal (Engineering)

Purpose

Define quality gates and release-readiness checks for DSAMbayes v1.2.2.

Audience

  • Maintainers preparing and validating releases.
  • Reviewers checking evidence before sign-off.

Pages

Page Topic
Testing and Validation Quality-gate execution commands, expected outcomes, and evidence capture
Quality Gates Gate definitions and pass/fail criteria
Runner Smoke Tests Minimal runner validation runs
CI Workflows Automated check and docs-build workflows
Release Readiness Checklist Gate checklist and sign-off fields
Release Evidence Pack Artefact collection for stakeholder review
Release Playbook Step-by-step release process
Sign-off Template Release sign-off record template

Subsections of Internal (Engineering)

Quality Gates

Purpose

Define the canonical release-quality gates for DSAMbayes v1.2.2, including commands, pass/fail criteria, and evidence requirements.

Audience

  • Maintainers preparing a release candidate
  • Reviewers signing off release readiness
  • Engineers running local pre-merge quality checks

Gate Matrix

Gate ID Gate Command Pass Criteria Evidence
QG-1 Lint R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --lint Exit code 0, no lint failures, no SKIP: output Terminal log and exit code
QG-2 Style R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --style Exit code 0, no style failures, no SKIP: output Terminal log and exit code
QG-3 Unit tests R_LIBS_USER="$PWD/.Rlib" R -q -e 'testthat::test_dir("tests/testthat")' Exit code 0, no test failures Terminal log and exit code
QG-4 Package check R_LIBS_USER="$PWD/.Rlib" R -q -e 'rcmdcheck::rcmdcheck(args = c("--no-manual","--compact-vignettes=gs+qpdf"))' No ERROR. No WARNING for release sign-off. Any NOTE requires explicit reviewer acceptance rcmdcheck summary and reviewer decision on NOTEs
QG-5 Runner smoke: validate R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_validate Exit code 0 CLI log and results/quality_gate_validate/00_run_metadata/config.resolved.yaml
QG-6 Runner smoke: run R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R run --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_run Exit code 0 and core artefacts exist CLI log and selected artefacts under results/quality_gate_run/
QG-7 Docs build check R_LIBS_USER="$PWD/.Rlib" Rscript -e 'pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)' Exit code 0 Build log and generated site output under docs/

Gate Definitions

QG-1 Lint

Command:

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --lint

Fail conditions:

  • Non-zero exit code
  • Any lint issue reported
  • Any SKIP: output

QG-2 Style

Command:

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --style

Fail conditions:

  • Non-zero exit code
  • Any file reported as requiring reformat
  • Any SKIP: output

QG-3 Unit Tests

Command:

R_LIBS_USER="$PWD/.Rlib" R -q -e 'testthat::test_dir("tests/testthat")'

Fail conditions:

  • Non-zero exit code
  • Any test failure or error

QG-4 Package Check

Command:

R_LIBS_USER="$PWD/.Rlib" R -q -e 'rcmdcheck::rcmdcheck(args = c("--no-manual","--compact-vignettes=gs+qpdf"))'

Fail conditions:

  • Any ERROR
  • Any WARNING for release sign-off

Escalation condition:

  • Any NOTE must be reviewed and explicitly accepted with rationale.

QG-5 Runner Smoke: Validate

Command:

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate \
    --config config/blm_synthetic_mcmc.yaml \
    --run-dir results/quality_gate_validate

Fail conditions:

  • Non-zero exit code
  • Missing results/quality_gate_validate/00_run_metadata/config.resolved.yaml

QG-6 Runner Smoke: Run

Command:

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R run \
    --config config/blm_synthetic_mcmc.yaml \
    --run-dir results/quality_gate_run

Fail conditions:

  • Non-zero exit code
  • Missing results/quality_gate_run/00_run_metadata/config.resolved.yaml
  • Missing results/quality_gate_run/20_model_fit/model.rds
  • Missing results/quality_gate_run/30_post_run/posterior_summary.csv
  • Missing results/quality_gate_run/40_diagnostics/diagnostics_report.csv

QG-7 Docs Build Check

Command:

R_LIBS_USER="$PWD/.Rlib" \
  Rscript -e 'pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)'

Fail conditions:

  • Non-zero exit code
  • pkgdown build aborts before site generation

Command Reference

Recommended environment setup before running gates:

# Navigate to your local DSAMbayes checkout
cd /path/to/DSAMbayes
mkdir -p .Rlib .cache
export R_LIBS_USER="$PWD/.Rlib"
export XDG_CACHE_HOME="$PWD/.cache"

Optional consolidated local gate (does not replace all release gates):

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --all

scripts/check.R --all covers lint, style, tests, and coverage. It does not run rcmdcheck, runner smoke checks, or pkgdown build.

Evidence Requirements

Minimum evidence bundle per release candidate:

  1. Full terminal output and exit code for each gate QG-1 to QG-7.
  2. Runner smoke artefacts under results/quality_gate_validate/.
  3. Runner smoke artefacts under results/quality_gate_run/.
  4. rcmdcheck summary with explicit handling of NOTEs.
  5. Confirmation that no gate result is SKIP.

Evidence review should reference:

Failure and Escalation Rules

  1. Any gate failure blocks release tagging.
  2. Do not proceed to sign-off with unresolved ERROR or WARNING.
  3. NOTEs require written rationale and reviewer acceptance.
  4. If a gate fails due to environment setup, fix the environment and re-run the full affected gate.
  5. If a gate fails due to product code, raise a remediation change and re-run from QG-1.

Sign-off Criteria

Release sign-off requires all of the following:

  1. QG-1 to QG-7 passed.
  2. No SKIP outcomes across mandatory gates.
  3. Evidence bundle completed and reviewed.
  4. Final decision recorded in sign-off-template.md.

Testing and Validation

Purpose

Define the canonical testing and validation workflow for DSAMbayes v1.2.2, from local pre-merge checks through release-quality gates.

Audience

  • Engineers running local checks before merge
  • Maintainers preparing release candidates
  • Reviewers validating release evidence

Validation layers

Layer Objective Primary command(s) Output proof
Lint Catch style and static issues early Rscript scripts/check.R --lint Exit code 0, no lint failures
Style Enforce formatting compliance on changed files Rscript scripts/check.R --style Exit code 0, no reformat-required files
Unit tests Catch behavioural regressions in package logic R -q -e 'testthat::test_dir("tests/testthat")' Exit code 0, no test failures
Package check Validate package-level install and check behaviour R -q -e 'rcmdcheck::rcmdcheck(...)' No ERROR; no unresolved WARNING
Runner validate Validate config and data contracts without fitting Rscript scripts/dsambayes.R validate ... Exit code 0, metadata artefacts
Runner run Validate end-to-end runner execution and artefacts Rscript scripts/dsambayes.R run ... Exit code 0, core run artefacts
Docs build Validate pkgdown documentation buildability Rscript -e 'pkgdown::build_site_github_pages(...)' Exit code 0, successful site build
CI workflows Cross-platform and publish-path verification GitHub Actions workflow runs Green workflow status on candidate commit

Environment setup

Run all commands from repository root:

# Navigate to your local DSAMbayes checkout
cd /path/to/DSAMbayes
mkdir -p .Rlib .cache
export R_LIBS_USER="$PWD/.Rlib"
export XDG_CACHE_HOME="$PWD/.cache"

Expected outcome: checks run in a repo-scoped environment with reproducible library and cache paths.

Local validation workflows

Developer fast path (pre-merge)

Run the consolidated local gate:

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --all

Expected outcome: lint, style, tests, and coverage complete successfully.

Implementation note: scripts/check.R --all is a convenience gate. It does not replace rcmdcheck, runner smoke tests, or docs build.

Release-candidate full path

Run mandatory gates in this exact order:

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --lint
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --style
R_LIBS_USER="$PWD/.Rlib" R -q -e 'testthat::test_dir("tests/testthat")'
R_LIBS_USER="$PWD/.Rlib" R -q -e 'rcmdcheck::rcmdcheck(args = c("--no-manual","--compact-vignettes=gs+qpdf"))'
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_validate
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R run --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_run
R_LIBS_USER="$PWD/.Rlib" Rscript -e 'pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)'

Expected outcome: all gates complete with exit code 0, with no unresolved release blockers.

Runner smoke-test expectations

Minimum release smoke expectations:

  1. validate command succeeds and writes metadata artefacts.
  2. run command succeeds and writes model, posterior summary, and diagnostics artefacts.
  3. Required runner artefact paths exist under results/quality_gate_validate/ and results/quality_gate_run/.

For matrix and exact artefact paths, use Runner Smoke Tests.

CI validation expectations

Candidate commit should have:

  1. Passing .github/workflows/R-CMD-check.yaml.
  2. Passing .github/workflows/pkgdown.yaml build.
  3. Deploy-step eligibility confirmed for non-PR release flow.

For workflow semantics and expected proof, use CI Workflows.

Evidence capture requirements

Before sign-off, capture:

  1. Full command logs and exit codes for all mandatory gates.
  2. Runner smoke artefacts from validate and run directories.
  3. CI run URLs and statuses for both workflows.
  4. Candidate commit hash and top changelog section.

Use Release Evidence Pack as the authoritative bundle contract.

Failure handling

  1. Any gate failure is a release blocker until resolved.
  2. Re-run the full failed gate after remediation.
  3. If rcmdcheck emits NOTE, record reviewer rationale explicitly.
  4. If runner artefacts are missing, inspect resolved config and outputs.* flags.

Runner Smoke Tests

Purpose

Define the minimal reproducible smoke-test matrix for the YAML runner validate and run commands, including exact commands and expected artefacts.

Audience

  • Maintainers preparing release evidence
  • Engineers triaging runner regressions
  • Reviewers confirming gate QG-5 and QG-6

Test scope

This smoke suite is intentionally small. It proves:

  • CLI argument handling for validate and run
  • Config resolution and runner pre-flight path
  • End-to-end artefact writing for one full run

This smoke suite does not replace unit tests or full package checks.

Preconditions

Run from repository root:

# Navigate to your local DSAMbayes checkout
cd /path/to/DSAMbayes
mkdir -p .Rlib .cache
export R_LIBS_USER="$PWD/.Rlib"
export XDG_CACHE_HOME="$PWD/.cache"

Expected outcome: commands resolve local package/library paths and use repo-scoped Stan cache.

Install DSAMbayes locally if needed:

R_LIBS_USER="$PWD/.Rlib" R -q -e 'install.packages(".", repos = NULL, type = "source")'

Expected outcome: library(DSAMbayes) succeeds in the same shell session.

Smoke-test matrix

Test ID Command Config Run directory Expected result
SMK-VAL-01 validate config/blm_synthetic_mcmc.yaml results/smoke_validate_blm Exit code 0; metadata artefacts written.
SMK-VAL-02 validate config/hierarchical_re_synthetic_mcmc.yaml results/smoke_validate_hier_re Exit code 0; metadata artefacts written.
SMK-VAL-03 validate config/pooled_synthetic_mcmc.yaml results/smoke_validate_pooled Exit code 0; metadata artefacts written.
SMK-RUN-01 run config/blm_synthetic_mcmc.yaml results/smoke_run_blm Exit code 0; core fit, post-run, and diagnostics artefacts written.

Canonical commands

SMK-VAL-01

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate \
    --config config/blm_synthetic_mcmc.yaml \
    --run-dir results/smoke_validate_blm

Expected outcome: validation completes without Stan fitting and prints Status: ok.

SMK-VAL-02

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate \
    --config config/hierarchical_re_synthetic_mcmc.yaml \
    --run-dir results/smoke_validate_hier_re

Expected outcome: validation completes for hierarchical RE config and prints Status: ok.

SMK-VAL-03

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R validate \
    --config config/pooled_synthetic_mcmc.yaml \
    --run-dir results/smoke_validate_pooled

Expected outcome: validation completes for pooled config and prints Status: ok.

SMK-RUN-01

R_LIBS_USER="$PWD/.Rlib" \
  Rscript scripts/dsambayes.R run \
    --config config/blm_synthetic_mcmc.yaml \
    --run-dir results/smoke_run_blm

Expected outcome: full pipeline completes and prints Run complete with a resolved run directory.

Expected artefacts

Validate artefacts (SMK-VAL-01 to SMK-VAL-03)

For each validate run directory, these files are required:

  • 00_run_metadata/config.original.yaml
  • 00_run_metadata/config.resolved.yaml
  • 00_run_metadata/session_info.txt

Failure rule: missing any required file is a smoke-test failure.

Run artefacts (SMK-RUN-01)

Required core artefacts:

  • 00_run_metadata/config.resolved.yaml
  • 20_model_fit/model.rds
  • 30_post_run/posterior_summary.csv
  • 40_diagnostics/diagnostics_report.csv

Recommended additional checks for stronger confidence:

  • 30_post_run/fitted.csv
  • 30_post_run/observed.csv
  • 40_diagnostics/diagnostics_summary.txt

Failure rule: missing any required core artefact is a smoke-test failure.

Verification helper commands

Check validate artefacts quickly:

for d in results/smoke_validate_blm results/smoke_validate_hier_re results/smoke_validate_pooled; do
  test -f "$d/00_run_metadata/config.original.yaml" || echo "MISSING: $d config.original.yaml"
  test -f "$d/00_run_metadata/config.resolved.yaml" || echo "MISSING: $d config.resolved.yaml"
  test -f "$d/00_run_metadata/session_info.txt" || echo "MISSING: $d session_info.txt"
done

Expected outcome: no MISSING: lines.

Check core run artefacts quickly:

d="results/smoke_run_blm"
test -f "$d/00_run_metadata/config.resolved.yaml" || echo "MISSING: config.resolved.yaml"
test -f "$d/20_model_fit/model.rds" || echo "MISSING: model.rds"
test -f "$d/30_post_run/posterior_summary.csv" || echo "MISSING: posterior_summary.csv"
test -f "$d/40_diagnostics/diagnostics_report.csv" || echo "MISSING: diagnostics_report.csv"

Expected outcome: no MISSING: lines.

Failure triage

  1. If validate fails, run the same command again with a clean run directory path and inspect CLI error output.
  2. If run fails before fitting, inspect 00_run_metadata/config.resolved.yaml to confirm resolved values.
  3. If run fails during fitting, verify local Stan toolchain and cache path from Install and Setup.
  4. If artefacts are missing after success exit code, inspect outputs.* flags in the resolved config.

Evidence capture

For release evidence, capture:

  1. Full terminal logs and exit codes for SMK-VAL-01 to SMK-RUN-01.
  2. Directory listings for each smoke run directory.
  3. The required artefacts listed above.

Store and review evidence with:

CI Workflows

Purpose

Define what the repository CI workflows execute, what each workflow is expected to prove, and where each workflow fits in release evidence.

Audience

  • Maintainers who triage CI failures and approve merges
  • Reviewers validating release evidence
  • Engineers who need to map local checks to CI outcomes

Workflow inventory

Workflow file Job Triggers Primary proof
.github/workflows/R-CMD-check.yaml R-CMD-check push and pull_request to main or master DSAMbayes passes R CMD check across the configured OS and R-version matrix.
.github/workflows/pkgdown.yaml pkgdown push and pull_request to main or master, release (published), and manual workflow_dispatch Documentation site builds successfully, and non-PR events can deploy to gh-pages.

R-CMD-check.yaml

Trigger conditions

R-CMD-check.yaml runs on:

  • push to main or master
  • pull_request targeting main or master

Job contract

The workflow defines one matrix job named R-CMD-check with fail-fast: false across:

  • macos-latest with R release
  • windows-latest with R release
  • ubuntu-latest with R devel
  • ubuntu-latest with R release
  • ubuntu-latest with R oldrel-1

Each matrix cell executes:

  1. Repository checkout (actions/checkout@v4)
  2. Pandoc setup (r-lib/actions/setup-pandoc@v2)
  3. R setup (r-lib/actions/setup-r@v2)
  4. Dependency resolution for checks (r-lib/actions/setup-r-dependencies@v2, needs: check, extra-packages: any::rcmdcheck)
  5. Package check (r-lib/actions/check-r-package@v2) with:
    • build_args: c("--no-manual","--compact-vignettes=gs+qpdf")
    • upload-snapshots: true

What this job is expected to prove

  • The package can be installed and checked on the supported CI matrix.
  • R CMD check passes without check-level failures on each matrix cell.
  • The package dependency graph in DESCRIPTION resolves in CI for each matrix cell.

What this job does not prove

  • It does not run scripts/check.R --lint or scripts/check.R --style.
  • It does not run runner smoke commands (scripts/dsambayes.R validate or run).
  • It does not build or deploy pkgdown documentation.

pkgdown.yaml

Trigger conditions

pkgdown.yaml runs on:

  • push to main or master
  • pull_request targeting main or master
  • release events where type is published
  • Manual dispatch (workflow_dispatch)

Job contract

The workflow defines a single pkgdown job on ubuntu-latest with:

  • Concurrency group pkgdown-${{ github.event_name != 'pull_request' || github.run_id }}
  • Workflow-level permissions: read-all
  • Job-level permissions: contents: write

The job executes:

  1. Repository checkout (actions/checkout@v4)
  2. Pandoc setup (r-lib/actions/setup-pandoc@v2)
  3. R setup (r-lib/actions/setup-r@v2, use-public-rspm: true)
  4. Website dependencies (r-lib/actions/setup-r-dependencies@v2, needs: website, extra-packages: any::pkgdown, local::.)
  5. Site build (pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE))
  6. Deploy step (JamesIves/github-pages-deploy-action@v4.5.0) only when event is not pull_request

What this job is expected to prove

  • The pkgdown site can be generated from the current repository state.
  • Package documentation inputs required by pkgdown are valid enough to build the site.
  • On non-PR events, generated site output under docs/ can be pushed to gh-pages.

What this job does not prove

  • It does not replace R CMD check as a package correctness gate.
  • It does not execute full runner smoke tests.
  • It does not provide release sign-off on its own.

Required secrets and permissions

  • Both workflows set GITHUB_PAT from secrets.GH_PAT.
  • R-CMD-check.yaml runs with read-only repository permissions (read-all).
  • pkgdown.yaml requires contents: write in the pkgdown job to push to gh-pages.

If GH_PAT is absent, steps that require authenticated GitHub API access can hit stricter rate limits.

Expected logs and artefacts

R-CMD-check.yaml

Expected evidence in the GitHub Actions run:

  • One job result per matrix cell
  • check-r-package logs for each cell
  • Clear pass or fail state per OS/R pair

pkgdown.yaml

Expected evidence in the GitHub Actions run:

  • Build site log showing pkgdown generation status
  • Deploy to GitHub pages log for non-PR events
  • No deploy step execution for PR events

Failure handling

  1. Treat any red CI job as a merge blocker until resolved or explicitly waived.
  2. Reproduce failures locally with the nearest equivalent command:
    • R CMD check failure: run the rcmdcheck command from Quality Gates.
    • pkgdown build failure: run the pkgdown command from Quality Gates.
  3. If failure is environment-specific (for example one matrix OS), capture that scope in the PR and link the failing job URL.
  4. Re-run failed jobs only after a code or environment change that addresses the root cause.

Relationship to release gates

  • R-CMD-check.yaml provides CI evidence for gate QG-4 in Quality Gates.
  • pkgdown.yaml provides CI evidence for gate QG-7 in Quality Gates.
  • Release sign-off still requires the full gate set and evidence bundle.

Release Evidence Pack

Purpose

Define the exact evidence bundle required before release sign-off for DSAMbayes v1.2.2.

Audience

  • Release owner preparing sign-off materials
  • Reviewers validating release readiness
  • Maintainers reproducing release gate outcomes

Evidence root and naming

Use one evidence root per candidate release.

Recommended path:

release_evidence/v1.2.2/<YYYYMMDD>_<short_sha>/

Example:

release_evidence/v1.2.2/20260228_0d68378/

Expected outcome: all sign-off evidence is stored in one deterministic location.

Mandatory evidence bundle

All items below are mandatory.

ID Evidence item Required content Source Required path in evidence root
EVD-01 Release identity Candidate commit hash, branch, intended tag, package version git, DESCRIPTION 00_release_identity/release_identity.txt
EVD-02 Changelog proof Top changelog section for release candidate CHANGELOG.md 00_release_identity/changelog_top.md
EVD-03 QG-1 log Lint command output and exit code local command 10_quality_gates/qg1_lint.log, 10_quality_gates/qg1_lint.exit
EVD-04 QG-2 log Style command output and exit code local command 10_quality_gates/qg2_style.log, 10_quality_gates/qg2_style.exit
EVD-05 QG-3 log Unit-test output and exit code local command 10_quality_gates/qg3_tests.log, 10_quality_gates/qg3_tests.exit
EVD-06 QG-4 log rcmdcheck output, status summary, NOTE rationale if present local command 10_quality_gates/qg4_rcmdcheck.log, 10_quality_gates/qg4_rcmdcheck.exit, 10_quality_gates/qg4_notes_rationale.md
EVD-07 QG-5 validate log Runner validate output and exit code local command 10_quality_gates/qg5_validate.log, 10_quality_gates/qg5_validate.exit
EVD-08 QG-6 run log Runner run output and exit code local command 10_quality_gates/qg6_run.log, 10_quality_gates/qg6_run.exit
EVD-09 QG-7 docs log pkgdown build output and exit code local command 10_quality_gates/qg7_pkgdown.log, 10_quality_gates/qg7_pkgdown.exit
EVD-10 Validate artefacts Required QG-5 artefacts results/quality_gate_validate 20_runner_artifacts/quality_gate_validate/
EVD-11 Run artefacts Required QG-6 artefacts results/quality_gate_run 20_runner_artifacts/quality_gate_run/
EVD-12 CI proof CI run URLs and final status for R-CMD-check.yaml and pkgdown.yaml on candidate commit GitHub Actions 30_ci/ci_run_summary.md
EVD-13 Sign-off record Completed final decision record sign-off template 40_signoff/sign_off_record.md

Exact required artefact paths

EVD-10 validate artefacts (QG-5)

Copy these paths from the run directory:

  • results/quality_gate_validate/00_run_metadata/config.original.yaml
  • results/quality_gate_validate/00_run_metadata/config.resolved.yaml
  • results/quality_gate_validate/00_run_metadata/session_info.txt

EVD-11 run artefacts (QG-6)

Copy these paths from the run directory:

  • results/quality_gate_run/00_run_metadata/config.resolved.yaml
  • results/quality_gate_run/20_model_fit/model.rds
  • results/quality_gate_run/30_post_run/posterior_summary.csv
  • results/quality_gate_run/40_diagnostics/diagnostics_report.csv

Collection commands

Create evidence structure:

mkdir -p release_evidence/v1.2.2/$(date +%Y%m%d)_$(git rev-parse --short HEAD)/{00_release_identity,10_quality_gates,20_runner_artifacts,30_ci,40_signoff}

Expected outcome: canonical evidence folders exist.

Capture release identity and changelog proof:

EROOT="release_evidence/v1.2.2/$(date +%Y%m%d)_$(git rev-parse --short HEAD)"
{
  echo "candidate_commit=$(git rev-parse HEAD)"
  echo "candidate_branch=$(git rev-parse --abbrev-ref HEAD)"
  echo "target_tag=v1.2.2"
  echo "package_version=$(awk -F': ' '/^Version:/{print $2}' DESCRIPTION)"
} > "$EROOT/00_release_identity/release_identity.txt"

sed -n '1,120p' CHANGELOG.md > "$EROOT/00_release_identity/changelog_top.md"

Expected outcome: release_identity.txt and changelog_top.md are populated.

Capture gate logs and exit codes:

EROOT="release_evidence/v1.2.2/$(date +%Y%m%d)_$(git rev-parse --short HEAD)"

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --lint > "$EROOT/10_quality_gates/qg1_lint.log" 2>&1; echo $? > "$EROOT/10_quality_gates/qg1_lint.exit"
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --style > "$EROOT/10_quality_gates/qg2_style.log" 2>&1; echo $? > "$EROOT/10_quality_gates/qg2_style.exit"
R_LIBS_USER="$PWD/.Rlib" R -q -e 'testthat::test_dir("tests/testthat")' > "$EROOT/10_quality_gates/qg3_tests.log" 2>&1; echo $? > "$EROOT/10_quality_gates/qg3_tests.exit"
R_LIBS_USER="$PWD/.Rlib" R -q -e 'rcmdcheck::rcmdcheck(args = c("--no-manual","--compact-vignettes=gs+qpdf"))' > "$EROOT/10_quality_gates/qg4_rcmdcheck.log" 2>&1; echo $? > "$EROOT/10_quality_gates/qg4_rcmdcheck.exit"
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_validate > "$EROOT/10_quality_gates/qg5_validate.log" 2>&1; echo $? > "$EROOT/10_quality_gates/qg5_validate.exit"
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R run --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_run > "$EROOT/10_quality_gates/qg6_run.log" 2>&1; echo $? > "$EROOT/10_quality_gates/qg6_run.exit"
R_LIBS_USER="$PWD/.Rlib" Rscript -e 'pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)' > "$EROOT/10_quality_gates/qg7_pkgdown.log" 2>&1; echo $? > "$EROOT/10_quality_gates/qg7_pkgdown.exit"

Expected outcome: seven gate logs and seven exit-code files are present.

Copy runner artefacts:

EROOT="release_evidence/v1.2.2/$(date +%Y%m%d)_$(git rev-parse --short HEAD)"
mkdir -p "$EROOT/20_runner_artifacts"
cp -R results/quality_gate_validate "$EROOT/20_runner_artifacts/"
cp -R results/quality_gate_run "$EROOT/20_runner_artifacts/"

Expected outcome: runner artefacts are captured under evidence storage.

Evidence review checklist

Before sign-off, reviewers must confirm all items below:

  1. release_identity.txt commit hash matches the commit being tagged.
  2. package_version in release_identity.txt matches the intended release (1.2.0).
  3. changelog_top.md includes a DSAMbayes 1.2.0 section aligned with candidate changes.
  4. Every qg*.exit file contains 0.
  5. Required QG-5 and QG-6 artefact paths exist.
  6. ci_run_summary.md records CI run links and final statuses for both workflows.
  7. Completed sign-off record exists at 40_signoff/sign_off_record.md.

CI summary file contract (EVD-12)

30_ci/ci_run_summary.md must contain:

  • Candidate commit hash
  • URL and status for .github/workflows/R-CMD-check.yaml
  • URL and status for .github/workflows/pkgdown.yaml
  • Reviewer name and review timestamp in UTC

Submission and retention

  1. Submit the evidence root path in the release PR and in the sign-off record.
  2. Do not delete evidence for approved releases.
  3. For rejected releases, retain evidence and mark decision as NO-GO in sign-off.

Release Readiness Checklist

Purpose

Provide the mandatory go or no-go checklist before creating a DSAMbayes release tag.

How to use this checklist

  1. Complete this checklist after running all release-quality gates.
  2. Record evidence paths for each item.
  3. If any mandatory item fails, decision is NO-GO.
  4. Copy final decision details into Release Sign-off Template.

Mandatory checklist

ID Check Pass criteria Evidence required
RL-01 Candidate commit fixed Single candidate commit hash selected and recorded 00_release_identity/release_identity.txt
RL-02 Version metadata aligned DESCRIPTION version equals intended release version 00_release_identity/release_identity.txt
RL-03 Changelog aligned Top CHANGELOG.md section matches candidate scope 00_release_identity/changelog_top.md
RL-04 QG-1 lint passed Exit code 0, no lint failures 10_quality_gates/qg1_lint.log, 10_quality_gates/qg1_lint.exit
RL-05 QG-2 style passed Exit code 0, no style failures 10_quality_gates/qg2_style.log, 10_quality_gates/qg2_style.exit
RL-06 QG-3 tests passed Exit code 0, no test failures 10_quality_gates/qg3_tests.log, 10_quality_gates/qg3_tests.exit
RL-07 QG-4 package check passed No ERROR; no unresolved WARNING 10_quality_gates/qg4_rcmdcheck.log, 10_quality_gates/qg4_rcmdcheck.exit
RL-08 QG-5 validate passed Exit code 0; required validate artefacts present 10_quality_gates/qg5_validate.*, 20_runner_artifacts/quality_gate_validate/
RL-09 QG-6 run passed Exit code 0; required run artefacts present 10_quality_gates/qg6_run.*, 20_runner_artifacts/quality_gate_run/
RL-10 QG-7 docs build passed Exit code 0; pkgdown build completed 10_quality_gates/qg7_pkgdown.*
RL-11 CI package workflow green .github/workflows/R-CMD-check.yaml passed on candidate commit 30_ci/ci_run_summary.md
RL-12 CI docs workflow green .github/workflows/pkgdown.yaml build passed on candidate commit 30_ci/ci_run_summary.md
RL-13 Evidence bundle complete All mandatory evidence items EVD-01 to EVD-13 present Evidence root directory
RL-14 Exceptions resolved or accepted All exceptions documented with owner and approval 40_signoff/sign_off_record.md
RL-15 Final sign-off recorded Decision, approver, and timestamp completed 40_signoff/sign_off_record.md

Required runner artefact checks

Validate artefacts that must exist:

  1. 20_runner_artifacts/quality_gate_validate/00_run_metadata/config.original.yaml
  2. 20_runner_artifacts/quality_gate_validate/00_run_metadata/config.resolved.yaml
  3. 20_runner_artifacts/quality_gate_validate/00_run_metadata/session_info.txt

Run artefacts that must exist:

  1. 20_runner_artifacts/quality_gate_run/00_run_metadata/config.resolved.yaml
  2. 20_runner_artifacts/quality_gate_run/20_model_fit/model.rds
  3. 20_runner_artifacts/quality_gate_run/30_post_run/posterior_summary.csv
  4. 20_runner_artifacts/quality_gate_run/40_diagnostics/diagnostics_report.csv

Decision rules

  1. GO only if every checklist item RL-01 to RL-15 passes.
  2. NO-GO if any mandatory item fails or evidence is incomplete.
  3. HOLD if no hard failure exists but final approval is pending.
  4. Tag creation is allowed only after GO decision is recorded.

Completion record template

Use this section when running the checklist.

Field Value
Release version <fill>
Candidate commit hash <fill>
Checklist executor <fill>
Checklist completion date (UTC) <YYYY-MM-DD>
Checklist result (GO/NO-GO/HOLD) <fill>
Evidence root path <fill>
Sign-off record path <fill>

Audit continuity reference

For programme-level historical traceability, also review:

  • refactoring_plan/release_readiness_checklist_v1.2.md

This page is the operational checklist for current release execution.

Release Playbook

Purpose

Define the operational release flow from freeze to tag for DSAMbayes v1.2.2, including rollback and hotfix handling.

Audience and roles

  • Release owner (cshaw): drives the checklist, evidence pack, and final go or no-go call.
  • Maintainers: execute gates, review failures, and approve release PRs.
  • Reviewers: verify evidence and sign off risk acceptance.

Release inputs

Release work starts only when these inputs are ready:

  1. CHANGELOG.md reflects the release candidate contents.
  2. DESCRIPTION has the intended release version.
  3. Required docs pages for quality and runner workflows are in place.
  4. Candidate commit hash is identified on main or master.

Step-by-step flow

1. Freeze the release candidate

Actions:

  1. Announce code freeze window and candidate commit hash.
  2. Stop merging non-release changes until gate outcome is known.
  3. Confirm release scope is documentation and refactor changes planned for v1.2.2.

Expected outcome: one stable candidate commit is selected for gate execution.

2. Prepare local release environment

Actions:

  1. Create repo-local paths and environment variables.
  2. Install local package into .Rlib.

Commands:

# Navigate to your local DSAMbayes checkout
cd /path/to/DSAMbayes
mkdir -p .Rlib .cache
export R_LIBS_USER="$PWD/.Rlib"
export XDG_CACHE_HOME="$PWD/.cache"
R_LIBS_USER="$PWD/.Rlib" R -q -e 'install.packages(".", repos = NULL, type = "source")'

Expected outcome: release checks run in a reproducible local environment.

3. Run mandatory quality gates

Execute gates QG-1 to QG-7 from Quality Gates in order.

Commands:

R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --lint
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/check.R --style
R_LIBS_USER="$PWD/.Rlib" R -q -e 'testthat::test_dir("tests/testthat")'
R_LIBS_USER="$PWD/.Rlib" R -q -e 'rcmdcheck::rcmdcheck(args = c("--no-manual","--compact-vignettes=gs+qpdf"))'
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R validate --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_validate
R_LIBS_USER="$PWD/.Rlib" Rscript scripts/dsambayes.R run --config config/blm_synthetic_mcmc.yaml --run-dir results/quality_gate_run
R_LIBS_USER="$PWD/.Rlib" Rscript -e 'pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)'

Expected outcome: all mandatory gates pass with exit code 0, with no unresolved WARNING or ERROR.

4. Confirm CI workflow status

Actions:

  1. Confirm .github/workflows/R-CMD-check.yaml is green for the candidate commit.
  2. Confirm .github/workflows/pkgdown.yaml build step is green for the candidate commit.
  3. For non-PR release flow, confirm pkgdown deploy step eligibility and permissions.

Expected outcome: CI evidence aligns with local gate outcomes.

5. Finalise release metadata

Actions:

  1. Re-check DESCRIPTION version and CHANGELOG.md top section.
  2. Ensure release notes correspond to the candidate commit only.
  3. Commit final metadata edits if needed.

Expected outcome: version metadata and release notes are internally consistent.

6. Build release evidence and sign-off record

Actions:

  1. Assemble logs and artefacts defined in Release Evidence Pack.
  2. Record gate outcomes and decision in Release Sign-off Template.
  3. Capture any accepted NOTE rationale from rcmdcheck.

Expected outcome: auditable evidence bundle is complete and signed.

7. Create and push release tag

Create an annotated tag for the approved version and push it.

Commands:

git tag -a v1.2.2 -m "DSAMbayes v1.2.2"
git push origin v1.2.2

Expected outcome: release tag v1.2.2 exists on remote and points to the signed-off commit.

8. Publish release and verify post-release state

Actions:

  1. Publish the GitHub release from tag v1.2.2.
  2. Confirm pkgdown.yaml runs on release published event.
  3. Confirm deployed docs are up to date.

Expected outcome: release record and documentation deployment are complete.

Go or no-go rules

  1. NO-GO if any mandatory quality gate fails.
  2. NO-GO if release evidence or sign-off template is incomplete.
  3. NO-GO if version metadata and changelog are inconsistent.
  4. GO only when all gates pass and sign-off is recorded.

Rollback procedure

Use this procedure if a release must be withdrawn.

Case A: tag created, release not published

Actions:

  1. Delete the local tag.
  2. Delete the remote tag.
  3. Open a corrective PR and restart gate execution.

Commands:

git tag -d v1.2.2
git push origin :refs/tags/v1.2.2

Expected outcome: withdrawn tag is removed and cannot trigger downstream release steps.

Case B: release published

Actions:

  1. Mark the published release as superseded with a clear incident note.
  2. Do not rewrite history on main or master.
  3. Cut a hotfix release using the hotfix path below.

Expected outcome: users are directed to a corrected patch release.

Hotfix path

Use this path for a defect found after publication.

  1. Branch from the released tag.
git checkout -b hotfix/v1.2.3 v1.2.2
  1. Apply the minimal fix and update release metadata.

Actions:

  • bump DESCRIPTION version to 1.2.1
  • add a DSAMbayes 1.2.1 section at top of CHANGELOG.md
  1. Re-run mandatory gates (QG-1 to QG-7).
  2. Obtain sign-off with explicit incident reference.
  3. Tag and publish v1.2.1.

Commands:

git tag -a v1.2.1 -m "DSAMbayes v1.2.1 hotfix"
git push origin v1.2.1

Expected outcome: defect is remediated in a new immutable patch release.

Sign-off Template

Purpose

Provide the final approval record for a DSAMbayes release candidate after all mandatory evidence has been reviewed.

Instructions

  1. Copy this template into the candidate evidence bundle as 40_signoff/sign_off_record.md.
  2. Complete every field.
  3. Use GO, NO-GO, or HOLD for the decision.
  4. If any exception is accepted, record explicit rationale and owner.

Release identification

Field Value
Release version v1.2.2
Package version (DESCRIPTION) <fill>
Candidate commit hash <fill>
Candidate branch <fill>
Intended tag <fill>
Changelog section verified <yes/no>
Evidence root path <fill>

Decision summary

Field Value
Decision <GO/NO-GO/HOLD>
Decision date (UTC) <YYYY-MM-DD>
Decision timestamp (UTC) <YYYY-MM-DDTHH:MM:SSZ>
Release owner <name>
Primary approver <name>
Secondary reviewer (if used) <name or n/a>

Decision rationale:

<fill>

Quality gate outcomes

Gate ID Outcome (pass/fail) Evidence file(s) Reviewer notes
QG-1 Lint <fill> 10_quality_gates/qg1_lint.log, 10_quality_gates/qg1_lint.exit <fill>
QG-2 Style <fill> 10_quality_gates/qg2_style.log, 10_quality_gates/qg2_style.exit <fill>
QG-3 Unit tests <fill> 10_quality_gates/qg3_tests.log, 10_quality_gates/qg3_tests.exit <fill>
QG-4 Package check <fill> 10_quality_gates/qg4_rcmdcheck.log, 10_quality_gates/qg4_rcmdcheck.exit <fill>
QG-5 Runner validate <fill> 10_quality_gates/qg5_validate.log, 10_quality_gates/qg5_validate.exit <fill>
QG-6 Runner run <fill> 10_quality_gates/qg6_run.log, 10_quality_gates/qg6_run.exit <fill>
QG-7 Docs build <fill> 10_quality_gates/qg7_pkgdown.log, 10_quality_gates/qg7_pkgdown.exit <fill>

Mandatory artefact checks

Artefact Present (yes/no) Path Notes
Resolved config (validate) <fill> 20_runner_artifacts/quality_gate_validate/00_run_metadata/config.resolved.yaml <fill>
Session info (validate) <fill> 20_runner_artifacts/quality_gate_validate/00_run_metadata/session_info.txt <fill>
Model object (run) <fill> 20_runner_artifacts/quality_gate_run/20_model_fit/model.rds <fill>
Posterior summary (run) <fill> 20_runner_artifacts/quality_gate_run/30_post_run/posterior_summary.csv <fill>
Diagnostics report (run) <fill> 20_runner_artifacts/quality_gate_run/40_diagnostics/diagnostics_report.csv <fill>

CI confirmation

Workflow Status (pass/fail) Run URL Notes
.github/workflows/R-CMD-check.yaml <fill> <fill> <fill>
.github/workflows/pkgdown.yaml <fill> <fill> <fill>

Exceptions and risk acceptance

Record every exception. If there are none, write none.

ID Exception Reason Risk owner Expiry date Approved (yes/no)
EX-01 <fill or none> <fill> <fill> <YYYY-MM-DD or n/a> <fill>

Required follow-up actions

Record actions that must happen after release decision.

ID Action Owner Due date Tracking link
ACT-01 <fill or none> <fill> <YYYY-MM-DD or n/a> <fill or n/a>

Final approval signatures

Role Name Signature mode Date (UTC)
Release owner <fill> <typed/e-sign> <YYYY-MM-DD>
Approver <fill> <typed/e-sign> <YYYY-MM-DD>
Additional approver (optional) <fill or n/a> <typed/e-sign or n/a> <YYYY-MM-DD or n/a>

Final decision statement

<Release vX.Y.Z is approved for tagging and publication.>

or

<Release vX.Y.Z is not approved. See exceptions and actions.>