Plots

Purpose

This section documents every plot the DSAMbayes runner produces. Each page covers one pipeline stage, describes what the plot shows, explains when and why the runner generates it, and gives practical interpretation guidance. The target reader is a modelling operator or analyst who needs to assess run quality without reading source code.

Pipeline stages

The runner writes artefacts into timestamped directories under results/. Plots are organised into six stages, each with its own subdirectory:

Stage	Directory	Role	Page
Pre-run	`10_pre_run/`	Data quality and input sanity checks before fitting	Pre-run plots
Model fit	`20_model_fit/`	Posterior summaries and fitted-vs-observed visualisations	Model fit plots
Post-run	`30_post_run/`	Response decomposition and predictor contributions	Post-run plots
Diagnostics	`40_diagnostics/`	Residual analysis, boundary monitoring, posterior predictive checks	Diagnostics plots
Model selection	`50_model_selection/`	LOO-CV diagnostics for model comparison and calibration	Model selection plots
Optimisation	`60_optimisation/`	Budget allocation, response curves, efficiency comparisons	Optimisation plots

Plot catalogue

All plots listed below map to files in docs/images/ and to actual runner output filenames.

Pre-run (`10_pre_run/`)

Filename	Description	Type
`media_spend_timeseries.png`	Stacked area chart of channel spend over time	Descriptive
`kpi_media_overlay.png`	KPI and total media spend on dual axes	Descriptive
`vif_bar.png`	Variance inflation factor per predictor	Diagnostic

Model fit (`20_model_fit/`)

Filename	Description	Type
`fit_timeseries.png`	Observed vs fitted response over time with credible band	Descriptive
`fit_scatter.png`	Observed vs fitted scatter with 45-degree reference	Descriptive
`posterior_forest.png`	Coefficient estimates with 90% credible intervals	Descriptive
`prior_posterior.png`	Prior-to-posterior shift for media coefficients	Descriptive

Post-run (`30_post_run/`)

Filename	Description	Type
`decomp_predictor_impact.png`	Total contribution per model term (bar chart)	Descriptive
`decomp_timeseries.png`	Stacked media contribution over time	Descriptive

Diagnostics (`40_diagnostics/`)

Filename	Description	Type
`ppc.png`	Posterior predictive check fan chart	Diagnostic
`residuals_timeseries.png`	Residuals over time	Diagnostic
`residuals_vs_fitted.png`	Residuals vs fitted values	Diagnostic
`residuals_hist.png`	Residual distribution histogram	Diagnostic
`residuals_acf.png`	Residual autocorrelation function	Diagnostic
`residuals_latent_acf.png`	Latent-scale residual ACF (log-response models)	Diagnostic
`boundary_hits.png`	Share of posterior draws near coefficient boundaries	Diagnostic/Gating

Model selection (`50_model_selection/`)

Filename	Description	Type
`pareto_k.png`	Pareto-k diagnostic scatter from PSIS-LOO	Diagnostic/Gating
`loo_pit.png`	LOO probability integral transform histogram	Diagnostic
`elpd_influence.png`	Pointwise ELPD contributions over time	Diagnostic

Optimisation (`60_optimisation/`)

Filename	Description	Type
`budget_response_curves.png`	Channel response curves with current/optimised points	Decision
`budget_roi_cpa.png`	ROI or CPA comparison (current vs optimised)	Decision
`budget_impact.png`	Spend reallocation and response impact diverging bars	Decision
`budget_contribution.png`	Absolute response comparison by channel	Decision
`budget_confidence_comparison.png`	Posterior credible intervals for allocations	Decision
`budget_sensitivity.png`	Sensitivity of optimised allocation to budget changes	Decision
`budget_efficient_frontier.png`	Efficient frontier across budget levels	Decision
`budget_kpi_waterfall.png`	KPI waterfall from reference to optimised allocation	Decision
`budget_marginal_roi.png`	Marginal ROI curves at the optimised point	Decision
`budget_spend_share.png`	Spend share comparison (current vs optimised)	Decision

Source code references

Plot generation is implemented across three files:

R/runner_fit_plots.R — pre-run, fit, diagnostics, and model selection plots
R/optimise_budget_plots.R — budget optimisation plots
R/run_artifacts.R — stage map and orchestration
R/run_artifacts_enrichment.R — wiring for fit-stage and pre-run plots
R/run_artifacts_diagnostics.R — wiring for diagnostics and model selection plots

Pre-run Plots

Purpose

Pre-run plots are generated before the model is fitted. They visualise the input data and flag structural problems — multicollinearity, missing spend periods, implausible KPI–media relationships — that could compromise inference. Treat these as a data quality gate: review them before interpreting any downstream output.

All pre-run plots are written to 10_pre_run/ within the run directory. They require ggplot2 and are generated by write_pre_run_plots() in R/run_artifacts_enrichment.R. The runner produces them whenever an allocation.channels block is present in the configuration and the data contains the referenced spend columns.

Plot catalogue

Filename	What it shows	Conditions
`media_spend_timeseries.png`	Stacked channel spend over time	Allocation channels defined with valid `spend_col`
`kpi_media_overlay.png`	KPI and total spend on dual axes	Allocation channels defined; response variable present
`vif_bar.png`	VIF per predictor with severity thresholds	Design matrix extractable with >1 predictor and >1 row

Media spend time series

Filename: media_spend_timeseries.png

What it shows

A stacked area chart of weekly media spend by channel, drawn from the raw spend_col columns declared in the allocation configuration. The x-axis is the date variable; the y-axis is spend in model units.

When it is generated

The runner generates this plot when:

The configuration includes an allocation.channels block.
At least one declared spend_col exists in the input data.

If no valid spend columns are found, the plot is silently skipped.

How to interpret it

Look for three things. First, check that each channel has plausible seasonal patterns and no unexpected gaps — zero-spend weeks in the middle of a campaign period suggest data ingestion problems. Second, verify that the relative magnitudes make sense: if TV dominates the stack but the brand has historically been digital-first, the data may be mislabelled or aggregated incorrectly. Third, confirm that the date range matches the modelling window declared in the configuration.

Warning signs

Flat channels: A channel with constant spend across all weeks contributes no variation and cannot be identified by the model. The coefficient will be driven entirely by the prior.
Sudden jumps or drops: Step changes in spend that do not correspond to known campaign events may indicate data joins across sources with different reporting conventions.
Missing periods: Gaps where spend drops to zero mid-series can distort adstock calculations if the model applies geometric decay.

Action

If a channel shows no variation, consider removing it from the formula or fixing the upstream data. If gaps are genuine (e.g. a seasonal channel), confirm the adstock specification handles zero-spend periods correctly.

data_dictionary.csv in 10_pre_run/ provides summary statistics for every input column.

KPI–media overlay

Filename: kpi_media_overlay.png

What it shows

A dual-axis time series with the KPI response variable on the left axis (blue) and total media spend (sum of all declared spend_col values) on the right axis (red, rescaled to share the vertical space). This is a visual correlation check, not a causal claim.

When it is generated

The runner generates this plot when:

The configuration includes an allocation.channels block with at least one valid spend_col.
The response variable exists in the data.

If the total spend has zero variance, the plot is skipped.

How to interpret it

The overlay reveals whether KPI and aggregate spend move together over time. A rough co-movement is expected in MMM data — media drives response — but the relationship need not be tight. Seasonal KPI peaks that precede or lag media bursts suggest confounding (e.g. demand-driven spend timing). Divergences where spend rises but KPI falls (or vice versa) are worth investigating: they may reflect diminishing returns, competitor activity, or a structural break in the data.

Warning signs

Perfect alignment: If the two series track each other almost exactly, the model may be fitting spend timing rather than incremental media effects.
Opposite trends: A persistent negative relationship between total spend and KPI suggests reverse causality or omitted-variable bias.
Scale artefacts: The dual-axis rescaling can exaggerate or suppress visual correlation. Do not draw quantitative conclusions from this plot.

Action

Use this plot as a sanity check only. If the relationship looks implausible, investigate the data and consider whether the formula includes adequate controls for seasonality, trend, and external factors.

Variance inflation factor (VIF) bar chart

Filename: vif_bar.png

What it shows

A horizontal bar chart of variance inflation factors for each predictor in the model’s design matrix. Bars are colour-coded by severity: green (VIF < 5), amber (5 ≤ VIF < 10), and red (VIF ≥ 10). Dashed vertical lines mark the 5 and 10 thresholds.

When it is generated

The runner generates this plot when:

The design matrix has more than one predictor column and more than one row.
The VIF computation does not encounter a singular or degenerate correlation matrix.

For pooled models, the design matrix extraction may return zero rows, in which case the plot is skipped.

How to interpret it

VIF measures how much the variance of a coefficient estimate inflates due to correlation with other predictors. A VIF of 1 means no multicollinearity; a VIF of 10 means the standard error is roughly three times larger than it would be with orthogonal predictors. In Bayesian MMM, high VIF does not break inference the way it does in OLS — priors regularise the estimates — but it does reduce the data’s ability to inform the posterior, making results more prior-dependent.

Warning signs

VIF > 10 on media channels: The model cannot reliably separate the effects of those channels. Posterior estimates will lean heavily on the prior. Consider whether the channels can be combined or whether one should be dropped.
VIF > 10 on seasonality terms: Common and usually harmless if the terms are included as controls rather than as interpretive outputs.
All terms moderate or high: The overall collinearity structure may be too severe for the data length. Consider increasing the sample size or simplifying the formula.

Action

Review the top-VIF terms. If two media channels are highly collinear (e.g. search and affiliate), consider whether they can be meaningfully separated given the available data. If not, combine them or use informative priors to anchor the split.

design_matrix_manifest.csv in 10_pre_run/ lists all design matrix columns with variance and uniqueness statistics.
spec_summary.csv in 10_pre_run/ summarises the model specification.

Cross-references

Runner output artefacts
Diagnostics plots — residual and boundary checks that complement pre-run screening
Plot index

Model Fit Plots

Purpose

Model fit plots summarise the posterior and compare fitted values against observed data. They answer two questions: does the model track the response variable adequately, and are the estimated coefficients plausible? These plots are written to 20_model_fit/ within the run directory.

The runner generates them via write_model_fit_plots() in R/run_artifacts_enrichment.R. All four plots require ggplot2 and the fitted model object. Each is wrapped in tryCatch so that a failure in one does not prevent the others from being written.

Plot catalogue

Filename	What it shows	Conditions
`fit_timeseries.png`	Observed vs fitted over time with 95% credible band	Always generated after a successful fit
`fit_scatter.png`	Observed vs fitted scatter	Always generated after a successful fit
`posterior_forest.png`	Coefficient point estimates with 90% CIs	Posterior draws available via `get_posterior()`
`prior_posterior.png`	Prior-to-posterior density shift for media terms	Model has a `.prior` table with media (`m_*`) parameters

Fit time series

Filename: fit_timeseries.png

What it shows

The observed KPI (orange) and posterior mean fitted values (blue) plotted over time, with a shaded 95% credible interval band. The subtitle reports in-sample fit metrics: R², RMSE, MAE, mean error (bias), sMAPE, 95% prediction interval coverage, lag-1 ACF of residuals, and sample size. For hierarchical models the plot facets by group.

When it is generated

Always, provided the model has been fitted successfully and the fit table (observed, mean, percentiles) can be computed.

How to interpret it

The fitted line should track the general level and seasonal pattern of the observed series. The 95% credible band should contain most observed points — the subtitle reports the actual coverage, which should be close to 95%. Systematic departures reveal model misspecification: if the fitted line consistently overshoots during holidays or undershoots during quiet periods, the formula may lack appropriate seasonal or event terms.

Warning signs

Coverage well below 95%: The model underestimates uncertainty. Common when the noise prior is too tight or the model is overfit to a subset of the data.
Coverage well above 95%: The credible interval is too wide. The model is underfit or the noise prior is too diffuse.
Persistent bias (ME far from zero): The model systematically over- or under-predicts. Check for missing structural terms (trend, level shifts, intercept misspecification).
High lag-1 ACF (> 0.3): Residuals are autocorrelated. The model is missing temporal structure — consider adding lagged terms or checking adstock specifications.

Action

If coverage or bias is unacceptable, revisit the formula (missing controls, wrong functional form) or the prior specification (overly tight noise SD). Cross-reference with the residuals diagnostics for a more detailed picture.

fit_metrics_by_group.csv in 20_model_fit/ provides the same metrics in tabular form, broken down by group for hierarchical models.

Fit scatter

Filename: fit_scatter.png

What it shows

A scatter plot of observed values (y-axis) against posterior mean fitted values (x-axis), with a 45-degree reference line. Points on the line indicate perfect fit. For hierarchical models the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Points should cluster tightly around the diagonal. Curvature away from the line suggests a systematic misfit — for instance, if the model underpredicts at high KPI values, the response may need a nonlinear term or a log transformation. Outliers far from the line warrant investigation: they may correspond to anomalous weeks (data errors, one-off events) that the model cannot capture.

Warning signs

Fan shape (wider scatter at higher values): Heteroscedasticity. A log-scale model or a variance-stabilising transform may be more appropriate.
Systematic curvature: The mean function is misspecified. Consider adding polynomial or interaction terms.
Isolated outliers: Check the dates of extreme residuals against the residuals time series and the input data for data quality issues.

Action

If the scatter reveals non-constant variance, consider fitting on the log scale (model.scale or a log-transformed formula). If curvature is evident, review the functional form of media transforms and control variables.

Posterior forest plot

Filename: posterior_forest.png

What it shows

A horizontal forest plot of posterior coefficient estimates. Each row is a model term (excluding the intercept). The point marks the posterior median; the horizontal bar spans the 5th to 95th percentile (90% credible interval). Terms whose interval excludes zero are drawn in colour; those consistent with zero are grey.

For hierarchical models, the plot displays population-level (group-averaged) estimates.

When it is generated

The runner generates this plot when posterior draws are available via get_posterior(). It is skipped if the posterior extraction fails.

How to interpret it

Focus on the media coefficients. Positive values indicate that higher media exposure is associated with higher KPI, which is the expected direction for most channels. The width of the interval reflects estimation precision: a narrow interval means the data informed the estimate strongly; a wide interval means the prior dominates.

Terms ordered by absolute magnitude (bottom to top) give a quick ranking of effect sizes, but note that these are on the model’s internal scale. For models fitted on the log scale, coefficients represent approximate percentage effects; for levels models, they represent absolute KPI units per unit of the transformed media input.

Warning signs

Media coefficient crosses zero: The model cannot confidently distinguish the channel’s effect from noise. This is not necessarily wrong — some channels may genuinely have weak effects — but it warrants scrutiny, especially if the prior was informative.
Implausibly large coefficients: Check for scaling issues. If model.scale: true, coefficients are on the standardised scale and must be interpreted accordingly.
All intervals very wide: The data may not have enough variation to identify individual effects. Review the VIF bar chart for multicollinearity.

Action

If a media coefficient is unexpectedly negative, investigate whether the data supports it (e.g. counter-cyclical spend) or whether multicollinearity is pulling the estimate. Cross-reference with the prior vs posterior plot to see how far the data moved the estimate from its prior.

Prior vs posterior

Filename: prior_posterior.png

What it shows

Faceted density plots for each media coefficient (m_* parameters). The grey distribution is the prior (Normal, as specified in the model’s .prior table); the blue distribution is the posterior (estimated from MCMC draws). Overlap indicates that the data did not strongly inform the estimate; separation indicates data-driven updating.

For hierarchical models, posterior draws are averaged across groups to show the population-level density.

When it is generated

The runner generates this plot when:

The model has a .prior table (i.e. it is a requires_prior model).
The prior table contains at least one m_* parameter.
Posterior draws are available.

If the model has no prior table (e.g. a pure OLS updater), the plot is skipped.

How to interpret it

A well-identified coefficient shifts noticeably from prior to posterior. If the two densities sit on top of each other, the data provided little information for that channel — the estimate is prior-driven. This is not inherently wrong (the prior may be well-calibrated from previous studies), but it does mean the current dataset alone cannot validate the estimate.

Warning signs

No shift at all: The channel has insufficient variation or is too collinear with other terms for the data to update the prior. The resulting coefficient is essentially assumed, not estimated.
Posterior much narrower than prior: Expected and healthy. The data concentrated the estimate.
Posterior shifted to the boundary: If a boundary constraint is active (e.g. non-negativity), the posterior may pile up at zero. Cross-reference with the boundary hits plot to confirm.

Action

If key media channels show no prior-to-posterior shift, consider whether the prior is appropriate, whether the data period is long enough, or whether multicollinearity prevents identification. For channels where the prior dominates, document this clearly when reporting ROAS or contribution estimates — the output reflects an assumption, not a data-driven finding.

Cross-references

Pre-run plots — VIF and data quality checks that contextualise fit results
Diagnostics plots — residual analysis that complements the fit overview
Model selection plots — LOO-CV diagnostics for comparing model specifications
Plot index

Post-run Plots

Purpose

Post-run plots decompose the fitted response into its constituent parts. They answer the question: how much does each predictor contribute to the modelled KPI, and how do those contributions evolve over time? These plots are written to 30_post_run/ within the run directory.

The runner generates them via write_response_decomposition_artifacts() in R/run_artifacts_enrichment.R, which calls runner_response_decomposition_tables() to compute per-term contributions from the design matrix and posterior coefficient estimates. For hierarchical models with random-effects formula syntax (|), the decomposition may fail gracefully — the runner logs a warning and continues to downstream stages.

Plot catalogue

Filename	What it shows	Conditions
`decomp_predictor_impact.png`	Total contribution per model term (bar chart)	Decomposition tables computed successfully
`decomp_timeseries.png`	Stacked media channel contribution over time	Decomposition tables computed successfully; media terms present

Predictor impact

Filename: decomp_predictor_impact.png

What it shows

A horizontal bar chart of the total contribution of each model term to the response, computed as the sum of coefficient × design-matrix column across all observations. Terms are sorted by absolute contribution magnitude. The intercept and total rows are excluded.

When it is generated

The runner generates this plot when runner_response_decomposition_tables() returns a valid predictor-level summary table. This requires that stats::model.matrix() can parse the model formula against the original input data — a condition that holds for BLM and pooled models but may fail for hierarchical models whose formulas contain random-effects syntax.

How to interpret it

The bar lengths represent total modelled impact over the data period. Media channels with large positive bars drove the most KPI in the model’s account of the data. Control variables (trend, seasonality, holidays) often dominate in absolute terms because they capture baseline demand — this is expected and does not diminish the media findings.

Negative contributions can arise for terms with negative coefficients (e.g. price sensitivity) or for seasonality harmonics where the net effect over the year partially cancels.

Warning signs

A media channel with negative total contribution: Unless the coefficient is intentionally unconstrained (no lower boundary at zero), a negative contribution suggests the model is absorbing noise or confounding through that channel. Review the posterior forest plot and check whether the coefficient’s credible interval excludes zero.
Intercept-dominated decomposition (not shown here, but visible in the CSV): If the intercept accounts for >90% of the total, media effects are negligible relative to baseline demand. This may be correct, but it limits the utility of the model for budget allocation.
Missing plot: If the decomposition failed (logged as a warning), the model type likely does not support direct model.matrix() decomposition. The CSV companions will also be absent.

Action

Use this plot to prioritise which channels to scrutinise. Cross-reference large contributors with the prior vs posterior plot to confirm they are data-driven rather than prior-driven.

decomp_predictor_impact.csv in 30_post_run/ contains the same data in tabular form.
posterior_summary.csv in 30_post_run/ provides the coefficient summary underlying the decomposition.

Decomposition time series

Filename: decomp_timeseries.png

What it shows

A stacked area chart of media channel contributions over time. Each layer represents one media term’s weekly contribution (coefficient × transformed media input). Non-media terms (intercept, controls, seasonality) are excluded to focus the view on the media mix.

When it is generated

The runner generates this plot alongside the predictor impact chart, provided the decomposition tables include at least one media term.

How to interpret it

The height of each band at a given week represents how much that channel contributed to the modelled response. Seasonal patterns in the stack reflect campaign timing and adstock carry-over. The total height of the stack is the aggregate media contribution — the gap between this and the observed KPI is accounted for by non-media terms and noise.

Warning signs

A channel with near-zero contribution throughout: The model assigns negligible effect to that channel. This could be correct (low spend, weak signal) or a sign that multicollinearity is suppressing the estimate.
Implausibly large single-channel dominance: If one channel accounts for the vast majority of the media stack, verify the coefficient is plausible and not inflated by collinearity with a correlated channel.
Abrupt jumps unrelated to spend changes: Check whether the design matrix term (adstock/saturation output) is well-behaved. Sudden spikes in contribution without corresponding spend changes suggest a data or transform issue.

Action

Compare the relative channel contributions here with the business’s spend allocation. Channels that receive large spend but show small contributions may have diminishing returns or weak effects. This comparison motivates the budget optimisation stage.

decomp_timeseries.csv in 30_post_run/ contains the weekly decomposition in long format.

Cross-references

Model fit plots — posterior estimates that drive the decomposition
Optimisation plots — budget allocation informed by these contribution estimates
Plot index

Diagnostics Plots

Purpose

Diagnostics plots assess whether the fitted model’s assumptions hold and whether any structural problems warrant remedial action. They cover residual behaviour, posterior predictive adequacy, and boundary constraint monitoring. These plots are written to 40_diagnostics/ within the run directory.

The runner generates residual plots via write_residual_diagnostics() in R/run_artifacts_diagnostics.R, the PPC plot via write_model_fit_plots() in R/run_artifacts_enrichment.R, and the boundary hits plot via write_boundary_diagnostics() in R/run_artifacts_diagnostics.R. Each plot is wrapped in tryCatch so that individual failures do not block the remaining outputs.

Plot catalogue

Filename	What it shows	Conditions
`ppc.png`	Posterior predictive check fan chart	Posterior draws (`yhat`) extractable from fitted model
`residuals_timeseries.png`	Residuals over time	Fit table available
`residuals_vs_fitted.png`	Residuals vs fitted values	Fit table available
`residuals_hist.png`	Residual distribution histogram	Fit table available
`residuals_acf.png`	Residual autocorrelation function	Fit table available
`residuals_latent_acf.png`	Latent-scale residual ACF	Model uses log-scale response (`response_scale != "identity"`)
`boundary_hits.png`	Posterior draw proximity to coefficient bounds	Boundary hit rates computable from posterior and bound specifications

Posterior predictive check (PPC)

Filename: ppc.png

What it shows

A fan chart of posterior predictive draws overlaid with observed data. The blue line is the posterior mean of the predicted response; the dark band spans the 25th–75th percentile (50% CI) and the light band spans the 5th–95th percentile (90% CI). Red dots mark observed values.

When it is generated

The runner generates this plot whenever posterior predictive draws (yhat) can be extracted from the fitted model via runner_yhat_draws(). This works for BLM, hierarchical, and pooled models fitted with MCMC.

How to interpret it

Well-calibrated models produce bands that contain roughly 50% and 90% of observed points in the respective intervals. The key diagnostic is whether observed values fall systematically outside the bands during specific periods — this reveals time-localised misfit that aggregate metrics like RMSE can mask.

Warning signs

Observed points consistently outside the 90% band: The model underestimates uncertainty or misses a structural feature (holiday, promotion, regime change).
Bands that widen dramatically in specific periods: The model is uncertain about those periods, possibly because the training data lacks similar observations.
Bands that are uniformly very wide: The noise prior may be too diffuse, or the model has too many weakly identified parameters.

Action

If the PPC reveals localised misfit, check whether the affected periods correspond to missing control variables (holidays, events). If the bands are too wide overall, consider tightening the noise prior or simplifying the formula. Cross-reference with the LOO-PIT histogram for an aggregate calibration assessment.

Residuals over time

Filename: residuals_timeseries.png

What it shows

A line chart of residuals (observed minus posterior mean) over time. A horizontal reference line at zero marks perfect fit. For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Residuals should scatter randomly around zero with no discernible trend or seasonal pattern. Any structure in the residuals indicates that the model has failed to capture a systematic component of the data.

Warning signs

Trend in residuals: The model’s trend specification is inadequate. Consider adding a higher-order polynomial or a structural-break term.
Seasonal oscillation: The Fourier harmonics or holiday dummies are insufficient. Add more harmonics or specific event indicators.
Clusters of large residuals: Localised misfit — check corresponding dates for data anomalies.

Action

Residual structure that persists across multiple weeks warrants a formula revision. Short isolated spikes are often data outliers and may not require model changes.

Residuals vs fitted

Filename: residuals_vs_fitted.png

What it shows

A scatter plot of residuals (y-axis) against posterior mean fitted values (x-axis), with a horizontal reference at zero. For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

The scatter should form a horizontal band centred on zero with roughly constant vertical spread across the fitted-value range. Patterns in this plot diagnose specific model violations.

Warning signs

Funnel shape (wider spread at higher fitted values): Heteroscedasticity. A log-scale model would be more appropriate.
Curvature: The mean function is misspecified. The model under- or over-predicts at the extremes.
Discrete clusters: May indicate grouping structure that the model does not account for.

Action

Heteroscedasticity in a levels model is the most common finding. If the funnel pattern is pronounced, re-fit on the log scale and compare diagnostics. Cross-reference with the fit scatter plot which shows the same information from a different angle.

Residual distribution

Filename: residuals_hist.png

What it shows

A histogram of residuals across all observations (40 bins). For hierarchical models with six or fewer groups, the histogram facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

The distribution should be approximately symmetric and unimodal if the Normal noise assumption holds. Heavy tails or skewness indicate departures from normality.

Warning signs

Strong right skew: Common in levels models when the response is strictly positive and has occasional large values. A log transform may help.
Bimodality: Suggests a mixture or an omitted grouping variable. Check whether the data contains distinct regimes.
Extreme outliers: Individual residuals several standard deviations from the mean warrant data inspection.

Action

Moderate departures from normality in the residuals are tolerable in Bayesian inference — the posterior is still valid if the model is otherwise well-specified. Severe skewness or heavy tails, however, can distort credible intervals and predictive coverage. Consider robust likelihood specifications or transformations.

Residual autocorrelation (ACF)

Filename: residuals_acf.png

What it shows

A bar chart of the sample autocorrelation function of residuals, computed up to lag 26 (roughly half a year of weekly data). Red dashed lines mark the 95% significance bounds (±1.96/√n). For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Bars within the significance bounds indicate no serial correlation at that lag. Significant autocorrelation — especially at low lags (1–4 weeks) — means the model misses short-run temporal dependence. Significant spikes at lag 52 (if the series is long enough) suggest residual annual seasonality.

Warning signs

Lag-1 ACF > 0.3: Strong short-run autocorrelation. The model’s uncertainty estimates are anti-conservative (credible intervals too narrow), and coefficient estimates may be biased if lagged effects are present.
Decaying positive ACF: Suggests an omitted AR component or insufficient adstock decay modelling.
Spike at lag 52: Residual annual seasonality not captured by the Fourier terms.

Action

If lag-1 ACF is material, consider adding lagged response terms or increasing the number of Fourier harmonics. For adstock-driven channels, verify that the decay rate is not too fast (underfitting carry-over) or too slow (overfitting noise).

Latent-scale residual ACF

Filename: residuals_latent_acf.png

What it shows

The same ACF plot as above, but computed on the latent (log) scale when the model’s response scale is not identity. This is relevant for models fitted with model.scale: true or log-transformed response variables.

When it is generated

The runner generates this plot when response_scale != "identity". It is skipped for levels-scale models.

How to interpret it

Interpretation is identical to the standard ACF plot. The latent-scale version is preferred for log models because autocorrelation in the log residuals is more directly interpretable as a model adequacy check on the scale where inference is performed.

Warning signs

Same as the standard ACF. Compare both plots if both are generated — discrepancies may indicate that the log transformation introduces or removes autocorrelation artefacts.

Boundary hits

Filename: boundary_hits.png

What it shows

A horizontal chart showing, for each constrained coefficient, the share of posterior draws that fall within a tolerance of the finite lower or upper bound. Bars are colour-coded: green (0% hit rate), amber (1–10%), red (≥10%). When all hit rates are zero, the plot displays green dots with explicit “0.0%” labels.

When it is generated

The runner generates this plot when the model has finite boundary constraints set via set_boundary() and boundary hit rates can be computed from the posterior draws. It is written by write_boundary_diagnostics() in R/run_artifacts_diagnostics.R.

How to interpret it

A zero hit rate for all parameters means no posterior draws approached any boundary — the constraints are not binding and the posterior is effectively unconstrained. This is the ideal outcome.

A non-zero hit rate means the boundary is influencing the posterior shape. Moderate rates (1–10%) suggest the data mildly conflicts with the constraint; high rates (≥10%) mean the data wants the coefficient outside the allowed range and the boundary is actively truncating the posterior.

Warning signs

Hit rate ≥10% on a media coefficient: The non-negativity constraint is binding. The true effect may be zero or negative, but the boundary forces a positive estimate. This inflates the channel’s apparent contribution.
Hit rate ≥10% on many parameters simultaneously: The overall constraint specification may be too tight for the data. Consider widening bounds or reviewing the formula.
Lower-bound hits on a coefficient with strong prior mass at zero: The prior and boundary together may create a “pile-up” at the bound. The posterior is not reflecting the data faithfully.

Action

For channels with high boundary hit rates, critically assess whether the non-negativity constraint is justified by domain knowledge. If the constraint is essential (e.g. media cannot destroy demand), document that the estimate is boundary-driven. If it is not essential, consider relaxing the bound and re-fitting to see whether the unconstrained estimate is materially different.

boundary_hits.csv in 40_diagnostics/ provides the per-parameter hit rates in tabular form.
diagnostics_report.csv in 40_diagnostics/ includes a summary check for boundary binding.

Hierarchical-specific: within variation

Filename: within_variation.png (generated only for hierarchical models)

This plot shows the within-group variation ratio for each non-CRE (correlated random effects) term: Var(x − mean_g(x)) / Var(x). Low ratios indicate that most variation in a predictor is between groups rather than within groups, making it difficult to identify the coefficient from within-group variation alone. Dashed lines at 5% and 10% mark conventional concern thresholds.

This plot is generated only for hierarchical models and is not included in the standard BLM image set.

Cross-references

Model fit plots — fit overview that the residual diagnostics refine
Model selection plots — LOO-based calibration and influence diagnostics
Pre-run plots — VIF checks that contextualise multicollinearity-related residual patterns
Plot index

Model Selection Plots

Purpose

Model selection plots provide leave-one-out cross-validation (LOO-CV) diagnostics that assess predictive adequacy and calibration. They help answer: does the model generalise to unseen observations, and are any individual data points unduly influencing the fit? These plots are written to 50_model_selection/ within the run directory.

The runner generates them via write_model_selection_artifacts() in R/run_artifacts_diagnostics.R. LOO-CV is computed using Pareto-smoothed importance sampling (PSIS-LOO) from the loo package, which approximates exact leave-one-out predictive densities from a single MCMC fit. All three plots depend on the pointwise LOO table (loo_pointwise.csv), which contains per-observation ELPD contributions, Pareto-k diagnostics, and influence flags.

Plot catalogue

Filename	What it shows	Conditions
`pareto_k.png`	Pareto-k diagnostic scatter over time	Pointwise LOO table available with `pareto_k` column
`loo_pit.png`	LOO-PIT calibration histogram	Posterior draws (`yhat`) extractable from fitted model
`elpd_influence.png`	Pointwise ELPD contributions over time	Pointwise LOO table available with `elpd_loo` and `pareto_k` columns

Pareto-k diagnostic

Filename: pareto_k.png

What it shows

A scatter plot of Pareto-k values over time, one point per observation. Points are colour-coded by severity:

Green (k < 0.5): PSIS approximation is reliable.
Amber (0.5 ≤ k < 0.7): Approximation is acceptable but warrants monitoring.
Red (0.7 ≤ k < 1.0): Approximation is unreliable. The observation is influential.
Purple (k > 1.0): PSIS fails entirely. The observation dominates the posterior.

Dashed horizontal lines mark the 0.5, 0.7, and 1.0 thresholds. The legend always displays all four severity levels regardless of whether points exist in each category.

When it is generated

The runner generates this plot whenever the pointwise LOO table contains a pareto_k column. This requires a successful PSIS-LOO computation, which in turn requires the fitted model to produce log-likelihood values.

How to interpret it

Most points should be green. A small number of amber points is typical and does not invalidate the LOO estimate. Red and purple points identify observations where the posterior changes substantially when that observation is excluded — these are influential data points.

Influential observations concentrated in a specific time period (e.g. a cluster of red points around a holiday) suggest that the model struggles with those conditions. Isolated influential points may correspond to data anomalies or outliers.

Warning signs

More than 10% of points above 0.7: The overall PSIS-LOO estimate is unreliable. The loo package will issue a warning. Consider moment-matching or exact refitting for affected observations.
Purple points (k > 1): These observations are so influential that removing them would substantially change the posterior. Investigate whether they represent data errors, one-off events, or genuine but rare conditions.
Influential points at the start or end of the series: Edge effects in adstock transforms can create artificial influence at series boundaries.

Action

For isolated red/purple points, inspect the corresponding dates and data values. If they are data errors, correct the data. If they are genuine but extreme, consider whether the model’s likelihood (Normal) is appropriate — heavy-tailed alternatives (Student-t) are more robust to outliers. If influential points are numerous, the model may be misspecified more broadly: revisit the formula, priors, and functional form.

loo_pointwise.csv in 50_model_selection/ contains the per-observation Pareto-k, ELPD, and influence flags.
loo_summary.csv in 50_model_selection/ reports the aggregate ELPD with standard error.

LOO-PIT calibration histogram

Filename: loo_pit.png

What it shows

A histogram of leave-one-out probability integral transform (LOO-PIT) values across all observations. The PIT value for observation t is the proportion of posterior predictive draws that fall below the observed value: PIT_t = Pr(ŷ_t ≤ y_t | y_{-t}). The histogram uses 20 equal-width bins from 0 to 1. A dashed red horizontal line marks the expected count under a perfectly calibrated model (n/20).

When it is generated

The runner generates this plot whenever posterior predictive draws can be extracted via runner_yhat_draws(). It does not require the pointwise LOO table — it computes PIT values directly from the posterior predictive distribution. The plot is written by write_model_fit_plots() in R/run_artifacts_enrichment.R and filed under 50_model_selection/.

How to interpret it

A well-calibrated model produces a uniform PIT distribution — all bins should be roughly equal in height, close to the dashed reference line. Departures from uniformity reveal specific calibration failures:

U-shape (excess mass at 0 and 1): The model is overdispersed — its predictive intervals are too narrow. Observed values fall in the tails of the predictive distribution more often than expected.
Inverse U-shape (excess mass in the centre): The model is underdispersed — its predictive intervals are too wide. The model is more uncertain than it needs to be.
Left-skewed (excess mass near 0): The model systematically overpredicts. Observed values tend to fall below the predictive distribution.
Right-skewed (excess mass near 1): The model systematically underpredicts.

Warning signs

Strong U-shape: The noise variance is underestimated or the model is missing a source of variation. This is the most concerning pattern because it means the credible intervals are anti-conservative — reported uncertainty is too low.
One bin dramatically taller than others: A single bin containing many more observations than expected suggests a discrete cluster of misfits. Check the dates of those observations.
Monotone slope: A systematic bias that the model has not captured. Check the residuals time series for trend.

Action

U-shaped PIT histograms call for wider predictive intervals: increase the noise prior, add missing covariates, or allow for heavier tails. Inverse-U patterns suggest the noise prior is too diffuse — tighten it. Skewed patterns indicate systematic bias that should be addressed through formula changes (missing controls, trend, level shifts). Cross-reference with the PPC fan chart for a visual complement.

ELPD influence plot

Filename: elpd_influence.png

What it shows

A lollipop chart of pointwise expected log predictive density (ELPD) contributions over time. Each vertical stem connects the observation’s ELPD value to zero; the dot marks the ELPD value. Blue points and stems indicate non-influential observations (Pareto-k ≤ 0.7); red indicates influential ones (Pareto-k > 0.7). Larger red dots draw attention to the problematic observations.

When it is generated

The runner generates this plot whenever the pointwise LOO table contains both elpd_loo and pareto_k columns. It is written by write_model_selection_artifacts() in R/run_artifacts_diagnostics.R, immediately after the Pareto-k scatter.

How to interpret it

ELPD values quantify each observation’s contribution to the model’s out-of-sample predictive performance. Values near zero indicate observations that the model predicts well. Large negative values indicate observations where the model assigns low predictive probability — these are the worst-predicted points.

The combination of ELPD magnitude and Pareto-k severity is informative:

Large negative ELPD + low k: The model predicts this observation poorly, but the PSIS estimate is reliable. The model genuinely struggles with this data point.
Large negative ELPD + high k: Both the prediction and the LOO approximation are unreliable. This observation is highly influential and poorly fit — it warrants the closest scrutiny.
Near-zero ELPD + high k: The observation is influential but well-predicted. It may be a leverage point (extreme in predictor space) that happens to lie on the fitted surface.

Warning signs

Cluster of large negative values in a specific period: The model systematically fails during that period. Check for missing events, structural breaks, or data quality problems.
Many red (influential) points with large negative ELPD: The model’s aggregate LOO estimate is unreliable, and the worst-fit observations are also the most influential. This combination makes model comparison results untrustworthy.
Monotone trend in ELPD values: Suggests time-varying model adequacy — the model may fit the training period well but degrade towards the edges.

Action

Investigate the dates of the worst ELPD observations. If they correspond to known anomalies (data errors, one-off events), consider excluding or down-weighting them. If they correspond to regular conditions that the model should handle, the model needs revision. Use the Pareto-k plot to confirm which observations are both poorly predicted and influential, and prioritise those for investigation.

loo_pointwise.csv in 50_model_selection/ contains the full pointwise table with ELPD, Pareto-k, and influence flags.
loo_summary.csv in 50_model_selection/ reports the aggregate ELPD estimate and standard error for model comparison.

Cross-references

Diagnostics plots — residual-level checks that complement LOO diagnostics
Model fit plots — posterior summaries and fitted-vs-observed views
Runner output artefacts — complete artefact inventory
Plot index

Optimisation Plots

Purpose

Optimisation plots visualise the outputs of the budget allocator. They translate model estimates into actionable budget decisions by showing response curves, efficiency comparisons, and the sensitivity of recommendations to budget changes. These are decision-layer artefacts: they sit downstream of all modelling and diagnostics, and their quality depends entirely on the credibility of the upstream fit.

All optimisation plots are written to 60_optimisation/ within the run directory. The runner generates them via write_budget_optimisation_artifacts() in R/run_artifacts_enrichment.R, which calls the public plotting APIs in R/optimise_budget_plots.R. They require a successful call to optimise_budget() that produces a budget_optimisation object with a plot_data payload.

Plot catalogue

Filename	What it shows	Conditions
`budget_response_curves.png`	Channel response curves with current/optimised points	Optimisation completed with response curve data
`budget_roi_cpa.png`	ROI or CPA comparison by channel	Optimisation completed with ROI/CPA summary
`budget_impact.png`	Spend reallocation and response impact (diverging bars)	Optimisation completed with ROI/CPA summary
`budget_contribution.png`	Absolute response comparison by channel	Optimisation completed with ROI/CPA summary
`budget_confidence_comparison.png`	Posterior credible intervals for current vs optimised	Optimisation completed with response points
`budget_sensitivity.png`	Total response change when each channel varies ±20%	Optimisation completed with response curve data
`budget_efficient_frontier.png`	Optimised response across budget levels	Efficient frontier computed via `budget_efficient_frontier()`
`budget_kpi_waterfall.png`	KPI decomposition waterfall (base + channels + controls)	Waterfall data computable from model coefficients and data means
`budget_marginal_roi.png`	Marginal ROI (or marginal response) curves by channel	Optimisation completed with response curve data
`budget_spend_share.png`	Current vs optimised spend allocation as percentage	Optimisation completed with ROI/CPA summary

Response curves

Filename: budget_response_curves.png

What it shows

Faceted line charts of the estimated response curve for each media channel. The x-axis is raw spend (model units); the y-axis is expected response. A shaded band shows the posterior credible interval around the mean curve. Two marked points per channel indicate the current (reference) and optimised spend allocations.

The subtitle notes which media transforms were applied (e.g. Hill saturation, adstock). A caption reports the marginal response at the optimised point for each channel.

When it is generated

The runner generates this plot whenever optimise_budget() returns response curve data in the plot_data payload. This requires at least one media channel in the allocation configuration with a computable response function.

How to interpret it

The curve shape encodes diminishing returns. Steep initial slopes indicate high marginal response at low spend; flattening curves indicate saturation. The gap between the current and optimised points shows the direction of the recommended reallocation: if the optimised point sits to the right (higher spend) of the current point, the allocator recommends increasing that channel’s budget.

The credible band width reflects posterior uncertainty about the response function. Wide bands mean the shape is poorly identified — the recommendation is sensitive to modelling assumptions. Narrow bands indicate data-informed estimates.

Warning signs

Very wide credible bands: The response curve shape is uncertain. Budget recommendations based on it carry substantial risk.
Optimised point near the flat part of the curve: The channel is saturated at the recommended spend. Further increases yield negligible marginal returns.
Current and optimised points nearly identical: The allocator found little room for improvement on that channel. The current allocation is already near-optimal (or the response function is too uncertain to justify a change).

Action

Compare the marginal response values across channels. The allocator equalises marginal response at the optimum — if marginal values differ substantially, the optimisation may have hit a constraint (spend floor/ceiling). Cross-reference with the budget sensitivity plot to assess how robust the recommendation is.

budget_response_curves.csv in 60_optimisation/ contains the curve data.
budget_response_points.csv in 60_optimisation/ contains the current and optimised point coordinates.

ROI/CPA comparison

Filename: budget_roi_cpa.png

What it shows

A grouped bar chart comparing ROI (or CPA, for subscription KPIs) by channel under the current and optimised allocations. If currency_col is defined per channel, bars show financial ROI; otherwise they show response-per-unit-spend in model units. A TOTAL bar summarises the portfolio-level metric.

The metric choice is automatic: the allocator uses ROI for revenue-type KPIs and CPA for subscription-type KPIs.

When it is generated

The runner generates this plot whenever the optimisation result includes a roi_cpa summary table.

How to interpret it

Channels where the optimised bar exceeds the current bar gain efficiency from the reallocation. Channels where the optimised bar is lower have had spend reduced — their marginal efficiency was below the portfolio average. The TOTAL bar shows the net portfolio improvement.

Warning signs

Optimised ROI lower than current for most channels: The allocator redistributed spend towards higher-response channels, which may have lower per-unit efficiency but larger absolute contribution. This is not necessarily wrong — the allocator maximises total response, not per-channel ROI.
TOTAL bar shows negligible improvement: The current allocation is already near-optimal, or the model’s response functions are too flat to support meaningful reallocation.
Very large ROI values on low-spend channels: Small denominators inflate ROI. These channels may have high marginal returns at low spend but limited capacity to absorb budget.

Action

Do not interpret this plot in isolation. Cross-reference with the contribution comparison and the response curves to distinguish efficiency improvements from scale effects.

budget_roi_cpa.csv in 60_optimisation/ contains the per-channel ROI/CPA values.
budget_summary.csv in 60_optimisation/ provides the top-level allocation summary.

Allocation impact

Filename: budget_impact.png

What it shows

A horizontal diverging bar chart in two facets. The left facet shows spend reallocation (positive = increase, negative = decrease) per channel. The right facet shows the corresponding response impact. Bars are coloured green for increases and red for decreases. A TOTAL row at the bottom summarises the net change with muted styling.

Channels are sorted by response impact magnitude — the channels most affected by the reallocation appear at the top.

When it is generated

The runner generates this plot whenever the optimisation result includes a roi_cpa summary with delta_spend and delta_response columns.

How to interpret it

The spend facet shows where the allocator moves budget. The response facet shows the expected consequence. A useful pattern is a channel that receives a spend decrease (red bar, left) but shows a small response decrease (small red bar, right) — that channel was inefficient and the freed budget drives larger gains elsewhere.

Warning signs

Large spend increase on a channel with modest response gain: Diminishing returns may be steep. Verify against the response curve.
Response decreases that exceed response gains: The allocator expects a net negative outcome. This should not happen with a correctly specified max_response objective, and suggests a configuration or constraint issue.

Action

Use this chart to brief stakeholders on the “where and why” of reallocation. Pair it with the confidence comparison to communicate whether the expected gains are statistically distinguishable from zero.

Response contribution

Filename: budget_contribution.png

What it shows

A grouped bar chart comparing absolute expected response (contribution) by channel under the current and optimised allocations. Delta annotations above each pair show the change. A TOTAL bar with muted styling shows the portfolio-level gain. The subtitle reports the percentage total response gain from optimisation.

When it is generated

The runner generates this plot whenever the optimisation result includes mean_reference and mean_optimised columns in the roi_cpa summary.

How to interpret it

This chart answers the question: in absolute terms, how much more (or less) response does each channel deliver under the optimised allocation? Unlike the ROI chart, this view is not distorted by small denominators — it shows the quantity the allocator actually maximises.

Warning signs

Negative delta on a channel with high current contribution: The allocator is pulling spend from a channel that currently contributes a great deal. This is rational if the marginal return on that channel is below the portfolio average, but it requires careful communication to stakeholders accustomed to interpreting total contribution as “importance”.
TOTAL gain is small: The reallocation may not justify the operational cost of implementing it. Consider whether the confidence intervals overlap (see confidence comparison).

Action

Report the TOTAL percentage gain as the headline number. Caveat it with the credible interval width from the confidence comparison. If the gain is within posterior uncertainty, the recommendation is suggestive rather than conclusive.

budget_allocation.csv in 60_optimisation/ contains the per-channel spend and response values.

Confidence comparison

Filename: budget_confidence_comparison.png

What it shows

A horizontal forest plot (dodge-positioned point-and-errorbar) showing the posterior mean response and 90% credible interval for each channel under the current (grey) and optimised (red) allocations. Channels where the intervals overlap suggest that the reallocation gain may not be statistically meaningful.

When it is generated

The runner generates this plot whenever the optimisation result includes response point data with mean, lower, and upper columns for both reference and optimised allocations.

How to interpret it

Focus on channels where the optimised interval (red) does not overlap with the current interval (grey). These are the channels where the reallocation produces a distinguishable change in expected response. Overlapping intervals mean the posterior cannot confidently distinguish the two allocations — the gain exists in expectation but falls within sampling uncertainty.

Warning signs

All intervals overlap: The data is too uncertain to support a confident reallocation recommendation. The allocator’s point estimate suggests improvement, but the posterior cannot distinguish it from noise.
One channel shows a clear gain while others overlap: The headline portfolio gain may be driven by a single channel. Verify that channel’s response curve and prior-posterior shift.

Action

Use this plot to calibrate the confidence of the recommendation. If intervals overlap for most channels, present the allocation as “directionally suggestive” rather than “statistically supported”. If key channels show clear separation, the recommendation is stronger.

Budget sensitivity

Filename: budget_sensitivity.png

What it shows

A spider chart (line plot) showing how total expected response changes when each channel’s spend is varied ±20% from its optimised level, while all other channels are held fixed. Steeper lines indicate channels whose budgets have the most influence on total response. A horizontal dashed line at zero marks the optimised baseline.

When it is generated

The runner generates this plot whenever the optimisation result includes response curve data. The ±20% range and 11 evaluation points per channel are defaults set in plot_budget_sensitivity().

How to interpret it

Channels with steep lines are the most sensitive: small deviations from their optimised spend produce large response changes. Flat lines indicate channels where modest budget deviations have little impact — the response function is either saturated (on the flat part of the curve) or nearly linear (constant marginal return).

Warning signs

A channel with an asymmetric slope (steep downward, flat upward): Cutting this channel’s spend is costly, but increasing it yields little. It is at or near its saturation point.
All lines nearly flat: The optimisation surface is plateau-like. The allocator’s recommendation is robust to implementation imprecision, but also implies limited upside from optimisation.
Lines that cross: Channels swap in relative importance at different budget perturbations. This complicates simple priority rankings.

Action

Use this chart to communicate implementation risk. If the recommended allocation is operationally difficult to achieve exactly, the sensitivity chart shows which channels require precise execution and which have margin for error.

Efficient frontier

Filename: budget_efficient_frontier.png

What it shows

A line-and-point chart of total optimised response as a function of total budget. Each point represents the optimal allocation at that budget level (expressed as a percentage of the current total budget). A red diamond marks the current budget level. The curve shows how much additional response is achievable by increasing the total budget — and the diminishing returns of doing so.

When it is generated

The runner generates this plot when budget_efficient_frontier() produces a budget_frontier object with at least two feasible points. This requires a valid optimisation result and a set of budget multipliers (configured in allocation.efficient_frontier).

How to interpret it

The frontier’s shape reveals the budget’s overall productivity. A concave curve (steepening, then flattening) is the classic diminishing-returns shape: each additional unit of budget buys less incremental response. The gap between the current point and the curve above it shows the unrealised potential at the same budget — the difference between the current allocation and the optimal one.

Warning signs

Frontier is nearly linear: Returns are approximately constant across the budget range. The model may not have enough data to identify saturation, or the budget range is too narrow to reveal it.
Frontier flattens early: The portfolio saturates at a budget well below the current level. The current spend may be wastefully high.
Only 2–3 feasible points: The optimiser could not find feasible allocations at most budget levels. Constraints may be too tight.

Action

Use the frontier to frame budget conversations. The curve shows what is achievable at each budget level. If a stakeholder proposes a budget cut, the frontier quantifies the response cost. If they propose an increase, it quantifies the expected gain. Present the frontier alongside the spend share comparison to show how the allocation shifts at each level.

budget_efficient_frontier.csv in 60_optimisation/ contains the frontier data.

KPI waterfall

Filename: budget_kpi_waterfall.png

What it shows

A horizontal waterfall bar chart decomposing the predicted KPI into its constituent components: base (intercept), trend, seasonality, holidays, controls, and individual media channels. Each bar shows the mean posterior coefficient multiplied by the mean predictor value — the average contribution of that component to the predicted KPI. A red TOTAL bar anchors the sum.

When it is generated

The runner generates this plot when build_kpi_waterfall_data() can extract posterior coefficients and match them to predictor means in the original data. This requires that the model’s .formula and .original_data are both accessible. For hierarchical models with random-effects syntax, the waterfall may fail gracefully and be skipped.

How to interpret it

The waterfall answers: “of the total predicted KPI, how much comes from each source?” The base (intercept) typically dominates, representing baseline demand independent of media and controls. Media channels sit at the bottom, showing their individual incremental contributions. The relative sizes of the media bars correspond to the decomposition impact chart (decomp_predictor_impact.png), but computed slightly differently (mean × mean vs sum over time).

Warning signs

Negative media contributions: A channel with a negative bar reduces predicted KPI. Unless the coefficient is intentionally unconstrained, this suggests a fitting or identification problem.
Intercept dwarfs all other terms: The model attributes nearly all KPI to baseline demand. Media effects are marginal. This may be realistic for low-spend brands but limits the value of budget optimisation.
Missing plot (skipped with warning): The model type does not support direct waterfall decomposition.

Action

Use the waterfall to contextualise media contributions within the total predicted KPI. For stakeholder reporting, it provides a clear answer to “what drives our KPI?” — while emphasising that media is one factor among several.

budget_kpi_waterfall.csv in 60_optimisation/ contains the waterfall data.

Marginal ROI curves

Filename: budget_marginal_roi.png

What it shows

Faceted line charts of marginal ROI (or marginal response, if no currency conversion is configured) as a function of spend for each channel. The marginal value is computed as the first difference of the response curve: the additional response per additional unit of spend. Current and optimised points are marked.

When it is generated

The runner generates this plot whenever the optimisation result includes response curve data with at least two points per channel.

How to interpret it

The marginal ROI curve is the derivative of the response curve. At the optimised allocation, the allocator equalises marginal ROI across channels (subject to constraints). If one channel’s marginal ROI at the optimised point is substantially higher than another’s, a constraint (spend floor or ceiling) is preventing further reallocation.

Diminishing returns appear as a downward-sloping marginal curve: each additional unit of spend yields less incremental response than the last. Channels with steeper slopes saturate faster.

Warning signs

Marginal ROI near zero at the optimised point: The channel is at or near saturation. Additional spend yields negligible incremental response.
Marginal ROI that increases with spend: This implies increasing returns, which is unusual for media. It may indicate a response curve misspecification or insufficient data in the high-spend region.
Large differences in marginal ROI at the optimised points across channels: Constraints are binding. The allocator cannot equalise marginal returns because spend bounds prevent it.

Action

Use marginal ROI to identify which channels have headroom (high marginal ROI at the optimised point) and which are saturated (marginal ROI near zero). This informs not just the current allocation but also the value of relaxing spend constraints.

Filename: budget_spend_share.png

What it shows

Two horizontal stacked bars showing the percentage allocation of total budget across channels: one for the current allocation and one for the optimised allocation. Percentage labels appear within each segment (for segments ≥ 4% of total). The subtitle reports the total budget in currency or model units for both allocations.

When it is generated

The runner generates this plot whenever the optimisation result includes a roi_cpa summary with spend_reference and spend_optimised columns.

How to interpret it

This is the most intuitive optimisation output for non-technical stakeholders. It answers: “how should we split the budget?” Segments that grow from current to optimised represent channels the allocator recommends investing more in; segments that shrink represent channels to reduce.

Warning signs

A channel disappears (0% share) in the optimised allocation: The allocator has hit the channel’s spend floor (which may be zero). If this is unintended, raise the minimum spend constraint.
Allocations are nearly identical: The current mix is already near-optimal, or the model cannot distinguish channel effects well enough to justify reallocation.
Very small segments in both allocations: Channels with negligible spend share contribute little to the optimisation. Consider whether they should be included or grouped.

Action

Present this chart as the primary recommendation visual. Accompany it with the confidence comparison to communicate the certainty of the recommendation and the allocation impact chart to show the expected consequence.

Cross-references

Post-run plots — decomposition that informs the optimisation inputs
Model selection plots — LOO diagnostics that validate the model underlying these recommendations
Diagnostics plots — residual checks on the fitted model
Runner output artefacts — complete artefact inventory
Plot index

Plots

Purpose

Pipeline stages

Plot catalogue

Pre-run (10_pre_run/)

Model fit (20_model_fit/)

Post-run (30_post_run/)

Diagnostics (40_diagnostics/)

Model selection (50_model_selection/)

Optimisation (60_optimisation/)

Related documentation

Source code references

Subsections of Plots

Pre-run Plots

Purpose

Plot catalogue

Media spend time series

What it shows

When it is generated

How to interpret it

Warning signs

Action

Related artefacts

KPI–media overlay

What it shows

When it is generated

How to interpret it

Warning signs

Action

Variance inflation factor (VIF) bar chart

What it shows

When it is generated

How to interpret it

Warning signs

Action

Related artefacts

Cross-references

Model Fit Plots

Purpose

Plot catalogue

Fit time series

What it shows

When it is generated

How to interpret it

Warning signs

Action

Related artefacts

Fit scatter

What it shows

When it is generated

How to interpret it

Warning signs

Action

Posterior forest plot

What it shows

When it is generated

How to interpret it

Warning signs

Action

Prior vs posterior

What it shows

When it is generated

How to interpret it

Warning signs

Action

Cross-references

Post-run Plots

Purpose

Plot catalogue

Predictor impact

What it shows

When it is generated

How to interpret it

Warning signs

Action

Related artefacts

Decomposition time series

What it shows

When it is generated

How to interpret it

Pre-run (`10_pre_run/`)

Model fit (`20_model_fit/`)

Post-run (`30_post_run/`)

Diagnostics (`40_diagnostics/`)

Model selection (`50_model_selection/`)

Optimisation (`60_optimisation/`)