Diagnostics Plots

Purpose

Diagnostics plots assess whether the fitted model’s assumptions hold and whether any structural problems warrant remedial action. They cover residual behaviour, posterior predictive adequacy, and boundary constraint monitoring. These plots are written to 40_diagnostics/ within the run directory.

The runner generates residual plots via write_residual_diagnostics() in R/run_artifacts_diagnostics.R, the PPC plot via write_model_fit_plots() in R/run_artifacts_enrichment.R, and the boundary hits plot via write_boundary_diagnostics() in R/run_artifacts_diagnostics.R. Each plot is wrapped in tryCatch so that individual failures do not block the remaining outputs.

Plot catalogue

Filename What it shows Conditions
ppc.png Posterior predictive check fan chart Posterior draws (yhat) extractable from fitted model
residuals_timeseries.png Residuals over time Fit table available
residuals_vs_fitted.png Residuals vs fitted values Fit table available
residuals_hist.png Residual distribution histogram Fit table available
residuals_acf.png Residual autocorrelation function Fit table available
residuals_latent_acf.png Latent-scale residual ACF Model uses log-scale response (response_scale != "identity")
boundary_hits.png Posterior draw proximity to coefficient bounds Boundary hit rates computable from posterior and bound specifications

Posterior predictive check (PPC)

Filename: ppc.png

Posterior predictive check Posterior predictive check

What it shows

A fan chart of posterior predictive draws overlaid with observed data. The blue line is the posterior mean of the predicted response; the dark band spans the 25th–75th percentile (50% CI) and the light band spans the 5th–95th percentile (90% CI). Red dots mark observed values.

When it is generated

The runner generates this plot whenever posterior predictive draws (yhat) can be extracted from the fitted model via runner_yhat_draws(). This works for BLM, hierarchical, and pooled models fitted with MCMC.

How to interpret it

Well-calibrated models produce bands that contain roughly 50% and 90% of observed points in the respective intervals. The key diagnostic is whether observed values fall systematically outside the bands during specific periods — this reveals time-localised misfit that aggregate metrics like RMSE can mask.

Warning signs

  • Observed points consistently outside the 90% band: The model underestimates uncertainty or misses a structural feature (holiday, promotion, regime change).
  • Bands that widen dramatically in specific periods: The model is uncertain about those periods, possibly because the training data lacks similar observations.
  • Bands that are uniformly very wide: The noise prior may be too diffuse, or the model has too many weakly identified parameters.

Action

If the PPC reveals localised misfit, check whether the affected periods correspond to missing control variables (holidays, events). If the bands are too wide overall, consider tightening the noise prior or simplifying the formula. Cross-reference with the LOO-PIT histogram for an aggregate calibration assessment.


Residuals over time

Filename: residuals_timeseries.png

Residuals time series Residuals time series

What it shows

A line chart of residuals (observed minus posterior mean) over time. A horizontal reference line at zero marks perfect fit. For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Residuals should scatter randomly around zero with no discernible trend or seasonal pattern. Any structure in the residuals indicates that the model has failed to capture a systematic component of the data.

Warning signs

  • Trend in residuals: The model’s trend specification is inadequate. Consider adding a higher-order polynomial or a structural-break term.
  • Seasonal oscillation: The Fourier harmonics or holiday dummies are insufficient. Add more harmonics or specific event indicators.
  • Clusters of large residuals: Localised misfit — check corresponding dates for data anomalies.

Action

Residual structure that persists across multiple weeks warrants a formula revision. Short isolated spikes are often data outliers and may not require model changes.


Residuals vs fitted

Filename: residuals_vs_fitted.png

Residuals vs fitted Residuals vs fitted

What it shows

A scatter plot of residuals (y-axis) against posterior mean fitted values (x-axis), with a horizontal reference at zero. For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

The scatter should form a horizontal band centred on zero with roughly constant vertical spread across the fitted-value range. Patterns in this plot diagnose specific model violations.

Warning signs

  • Funnel shape (wider spread at higher fitted values): Heteroscedasticity. A log-scale model would be more appropriate.
  • Curvature: The mean function is misspecified. The model under- or over-predicts at the extremes.
  • Discrete clusters: May indicate grouping structure that the model does not account for.

Action

Heteroscedasticity in a levels model is the most common finding. If the funnel pattern is pronounced, re-fit on the log scale and compare diagnostics. Cross-reference with the fit scatter plot which shows the same information from a different angle.


Residual distribution

Filename: residuals_hist.png

Residual histogram Residual histogram

What it shows

A histogram of residuals across all observations (40 bins). For hierarchical models with six or fewer groups, the histogram facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

The distribution should be approximately symmetric and unimodal if the Normal noise assumption holds. Heavy tails or skewness indicate departures from normality.

Warning signs

  • Strong right skew: Common in levels models when the response is strictly positive and has occasional large values. A log transform may help.
  • Bimodality: Suggests a mixture or an omitted grouping variable. Check whether the data contains distinct regimes.
  • Extreme outliers: Individual residuals several standard deviations from the mean warrant data inspection.

Action

Moderate departures from normality in the residuals are tolerable in Bayesian inference — the posterior is still valid if the model is otherwise well-specified. Severe skewness or heavy tails, however, can distort credible intervals and predictive coverage. Consider robust likelihood specifications or transformations.


Residual autocorrelation (ACF)

Filename: residuals_acf.png

Residual ACF Residual ACF

What it shows

A bar chart of the sample autocorrelation function of residuals, computed up to lag 26 (roughly half a year of weekly data). Red dashed lines mark the 95% significance bounds (±1.96/√n). For hierarchical models, the plot facets by group.

When it is generated

Always, provided the fit table is available.

How to interpret it

Bars within the significance bounds indicate no serial correlation at that lag. Significant autocorrelation — especially at low lags (1–4 weeks) — means the model misses short-run temporal dependence. Significant spikes at lag 52 (if the series is long enough) suggest residual annual seasonality.

Warning signs

  • Lag-1 ACF > 0.3: Strong short-run autocorrelation. The model’s uncertainty estimates are anti-conservative (credible intervals too narrow), and coefficient estimates may be biased if lagged effects are present.
  • Decaying positive ACF: Suggests an omitted AR component or insufficient adstock decay modelling.
  • Spike at lag 52: Residual annual seasonality not captured by the Fourier terms.

Action

If lag-1 ACF is material, consider adding lagged response terms or increasing the number of Fourier harmonics. For adstock-driven channels, verify that the decay rate is not too fast (underfitting carry-over) or too slow (overfitting noise).


Latent-scale residual ACF

Filename: residuals_latent_acf.png

Latent ACF Latent ACF

What it shows

The same ACF plot as above, but computed on the latent (log) scale when the model’s response scale is not identity. This is relevant for models fitted with model.scale: true or log-transformed response variables.

When it is generated

The runner generates this plot when response_scale != "identity". It is skipped for levels-scale models.

How to interpret it

Interpretation is identical to the standard ACF plot. The latent-scale version is preferred for log models because autocorrelation in the log residuals is more directly interpretable as a model adequacy check on the scale where inference is performed.

Warning signs

Same as the standard ACF. Compare both plots if both are generated — discrepancies may indicate that the log transformation introduces or removes autocorrelation artefacts.


Boundary hits

Filename: boundary_hits.png

Boundary hits Boundary hits

What it shows

A horizontal chart showing, for each constrained coefficient, the share of posterior draws that fall within a tolerance of the finite lower or upper bound. Bars are colour-coded: green (0% hit rate), amber (1–10%), red (≥10%). When all hit rates are zero, the plot displays green dots with explicit “0.0%” labels.

When it is generated

The runner generates this plot when the model has finite boundary constraints set via set_boundary() and boundary hit rates can be computed from the posterior draws. It is written by write_boundary_diagnostics() in R/run_artifacts_diagnostics.R.

How to interpret it

A zero hit rate for all parameters means no posterior draws approached any boundary — the constraints are not binding and the posterior is effectively unconstrained. This is the ideal outcome.

A non-zero hit rate means the boundary is influencing the posterior shape. Moderate rates (1–10%) suggest the data mildly conflicts with the constraint; high rates (≥10%) mean the data wants the coefficient outside the allowed range and the boundary is actively truncating the posterior.

Warning signs

  • Hit rate ≥10% on a media coefficient: The non-negativity constraint is binding. The true effect may be zero or negative, but the boundary forces a positive estimate. This inflates the channel’s apparent contribution.
  • Hit rate ≥10% on many parameters simultaneously: The overall constraint specification may be too tight for the data. Consider widening bounds or reviewing the formula.
  • Lower-bound hits on a coefficient with strong prior mass at zero: The prior and boundary together may create a “pile-up” at the bound. The posterior is not reflecting the data faithfully.

Action

For channels with high boundary hit rates, critically assess whether the non-negativity constraint is justified by domain knowledge. If the constraint is essential (e.g. media cannot destroy demand), document that the estimate is boundary-driven. If it is not essential, consider relaxing the bound and re-fitting to see whether the unconstrained estimate is materially different.

  • boundary_hits.csv in 40_diagnostics/ provides the per-parameter hit rates in tabular form.
  • diagnostics_report.csv in 40_diagnostics/ includes a summary check for boundary binding.

Hierarchical-specific: within variation

Filename: within_variation.png (generated only for hierarchical models)

This plot shows the within-group variation ratio for each non-CRE (correlated random effects) term: Var(x − mean_g(x)) / Var(x). Low ratios indicate that most variation in a predictor is between groups rather than within groups, making it difficult to identify the coefficient from within-group variation alone. Dashed lines at 5% and 10% mark conventional concern thresholds.

This plot is generated only for hierarchical models and is not included in the standard BLM image set.


Cross-references