Debug Run Failures

Objective

Diagnose and resolve the most common failure modes encountered when running DSAMbayes via the YAML/CLI runner.

Prerequisites

  • A failed runner execution (non-zero exit code or missing artefacts).
  • Access to the terminal output or log from the failed run.
  • Familiarity with CLI Usage and Config Schema.

Triage by failure stage

Stage 0: Config resolution failures

Symptoms: runner exits immediately after validate or at the start of run; no run directory created or only 00_run_metadata/ is present.

Error pattern Cause Fix
data_path not found Data file path is wrong or missing Check data.path in YAML; use absolute path or path relative to config file
Unknown YAML key Typo or unsupported config key Compare against Config Schema; fix spelling
Formula parse error Invalid R formula syntax Check model.formula for unmatched parentheses, missing ~, or invalid operators
holidays.calendar_path not found Holiday calendar file missing Check time_components.holidays.calendar_path; ensure file exists

Quick check:

Rscript scripts/dsambayes.R validate --config config/my_config.yaml

If validate passes, the config is structurally valid.

Stage 1: Stan compilation failures

Symptoms: runner reports compilation errors after “Compiling model”; may reference C++ or Stan syntax errors.

Error pattern Cause Fix
Stan syntax error in generated template Template rendering issue Clear the Stan cache (rm -rf .cache/dsambayes/) and retry
C++ compiler not found Toolchain not installed Install a C++ toolchain (see Install and Setup)
Permission denied on cache directory Cache path not writable Set XDG_CACHE_HOME to a writable directory

Quick check:

mkdir -p .cache
export XDG_CACHE_HOME="$PWD/.cache"

Stage 2: Data preparation failures

Symptoms: runner fails after compilation but before sampling; error messages reference prep_data_for_fit, model.frame, or scaling.

Error pattern Cause Fix
“Cannot scale model data with zero variance” A column in the model frame is constant Remove constant terms from formula, or set model.scale: false
“Constant CRE mean terms” CRE variable has identical group means Use model.type: re (without CRE) or add variation
“non-finite values” in model frame NA or Inf values in data Clean data before running; remove rows with missing values
“Offset vector length does not match” NA handling created length mismatch Ensure offset column has no NA values, or report as a bug

Stage 3: Sampling failures

Symptoms: runner fails during rstan::sampling() or rstan::optimizing(); may report Stan runtime errors.

Error pattern Cause Fix
“Exception: validate transformed params” Parameter hits boundary during sampling Widen boundaries; check for overly tight constraints
“Initialization failed” Poor initial values Increase fit.mcmc.init range or simplify model
Timeout or very slow sampling Model too complex for data size Reduce iterations for initial testing; simplify formula
All chains fail Severe model misspecification Review formula, priors, and data for fundamental issues

Stage 4: Post-fit artefact failures

Symptoms: runner completes sampling but some artefact folders are empty or missing files.

Error pattern Cause Fix
Missing 30_post_run/ files Decomposition failed Check formula compatibility with model.matrix(); hierarchical formulas with `
Missing 40_diagnostics/ files Diagnostics writer error Check for upstream issues in model object; review tryCatch messages in log
Missing 50_model_selection/ files LOO computation failed Ensure MCMC fit (not MAP); check for valid posterior
Missing 60_optimisation/ files Allocation not enabled or failed Check allocation.enabled: true in config; review scenario specification

Quick check:

find results/<run_dir> -type f | sort

Compare against the expected artefact list in Output Artefacts.

Stage 5: Plot generation failures

Symptoms: CSV artefacts are present but PNG plot files are missing.

Error pattern Cause Fix
“cannot open connection” for PNG Graphics device issue Check that grDevices is available; ensure sufficient disk space
Plot function error for hierarchical model Group-level coefficient draws are vectors, not scalars This was fixed in v1.2.0; ensure you are running the latest version

General debugging steps

  1. Read the full error message. DSAMbayes uses cli::cli_abort() with descriptive messages that identify the failing function and parameter.

  2. Check the resolved config. If a run directory was created, inspect 00_run_metadata/config.resolved.yaml to see what defaults were applied.

  3. Check session info. Inspect 00_run_metadata/session_info.txt for package version mismatches.

  4. Clear the Stan cache. Stale compiled models can cause unexpected failures:

    rm -rf .cache/dsambayes/
  5. Run validate before run. Always validate first to catch config errors before committing to a full MCMC run.

  6. Reduce iterations for debugging. Use a minimal config with fit.mcmc.iter: 200 and fit.mcmc.warmup: 100 to iterate quickly on formula and data issues.