Why Stress Testing Matters
MMM outputs look like decisions even when they are guesses. The model can produce confident-sounding numbers that fall apart under perturbation. Stress testing is the discipline of pushing the model in defined ways and checking whether the conclusions survive. The conclusions that survive are operationally trustworthy; the ones that do not are placeholders.
Step 1: Sensitivity Analysis on Priors
Refit the model with priors at the upper and lower bounds of the plausible range. Compare key conclusions (channel contributions, budget allocation recommendation) across the refit. Conclusions that change materially are driven by the priors, not by the data; conclusions that stay stable are data-driven and trustworthy.
Step 2: Sensitivity Analysis on Spec
Refit with different spec variations: alternative adstock functions, alternative saturation forms, adding or removing minor channels. Conclusions that hold across spec variations are robust; conclusions that depend on a specific spec choice are spec-dependent and should be reported with that caveat.
Step 3: Holdout Validation
Hold out the most recent eight weeks. Refit on the remainder. Predict the holdout. Compare predicted to actual via MAPE or another error metric. Holdout MAPE below 10 percent is healthy for most consumer categories; above 15 percent is a problem.
Step 4: Calibration Consistency Check
For channels where lift tests have been run, compare the MMM-implied lift for the same intervention to the test result. They should agree within confidence interval. Persistent disagreement across channels indicates a systematic spec issue; isolated disagreement indicates a channel-specific issue.
Step 5: Decomposition Plausibility
Inspect the contribution decomposition. Base demand should be 30 to 60 percent of revenue for a mature brand. No single channel should contribute more than 30 to 40 percent in a diversified marketing portfolio. Seasonality should contribute proportionally to category dynamics. Implausible decompositions (negative channel effects, channels with 60+ percent share) indicate spec issues.
Step 6: Time-Stability Check
Refit on rolling windows (years one and two, years one through three, years one through four) and compare coefficients across refits. Coefficients should be stable across windows except where genuine structural change has occurred. Wild swings without structural justification indicate spec issues or data quality problems.
Step 7: Document the Audit
The stress testing pack should be documented alongside the model methodology pack. Each test result (sensitivity findings, holdout MAPE, calibration agreement, decomposition plausibility, time stability) is a piece of evidence for or against the model's trustworthiness. Boards and finance teams that see the audit pack respond more confidently to the model's conclusions than to the conclusions alone.
How Presenc AI Helps
Presenc AI provides the AI visibility data that supports the stress testing process. Stable methodology means the AI variable is comparable across stress tests; the data layer does not introduce artifactual variation that the model has to absorb. For the calibration consistency check on the AI channel specifically, Presenc supports the geographic lift testing that produces the calibration anchor.