How-To Guide

How to Run a Conversion Lift Test

A practical playbook for running platform-side conversion lift studies on Meta, Google, TikTok, and Amazon. Test design, sample sizing, holdout share, and interpretation.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: May 17, 2026

Step 1: Pick the Campaign to Test

Conversion lift studies are most useful on campaigns with meaningful spend, clear conversion events, and stable creative. The test estimates the incremental effect of the campaign during the test window; campaigns that change materially mid-test produce unreliable estimates. Pick campaigns at least 8 weeks into stable operation, not new launches.

The single highest-leverage test selection is on channels and campaigns where the platform-reported ROAS is high and you suspect the platform is taking credit for organic conversion. These are the cases where lift testing most often surfaces large discrepancies between attribution and reality.

Step 2: Define the Conversion Event

Pick the conversion event the test will measure: purchase, signup, qualified lead, app install. The event needs to be tracked reliably and accumulate enough volume during the test window to satisfy the power calculation. High-volume events (page view, click) produce statistically clean tests but are weaker proxies for business value; low-volume events (purchase) are better proxies but require longer test windows.

Step 3: Set the Holdout Share

Platform conversion lift studies typically default to a 5 to 15 percent holdout. Larger holdouts produce tighter confidence intervals but cost more in foregone revenue from the holdout audience. The right size is dictated by the power calculation: what holdout produces enough statistical power to detect the expected effect size with target confidence.

For most production tests, 10 percent holdout is a reasonable default. Smaller holdouts (5 percent) work for very high-volume campaigns; larger holdouts (20 percent) work when the expected effect is small and detection requires more statistical power.

Step 4: Run the Power Calculation

Inputs: expected lift (typically 5 to 20 percent of baseline conversion rate for a meaningful campaign), baseline conversion rate, holdout share, target significance level (usually 5 percent), target power (usually 80 percent). The output is the required exposed audience size and test duration. Most platforms run this calculation automatically; the analyst should verify the recommendation matches independent calculation.

Common error: running the test "as long as we can" instead of for the duration the power calculation requires. Underpowered tests produce null results that get misread as "the campaign does not work."

Step 5: Configure the Test

In the platform's lift study product, configure: campaign(s) included, audience eligibility, holdout share, test duration, conversion event. Most platforms (Meta, Google, TikTok, Snap) have point-and-click lift study products. Some larger advertisers integrate via API for programmatic test management.

Critical: lock the campaign once the test is running. Changing creative, audience, or bidding mid-test contaminates the result. Plan campaign refresh cycles around the test window.

Step 6: Run the Test

Let the platform deliver the campaign to the exposed group and suppress to the holdout. Avoid checking partial results mid-test (interim looks invalidate the inference unless explicitly planned in the design). The test runs to its planned duration.

Step 7: Read the Result

The platform reports the lift estimate, confidence interval, and statistical significance. A point estimate of 12 percent lift with 95 percent CI of 4 to 20 percent is a clean positive result. A point estimate of 8 percent with 95 percent CI of -2 to 18 percent is inconclusive: directionally positive but not statistically distinguishable from zero, usually because the test was underpowered.

Step 8: Translate to Decisions

Apply the lift estimate to the campaign's spend to compute incremental ROAS. Compare to the platform-reported attributed ROAS. The difference is the over- or under-attribution from the platform's default model. If incremental ROAS is materially lower than attributed ROAS, the channel is taking credit for organic conversion; if it is higher, the channel is producing demand that other channels are capturing credit for.

How Presenc AI Helps

Presenc AI provides the AI visibility data that contextualizes lift test results across the broader measurement stack. When a conversion lift test shows that Meta's incremental ROAS is materially lower than attributed, the lift test is surfacing reattribution but does not say where the demand actually originated. AI visibility data, fed into MMM alongside the lift-tested channels, surfaces the upstream demand-creation channels that the platform-side lift test cannot see.

Frequently Asked Questions

For most campaigns, a 5 to 15 percent lift in conversion rate is a meaningful effect. Above 15 percent is rare for mature campaigns. Below 5 percent is often statistically indistinguishable from zero in tests of reasonable size. The right threshold depends on the campaign's spend, audience size, and conversion volume; the power calculation specifies what is detectable.
Once per quarter on each major platform, rotating across campaigns. Continuous lift testing produces alert fatigue and is expensive; quarterly testing balances measurement freshness against operational cost. Major platform changes (algorithm updates, new ad products) justify off-cycle tests to recalibrate.
Not directly. Platform-side lift studies measure campaigns the platform runs. AI search has no advertising platform in the traditional sense, so no platform-side lift study exists. The equivalent for AI search is geographic lift testing on AI visibility inputs, which produces a similar causal estimate via a different randomization unit.
Because attribution credits the campaign for conversions that would have happened anyway. The lift test isolates the truly incremental conversions; attribution counts everyone who saw the ad and converted. The gap is the "would have anyway" share, which is typically 30 to 70 percent for branded search and retargeting and lower for upper-funnel campaigns.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.