A holdout test withholds a marketing stimulus from a randomly selected control group to establish a counterfactual baseline — enabling the measurement of incremental lift by comparing outcomes between exposed and withheld populations.
Quick Answer
A holdout test withholds a marketing stimulus from a randomly selected control group to establish a counterfactual baseline — enabling the measurement of incremental lift by comparing outcomes between exposed and withheld populations.
Holdout group size must be calculated via power analysis before testing — the most common mistake is under-powering the control group
PSA holdouts (control sees neutral ads) are more rigorous than pure holdouts because they control for the mere-exposure effect
Run B2B holdout tests for minimum 6–8 weeks to capture full sales cycle effects — shorter tests only measure initial intent signals
Key Takeaways
Holdout group size must be calculated via power analysis before testing — the most common mistake is under-powering the control group
PSA holdouts (control sees neutral ads) are more rigorous than pure holdouts because they control for the mere-exposure effect
Run B2B holdout tests for minimum 6–8 weeks to capture full sales cycle effects — shorter tests only measure initial intent signals
How Holdout Test Works
A holdout test (also called a "ghost ad" test or "PSA holdout") creates a clean control group by randomly excluding a portion of your eligible audience from a marketing campaign. The holdout group either sees nothing (pure holdout) or sees a neutral public service announcement (PSA holdout — ensures the control group is exposed to the same number of ad impressions but to irrelevant content, controlling for the mere exposure effect). After the test period, you compare conversion rates between the exposed group and the holdout group. The difference, adjusted for sample size and statistical confidence, is your incremental lift rate.
Why Holdout Test Matters for B2B Marketing
Holdout test design requires careful attention to randomization, sample size, and test duration. The holdout group must be randomly selected — not selected based on any behavioral characteristic — otherwise you introduce selection bias. Sample size must be large enough to detect meaningful effect sizes: detecting a 10% lift on a 2% baseline conversion rate requires approximately 50,000 users per arm at 80% statistical power. Test duration must be long enough to capture full purchase cycle effects — for B2B with 30–90 day decision timelines, a 6–8 week holdout is typically minimum.
Holdout Test: Best Practices & Strategic Application
Holdout tests apply across B2B marketing activities: email campaign holdouts (exclude 10–20% of your list from a nurture sequence to measure email's incremental pipeline contribution); ad retargeting holdouts (withhold retargeting ads from a random 20% of your pixel audience to measure incremental conversion lift); SEO holdouts (measure organic traffic and lead volume changes in specific content clusters before and after publication); and budget holdout tests (go dark on a specific channel in a test market for 4 weeks and measure pipeline impact versus control markets).
Agency Perspective: Holdout Test in Practice
The statistical requirements for holdout tests are frequently underestimated. The most common failure is a holdout group too small to produce statistically significant results — yielding inconclusive data that can't guide decisions. Use a power calculator (G*Power, Evan Miller's A/B test calculator) to determine required sample sizes before launching. Also account for novelty effects — newly launched campaigns often show higher-than-sustainable lift in the first 2–3 weeks. Run holdouts long enough to observe steady-state performance, not initial novelty spikes. Report results with 90% or 95% confidence intervals, not just point estimates.
Frequently Asked Questions: Holdout Test
A holdout test withholds a marketing stimulus from a randomly selected control group to establish a counterfactual baseline — enabling the measurement of incremental lift by comparing outcomes between exposed and withheld populations.
Typically 10–20% is withheld for holdout tests, with 80–90% receiving the campaign. The holdout must be large enough to achieve statistical significance for your expected effect size. For high-volume campaigns (100,000+ eligible users), 10% is sufficient. For smaller audiences (under 20,000 users), you may need a 20–30% holdout to detect meaningful lift, or the test may be underpowered regardless of holdout size.
An A/B test compares two variants of a marketing element (ad creative A vs B, landing page version A vs B) to find which performs better — both groups receive some marketing stimulus. A holdout test compares a group that received the marketing stimulus against a group that received none (or a neutral PSA) — measuring whether the marketing activity generates lift over baseline. A/B tests optimize execution; holdout tests validate investment.
Yes. Randomly withhold 10–15% of your email list from a specific nurture sequence for one quarter. Track whether this holdout group converts to SQL or customer at the same rate as the nurtured group. If conversion rates are similar, your email nurture sequence may not be incrementally valuable — the leads would have converted at the same rate organically. If the nurtured group converts at significantly higher rates, you've confirmed email's incremental contribution. The "cost" is foregoing email touches on 10–15% of your list for one quarter.
MV3 Marketing helps B2B companies apply these strategies to drive measurable pipeline growth. Our team executes analytics setup for technology, SaaS, and professional services companies.
ID used to identify users for 24 hours after last activity
24 hours
_gat
Used to monitor number of Google Analytics server requests when using Google Tag Manager
1 minute
_gac_
Contains information related to marketing campaigns of the user. These are shared with Google AdWords / Google Ads when the Google Ads and Google Analytics accounts are linked together.
90 days
__utma
ID used to identify users and sessions
2 years after last activity
__utmt
Used to monitor number of Google Analytics server requests
10 minutes
__utmb
Used to distinguish new sessions and visits. This cookie is set when the GA.js javascript library is loaded and there is no existing __utmb cookie. The cookie is updated every time data is sent to the Google Analytics server.
30 minutes after last activity
__utmc
Used only with old Urchin versions of Google Analytics and not with GA.js. Was used to distinguish between new sessions and visits at the end of a session.
End of session (browser)
__utmz
Contains information about the traffic source or campaign that directed user to the website. The cookie is set when the GA.js javascript is loaded and updated when data is sent to the Google Anaytics server
6 months after last activity
__utmv
Contains custom information set by the web developer via the _setCustomVar method in Google Analytics. This cookie is updated every time new data is sent to the Google Analytics server.
2 years after last activity
__utmx
Used to determine whether a user is included in an A / B or Multivariate test.
18 months
_ga
ID used to identify users
2 years
_gali
Used by Google Analytics to determine which links on a page are being clicked