What’s the Minimum Sample Size You Need Before Trusting Your Test Results?
You’ve launched an A/B test, traffic’s flowing, and results look promising—but can you actually trust them yet? Knowing when your data is reliable is one of the hardest calls in experimentation.
Direct Answer: Your results are only as trustworthy as your sample size. Too small, and randomness wins. Too large, and you waste time and traffic. The goal is balance—aligning your confidence level, statistical power, and minimum detectable effect (MDE) to gather enough data to make a confident, timely decision.
Start With the Outcome That Matters
Before doing any math, get clear on what success looks like. Are you measuring purchases, sign-ups, or engagement time? Different metrics fluctuate differently, so each requires a unique sample size to smooth out the noise.
- High-variance metrics (like revenue) need larger samples.
- Low-variance metrics (like click-through rate) can stabilize faster.
- Rare events always demand more data to separate signal from chance.
The more variable your data, the longer you need to run the test to get credible results.
Understand Confidence, Power, and the MDE
These three levers determine how much data you need:
- Confidence Level: How sure you want to be that the observed difference is real (often 95%).
- Statistical Power: The probability of detecting a true effect (usually 80%).
- Minimum Detectable Effect (MDE): The smallest improvement that matters to your business.
They interact directly—tightening one increases the sample size required. Want higher confidence or smaller detectable changes? You’ll need more traffic and patience. Use a sample size calculator to find the right number, but make the final decision with context: time, traffic, and business importance.
Avoid the Classic Pitfalls
- Stopping too early: Ending a test when results look “done” inflates false positives.
- Peeking too often: Repeated significance checks distort p-values.
- Ignoring traffic splits: If one variant gets more exposure, results can skew.
- Underestimating variance: Always factor in natural volatility when calculating sample size.
Good experimentation is as much about discipline as it is about data. Commit to your calculated sample size before starting, and see it through.
The 3C Model: Calculate. Commit. Confirm.
- Calculate your required sample size before you launch.
- Commit to running the full duration—no early exits.
- Confirm your findings with post-test validation and clean data.
This simple framework protects you from bias and builds trust in your program’s insights.
FAQ
Q: How do I know if my sample size is large enough?
A: If your calculator shows at least 80% power and a margin of error under 5%, you’re in safe territory.
Q: What happens if my test ends early?
A: You risk false positives—apparent “wins” that disappear when retested. Patience preserves credibility.
Q: Is there a minimum number of conversions I should aim for?
A: There’s no universal number, but most analysts prefer at least a few hundred conversions per variant to stabilize variance.