Cookie preferences
We use cookies for analytics. Privacy Policy You can accept or decline non-essential tracking.
Practical guide to stop rules ab test: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.
Go to tool
Statistical significance (Z-test) and confidence intervals.
You launch an A/B test and check results on day 3. The p-value is 0.03 — significant! You stop the test and ship. Two weeks later, the lift disappears.
This happens because checking a running test multiple times inflates the false positive rate. With alpha = 0.05 and daily checks over 14 days, the actual false positive rate climbs to 20-30%. The math: each peek is a chance to accidentally hit significance on random noise.
Example: baseline conversion 4%, MDE 1 pp (absolute), alpha 0.05, power 80%. Required sample: ~3,800 per variant. At 500 visitors/day, that is 15 days. Set a calendar reminder for day 15. Do not peek.
Low-traffic sites (under 1,000 visitors/day) face a real problem: the test to detect a 1 pp lift might need 8 weeks. Options:
1. Increase your MDE threshold. Accept that you can only detect larger effects. A 3 pp MDE instead of 1 pp cuts the required sample from ~3,800 to ~430 per variant. The trade-off: you might miss small wins.
2. Test bigger changes. Instead of testing button color, test an entirely different page layout. Bigger changes produce bigger effects, making them detectable with less traffic.
3. Use sequential testing. Methods like group sequential design let you peek at predefined intervals without inflating alpha. You pay a ~20-30% sample size premium, but you can stop early if the effect is large. See Sequential Testing and the Peeking Trap.
4. Extend the test window. If the business allows, run the test for 6-8 weeks. Just make sure to account for weekday/weekend cycles by running full weeks.
Site: 300 visitors/day, 5% conversion.
Conclusion: this site should target MDE of 2-3 pp and test bold changes, not micro-optimizations.
Open Sample Size Calculator, enter your traffic and baseline rate, and find the MDE your site can realistically detect in 2-4 weeks.
Run the workflow directly in A/B Test Calculator and save your baseline output before scaling traffic.
This article is reviewed by the Tools Hub editorial team for factual accuracy, practical relevance, and consistency with current product workflows.
Last reviewed:
Practical guide to multivariate vs ab test: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.
Practical guide to false positive ab test: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.
Practical guide to sample ratio mismatch: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.
Practical guide to bayesian vs frequentist ab testing: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.