Is A/B Test Calculator free to use?

Yes. [A/B Test Calculator](/en/ab-test-calculator) is completely free with no registration required. All processing happens in your browser.

Do I need to install anything to use A/B Test Calculator?

No. [A/B Test Calculator](/en/ab-test-calculator) works directly in your browser — no downloads, plugins, or sign-ups needed.

Is my data safe when using A/B Test Calculator?

Yes. All data stays in your browser and is never sent to our servers. Nothing is stored or tracked.

sequential testing peeking ab test stats bias

Sequential Testing and the Peeking Trap

Practical guide to sequential ab testing: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.

By Tools Hub Editorial Team·Published February 11, 2026·Time to read: 8 min

Go to tool

A/B Calculator

Statistical significance (Z-test) and confidence intervals.

Marketing

Go to tool →All articles

The peeking trap in numbers

You set alpha = 0.05 (5% false positive rate) and plan to run a test for 4 weeks. But you check results every day. After 28 checks on data that fluctuates randomly, the probability of *at least one* false significant result is not 5% — it rises to roughly 25-30%.

The reason: each check is a hypothesis test. Even if there is no real effect, random data occasionally looks significant. More checks = more chances for a false alarm. Formally, the error rate inflates because the test statistic follows a random walk under the null, and it crosses any fixed boundary with increasing probability over time.

Alpha spending: the solution

Sequential testing methods control the overall false positive rate across multiple looks by "spending" alpha gradually. Instead of using alpha = 0.05 at every look, each interim analysis uses a smaller threshold, so the total across all looks stays at 0.05.

Two classic approaches:

O'Brien-Fleming — very conservative early, lenient late. First look might require p < 0.0001 to stop. Final look uses roughly the original alpha. Best when you want to run the full test unless the effect is enormous.

Look	Alpha spent (cumulative)	Boundary p-value
1 of 4	0.0001	0.0001
2 of 4	0.0054	0.0049
3 of 4	0.0221	0.0184
4 of 4	0.0500	0.0429

Pocock — spends alpha evenly. Every look uses approximately the same threshold (~0.016 for 4 looks). Easier to explain but requires more total sample size because you "use up" alpha early.

How to set up group sequential testing

1.Before the test: define the maximum number of looks (e.g., 4 interim analyses = looks at 25%, 50%, 75%, and 100% of target sample).

2.Choose a spending function (O'Brien-Fleming is usually preferred).

3.Calculate adjusted sample size — add ~15-20% to your fixed-horizon estimate to account for sequential flexibility.

4.At each interim look: compare the test statistic to the boundary for that look. If it crosses, stop and declare significance. If not, continue.

5.At the final look: apply the final boundary. If still not significant, conclude no detectable effect.

Practical example

Baseline: 5% conversion, MDE: 2 pp, alpha: 0.05, power: 80%.

•Fixed-horizon sample: ~1,300 per variant.

•With 4 interim looks (O'Brien-Fleming): ~1,500 per variant.

•At 300 visitors/day per variant, look schedule: day 5, day 10, day 15, day 20.

Compute your required sample and schedule using A/B Test Calculator.

When NOT to use sequential testing

•If you have enough traffic to reach full sample size in under a week, just use fixed-horizon — the complexity is not worth it.

•If stakeholders will ignore boundaries and peek anyway, sequential design does not help.

Related resources

•When to Stop an A/B Test on Low Traffic

•How to Avoid False Positives in A/B Tests

•MDE in A/B Testing Explained

Next step

Decide on 3-5 interim looks, choose O'Brien-Fleming boundaries, and compute your adjusted sample size in A/B Test Calculator.

Editorial Standards

This article is reviewed by the Tools Hub editorial team for factual accuracy, practical relevance, and consistency with current product workflows.

Last reviewed: February 11, 2026

About Contact Privacy Policy

Article navigation

All articles

Newer article

Bayesian vs Frequentist A/B Testing for Product Teams

Older article

Sample Size for Signup Flow Experiments

Sample Ratio Mismatch: Detection and Root Causes

8 min

Practical guide to sample ratio mismatch: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.

Feb 14, 2026Read now

When to Stop an A/B Test on Low Traffic

8 min

Practical guide to stop rules ab test: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.

Feb 13, 2026Read now

Bayesian vs Frequentist A/B Testing for Product Teams

8 min

Practical guide to bayesian vs frequentist ab testing: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.

Feb 12, 2026Read now

Multivariate vs A/B Testing: A Decision Framework

8 min

Practical guide to multivariate vs ab test: formulas, workflow, implementation pitfalls, and a direct execution playbook with A/B Test Calculator.

Feb 9, 2026Read now

A/B Calculator

The peeking trap in numbers

Alpha spending: the solution

How to set up group sequential testing

Practical example

When NOT to use sequential testing

Related resources

Next step

Editorial Standards

Article navigation

Related articles

A/B Calculator

The peeking trap in numbers

Alpha spending: the solution

How to set up group sequential testing

Practical example

When NOT to use sequential testing

Related resources

Next step

Editorial Standards

Article navigation

Related articles