Deciding How Long to Run an A/B Test
Humblytics’ Test‑Duration Calculator converts your traffic numbers and statistical settings into a recommended runtime—so you know exactly when to stop collecting data.
1. Why Duration Matters
Running a test for too short a time risks false winners; running it too long delays deployment and may expose users to sub‑optimal experiences. A calculated duration balances confidence with speed.
2. Input Definitions
Field
What It Means
Example
Average Daily Visitors
Unique users your site receives each day (use analytics data).
1 200
Baseline Conversion Rate
Current % of visitors that convert.
4 %
Minimum Detectable Effect (MDE)
Smallest lift you care to detect.
+0.8 %
absolute
Statistical Significance
Confidence level (95 % or 99 %).
95 %
Statistical Power
Probability of catching a real effect (fixed at 80 % in this tool).
80 %
Tip: Lower traffic or a smaller MDE will increase recommended days; consider prioritising bigger changes when traffic is scarce.
3. Step‑by‑Step
Open the Test‑Duration Calculator in your Humblytics toolkit.
Enter all five inputs above.
Click Calculate.
Review the results panel:
Days to Run — minimum calendar days before analysing.
Visitors per Variant — the sample size target for each group.
Optionally click Export to CSV to share the plan with stakeholders.
4. Worked Example
Daily Visitors:
1 200
Baseline CVR:
4
MDE:
1
Significance:
95
▶︎ Result: ≈ 16 days of traffic → ≈ 9 600 visitors per variant. End the test once both thresholds (days and visitors) are met.
5. Understanding the Math (High‑Level)
The calculator first computes the required sample size using your CVR, MDE, confidence, and power settings (same formula used in the Sample‑Size Calculator).
It then divides that sample size by your average daily visitors, assuming a 50/50 traffic split, to yield recommended days.
6. Best‑Practice Checklist
Run the Full Duration — resist the temptation to stop early when trends look promising.
Include Full Business Cycles — ensure weekends, paydays, campaigns, etc., are represented.
Freeze Site Changes — avoid deploying unrelated changes mid‑test.
One Change at a Time — isolates causal impact.
Document Everything — hypothesis, settings, runtime, outcome.
Segment After Significance — check if the uplift holds across devices, channels, or geos.
Following these steps will keep every A/B test statistically sound while maximising learning velocity.
Last updated
Was this helpful?