Deciding How Long to Run an A/B Test

Humblytics’ Test‑Duration Calculator converts your traffic numbers and statistical settings into a recommended runtime—so you know exactly when to stop collecting data.


1. Why Duration Matters

Running a test for too short a time risks false winners; running it too long delays deployment and may expose users to sub‑optimal experiences. A calculated duration balances confidence with speed.


2. Input Definitions

Field

What It Means

Example

Average Daily Visitors

Unique users your site receives each day (use analytics data).

1 200

Baseline Conversion Rate

Current % of visitors that convert.

4 %

Minimum Detectable Effect (MDE)

Smallest lift you care to detect.

+0.8 % absolute

Statistical Significance

Confidence level (95 % or 99 %).

95 %

Statistical Power

Probability of catching a real effect (fixed at 80 % in this tool).

80 %

Tip: Lower traffic or a smaller MDE will increase recommended days; consider prioritising bigger changes when traffic is scarce.


3. Step‑by‑Step

  1. Open the Test‑Duration Calculator in your Humblytics toolkit.

  2. Enter all five inputs above.

  3. Click Calculate.

  4. Review the results panel:

    • Days to Run — minimum calendar days before analysing.

    • Visitors per Variant — the sample size target for each group.

  5. Optionally click Export to CSV to share the plan with stakeholders.


4. Worked Example

  • Daily Visitors: 1 200

  • Baseline CVR: 4

  • MDE: 1

  • Significance: 95

▶︎ Result: ≈ 16 days of traffic → ≈ 9 600 visitors per variant. End the test once both thresholds (days and visitors) are met.


5. Understanding the Math (High‑Level)

  • The calculator first computes the required sample size using your CVR, MDE, confidence, and power settings (same formula used in the Sample‑Size Calculator).

  • It then divides that sample size by your average daily visitors, assuming a 50/50 traffic split, to yield recommended days.


6. Best‑Practice Checklist

  • Run the Full Duration — resist the temptation to stop early when trends look promising.

  • Include Full Business Cycles — ensure weekends, paydays, campaigns, etc., are represented.

  • Freeze Site Changes — avoid deploying unrelated changes mid‑test.

  • One Change at a Time — isolates causal impact.

  • Document Everything — hypothesis, settings, runtime, outcome.

  • Segment After Significance — check if the uplift holds across devices, channels, or geos.

Following these steps will keep every A/B test statistically sound while maximising learning velocity.

Last updated

Was this helpful?