Why Sample Size Planning Matters
A Sample Size Calculator helps you plan how many responses, observations, or measurements you need before you start collecting data. Planning sample size is not just a “statistics checkbox” — it directly affects whether your results will be useful. Too small, and your estimates swing wildly or your test can miss a real difference. Too large, and you spend time and money collecting data you didn’t need.
The best sample size is the one that matches your goal. If you are estimating a percentage (like approval rate, conversion rate, defect rate, or prevalence), sample size depends on confidence level, margin of error, and an expected proportion. If you are estimating an average (like a mean price, mean duration, mean score, or average weight), sample size depends on confidence, margin of error in real units, and a standard deviation estimate. If you are comparing two groups (A/B testing, control vs treatment), you plan sample size using confidence, power, and your expected effect size.
Key Terms Used in This Sample Size Calculator
Sample size formulas look different across scenarios, but the core ideas repeat. Understanding these terms makes it much easier to choose the correct mode and interpret the output.
- Confidence level: How certain you want to be that your interval covers the true value (commonly 90%, 95%, 99%). Higher confidence increases required sample size.
- Critical value (z): The multiplier from the standard normal distribution associated with your confidence level (for two-sided intervals: 1.645, 1.960, 2.576 for 90/95/99%).
- Margin of error (MOE): The maximum tolerated error for your estimate. For proportions, this is usually in percentage points (±3%, ±5%). For means, it is in real units (±2 minutes, ±$5, ±0.3 kg).
- Expected proportion (p): Your best guess of the true percentage. If unknown, use 50% as a conservative choice (it produces the largest required sample).
- Standard deviation (σ): Your best estimate of spread for mean calculations. If unknown, use pilot data or a past study to approximate it.
- Power: For detecting differences (like A/B testing), power is the probability your study will detect a real effect. Typical choices are 80% or 90%.
- Finite population correction (FPC): A reduction in required sample size when your population is not large and you are sampling without replacement.
- Design effect (DEFF): Inflation factor for non-simple sampling designs (cluster sampling, weighting, unequal probabilities). DEFF ≥ 1.
- Non-response adjustment: Inflation to account for expected dropouts or non-responders.
Sample Size for a Proportion
Use the Proportion tab when your result is a percentage or rate: “What percent of customers are satisfied?”, “What share of voters supports option A?”, “What is the defect rate?” This case uses a normal approximation to the binomial proportion.
n₀ = (z² × p × (1 − p)) / E²
Here, E is your margin of error expressed as a decimal (5% → 0.05), and p is the expected proportion as a decimal (50% → 0.50). If you don’t know p, set p = 0.50. That “worst-case” assumption maximizes p(1−p) and therefore produces the largest sample size, so it is safe for planning.
Finite Population Correction
If your population size N is not huge, and your sample is a meaningful fraction of that population, FPC reduces the required sample size. This commonly applies to internal surveys (employees, members, customers in a known list) or small populations.
n = n₀ / (1 + (n₀ − 1) / N)
A good rule of thumb is that FPC starts to matter when your sample would be more than about 5% of the population. If N is large or unknown, leave the population size blank — the calculator will treat it as an effectively infinite population.
Design Effect and Non-Response Inflation
Real-world sampling isn’t always simple random sampling. If you are using cluster sampling (schools, stores, neighborhoods) or heavy weighting, variability often increases. That’s what design effect (DEFF) accounts for:
ndesign = n × DEFF
Then apply non-response inflation. If you expect a non-response rate r (for example 20% → 0.20), divide by (1−r):
nfinal = ndesign / (1 − r)
In practice, these two adjustments are a big reason a “simple” sample size can grow quickly. A base survey might need 385 completes at 95% and ±5% (with p=0.5), but if DEFF=1.5 and non-response=25%, your final target becomes much larger.
Sample Size for a Mean
Use the Mean tab when you want an average: “average waiting time,” “mean order value,” “average blood pressure,” “mean exam score.” The key extra input is an estimated standard deviation σ.
n₀ = (z² × σ²) / E²
Here, E is your desired margin of error in the original units. For example, if you want the mean waiting time within ±2 minutes, E=2. The required sample grows with σ², so if your data is very spread out, you need more observations for the same precision.
Two Proportions (A/B Testing) Sample Size
When comparing two proportions — for example conversion rate in A vs B, click-through rates, or defect rates under two processes — you typically plan sample size with both confidence and power. Confidence determines your false-positive tolerance (α), and power determines your false-negative tolerance (β).
This calculator uses a widely used normal approximation for two-proportion tests with equal group sizes:
n ≈ [ (zα/2 √(2 p̄(1−p̄)) + zβ √(p₁(1−p₁)+p₂(1−p₂)) )² ] / (p₁ − p₂)²
Where p̄ is the average of p₁ and p₂. The key driver is the difference (p₁ − p₂). Small effect sizes require big samples. If you are testing a tiny lift (say 10.0% to 10.5%), it is normal to need thousands (or tens of thousands) per group.
How to Choose Inputs That Make Sense
A Sample Size Calculator is only as good as the assumptions you feed it. The following practical choices are common:
- Confidence: 95% is the default for many studies. Use 90% when speed matters and you can tolerate more uncertainty. Use 99% for high-stakes estimates.
- Power: 80% is a typical minimum for A/B tests; 90% is stricter and increases sample size.
- Expected proportion (p): Use historical data if available. If not, use 50% to be conservative.
- Standard deviation (σ): Use pilot data or prior studies. If you wildly underestimate σ, your sample size will be too small.
- Margin of error: Choose what is meaningful. ±5% might be fine for general polling, but too loose for process control or medical decisions.
- Population size: Only matters when N is small enough that sampling is a noticeable fraction of N.
Margin of Error from an Existing Sample Size
Sometimes you already have a fixed sample size and need to report precision. The Margin of Error tab does that. For proportions (survey percentages), margin of error (without FPC) is:
E = z × √(p(1−p)/n)
For means (averages), it’s:
E = z × (σ/√n)
This tool also applies effective sample size when you use a design effect (DEFF): neff = n / DEFF. That reduces effective precision when design effect is above 1.
Common Mistakes to Avoid
- Confusing percentage points with percent: A ±5% margin of error for a proportion means ±5 percentage points (0.05 as a decimal), not “5% of the value.”
- Forgetting non-response: If you need 400 completes and expect 25% non-response, you must invite/contact more than 400.
- Ignoring design effect: Cluster sampling can dramatically increase required sample size; DEFF is not optional when clusters are present.
- Using p far from reality: If you guess p=10% but reality is near 50%, your sample may be underpowered for the same MOE goal.
- Assuming “bigger is always better”: Bigger reduces uncertainty, but costs grow. Choose a MOE and power that match the decision you need to make.
FAQ
Sample Size Calculator – Frequently Asked Questions
Answers about confidence, margin of error, power, population correction, design effect, and non-response.
Sample size is the number of observations or respondents you need to estimate a population value (like a percentage or mean) with a chosen confidence level and margin of error, or to detect an effect with a chosen power.
Confidence level controls how often your method would capture the true value if you repeated the study many times. Higher confidence requires a larger sample size.
Margin of error is the maximum expected difference between the sample estimate and the true population value (within the chosen confidence level). Smaller margin of error requires a larger sample.
FPC reduces the required sample size when the population is not very large and you are sampling without replacement. It matters most when your sample is a noticeable fraction of the population.
If you have no estimate, use 50% (p = 0.5). It’s the most conservative choice and produces the largest required sample size.
Power is the probability of detecting a true effect (difference) when it exists. Higher power usually requires a larger sample size.
Design effect (DEFF) inflates sample size when your sampling design (like cluster sampling) increases variability compared to simple random sampling. A DEFF of 1 means no inflation.
If you expect non-response, inflate your required sample by dividing by (1 − non-response rate). For example, 20% non-response means divide by 0.80.
This tool uses z critical values (normal approximation), which is standard for planning survey sample size and large-sample settings. For very small samples with unknown standard deviation, a t-based approach may be preferred.