Sample Size Calculator

This calculator computes the minimum number of necessary samples to meet the desired statistical constraints.

How sure you want to be that the results are accurate. Standard is 95%.
The margin of error you can tolerate (e.g., +/- 5%). Lower error requires more people.
%
Expected response distribution. Use 50% if unsure (it's the safest 'worst-case' scenario).
%
Total number of people in your target group. Leave blank if unknown or very large.
Survey / Quant
For Statistical Significance
---
Participants needed for --% confidence.
OR
Interviews / Qual
For Insight Discovery
5
According to Nielsen Norman, 5 users reveal 85% of usability issues.
Start a project in Glaut →
TABLE OF CONTENT
Try Glaut
SUMMARISE WITH AI

 Sample size and data quality: the problem that starts before fieldwork

Getting your sample size right is essential, but it’s not sufficient. A precisely calculated N only matters if the responses are valid, which is increasingly challenging in online research.

Usable response rates in online surveys have dropped significantly over the past decade. The challenges are well-documented and compounding:

  • Bots sophisticated enough to pass attention checks, mimic human timing, and evade open-ended traps;
  • Real respondents are using generative AI to answer open-ended questions. A study from Stanford and NYU found that nearly one in three online survey takers admitted to doing so.
  • Traditional quality controls that were never designed for this level of sophistication: one Pew Research study found 84% of fake respondents passed trap questions, and 87% evaded speed checks.

The impact on sample integrity is straightforward: fraudulent or AI-generated data don’t falsely widen your confidence interval; instead, they silently compromise its accuracy at the origin.

"When fraudulent responses were removed from the analysis, results that had reached statistical significance no longer did — meaning flawed recommendations would have been made based on the contaminated dataset."
— PMC / Journal of Advanced Nursing

The conventional post-hoc response follows a predictable sequence:

  1. Close fieldwork
  2. Run speeding and straight-lining checks
  3. Flag and remove outliers

The problem is structural: by the time you're cleaning, bad data has already accumulated. In a controlled study conducted by the University of Mannheim, AI-moderated interviews produced zero gibberish responses compared to a 10% gibberish rate in the equivalent static survey, a difference built into how the data is collected, not applied after the fact.

That's not a post-hoc cleaning result. It's a structural difference in how the data is collected.