Skip to content

Design Effect (DEFF)

Design effect (DEFF) measures how much the structure of a sampling design inflates the variability of your results compared to a theoretically ideal random sample, a penalty that grows with clustering. The closer DEFF is to 1, the better. Consumr.AI deliberately targets a DEFF of approximately 1.2.

Imagine you want to measure the heights of high school students and you need 100 data points. You have two approaches:

  1. Go to one school and measure 100 students there.
  2. Go to 100 different schools and measure 1 student at each.

If you go to a single school, all your measurements come from the same environment, the same socioeconomic zone, the same nutrition and genetics pool. Your data will be internally consistent, but it will be biased toward that one school’s population. The design effect will be higher because of that clustering.

If you go to 100 different schools, each data point comes from a different context. You capture far more of the natural variation in the population. The design effect is closer to 1.

The more diverse and independent your data sources, the closer DEFF is to 1, and the less clustered bias you carry into your results.

DEFF valueInterpretation
1.0Equivalent to a perfect random sample; no clustering penalty
~1.2Consumr.AI’s target: low clustering, acceptable respondent count
2.0Significant clustering: sample needs to increase by approximately 80%

If your DEFF is 2, the effective sample size is cut roughly in half compared to what the raw count suggests. To restore the statistical power you thought you had, you would need to add approximately 80% more respondents. DEFF creep is a real operational cost, which is why staying close to 1.2 is a deliberate design goal.

A DEFF of exactly 1.0 would require perfectly random sampling across the entire population, an ideal that is nearly impossible to achieve in practice and would require a very large respondent pool. As you push toward 1.0, sample size requirements drop, but the practical constraints of respondent sourcing make it difficult to maintain.

At DEFF ~1.2, the platform achieves a balance: the clustering penalty is small enough that results remain statistically reliable, and the respondent count of approximately 10,000 is practical to work with. The design is “very close to 1” while still being achievable at the scale of a real research operation.

You will typically see DEFF displayed as 1.20, 1.21, or 1.22 in platform outputs, minor variation that is acceptable. If you see DEFF climbing toward 2, that is a flag worth raising.