Statistical Concepts

Effect Size: Cohen's d, Eta-Squared, and Interpretation

6 min read

Learn what effect size is, how to calculate Cohen's d and eta-squared with worked examples, and why p-values aren't enough for research decisions.

What Is Effect Size?

Effect size is a quantitative measure of the magnitude of a phenomenon, how big the difference, relationship, or effect actually is, independent of sample size. While a p-value tells you whether an effect is likely real (statistically significant), effect size tells you whether the effect is large enough to care about practically. A drug trial might find that a new medication reduces headache duration by an average of 0.3 minutes with p < 0.001. That's statistically significant, but nobody's switching medications to save 18 seconds. Effect size separates statistical significance from practical significance, and it's become a required reporting element in most peer-reviewed journals and a best practice in applied research.

Why Effect Size Matters

Statistical significance is a function of three things: the true effect, the variability in the data, and the sample size. With a large enough sample, even trivially small differences become statistically significant. A satisfaction survey with 50,000 respondents might flag a 0.02-point difference between regions as significant at p < 0.01. That difference is real in a statistical sense but meaningless for any business decision.

Effect size solves this problem by measuring the magnitude of the difference on a standardized scale. It lets you:

  • Prioritize findings: when multiple differences are significant, effect size tells you which ones are worth acting on
  • Compare across studies: a Cohen's d of 0.5 means the same thing whether the study used 50 respondents or 5,000
  • Plan future research: power analyses require an expected effect size to calculate the sample size you need
  • Communicate impact: telling a stakeholder "the effect was medium-sized" is more useful than "p = 0.03"

How Effect Size Works

Cohen's d: Comparing Two Group Means

Cohen's d measures the difference between two group means in standard deviation units.

Formula:

d = (x-bar1 - x-bar2) / s_pooled

Where s_pooled is the pooled standard deviation:

s_pooled = sqrt(((n1-1)*s1^2 + (n2-1)*s2^2) / (n1 + n2 - 2))

Worked Example: Cohen's d

A concept test compared two product designs. Design A (n = 45) received a mean appeal score of 7.2 (s = 1.5). Design B (n = 50) received a mean of 6.4 (s = 1.7).

Step 1: Calculate the pooled standard deviation.

s_pooled = sqrt(((45-1)1.5^2 + (50-1)1.7^2) / (45 + 50 - 2)) s_pooled = sqrt((442.25 + 492.89) / 93) s_pooled = sqrt((99.0 + 141.61) / 93) s_pooled = sqrt(240.61 / 93) s_pooled = sqrt(2.587) s_pooled = 1.608

Step 2: Calculate d.

d = (7.2 - 6.4) / 1.608 = 0.8 / 1.608 = 0.497

Interpretation: The difference between designs is approximately 0.50 standard deviations, a medium effect. This isn't just statistically detectable; it represents a meaningful difference in how respondents perceived the two designs.

Cohen's Benchmarks

Jacob Cohen proposed these guidelines for interpreting d in behavioral and social science research:

Effect Size Cohen's d What It Looks Like
Small 0.20 Difference is real but hard to see without data
Medium 0.50 Difference is noticeable and practically relevant
Large 0.80 Difference is obvious and substantial

These are guidelines, not rules. In some contexts, a "small" effect of d = 0.20 is hugely important. A drug that reduces mortality by a small effect size saves thousands of lives at scale. In market research, the practical threshold depends on the cost of acting on the finding versus the cost of ignoring it.

Eta-Squared: Effect Size for ANOVA

When comparing three or more groups (ANOVA), eta-squared measures how much of the total variance in the outcome is explained by group membership.

Formula:

eta^2 = SS_between / SS_total

Where SS_between is the sum of squares between groups and SS_total is the total sum of squares.

Worked example:

An ANOVA comparing satisfaction across three customer segments produced:

SS_between = 180 SS_within = 420 SS_total = 180 + 420 = 600

eta^2 = 180 / 600 = 0.30

Thirty percent of the variance in satisfaction is explained by segment membership. That's a large effect by Cohen's benchmarks:

Effect Size Eta-Squared
Small 0.01
Medium 0.06
Large 0.14

r: Effect Size for Correlations

The Pearson correlation coefficient (r) is itself an effect size measure. It ranges from -1 to +1.

Effect Size r
Small 0.10
Medium 0.30
Large 0.50

An r of 0.35 between ad recall and purchase intent represents a medium effect, the two variables are related, but ad recall explains about 12% of the variance in purchase intent (r^2 = 0.35^2 = 0.123).

Why P-Values Aren't Enough

A p-value answers: "If there were no real effect, how likely would I be to see data this extreme?" It doesn't answer: "How big is the effect?" or "Does this matter practically?"

Consider two studies on the same intervention:

Study n per group Mean difference p-value Cohen's d
Study A 25 4.2 points 0.08 0.50
Study B 2,500 0.4 points 0.001 0.05

Study A found a medium-sized effect that didn't reach significance (underpowered). Study B found a statistically significant effect that's trivially small. If you only look at p-values, Study B "worked" and Study A "failed." If you look at effect sizes, Study A found something worth investigating and Study B found something not worth acting on.

When to Use Effect Size

  • Concept testing: determining whether the difference between concepts is large enough to justify choosing one over the other
  • A/B testing: evaluating whether a statistically significant result represents a meaningful improvement worth implementing
  • Power analysis: estimating the sample size needed for a future study based on the effect size you want to detect
  • Meta-analysis: comparing and combining results across multiple studies that used different sample sizes and measures
  • Stakeholder reporting: translating statistical results into practical terms that inform decisions

Common Mistakes

  • Reporting only p-values without effect sizes: this leaves the most important question unanswered: "How big is it?"
  • Treating Cohen's benchmarks as absolute cutoffs: small, medium, and large are context-dependent; a "small" effect can be practically important depending on the domain
  • Using eta-squared instead of partial eta-squared in multi-factor designs: in factorial ANOVA, partial eta-squared isolates each factor's contribution more accurately
  • Ignoring confidence intervals for effect sizes: a Cohen's d of 0.50 with a 95% CI of [0.05, 0.95] tells a very different story than one with a CI of [0.35, 0.65]
  • Confusing effect size with effect importance: a large effect size in a trivial variable matters less than a small effect size in a critical outcome

How Quali-Fi Supports Effect Size Analysis

Quali-Fi's Research plan ($1,061/month) reports effect sizes alongside significance tests in cross-tabulation outputs, giving you both the "is it real?" and "is it big enough?" answers in a single view. The Intelligence tier ($2,750+/project) includes effect size calculations in key driver analysis and experimental designs, with automated interpretation that flags practically meaningful differences versus merely statistically significant ones.

Measure what matters with Quali-Fi

Frequently Asked Questions

Can effect size be negative?

Cohen's d can be negative, it just means the first group's mean is lower than the second group's. The absolute value tells you the magnitude. When reporting, researchers often arrange groups so that d is positive for clarity, but the sign is mathematically meaningful.

What effect size should I use for my power analysis?

Use the smallest effect size you'd consider practically meaningful. If a 0.3-point difference on a 10-point scale wouldn't change any decisions, don't power your study to detect it. In market research, d = 0.30 to 0.50 is a common target for concept tests and satisfaction studies.

How do I convert between effect size measures?

Common conversions: d = 2r / sqrt(1 - r^2) for converting correlation to Cohen's d. For eta-squared to d: d = 2 * sqrt(eta^2 / (1 - eta^2)) in a two-group comparison. Most statistical software and online calculators handle these conversions automatically.

Frequently Asked Questions

Related Guides

Put it into practice

Ready to apply this in your research?

Quali-Fi makes it easy to run surveys, conjoint studies, and more, all in one platform.