Effect Size: Cohen's d, Eta-Squared, and Interpretation

Q: Can effect size be negative?

Cohen's d can be negative, it just means the first group's mean is lower than the second group's. The absolute value tells you the magnitude. When reporting, researchers often arrange groups so that d is positive for clarity, but the sign is mathematically meaningful.

Q: What effect size should I use for my power analysis?

Use the smallest effect size you'd consider practically meaningful. If a 0.3-point difference on a 10-point scale wouldn't change any decisions, don't power your study to detect it. In market research, d = 0.30 to 0.50 is a common target for concept tests and satisfaction studies.

Q: How do I convert between effect size measures?

Common conversions: d = 2r / sqrt(1 - r^2) for converting correlation to Cohen's d. For eta-squared to d: d = 2 * sqrt(eta^2 / (1 - eta^2)) in a two-group comparison. Most statistical software and online calculators handle these conversions automatically.

Learn what effect size is, how to calculate Cohen's d and eta-squared with worked examples, and why p-values aren't enough for research decisions.

What Is Effect Size?

Effect size is a quantitative measure of the magnitude of a phenomenon, how big the difference, relationship, or effect actually is, independent of sample size. While a p-value tells you whether an effect is likely real (statistically significant), effect size tells you whether the effect is large enough to care about practically. A drug trial might find that a new medication reduces headache duration by an average of 0.3 minutes with p < 0.001. That's statistically significant, but nobody's switching medications to save 18 seconds. Effect size separates statistical significance from practical significance, and it's become a required reporting element in most peer-reviewed journals and a best practice in applied research.

Why Effect Size Matters

Statistical significance is a function of three things: the true effect, the variability in the data, and the sample size. With a large enough sample, even trivially small differences become statistically significant. A satisfaction survey with 50,000 respondents might flag a 0.02-point difference between regions as significant at p < 0.01. That difference is real in a statistical sense but meaningless for any business decision.

Effect size solves this problem by measuring the magnitude of the difference on a standardized scale. It lets you:

Prioritize findings: when multiple differences are significant, effect size tells you which ones are worth acting on
Compare across studies: a Cohen's d of 0.5 means the same thing whether the study used 50 respondents or 5,000
Plan future research: power analyses require an expected effect size to calculate the sample size you need
Communicate impact: telling a stakeholder "the effect was medium-sized" is more useful than "p = 0.03"

How Effect Size Works

Cohen's d: Comparing Two Group Means

Cohen's d measures the difference between two group means in standard deviation units.

Formula:

d = (x-bar1 - x-bar2) / s_pooled

Where s_pooled is the pooled standard deviation:

s_pooled = sqrt(((n1-1)*s1^2 + (n2-1)*s2^2) / (n1 + n2 - 2))

Worked Example: Cohen's d

A concept test compared two product designs. Design A (n = 45) received a mean appeal score of 7.2 (s = 1.5). Design B (n = 50) received a mean of 6.4 (s = 1.7).

Step 1: Calculate the pooled standard deviation.

s_pooled = sqrt(((45-1)1.5^2 + (50-1)1.7^2) / (45 + 50 - 2)) s_pooled = sqrt((442.25 + 492.89) / 93) s_pooled = sqrt((99.0 + 141.61) / 93) s_pooled = sqrt(240.61 / 93) s_pooled = sqrt(2.587) s_pooled = 1.608

Step 2: Calculate d.

d = (7.2 - 6.4) / 1.608 = 0.8 / 1.608 = 0.497

Interpretation: The difference between designs is approximately 0.50 standard deviations, a medium effect. This isn't just statistically detectable; it represents a meaningful difference in how respondents perceived the two designs.

Cohen's Benchmarks

Jacob Cohen proposed these guidelines for interpreting d in behavioral and social science research:

Effect Size	Cohen's d	What It Looks Like
Small	0.20	Difference is real but hard to see without data
Medium	0.50	Difference is noticeable and practically relevant
Large	0.80	Difference is obvious and substantial

These are guidelines, not rules. In some contexts, a "small" effect of d = 0.20 is hugely important. A drug that reduces mortality by a small effect size saves thousands of lives at scale. In market research, the practical threshold depends on the cost of acting on the finding versus the cost of ignoring it.

Eta-Squared: Effect Size for ANOVA

When comparing three or more groups (ANOVA), eta-squared measures how much of the total variance in the outcome is explained by group membership.

Formula:

eta^2 = SS_between / SS_total

Where SS_between is the sum of squares between groups and SS_total is the total sum of squares.

Worked example:

An ANOVA comparing satisfaction across three customer segments produced:

SS_between = 180 SS_within = 420 SS_total = 180 + 420 = 600

eta^2 = 180 / 600 = 0.30

Thirty percent of the variance in satisfaction is explained by segment membership. That's a large effect by Cohen's benchmarks:

Effect Size	Eta-Squared
Small	0.01
Medium	0.06
Large	0.14

r: Effect Size for Correlations

The Pearson correlation coefficient (r) is itself an effect size measure. It ranges from -1 to +1.

Effect Size	r
Small	0.10
Medium	0.30
Large	0.50

An r of 0.35 between ad recall and purchase intent represents a medium effect, the two variables are related, but ad recall explains about 12% of the variance in purchase intent (r^2 = 0.35^2 = 0.123).

Why P-Values Aren't Enough

A p-value answers: "If there were no real effect, how likely would I be to see data this extreme?" It doesn't answer: "How big is the effect?" or "Does this matter practically?"

Consider two studies on the same intervention:

Study	n per group	Mean difference	p-value	Cohen's d
Study A	25	4.2 points	0.08	0.50
Study B	2,500	0.4 points	0.001	0.05

Study A found a medium-sized effect that didn't reach significance (underpowered). Study B found a statistically significant effect that's trivially small. If you only look at p-values, Study B "worked" and Study A "failed." If you look at effect sizes, Study A found something worth investigating and Study B found something not worth acting on.

When to Use Effect Size

Concept testing: determining whether the difference between concepts is large enough to justify choosing one over the other
A/B testing: evaluating whether a statistically significant result represents a meaningful improvement worth implementing
Power analysis: estimating the sample size needed for a future study based on the effect size you want to detect
Meta-analysis: comparing and combining results across multiple studies that used different sample sizes and measures
Stakeholder reporting: translating statistical results into practical terms that inform decisions

Common Mistakes

Reporting only p-values without effect sizes: this leaves the most important question unanswered: "How big is it?"
Treating Cohen's benchmarks as absolute cutoffs: small, medium, and large are context-dependent; a "small" effect can be practically important depending on the domain
Using eta-squared instead of partial eta-squared in multi-factor designs: in factorial ANOVA, partial eta-squared isolates each factor's contribution more accurately
Ignoring confidence intervals for effect sizes: a Cohen's d of 0.50 with a 95% CI of [0.05, 0.95] tells a very different story than one with a CI of [0.35, 0.65]
Confusing effect size with effect importance: a large effect size in a trivial variable matters less than a small effect size in a critical outcome

How Quali-Fi Supports Effect Size Analysis

Quali-Fi's Research plan ($1,061/month) reports effect sizes alongside significance tests in cross-tabulation outputs, giving you both the "is it real?" and "is it big enough?" answers in a single view. The Intelligence tier ($2,750+/project) includes effect size calculations in key driver analysis and experimental designs, with automated interpretation that flags practically meaningful differences versus merely statistically significant ones.

Measure what matters with Quali-Fi

Frequently Asked Questions

Can effect size be negative?

Cohen's d can be negative, it just means the first group's mean is lower than the second group's. The absolute value tells you the magnitude. When reporting, researchers often arrange groups so that d is positive for clarity, but the sign is mathematically meaningful.

What effect size should I use for my power analysis?

Use the smallest effect size you'd consider practically meaningful. If a 0.3-point difference on a 10-point scale wouldn't change any decisions, don't power your study to detect it. In market research, d = 0.30 to 0.50 is a common target for concept tests and satisfaction studies.

How do I convert between effect size measures?

Common conversions: d = 2r / sqrt(1 - r^2) for converting correlation to Cohen's d. For eta-squared to d: d = 2 * sqrt(eta^2 / (1 - eta^2)) in a two-group comparison. Most statistical software and online calculators handle these conversions automatically.

What Is Effect Size?

Why Effect Size Matters

How Effect Size Works

Cohen's d: Comparing Two Group Means

Worked Example: Cohen's d

Cohen's Benchmarks

Eta-Squared: Effect Size for ANOVA

r: Effect Size for Correlations

Why P-Values Aren't Enough

When to Use Effect Size

Common Mistakes

How Quali-Fi Supports Effect Size Analysis

Frequently Asked Questions

Can effect size be negative?

What effect size should I use for my power analysis?

How do I convert between effect size measures?

Frequently Asked Questions

Related Guides

Statistical Concepts: The Complete Guide for Research Teams

T-Test: Types, Formulas, and When to Use Each

Variance: What It Is and How to Calculate It

Type I Error: False Positives in Statistical Testing

Type II Error: False Negatives and Statistical Power

Ready to apply this in your research?

Effect Size: Cohen's d, Eta-Squared, and Interpretation

What Is Effect Size?

Why Effect Size Matters

How Effect Size Works

Cohen's d: Comparing Two Group Means

Worked Example: Cohen's d

Cohen's Benchmarks

Eta-Squared: Effect Size for ANOVA

r: Effect Size for Correlations

Why P-Values Aren't Enough

When to Use Effect Size

Common Mistakes

How Quali-Fi Supports Effect Size Analysis

Frequently Asked Questions

Can effect size be negative?

What effect size should I use for my power analysis?

How do I convert between effect size measures?

Related Topics

Frequently Asked Questions

Related Guides

Statistical Concepts: The Complete Guide for Research Teams

T-Test: Types, Formulas, and When to Use Each

Variance: What It Is and How to Calculate It

Type I Error: False Positives in Statistical Testing

Type II Error: False Negatives and Statistical Power

Ready to apply this in your research?