Definition (plain): Degrees of freedom (df) is the number of independent values in a data set that can vary while still satisfying any constraints you impose. In other words, df counts how many observations you can choose freely before the remaining values are forced by a restriction (for example, a fixed total or an estimated parameter).
Simple rule and general formula
– Common simple case: df = N − 1, where N is sample size. This appears when you estimate one parameter from the data (typically the sample mean) and then measure variability around that estimate.
– More generally: df = N − P, where P is the number of parameters or independent constraints that have been estimated or imposed from the same data.
Why it matters
– The value of df determines the shape of sampling distributions used in hypothesis tests. For example, t-distributions depend on df; low df produce heavier tails (more extreme values are relatively likely), while large df make the t-distribution approach the normal distribution.
– Correct df are essential to compute p-values, critical values, confidence intervals, and to assess tests like t-tests and chi-square tests.
Worked numeric example (sample variance)
Step 1 — data and sample size:
– Observations: 3, 8, 5, 4, 10 → N = 5.
Step 2 — sample mean:
– Mean = (3 + 8 + 5 + 4 + 10) / 5 = 30 / 5 = 6.
Step 3 — squared deviations and sum:
– (3−6)^2 = 9
– (8−6)^2 = 4
– (5−6)^2 = 1
– (4−6)^2 = 4
– (10−6)^2 = 16
– Sum of squared deviations = 9 + 4 + 1 + 4 + 16 = 34.
Step 4 — compute variance estimates:
– Population variance (if you somehow had the whole population): σ^2_pop = 34 / 5 = 6.8.
– Sample (unbiased) variance estimator: s^2 = 34 / (N − 1) = 34 / 4 = 8.5.
Why divide by N − 1? Because the sample mean was estimated from the data; that estimation removes one degree of freedom. Using N − 1 corrects bias in the variance estimate when using the sample to infer the population variance.
Common examples and quick rules (checklist)
– Determine constraints: is there a fixed sum, a fixed mean, or parameters estimated from the same data?
– Count N (observations).
– Count P (number of parameters estimated from the sample or independent constraints).
– Compute df = N − P.
– Use df to find critical values or p-values from the correct sampling distribution.
Typical df formulas for standard tests
– One-sample t-test: df = N − 1.
– Two-sample t-test (equal variances assumed): df = N1 + N2 − 2.
– Welch’s t-test (unequal variances): use the Welch–Satterthwaite approximation (non-integer df; software usually computes it).
– Chi-square goodness-of-fit: df = k − 1 − r, where k is number of categories and r is number of parameters estimated from the data (often r = 0).
– Chi-square test of independence
– Chi-square test of independence: df = (r − 1) × (c − 1), where r is the number of rows and c is the number of columns in the contingency table (after any category collapsing).
More common formulas and how to apply them
– One‑way ANOVA (k groups, total N observations)
– Between‑groups df = k − 1 (number of groups minus one).
– Within‑groups (error) df = N − k.
– Total df = N − 1.
– Example: k = 4 groups, N = 40 observations. Between df = 3, Within df = 36, Total df = 39.
– Linear regression (N observations, P parameters estimated)
– Residual (error) df = N − P.
– Regression df = P − 1 if one of the parameters is an intercept (this is the number of estimated slope coefficients).
– Total df = N − 1.
– Example: N = 50, model has intercept + 3 slopes (P = 4). Residual df = 50 − 4 = 46. Regression df = 3. Total df = 49.
– Welch’s t-test (unequal variances) — Welch–Satterthwaite approximation
– When two samples have different variances, the df is approximated by
df ≈ (s1^2/n1 + s2^2/n2)^2
/ [ (s1^4 / (n1^2 (n1 − 1))) + (s2
4^4 / (n2^2 (n2 − 1))) ]
In full form the Welch–Satterthwaite approximation is:
df ≈ ( (s1^2 / n1 + s2^2 / n2)^2 ) / ( (s1^4 / (n1^2 (n1 − 1))) + (s2^4 / (n2^2 (n2 − 1))) )
Notes and related formulas
– Pooled two-sample t-test (equal variances assumed)
– If you justify equal population variances, pool the sample variances and use df = n1 + n2 − 2.
– Use pooled variance only when the equal-variance assumption is defensible (formal test or strong subject-matter justification).
– Chi-square tests
– Goodness-of-fit (k categories): df = k − 1 (minus additional constraints if parameters are estimated from the data).
– Contingency table (r rows × c columns): df = (r − 1)(c − 1).
– ANOVA (analysis of variance)
– Between-groups df = k − 1 (k = number of groups).
– Within-groups (residual) df = N − k (N = total observations).
– Total df = N − 1.
– F-test
– An F statistic compares two variances; report numerator and denominator dfs separately (df1, df2), e.g., F(df1 = n1 − 1, df2 = n2 − 1).
– Linear regression (reminder)
– Residual (error) df = N − P, where P = number of parameters estimated (including intercept).
– Regression (model) df = P − 1 if one parameter is an intercept.
– Total df = N − 1.
Worked numeric example — Welch t-test
– Data: sample 1: n1 = 10, s1 = 2.10; sample 2: n2 = 12, s2 = 1.80.
– Compute components:
– s1^2 / n1 = 4.41 / 10 = 0.441
– s2^2 / n2 = 3.24 / 12 = 0.270
– Numerator = (0.441 + 0.270)^2 = 0.711^2 = 0.505
– s1^4 / (n1^2 (n1 − 1)) = 19.4481 / 900 = 0.02161
– s2^4 / (n2^2 (n2 − 1)) = 10.4976 / 1584 = 0.00663
– Denominator = 0.02161 + 0.00663 = 0.02824
– df ≈ 0.505 / 0.02824 ≈ 17.9 → report df ≈ 18 (many software packages accept non-integer df).
– Compare: pooled t-test would give df = 10 + 12 − 2 = 20 (but pooled is inappropriate here unless equal-variance assumption is valid).
Practical checklist for
Practical checklist for performing and reporting a two-sample t-test (Welch)
1) Pick the right test
– If you suspect unequal variances (or you haven’t tested equality), use Welch’s t-test. It does not assume equal variances.
– If you are confident variances are equal and sample sizes are similar, a pooled (Student’s) t-test can be used — but check that assumption.
2) Verify assumptions (quick diagnostic)
– Independent samples: observations in one group don’t influence the other.
– Approximately continuous outcome and roughly symmetric or large samples (Central Limit Theorem helps for n ≥ ~30).
– No extreme outliers; if present, consider robust alternatives (e.g., bootstrap, rank tests).
3) Compute the test statistic (steps)
– Gather: group sample sizes n1, n2; sample means x̄1, x̄2; sample variances s1^2, s2^2.
– Standard error of the difference: SE = sqrt(s1^2/n1 + s2^2/n2).
– t-statistic: t = (x̄1 − x̄2) / SE.
– Degrees of freedom (Welch approximation):
df ≈ (s1^2/n1 + s2^2/n2)^2 / [ (s1^4 / (n1^2 (n1 −
…1)]).
So the full Welch–Satterthwaite approximation is:
df ≈ ( (s1^2/n1 + s2^2/n2)^2 ) / ( (s1^4 / [n1^2 (n1 − 1)]) + (s2^4 / [n2^2 (n2 − 1)]) ).
Notes on df: you may use the non-integer df directly with most software. If using tables that require an integer df, rounding down (conservative) is common but unnecessary when you have software.
4) Find p-value or critical t
– Two-sided test: p = 2 · P(Tdf ≥ |t|) where Tdf is a t-distributed random variable with the Welch df above.
– One-sided test: p = P(Tdf ≥ t) for testing x̄1 > x̄2 (or use ≤ for the opposite).
– Compare p to your α (commonly 0.05). If p ≤ α, reject H0; otherwise fail to reject H0.
5) Report the result and a confidence interval
– 95% CI for μ1 − μ2: (x̄1 − x̄2) ± tcrit(df, 0.975) · SE, where SE = sqrt(s1^2/n1 + s2^2/n2) and tcrit is the critical t-value for the chosen df.
– State sample sizes, means, variances (or SDs), t statistic, df (state approximation method), p-value, and the confidence interval.
When to use a pooled (Student’s) t-test instead
– If you have good evidence variances are equal (σ1^2 = σ2^2), use pooled t:
sp^2 = [ (n1−1)s1^2 + (n2−1)s2^2 ] / (n1 + n2 − 2)
SEpooled = sqrt( sp^2 · (1/n1 + 1/n2) )
t = (x̄1 − x̄2) / SEpooled
df = n1 + n2 − 2
– Do not pool when variances are clearly unequal. Pooled test can produce misleading p-values when variances differ.
Worked numeric example (step-by-step)
Data:
– Group 1: n1 = 25, x̄1 = 5.20, s1 = 1.20 → s1^2 = 1.44
– Group 2: n2 = 30, x̄
= 4.80, s2 = 1.50 → s2^2 = 2.25
We’ll compare the two-sample t-tests step-by-step: first the unequal-variance (Welch) test, then the pooled (Student’s) test to show differences.
Worked example — Welch (unequal variances) t-test
1) Data recap
– Group 1: n1 = 25, x̄1 = 5.20, s1^2 = 1.44
– Group 2: n2 = 30, x̄2 = 4.80, s2^2 = 2.25
– Difference in sample means: Δ = x̄1 − x̄2 = 0.40
2) Standard error (SE) for unequal variances
SE = sqrt( s1^2/n1 + s2^2/n2 )
= sqrt( 1.44/25 + 2.25/30 )
= sqrt( 0.0576 + 0