Anova

Updated: September 22, 2025

What is ANOVA (analysis of variance)
– Definition: ANOVA is a statistical test that checks whether the means of three or more groups are likely to be different for reasons other than random chance. It compares how much of the total variability in a dataset is explained by group membership (systematic variance) versus unexplained fluctuations (random variance).
– Purpose: Use ANOVA when you want to test differences across multiple groups simultaneously instead of running many pairwise t‑tests.

Key terms (defined)
– Independent variable (factor): the categorical variable that defines groups (e.g., treatment A, B, C).
– Dependent variable (response): the continuous outcome measured for each subject (e.g., test score, cost).
– Null hypothesis: the hypothesis that all group means are equal.
– F statistic (F‑ratio): the test statistic equal to the ratio of variance explained by the factor to the unexplained variance. Values near 1 suggest no real group differences.
– F‑distribution: the probability distribution used to convert the F statistic into a p‑value; it depends on two degrees of freedom parameters (numerator and denominator).
– MST (mean sum of squares due to treatment): an estimate of variance between group means.
– MSE (mean sum of squares due to error): an estimate of variance within groups (error).

How ANOVA works (in plain steps)
1. Partition total variability into two parts: variability between groups (systematic) and variability within groups (random).
2. Compute MST (an average of squared deviations between group means and the grand mean).
3. Compute MSE (an average of squared deviations of observations from their group means).
4. Form the F statistic: F = MST / MSE.
5. Compare F to the appropriate F‑distribution (with numerator and denominator degrees of freedom) to judge whether group differences are unlikely under the null hypothesis.
6. If ANOVA is significant, follow up with additional tests to locate which groups differ.

One‑way vs two‑way ANOVA (when to use each)
– One‑way ANOVA: use when you have a single categorical factor and want to test whether its levels produce different means on a

continuous outcome. Two‑way ANOVA: use when you have two categorical factors and you want to test (a) the main effect of each factor on the continuous outcome and (b) whether the factors interact (i.e., the effect of one factor depends on the level of the other).

Key assumptions (define tests/alternatives)
– Independence: observations are independent across and within groups. If violated (clustered or repeated measures), use repeated‑measures ANOVA or mixed models.
– Normality: residuals (differences between observations and their group means) are approximately normally distributed. Check with Q‑Q plots or tests (Shapiro‑Wilk). If sample sizes are large, the central limit theorem often mitigates modest departures.
– Homogeneity of variances: group variances are similar (also called homoscedasticity). Check with Levene’s test or Bartlett’s test. If variances differ, consider Welch’s ANOVA (robust to unequal variances) or a nonparametric alternative.
– Additivity (for factorial ANOVA): main effects add without interaction unless interaction is present; check interaction term in two‑way ANOVA.

Checklist to run ANOVA (practical workflow)
1. State hypotheses:
– H0: all group means equal.
– H1: at least one group mean differs.
2. Inspect data: summary stats and boxplots by group.
3. Check assumptions: independence (design), normality (residuals), homogeneity (Levene).
4. Compute ANOVA table (see formulas below).
5. If F is significant, run post‑hoc comparisons (Tukey HSD, Games‑Howell if unequal variances).
6. Report F, degrees of freedom, p‑value, and effect size (eta‑squared or partial eta‑squared); include confidence intervals for group differences where possible.
7. If assumptions fail, run robust or nonparametric tests (Welch’s ANOVA, Kruskal‑Wallis).

Formulas and degrees of freedom (one‑way ANOVA)
– Notation: k = number of groups, ni = sample size in group i, N = total sample size, yi· = mean of group i, y·· = grand mean.
– Sum of squares between (SSB or SSTr): SSB = Σi ni(yi· − y··)2
– Sum of squares within (SSW or SSE): SSW = Σi Σj (yij − yi·)2
– Total sum of squares: SST = SSB + SSW = Σi Σj (yij − y··)2
– Degrees of freedom: df_between = k − 1; df_within = N − k
– Mean squares: MST (mean square treatment) = SSB / (k − 1); MSE (mean square error) = SSW / (N − k)
– F statistic: F = MST / MSE

Worked numeric example (one‑way ANOVA)
Data: three groups, n = 3 each
– Group 1: 2, 3, 4 (mean = 3.000)
– Group 2: 5, 6

, 7 (mean = 6.000)
– Group 3: 8, 9, 10 (mean = 9.000)

Step 1 — grand mean
– Grand mean y·· = (3.000 + 6.000 + 9.000) / 3 = 6.000

Step 2 — sum of squares between (SSB)
– SSB = Σi ni(yi· − y··)2
– = 3*(3 − 6)2 + 3*(6 − 6)2 + 3*(9 − 6)2
– = 3*9 + 3*0 + 3*9 = 27 + 0 + 27 = 54.000

Step 3 — sum of squares within (SSW)
Compute within each group:
– Group 1: (2−3)2 + (3−3)2 + (4−3)2 = 1 + 0 + 1 = 2
– Group 2: (5−6)2 + (6−6)2 + (7−6)2 = 1 + 0 + 1 = 2
– Group 3: (8−9)2 + (9−9)2 + (10−9)2 = 1 + 0 + 1 = 2
– SSW = 2 + 2 + 2 = 6.000

Step 4 — total sum of squares (SST)
– SST = SSB + SSW = 54 + 6 = 60.000

Step 5 — degrees of freedom
– k = 3 groups; N = 9 observations
– df_between = k − 1 = 2
– df_within = N − k = 6

Step 6 — mean squares
– MST = SSB / df_between = 54 / 2 = 27.000
– MSE = SSW / df_within = 6 / 6 = 1.000

Step 7 — F statistic and decision
– F = MST / MSE = 27 / 1 = 27.000
– For α = 0.05, critical F(2,6) ≈ 5.14. Since 27 >> 5.14, reject the null hypothesis that all group means are equal.
– p-value (approx): p < 0.01 (very small), consistent with rejecting H0.

Interpretation
– The between-group variability is much larger than within-group variability, so the ANOVA indicates a statistically significant difference among the three group means.
– ANOVA itself does not say which pairs differ. To find specific differences, run a post-hoc test (e.g., Tukey’s HSD) or planned pairwise comparisons.

Effect size (simple)
– Eta-squared (η2) = SSB / SST = 54 / 60 = 0.90. This indicates a very large proportion of total variance is attributed to group differences (note: η2 can be upward biased with small samples).

Practical checklist before using this ANOVA result
1. Independence: observations across and within groups should be independent.
2. Normality: residuals (within-group errors) roughly follow a normal distribution —

— roughly symmetric with no extreme skew or heavy tails. Check this by plotting residuals (histogram/Q–Q plot) or using a formal test (e.g., Shapiro–Wilk). Note: with large samples the normality requirement is less strict because of the central limit theorem.

3. Homogeneity of variances: the variances of the groups should be approximately equal. Check with Levene’s test or Bartlett’s test and by comparing group standard deviations. If variances differ substantially and sample sizes are unequal, the standard ANOVA F-test can be biased.

4. Independence of errors: observations (and therefore residuals) should not be correlated. If data are clustered or measured repeatedly on the same subjects, use a repeated-measures ANOVA or mixed model instead.

5. No influential outliers: extreme values can distort means and inflate between- or within-group variance. Identify outliers (boxplots, influence diagnostics) and decide on casewise handling before running ANOVA.

What to do if assumptions are violated
– Heterogeneous variances (unequal variances): use Welch’s ANOVA (robust to unequal variances) or transform the data (log, square-root) if appropriate.
– Non-normal residuals with small samples: consider nonparametric Kruskal–Wallis test (ranks-based) or permutation/bootstrapping methods.
– Non-independence (repeated measures): use repeated-measures ANOVA or linear mixed-effects models that model within-subject correlation.
– Outliers: investigate reasons (data entry error, true but extreme observation). If legitimate, use robust methods (trimmed means, bootstrap, or robust ANOVA).

Post-hoc comparisons (when omnibus ANOVA is significant)
– Omnibus ANOVA only indicates that at least two group means differ. To find which pairs differ, run post-hoc pairwise tests that control the family-wise error rate (the probability of at least one false positive among all comparisons).
– Common choices: Tukey’s HSD (honest significant difference) for all pairwise comparisons assuming equal variances; Welch-adjusted pairwise tests if variances unequal; Bonferroni or Holm’s sequential Bonferroni for simple control of Type I error; false-discovery-rate (Benjamini–Hochberg) when you prefer control of expected proportion of false discoveries.
– Report which method was used, adjusted p-values or confidence intervals, and effect sizes for pairwise differences.

Effect-size measures and interpretation
– Eta-squared (η2): proportion of total variance explained by the factor. η2 = SSB / SST. It is intuitive but can be biased upward with small samples.
– Partial

Partial eta-squared (ηp2): the proportion of variance associated with a factor after removing variance explained by other factors in the model. For a single-factor (one-way) ANOVA, ηp2 reduces to η2 because there are no other factors to partial out. Formula (general factorial designs):
ηp2 = SSeffect / (SSeffect + SSerror)
where SSeffect is the sum of squares for the effect and SSerror is the residual (error) sum of squares.

Omega-squared (ω2): a less biased estimator of population

Omega-squared (ω2): a less biased estimator of population effect size that adjusts for sampling error. For one-way ANOVA the common formula is

ω2 = (SSbetween − dfbetween × MSerror) / (SStotal + MSerror)

where
– SSbetween = sum of squares between groups,
– SStotal = total sum of squares,
– MSerror = mean square error = SSerror / dferror,
– dfbetween = k − 1 (k = number of groups),
– dferror = N − k (N = total sample size).

Why use ω2? Eta-squared (η2) measures the proportion of observed variance explained by the factor, but it tends to be biased upward as an estimator of the population effect—especially with small samples. Omega-squared subtracts an estimate of sampling noise (dfbetween × MSerror) from the numerator and includes MSerror in the denominator, producing a smaller, less optimistic estimate that is closer to the population value on average.

Key relationships and practical points
– For one-way ANOVA, partial eta-squared (ηp2) = eta-squared (η2), because there are no other factors to partial out.
– Generally: η2 ≥ ω2 (η2 is usually