t test (summary) - DominionFX

• A t‑test is an inferential statistical test that compares the means of two groups to determine whether their difference is statistically significant (unlikely to have arisen by chance).
– It is used when the data are approximately normally distributed and the population variances are unknown.
– The test produces a t‑statistic (t‑value) and degrees of freedom (df); the t‑value is compared to the t‑distribution (or used to compute a p‑value) to accept or reject the null hypothesis that the two means are equal.
Source: Investopedia

Key assumptions of t‑tests
1. Scale of measurement: outcome is continuous (interval or ratio).
2. Independence: observations are independent (except for paired tests where pairs are dependent).
3. Normality: the sampling distribution of the mean is approximately normal (small samples require approximate normality; larger samples invoke the central limit theorem).
4. Unknown variances: population variances are unknown (if known and sample large, a z‑test may be used).

Types of t‑tests and when to use them
– Paired (dependent) t‑test: use when observations are matched/paired (before–after measurements on the same subjects, matched pairs). Each subject serves as their own control.
– Independent (two‑sample) t‑test — equal variance (pooled): use when two independent groups are compared and their variances are assumed equal.
– Independent (two‑sample) t‑test — unequal variance (Welch’s test): use when two independent groups are compared but variances differ (recommended by default if variance equality is uncertain).

Formulas (practical presentation)

1) Paired t‑test
– Compute differences di for each pair; let mean_diff = mean(di), sd_diff = standard deviation of di, n = number of pairs.
– t = mean_diff / (sd_diff / sqrt(n))
– df = n − 1

2) Independent t‑test — equal variances (pooled)
– Let means mean1, mean2; sample sizes n1, n2; sample variances s1^2, s2^2.
– Pooled variance: sp^2 = [ (n1−1)s1^2 + (n2−1)s2^2 ] / (n1 + n2 − 2)
– Standard error: SE = sqrt( sp^2 * (1/n1 + 1/n2) )
– t = (mean1 − mean2) / SE
– df = n1 + n2 − 2

3) Independent t‑test — unequal variances (Welch)
– Standard error: SE = sqrt( s1^2/n1 + s2^2/n2 )
– t = (mean1 − mean2) / SE
– Welch df (approximate, Satterthwaite):
df ≈ (s1^2/n1 + s2^2/n2)^2 / [ (s1^4 / (n1^2 (n1−1))) + (s2^4 / (n2^2 (n2−1))) ]

Which t‑test to use (practical guidance)
– If samples are paired (e.g., before/after), use paired t‑test.
– If independent groups:
• If sample sizes are large and similar, Welch’s t (unequal variance) is safe.
• If you can justify equal variances (by design or tests such as Levene’s), pooled t can be used.
• Many analysts default to Welch’s t because it is robust to variance differences.

How to use the t‑distribution (table or software)
– To use a t‑table:
1. Choose significance level α (e.g., 0.05) and whether test is one‑ or two‑tailed.
2. Look up critical t for the appropriate degrees of freedom (df).
3. Compare |t_calc| to t_critical: if |t_calc| > t_critical, reject H0.
– With software you typically obtain the t‑statistic and p‑value directly; compare p to α.

Step‑by‑step practical procedure to run a t‑test
1. State hypotheses:
• H0 (null): μ1 = μ2 (no difference)
• H1 (alternative): μ1 ≠ μ2 (two‑tailed), or μ1 > μ2 / μ1 < μ2 (one‑tailed) 2. Check assumptions: independence, approximate normality (inspect histograms, Q‑Q plots, Shapiro–Wilk for small samples), scale of measurement. 3. Decide test type (paired vs independent; pooled vs Welch). 4. Optionally test equality of variances (e.g., Levene’s test). Note: don’t rely solely on variance tests; sample sizes and context matter. 5. Compute t and df (use formulas or software). 6. Obtain p‑value or compare to critical t: - If p ≤ α → reject H0 (evidence of a difference). - If p > α → fail to reject H0 (no evidence of a difference).
7. Report results: t, df, p‑value, sample means, standard deviations, and confidence interval for difference (e.g., 95% CI).
8. Report effect size (Cohen’s d) and practical significance, not just statistical significance.
9. Discuss limitations and assumptions.

Worked numerical example (independent pooled t)
Scenario: A drug trial yields
– Treatment group: n2 = 25, mean2 = 4.0 years, sd2 = 1.3
– Control group: n1 = 25, mean1 = 3.0 years, sd1 = 1.2
We test whether the drug increases life expectancy (two‑tailed H1: means differ).

1) Compute pooled variance:
– s1^2 = 1.2^2 = 1.44; s2^2 = 1.3^2 = 1.69
– sp^2 = [24(1.44) + 24(1.69)] / 48 = (34.56 + 40.56)/48 = 75.12/48 = 1.565

2) Standard error:
– SE = sqrt( sp^2 * (1/n1 + 1/n2) ) = sqrt(1.565*(2/25)) = sqrt(0.1252) ≈ 0.354

3) t statistic:
– t = (mean1 − mean2) / SE = (3.0 − 4.0)/0.354 ≈ −2.825 (absolute 2.825)

4) Degrees of freedom:
– df = n1 + n2 − 2 = 48

5) Compare to critical value or p‑value:
– For df = 48, two‑tailed α = 0.05, t_critical ≈ 2.011. Since 2.825 > 2.011, reject H0.
– p ≈ 0.007 (two‑tailed), so the difference is statistically significant at α = 0.05.

Interpretation: The observed increase of 1 year in mean life expectancy between groups is unlikely to be due to chance given these sample statistics. Also compute effect size (Cohen’s d) and confidence interval to assess practical significance.

Important reporting items (what to include in a results statement)
– Type of t‑test used (paired/independent; pooled/Welch)
– t statistic, df, and p‑value (e.g., t(48) = −2.83, p = 0.007)
– Sample sizes, sample means, standard deviations
– Confidence interval for mean difference
– Effect size (Cohen’s d)
– Assumptions checked and any violations or remedies (e.g., nonparametric test used)

When not to use a t‑test (alternatives and limitations)
– If distributions are markedly nonnormal and sample sizes are small: consider nonparametric alternatives (Mann–Whitney U for independent samples, Wilcoxon signed‑rank for paired).
– If variances are known and sample sizes large: z‑test could be appropriate (rare).
– For more than two groups/means: use ANOVA (and post hoc tests) rather than multiple t‑tests.
– If data are counts/categorical: use chi‑square or Fisher’s exact test.
– For multivariable or adjusted comparisons, use regression methods.

How to run t‑tests in common software
– R: t.test(x, y, paired = FALSE, var.equal = FALSE) — default performs Welch by default; paired = TRUE for paired t‑test.
– Python (SciPy): scipy.stats.ttest_ind(a, b, equal_var=False) for Welch; ttest_rel for paired.
– Excel: T.TEST function or Data Analysis Toolpak (t‑Test: Two‑Sample Assuming Equal Variances; t‑Test: Two‑Sample Assuming Unequal Variances; Paired Two‑Sample for Means).
– Most statistical packages also give confidence intervals and effect sizes.

Quick checklist before concluding
– Are observations independent (or correctly paired)?
– Is the scale continuous and reasonably normal?
– Have you chosen the correct form (paired vs independent; pooled vs Welch)?
– Have you reported effect size and confidence intervals?
– Have you disclosed sample sizes and checked for influential outliers?

Bottom line
A t‑test is a straightforward, widely used method to compare two means. Choosing the proper type of t‑test, checking assumptions, and reporting t, df, p, confidence intervals, and effect sizes make the result meaningful. When assumptions are violated or there are more than two groups, consider nonparametric tests or ANOVA/ regression alternatives.

Primary source for this summary: Investopedia — “T‑Test” .