• A quintile divides a data set into five equal parts; each part covers 20% of observations.
– Quintiles are calculated as the 20th, 40th, 60th and 80th percentiles (the fifth segment is the top 20%).
– Quintiles are useful for summarizing distributions, creating eligibility cutoffs (e.g., for subsidies), and comparing groups (income, returns, test scores).
– For small samples, tied values, or weighted data, use alternate methods (tertiles, quartiles, weighted percentiles) or explicit interpolation rules.
– Practical computation can be done by hand (sort + interpolate), in Excel (PERCENTILE functions), in R (quantile), or in Python (numpy.quantile / pandas.Series.quantile).
What is a quintile?
A quintile is one of five equal-sized groups created by dividing an ordered data set into fifths. If you rank observations from smallest to largest and split them so each group contains roughly 20% of the observations, each group is a quintile. The cutoff points that separate those groups are the 20th, 40th, 60th and 80th percentiles.
Common interpretations
– First (lowest) quintile = bottom 20% of observations.
– Fifth (highest) quintile = top 20% of observations.
Quintiles are descriptive tools — useful for summarizing how data are distributed and for creating thresholds used in policy or business decisions.
Understanding quintiles (how they’re computed)
Basic manual method (step-by-step)
1. Collect the data and sort it in ascending order.
2. Decide which quintiles you need: usually the 20th, 40th, 60th and 80th percentiles.
3. Compute the position for each percentile. A common formula for the percentile position is:
position = (n + 1) × P
where n = number of observations and P = percentile expressed as a fraction (for 20% use 0.20).
4. If position is an integer, the percentile equals the value at that index. If it’s not an integer, interpolate between the two nearest values.
Worked example
Data: [3, 7, 8, 5, 12, 14, 21, 13, 18, 20]
1. Sort: [3, 5, 7, 8, 12, 13, 14, 18, 20, 21], n = 10
2. First quintile (20%): position = (10 + 1) × 0.20 = 2.2 → interpolate between 2nd and 3rd values:
value = x2 + 0.2 × (x3 − x2) = 5 + 0.2 × (7 − 5) = 5.4
3. Repeat for 40%, 60%, 80% to get the other cutoffs.
Notes about methods and interpolation
– There are multiple legitimate percentile definitions (different interpolation rules). Common software choices use different defaults (R’s quantile has multiple “types”; Excel has PERCENTILE.INC and PERCENTILE.EXC). Expect small numeric differences for small samples.
– For large samples the different methods converge and differences are negligible.
Practical computation (software)
– Excel:
• Inclusive (works for all practical cases): =PERCENTILE.INC(range, 0.2) for the 20th percentile (use 0.4, 0.6, 0.8 for other quintiles).
• Exclusive: =PERCENTILE.EXC(range, 0.2) — this follows a different convention and can error with very small n.
– R:
• quantile(x, probs = c(0.2, 0.4, 0.6, 0.8), type = 7) — type=7 is R’s default and is commonly used; other types implement different interpolation rules.
– Python (NumPy / pandas):
• NumPy: np.quantile(data, [0.2, 0.4, 0.6, 0.8])
• pandas: pd.Series(data).quantile([.2, .4, .6, .8])
– Weighted quintiles:
• When observations carry weights (e.g., survey weights, population weights), compute the cumulative sum of weights and find values at cumulative-weight percentiles (20%, 40%, …). Many stats packages and custom functions support weighted quantiles.
Common uses of quintiles
– Income and wealth analysis: rank households by income and compare consumption, taxes, or benefits by quintile.
– Policy design: set eligibility cutoffs (e.g., households below the 20th income percentile qualify for a subsidy).
– Finance: categorize returns, volatility, or risk into quintiles to compare performance of funds or securities.
– Education and social research: compare outcomes (test scores, health metrics) across population quintiles.
– Marketing: segment customers into quintiles by value (RFM models often use quintile-style segmentation).
Alternatives to quintiles and when to use them
– Tertiles (three groups of ~33.3%): useful for smaller samples or when coarse grouping suffices.
– Quartiles (four groups of 25%): common for boxplots and interquartile range analysis.
– Deciles (ten groups of 10%): finer-grained segmentation.
– Percentiles (100 groups): for very fine thresholds.
– Weighted quantiles: necessary when data includes sampling or population weights.
– Lorenz curve / Gini coefficient: when you want a continuous measure of inequality rather than group cutoffs.
Choose grouping size based on sample size, purpose, and interpretability: too many groups with few observations each reduces reliability.
Practical steps to use quintiles in analysis or policy
1. Define the objective: comparison, eligibility cutoff, segmentation, or reporting.
2. Prepare the data: clean, decide whether to weight observations, and remove impossible values.
3. Choose a quintile computation method (explicitly state which — e.g., interpolation rule, inclusive vs exclusive).
4. Compute quintile cutoffs and assign observations to quintile groups.
5. Validate:
• Check group sizes and ties (many identical values can lead to uneven group membership).
• Assess robustness to different percentile methods (sensitivity check).
• If used for policy thresholds, test coverage and false positives/negatives.
6. Visualize results: histogram with quintile cutoffs, boxplots by group, Lorenz curve, cumulative distribution function.
7. Report clearly: provide cutoffs, how they were computed, and any weighting or adjustments.
8. Reassess periodically: distributions change over time; update thresholds on a regular schedule if used operationally.
Interpretation tips and pitfalls
– Outliers: quintile cutoffs are driven by rank, so a few extreme values won’t distort cutoffs much — but they do affect means within quintiles.
– Small samples: quintile boundaries may be unstable; prefer tertiles or quartiles or report exact percentiles with confidence intervals.
– Tied values: many identical values near a cutoff can make group membership ambiguous; document how ties are handled.
– Weighted data: failing to use weights with survey or population data gives misleading quintiles.
– Communication: when presenting results (especially public policy), explain what “top 20%” means (by income, wealth, test score, etc.) and which percentile method and data year were used.
Practical examples of application
– Income subsidy policy: compute household income quintiles using survey weights, identify the bottom quintile, simulate program costs and coverage if the subsidy is offered to the bottom quintile.
– Portfolio analysis: sort funds by past 5-year return, split into quintiles, and report average return and volatility for each quintile.
– Education research: split schools into quintiles by average test score, compare graduation rates or college enrollment across quintiles.
Quick glossary
– Quintile: each of five equal groups (20% each) made by sorting data.
– Fifth quintile: usually refers to the top (fifth) group — the highest 20% of observations.
– Tertile: each of three equal groups (~33.3% each).
– Percentile: a value below which a given percentage of observations fall (e.g., 20th percentile).
Bottom line
Quintiles are a simple, widely used way to summarize and segment distributions into five equal parts. They are especially useful for comparing groups (such as income groups) and creating policy thresholds. When applying quintiles, choose an appropriate computation method, handle weights and ties explicitly, and validate results with sensitivity checks and visualization.
Sources and further reading
– Investopedia, “Quintile” (Paige McLaughlin) — overview and examples.
– Cambridge Dictionary, entry “Quintile” — concise definition.
– Brookings Institution, “Does ‘The Bell Curve’ Ring True?” — example of quintiles used in social research.
Editor’s note: The following topics are reserved for upcoming updates and will be expanded with detailed examples and datasets.