A quartile is a value that divides an ordered data set into four equal parts, each containing (roughly) 25% of the observations. The three quartile values are:
– Q1 (lower/first quartile): the 25th percentile — about 25% of observations are ≤ Q1.
– Q2 (median/second quartile): the 50th percentile — about 50% of observations are ≤ Q2.
– Q3 (upper/third quartile): the 75th percentile — about 75% of observations are ≤ Q3.
Key takeaways
– Quartiles summarize the location and spread of a distribution and are robust to outliers.
– The interquartile range (IQR = Q3 − Q1) measures the spread of the middle 50% of the data.
– Different software and textbooks use different computational conventions (interpolation, inclusive/exclusive), so results can differ slightly.
– Spreadsheets provide functions (MEDIAN, QUARTILE.*) for quick computation.
Role of quartiles in data analysis
– Describe central tendency and dispersion together (median + spread around it).
– Visualize distribution with boxplots (box spans Q1 to Q3, median inside, whiskers often at 1.5×IQR).
– Identify skewness: if median is closer to Q1 or Q3, the data are skewed.
– Detect outliers using the 1.5×IQR rule (points Q3 + 1.5×IQR).
How to calculate quartiles in a spreadsheet (practical steps)
1. Put your data in one column (no header) and sort if you like; sorting is not required for the functions to work.
2. Median:
• Excel / Google Sheets: =MEDIAN(range)
3. First and third quartiles:
• Excel (modern): =QUARTILE.INC(range,1) for Q1, =QUARTILE.INC(range,3) for Q3
• Excel (older versions): =QUARTILE(range,1) and =QUARTILE(range,3)
• Excel (exclusive variant): =QUARTILE.EXC(range,1) / =QUARTILE.EXC(range,3) (EXC uses a different percentile definition)
• Google Sheets: =QUARTILE(range,1) and =QUARTILE(range,3) (uses the inclusive method)
4. Interquartile range (IQR):
• =QUARTILE.INC(range,3) − QUARTILE.INC(range,1)
Notes:
– Many spreadsheet QUARTILE functions implement a particular percentile/interpolation rule; results may differ from a manual method that uses (n+1)p (see below).
Manual method for calculating quartiles (step‑by‑step)
This describes a common textbook procedure using the position formula position = (n + 1) × p, where p is the percentile expressed as a fraction (0.25, 0.50, 0.75)
1. Sort the data in ascending order.
2. Let n = number of observations.
3. For each quartile, compute the position:
• Q1 position = (n + 1) × 0.25
• Q2 position = (n + 1) × 0.50 (this gives the median)
• Q3 position = (n + 1) × 0.75
4. If a position is an integer k, the quartile is the kth value in the sorted list.
5. If a position is fractional, say k + d where 0 < d < 1, interpolate linearly:
quartile = (1 − d) × x_k + d × x_{k+1}
(where x_k is the kth ordered value and x_{k+1} is the next larger value)
Worked example (manual)
Data (sorted): 4, 7, 8, 9, 10, 12, 13, 15, 18, 20, 22 (n = 11)
– Q1 position = (11 + 1) × 0.25 = 12 × 0.25 = 3 → Q1 = 3rd value = 8
– Q2 position = 12 × 0.50 = 6 → median = 6th value = 12
– Q3 position = 12 × 0.75 = 9 → Q3 = 9th value = 18
– IQR = Q3 − Q1 = 18 − 8 = 10
Example with interpolation
If n = 10 and Q1 position = (10 + 1) × 0.25 = 2.75, then Q1 lies 75% of the way between the 2nd and 3rd sorted values:
Q1 = 0.25 × x2 + 0.75 × x3 (or use linear interpolation as above).
How to find the lower quartile of a data set
– Spreadsheet: =QUARTILE.INC(range,1) (or older =QUARTILE(range,1)).
– Manual: sort the data and compute position = (n + 1) × 0.25; take the value at that position or interpolate between adjacent values.
How to find the upper quartile of a data set
– Spreadsheet: =QUARTILE.INC(range,3) (or older =QUARTILE(range,3)).
– Manual: sort the data and compute position = (n + 1) × 0.75; take value at that position or interpolate.
What is the interquartile range (IQR)?
– IQR = Q3 − Q1
– It measures the spread of the middle 50% of observations and is robust to extreme values.
– Common uses: summarize variability, create boxplots, detect outliers (1.5×IQR rule), and compare dispersion across groups.
Important considerations and common variations
– Multiple definitions: There are several legitimate methods for computing percentiles and quartiles (e.g., different rules used by statistical packages and textbooks). These methods differ in how they treat interpolation and whether the median is included when splitting data.
– Spreadsheet differences:
• Excel historically supplied QUARTILE which followed a certain convention; modern Excel includes QUARTILE.INC (inclusive) and QUARTILE.EXC (exclusive) with subtly different results for small samples.
• Google Sheets’ QUARTILE uses the inclusive method.
– Even vs odd n: When n is even, the median is the average of the two middle values. How you then split data to get Q1/Q3 depends on the chosen convention (some exclude the median value(s) when forming halves; others include them with interpolation).
– Reporting: When publishing results, indicate which method you used if exact reproducibility matters (for example: “quartiles computed using the (n+1)p method” or “QUARTILE.INC in Excel”).
Practical checklist (quick steps you can follow)
1. Decide whether you will use software or manual calculation.
2. If using software, pick the function (MEDIAN, QUARTILE.INC/EXC) and document it.
3. If manual:
• Sort data.
• Compute positions (use (n + 1) × p, or another agreed method).
• Extract or interpolate values.
4. Compute IQR = Q3 − Q1.
5. Optionally detect outliers: flags are values Q3 + 1.5×IQR.
6. Visualize with a boxplot for quick interpretation.
Applications and interpretation tips
– Use quartiles and IQR when you want robust measures that downweight outliers (financial returns, income data, test scores).
– Compare IQRs across groups to judge relative variability.
– Use quartile skewness or Bowley’s skewness = (Q3 + Q1 − 2×median) / (Q3 − Q1) for a simple skewness indicator.
– Remember that quartiles give no information about distribution shape inside each quartile (e.g., multi-modality).
Bottom line
Quartiles break a distribution into four parts and are simple, robust summaries of location and spread. Spreadsheets make their computation quick, but be aware that different formulas and software conventions can produce slightly different quartile values for the same data. For reproducibility, state the method or function used.
References
– Investopedia. “Quartile.”
– (For differences among percentile algorithms, see statistical software documentation such as Microsoft Excel’s QUARTILE.INC and QUARTILE.EXC documentation.)
(Continuing and expanding the article on quartiles)
Additional Sections
Differences in Quartile Calculation Methods
– Why methods differ: Quartiles are conceptually simple (split data into four equal parts), but different software and statistics texts use different interpolation rules or define how to split a middle observation. These choices change Q1 and Q3 when the target positions fall between observations.
– Common methods:
1. Inclusive (position method using (n + 1) × p): position = (n + 1) × p, where p = 0.25, 0.5, 0.75 for Q1, Q2, Q3. If the position is not an integer, interpolate between surrounding values. This is the traditional textbook rule that many introductory texts use.
2. Tukey’s hinges (split-by-median): Split the sorted data into lower and upper halves. If n is odd, exclude the median when forming halves; if n is even, split evenly. Take medians of halves as Q1 and Q3. This is used for box-plot hinges in some software and by some authors.
3. Software percentile rules (PERCENTILE.INC / PERCENTILE.EXC and percentile “types”): Modern software often implements several percentile algorithms (Excel has multiple methods, R’s quantile has types 1–9). Excel’s QUARTILE.INC and PERCENTILE.INC use one interpolation formula; QUARTILE.EXC / PERCENTILE.EXC use another and exclude endpoints. These produce different numeric results for many datasets.
– Practical implication: For large datasets differences are small; for small datasets or when a quartile position falls between values, the differences matter. Always document which method you used when reporting quartiles.
Step‑by‑Step: Manual Quartile Calculation (Inclusive position method)
1. Sort the data from smallest to largest.
2. Calculate Q1 position = (n + 1) × 0.25; Q2 position = (n + 1) × 0.5; Q3 position = (n + 1) × 0.75.
3. If a position is an integer, that numbered observation is the quartile. If not, interpolate between the surrounding observations.
4. Optionally, verify by splitting and taking medians (Tukey), and note any differences.
Example 1 — Odd n (illustrates concordance between methods)
Data (sorted): 3, 5, 7, 8, 12, 13, 14, 18, 21 (n = 9)
– Median (Q2) = 12 (5th value)
– Inclusive position method:
• Q1 pos = (9 + 1) × 0.25 = 2.5 → average of 2nd and 3rd values = (5 + 7)/2 = 6
• Q3 pos = (9 + 1) × 0.75 = 7.5 → average of 7th and 8th values = (14 + 18)/2 = 16
– Tukey’s hinges (exclude median):
• Lower half: 3, 5, 7, 8 → median = (5 + 7)/2 = 6 → Q1 = 6
• Upper half: 13, 14, 18, 21 → median = (14 + 18)/2 = 16 → Q3 = 16
– Excel PERCENTILE.INC / QUARTILE.INC (different interpolation rule): may return Q1 = 7 and Q3 = 14 for this dataset. Conclusion: method matters.
Example 2 — Detecting outliers with IQR
Data: 2, 4, 5, 6, 7, 8, 50 (n = 7)
– Sorted median Q2 = 6
– Lower half: 2, 4, 5 → Q1 = 4
– Upper half: 7, 8, 50 → Q3 = 8
– Interquartile range (IQR) = Q3 − Q1 = 8 − 4 = 4
– Outlier fences:
• Lower fence = Q1 − 1.5 × IQR = 4 − 6 = −2
• Upper fence = Q3 + 1.5 × IQR = 8 + 6 = 14
– Since 50 > 14, 50 is an outlier by the common 1.5×IQR rule.
How to Calculate Quartiles Using a Spreadsheet (practical)
– Excel:
• QUARTILE.INC(range, 1) returns Q1 (inclusive method).
• QUARTILE.INC(range, 2) returns the median.
• QUARTILE.INC(range, 3) returns Q3.
• QUARTILE.EXC(range, k) uses an exclusive method available for k = 1,2,3 (may produce different results).
• PERCENTILE.INC(range, p) and PERCENTILE.EXC(range, p) provide explicit control for p = 0.25, 0.5, 0.75.
– Google Sheets:
• QUARTILE(range, index) and PERCENTILE(range, p) functions can be used; verify whether the function uses inclusive/exclusive behavior for your Sheets version.
– Practical tip: If you need consistency across tools or to match a publication, use PERCENTILE.INC or PERCENTILE.EXC (or their equivalents) so you control the percentile method rather than relying on a legacy QUARTILE implementation.
Box Plots and Visualizing Quartiles
– A box plot visually shows Q1, median, Q3, whiskers (usually to the furthest non-outlying data within 1.5×IQR), and individual outliers beyond whiskers.
– Box height = IQR and indicates the spread of the middle 50% of data.
– The position of the median inside the box and the relative whisker lengths reveal skewness and spread.
Quartile Skewness (Bowley’s skewness)
– A robust measure of skewness using quartiles:
Bowley’s coefficient = (Q1 + Q3 − 2×Q2) ÷ (Q3 − Q1)
– Interpretation:
• Positive value: distribution is skewed right (longer tail to the right).
• Negative value: skewed left.
• Value near 0: roughly symmetric in the central half.
Important Considerations and Best Practices
– Always sort data first.
– Decide and document which method you use (inclusive/exclusive/Tukey/other percentile type).
– For small datasets, method choice can materially change quartiles—be explicit.
– Quartiles are robust measures of location and spread, less sensitive to outliers than mean and standard deviation.
– Use IQR and quartile-based rules to identify potential outliers but investigate context before removing data.
– For statistical modeling and reporting, indicate the quartile method or use standardized functions used in your field.
Practical Checklist — Calculating Quartiles (quick)
1. Sort your data.
2. Pick a method (typical choices: (n + 1)×p or split-by-median/Tukey; or use software function you understand).
3. Compute Q1, Q2 (median), Q3.
4. Calculate IQR = Q3 − Q1.
5. If needed, compute fences for outlier detection: Q1 − 1.5×IQR and Q3 + 1.5×IQR.
6. Visualize with a box plot to communicate results.
7. Document the method and software functions used.
Additional Examples (brief)
– Even n example (data 1–10):
• Using (n + 1)×p method: Q1 position = 11×0.25 = 2.75; interpolate between 2nd and 3rd values (2 and 3) => Q1 = 2.75.
• Tukey’s hinges: split into 1–5 and 6–10, medians of halves = 3 and 8 => Q1 = 3, Q3 = 8. You’ll see small but real differences based on method.
– Large dataset: Methods converge and differences become negligible.
Concluding Summary
Quartiles partition a dataset into four parts and provide robust, interpretable summaries of where observations lie relative to the distribution’s center. Q1 and Q3 mark the 25th and 75th percentiles, the median (Q2) marks the 50th percentile, and the interquartile range (IQR) measures spread of the middle 50%. Use spreadsheet functions (QUARTILE.INC/EXC, PERCENTILE.INC/EXC) for convenience, but be aware of method differences—especially for small samples. Apply quartiles and the IQR for descriptive summaries, box plots, and systematic outlier detection, and always document the calculation method used.
Sources and Further Reading
– Investopedia — “Quartile” (see:
– Scribbr — “Quartiles & Quantiles / Calculation, Definition & Interpretation”
– Microsoft Support — documentation on QUARTILE.INC / QUARTILE.EXC and PERCENTILE functions