A histogram is a graphical display that summarizes the distribution of a numerical data set by grouping observations into contiguous intervals (called bins) and plotting the frequency (count), percentage, or density for each interval as a contiguous bar. It condenses many individual data points into a visual representation that makes the center, spread, shape, modality, and outliers of the data easy to see (Investopedia; Penn State STAT 500).
Key takeaways
– A histogram shows the frequency (or proportion/density) of numerical data within contiguous intervals (bins).
– Unlike a bar chart for categorical data, histogram bars touch because the underlying variable is continuous.
– Choice of bin width and placement strongly affects interpretation; several rules of thumb can help choose bin size.
– Histograms are useful for exploring distribution shape (normal, skewed, bimodal), spread, and outliers.
– In finance, the MACD histogram is a related concept that plots the difference between the MACD line and its signal line to show momentum (Investopedia).
How histograms work (conceptual)
– Select a continuous numerical variable (e.g., ages, price returns, measurement).
– Partition the variable’s range into contiguous, non-overlapping intervals (bins).
– Count how many observations fall into each bin (frequency), or compute the percentage/density for each bin.
– Draw adjacent rectangles for each bin: the horizontal width equals the bin interval, and the vertical height equals the frequency (or percent/density). No gaps are shown between bars.
Histogram vs. bar chart
– Histogram: for continuous or interval data; bars touch; horizontal axis represents numeric intervals; vertical axis is frequency, percent, or density.
– Bar chart: for discrete or categorical data; bars usually spaced; horizontal axis shows categories; bar width is arbitrary (has no numeric meaning) (Investopedia; Montgomery College).
When to use a histogram
– Use histograms for exploring the distribution of numeric data: to identify center, spread, skewness, modality (uni-, bi-, multi-modal), and outliers.
– Use histograms when you need to compare distributional shape across groups (side-by-side histograms) or to check assumptions (for example, normality) before statistical modeling.
– Don’t use a histogram when data are categorical—use a bar chart instead.
Interpreting histogram shapes (practical guide)
– Symmetric, bell-shaped: consistent with a normal-like distribution.
– Right-skewed (tail to the right): many low values, few large values.
– Left-skewed (tail to the left): many high values, few small values.
– Bimodal/multimodal: two or more peaks — may indicate mixed populations or processes.
– Very narrow: low variability; very wide: high variability.
– Outliers: isolated bars far from main cluster indicate possible outliers or data-entry issues.
Choosing bins: practical rules and guidance
Bin choice can change what you see. Try different approaches and document the choice.
– Square-root rule: number of bins ≈ √n (simple, quick for exploratory work).
– Sturges’ rule: number of bins ≈ log2(n) + 1 (works well for approximately normal small-to-moderate samples).
– Freedman–Diaconis rule (robust): bin width = 2 × IQR × n^(−1/3). Produces wider bins for noisy data and narrower bins for large samples.
– Practical tips:
• Start with square-root or Sturges for a quick view; switch to Freedman–Diaconis if distribution looks noisy.
• Try several bin widths and compare—look for a stable picture of modality and skew.
• When comparing groups, use the same bins for all histograms.
How to make a histogram — by hand (step-by-step)
1. Gather and clean your data (remove or mark missing values).
2. Order the data and calculate min and max.
3. Choose the number of bins or bin width (use one of the rules above).
4. Define the bin edges so they’re contiguous and non-overlapping.
5. Count the observations that fall into each bin (include left edge or right edge consistently).
6. Decide the vertical scale: frequency (counts), percentage (counts / n × 100), or density (counts / (n × width)).
7. Draw the horizontal axis with bin intervals and the vertical axis with chosen scale.
8. Draw contiguous bars for each bin: width = bin width, height = frequency/percentage/density.
9. Label axes and add a title; annotate if you used a particular binning rule.
How to make a histogram in Excel (practical steps)
Two common ways in modern Excel:
A. Built-in Histogram chart (recommended for most users)
1. Select your numeric data.
2. Insert tab → Charts group → Histogram (under Statistical charts) — Excel will auto-bin and draw the chart.
3. Adjust bin settings: right-click horizontal axis → Format Axis → Bin options (You can set bin width, number of bins, or overflow/underflow limits).
4. Format vertical axis (counts vs. percentage): to get density/percentage, convert counts manually or compute percentages in a helper column and plot them as a column chart with no gaps.
B. Data Analysis ToolPak (older method)
1. Enable Data Analysis add-in (File → Options → Add-ins → Manage Excel Add-ins → Go → check Analysis ToolPak).
2. Data tab → Data Analysis → Histogram.
3. Input Range: your data. Bin Range: either leave blank (Excel creates bins) or supply a range of bin upper limits.
4. Choose Output Range and check Chart Output for an automatic histogram chart.
(Microsoft documentation)
Histogram example (age distribution)
– Suppose a town’s population ages grouped in 10-year bins: 0–10, 11–20, …, 71–80.
– If counts are [500, 4000, 3500, 3000, 2500, 1500, 700, 200], a histogram with these eight contiguous bins shows where population concentrates and reveals any skew or modes.
– You can change to four 20-year bins to see different aggregation and patterns.
The MACD histogram — what it is and how traders use it
– What it is: In technical analysis, the MACD histogram plots the difference between the MACD line and the signal line (both derived from exponential moving averages). Each bar’s height equals MACD − signal; bars above zero indicate MACD > signal; bars below zero indicate MACD < signal (Investopedia).
– How traders interpret it:
• Positive, rising bars indicate increasing upward momentum.
• Negative, falling bars indicate increasing downward momentum.
• Shortening bars (bar smaller than previous) can warn of momentum slowing and sometimes give earlier signals than the MACD/Signal crossover (helps reduce lag).
• Use the histogram alongside price action and other indicators; it’s not reliable alone because the MACD lines are moving averages and inherently lagging.
– Risk management: combine histogram signals with confirmation (volume, trendlines, support/resistance) and apply stop-losses because signals can be late or false (Investopedia).
Limitations and cautions
– Sensitive to bin choice—different bins can imply different stories.
– Histograms hide individual data values; important features can be masked if bins are too wide.
– Not appropriate for categorical data.
– For small samples, histogram shape may be misleading; consider kernel density estimates or boxplots as complements.
– In trading, MACD histogram signals lag and may generate false signals in choppy markets; always manage risk.
Practical checklist for producing and using histograms
1. Clean data and decide the goal (explore distribution, test normality, compare groups).
2. Choose binning strategy and document it.
3. Create histogram (by hand or software) and try at least two bin widths for stability.
4. Examine shape, center, spread, modality, and outliers.
5. If comparing groups, use consistent bins and scales.
6. Complement histograms with summary statistics (mean, median, IQR) and other visualizations (boxplots, density plots).
7. For MACD histogram trading, combine with other technical/fundamental analysis and apply risk controls.
The bottom line
A histogram is an essential exploratory tool for summarizing the distribution of numeric data. It’s simple to construct yet powerful for revealing central tendency, dispersion, shape, and anomalies. Proper bin choice and complementary analyses are important to avoid misinterpretation. In finance, the MACD histogram is a specialized application that visualizes momentum (the gap between MACD and its signal line) and can provide useful—but not foolproof—signals when used with other tools (Investopedia; Penn State STAT 500; Microsoft).
Sources
– Investopedia. “Histogram.” (Includes MACD histogram discussion and general definitions)
– Penn State, Eberly College of Science. STAT 500, Applied Statistics: 1.6.2 – Histograms.
– Microsoft. “Create a Histogram.”
– Montgomery College, Pressbooks. “Statistics Study Guide: 2.2 Histograms, Frequency Polygons, and Time Series Graphs.”
(Continuing from the earlier material)
Choosing bin size and binning strategy
– Why bin size matters: The number and width of bins (also called buckets) determine how much structure you see. Too few bins oversmooth the distribution and hide details; too many produce noisy, fragmented histograms that can mislead.
– Common rules for selecting bins:
• Sturges’ rule: k = ⌈log2(n) + 1⌉, where n is sample size. Simple and works reasonably for moderately sized, near-normal data.
• Scott’s rule (for bin width h): h = 3.5 * σ * n^(−1/3). Emphasizes sample standard deviation σ.
• Freedman–Diaconis rule (for robust bin width h): h = 2 * IQR * n^(−1/3), where IQR is interquartile range. This is robust to outliers and often a good default.
– Practical tip: try several methods (Sturges, Scott, Freedman–Diaconis) and visually compare. Domain knowledge should guide final choice — for example, age-grouping in demographic histograms often uses round decades (0–10, 11–20, etc.) rather than a mathematically optimal bin width.
How to make a histogram — step-by-step (general workflow)
1. Define the question: Which variable do you want to understand? Frequency of returns? Age distribution? Time durations?
2. Prepare and clean the data: remove or flag missing values, check for obvious data errors, and decide whether to transform values (e.g., log-transform positive-skewed data such as incomes).
3. Decide on the measurement shown on the vertical axis: frequency (counts), relative frequency (percent of observations), or density (so area sums to 1).
4. Choose bin edges/width using one of the rules above or based on domain buckets.
5. Plot and label axes clearly, including unit labels and sample size (n).
6. Inspect the shape: modality (uni/bi/multi), skewness, kurtosis (tails), gaps, and outliers.
7. Iterate: adjust bins or transform data if necessary and, where appropriate, overlay a density estimate (KDE) or theoretical distribution (e.g., normal curve) for comparison.
Make a histogram in Microsoft Excel (practical steps)
– Excel (modern versions):
1. Insert your data in one column.
2. Select the data and go to Insert → Charts → Histogram (Excel computes bins automatically; you can adjust bin width in Chart Format → Format Axis → Axis Options → Bin width).
3. Alternatively, use the Data Analysis ToolPak: Data → Data Analysis → Histogram. Provide data and bin range, then choose output options. This produces a frequency table and chart.
– By using formulas:
• Use FREQUENCY(data_array, bins_array) as an array formula to compute counts for custom bins, then plot the results as a column chart without gaps and adjust formatting to remove space between bars.
Make a histogram in Python (pandas/matplotlib/seaborn)
– Simple with pandas/matplotlib:
• import pandas as pd
• import matplotlib.pyplot as plt
• df['column'].hist(bins=30, density=False) # set density=True to plot probability density
• plt.xlabel('Value'); plt.ylabel('Frequency'); plt.title('Histogram')
– Using seaborn for nicer defaults and KDE overlay:
• import seaborn as sns
• sns.histplot(df['column'], bins=30, kde=True)
– Use numpy.histogram or matplotlib.pyplot.hist for more control; specify bin edges, density, and cumulative options.
Making a histogram by hand (small example)
1. Collect data: e.g., daily returns for 30 days.
2. Choose bins: e.g., −3% to −2%, −2% to −1%, −1% to 0%, 0% to 1%, 1% to 2%, 2% to 3%.
3. Count frequency per bin.
4. Draw horizontal axis with bins and vertical axis with counts; draw adjacent rectangles whose heights equal the counts.
Interpreting histograms — what to look for
– Center: where data is concentrated (mode).
– Spread: width of distribution; relationship to variance and standard deviation.
– Shape:
• Symmetric vs. skewed (right/positive skew = long tail to the right; left/negative skew = long tail to the left).
• Modality: unimodal, bimodal, multimodal (multiple peaks can indicate mixed populations).
– Tails: heavy (fat) tails vs. light tails; fat tails are common in financial returns and indicate higher probability of extreme outcomes than a normal distribution implies.
– Outliers and gaps: indicate data issues or rare events; investigate rather than ignore.
– Compare to theoretical curves: overlay normal or other relevant distributions to see deviations.
Examples and applied use cases
1) Demographic example (age frequencies)
– Goal: visualize how a town's population is distributed by age.
– Bins: decades (0–10, 11–20, …). Vertical axis: count or percent.
– Interpretation: identify dominant age groups, peaks (young population vs. aging population), and gaps.
2) Financial returns (risk analysis)
– Goal: understand distribution of daily returns for a stock or portfolio.
– Procedure:
• Compute log returns: rt = ln(Pt / Pt−1).
• Plot histogram with enough bins (Freedman–Diaconis often appropriate).
• Overlay a normal distribution fitted to mean and std dev to highlight fat tails.
– Interpretation:
• Fat tails and skewness imply downside risk not captured by normal-based models.
• Use histogram insights to adjust Value at Risk (VaR) models or stress-testing.
3) Monte Carlo simulation outputs
– Goal: examine distribution of simulated portfolio values at a future horizon.
– Procedure:
• Run many simulations (e.g., 10,000).
• Plot histogram of end values or percent changes and mark percentiles (e.g., 5th, 95th).
– Interpretation:
• Assess probability of outcomes, tail risk, and likely ranges for planning.
The MACD histogram — deeper dive and example
– Recap of definitions:
• MACD line = EMAshort − EMAlong (commonly EMA12 − EMA26).
• Signal line = EMA of MACD line (commonly EMA9 of MACD).
• MACD histogram = MACD line − Signal line.
– Numeric illustration (conceptual):
• If MACD line today = 0.60 and signal line = 0.40, histogram = 0.20 (positive, bullish momentum).
• If MACD line = 0.20 and signal = 0.50, histogram = −0.30 (negative, bearish momentum).
– How to compute (practical steps):
1. Calculate short-term and long-term EMAs of price series (12-period and 26-period are standard).
2. Subtract long EMA from short EMA → MACD line.
3. Calculate the EMA of the MACD line (e.g., 9-period) → signal line.
4. Subtract signal line from MACD line → MACD histogram.
– Common signals using MACD histogram:
• Zero-line cross: histogram crosses from negative to positive (MACD crosses above signal) may indicate a bullish shift; opposite for bearish.
• Divergence: price makes a new high but histogram fails to make a new high (bearish divergence), or price makes a new low but histogram does not (bullish divergence).
• Histogram bar shortening: when bar height shrinks (even while above zero), momentum may be weakening and a reversal or correction could follow.
– Example trading approach (illustrative — not investment advice):
• Entry: After confirmation from histogram (e.g., histogram turning positive and expanding) plus a price pattern or support break.
• Exit/stop: Place stop-loss under recent swing low; consider exiting when histogram shrinks and reverses sign.
• Use other indicators (volume, price action, ADX, RSI) to avoid false signals.
– Limitations and mitigation:
• MACD and its histogram are lagging because EMAs rely on past prices. To reduce false signals, require additional confirmation, use shorter EMAs for more responsiveness (at cost of more noise), or apply filters (e.g., trend filter).
Advanced options and alternatives
– Kernel density estimation (KDE): a smoothed estimate of the distribution. Useful when you want a continuous curve rather than discrete bars. Combine histogram + KDE for complementary views.
– Cumulative histograms (CDF): show cumulative probability up to each bin; useful to read percentiles directly.
– Log binning: for heavy-tailed positive data (e.g., incomes, trade sizes), use logarithmic bins to display multiplicative scales better.
– Normal QQ-plots and goodness-of-fit tests: use these to formally assess whether data follow a specific distribution.
Common mistakes and pitfalls
– Using categorical thinking with continuous data: histograms summarize continuous data — avoid arbitrary categorical labels unless they reflect real categories.
– Ignoring bin-sensitivity: conclusions about modality or tails can change with bin choice — always test robustness.
– Failing to annotate: omitting units, sample size, or axis scales leads to misinterpretation.
– Overreliance for decision-making in trading: histogram-based signals alone are not reliable; combine with other tools and risk management.
Checklist before presenting a histogram
– Is the axis labeled with units?
– Is sample size (n) indicated?
– Did you choose binning that reflects the question and data size?
– Have you checked for data issues (missing values, erroneous outliers)?
– Did you consider density vs. frequency and state which you used?
– Are any overlays (KDE, normal curve) clearly labeled?
Concluding summary
Histograms are a foundational visualization that condense many data points into an intuitive picture of a variable’s distribution. Proper construction — choosing sensible bin widths, cleaning data, and labeling axes — makes them powerful tools for exploratory data analysis, risk assessment, and communication. In finance, histograms clarify return distributions, highlight fat tails and skewness, and provide insight for models and stress tests. The MACD histogram is a specialized application in technical analysis that helps traders read momentum and possible trend shifts, but it should be combined with other indicators and disciplined risk management because of lag and potential false signals.
Recommended references and further reading
– Investopedia. “Histogram.”
– PennState, Eberly College of Science. STAT 500, Applied Statistics: 1.6.2 – Histograms.
– Microsoft. “Create a Histogram.” (Support documentation for Excel histogram tools.)
– Montgomery College, Pressbooks. “Statistics Study Guide: 2.2 Histograms, Frequency Polygons, and Time Series Graphs.”
– Freedman, D. and Diaconis, P. (1981). “On the histogram as a density estimator: L2 theory.” (Formulation of the Freedman–Diaconis rule.)
– Produce example histograms from a dataset you provide (Excel or CSV).
– Show a step-by-step MACD histogram calculation on a small price series.
– Provide ready-to-run Python or Excel templates.