
How to Draw a Box and Whisker Diagram: Complete Step-by-Step Guide (2026)
Master how to draw a box and whisker diagram with this complete tutorial covering hand calculations, Excel, Python, and R methods with worked examples.
Drawing a Box and Whisker Diagram: A Full Tutorial for Researchers and Students
Picture a column of numbers with no obvious pattern. How do you convey, in a single picture, where the bulk of those numbers sit, how widely they scatter, and whether a few of them stray far from the pack? A box and whisker diagram answers all three questions at once, using a footprint smaller than most paragraphs.
The format traces back to John Tukey, who introduced it in 1969. Today it travels under several interchangeable names: box plot, box and whisker plot, box and whisker diagram. Whatever you call it, the graphic distills a distribution down to five summary numbers. That economy explains why it turns up constantly in lab reports, journal figures, and lecture slides across ecology, neuroscience, mechanical engineering, economics, and beyond, wherever someone needs to line up several groups and ask how they differ.
The sections below walk through every route to the same chart, beginning with paper and a pencil and ending with reproducible scripts in Excel, Python, and R.

AI Chart Generator
Generate professional box plots and statistical charts from your data using AI. Export in high resolution for research papers and presentations.
Create box plots free →What a Box and Whisker Diagram Actually Shows
At its core, the chart encodes a distribution through five reference points: the lowest value, the first quartile (Q1), the median (Q2), the third quartile (Q3), and the highest value. Taken as a set, these numbers reveal three things simultaneously, where the data tends to center, how loosely or tightly it spreads, and whether stray observations sit far from everything else.
Compare that against the usual alternatives. A bar chart collapses everything into an average and hides the variation behind it. A histogram brings the shape back but forces you to pick a bin width that can change the story. The box plot threads the needle: it gives a tidy, bin-free portrait of spread that stays legible even when you stack a dozen groups along one axis.
Why Box Plots Appear Throughout Academic Research
A few concrete payoffs keep box plots in heavy rotation among researchers:
| Advantage | Description |
|---|---|
| Five numbers at a glance | The whole five-number summary fits into one compact mark |
| Spots the odd values | Points that fall outside the whiskers stand out right away |
| Lines up well | Stack several plots in a row to weigh groups against each other |
| Hints at shape | Where the median sits inside the box signals any lean in the data |
| Distribution-agnostic | Holds up for any continuous measurements, whatever their spread |
We expand on how chart selection underpins clear scientific reporting in our research data visualization guide, where box plots repeatedly stand out as the natural pick for side-by-side distribution comparisons.
The Anatomy of a Box and Whisker Plot
Once you can name each part of the chart and say what it stands for, both drawing one and reading one become almost mechanical.
The Five-Number Summary
These are the five quantities that every box plot is assembled from:
- Minimum: The lowest reading that remains once outliers are set aside
- First Quartile (Q1): The cut point at the 25th percentile, with a quarter of the observations lying beneath it
- Median (Q2): The middle of the data at the 50th percentile, dividing the set into two equal halves
- Third Quartile (Q3): The cut point at the 75th percentile, with three quarters of the observations lying beneath it
- Maximum: The highest reading that remains once outliers are set aside
Visual Components Explained
| Component | What It Represents | Visual Element |
|---|---|---|
| Box | The interquartile range (IQR = Q3 - Q1), spanning the middle 50% of data | Rectangle |
| Median line | The center of the distribution | Line inside the box |
| Whiskers | Range of values within 1.5 x IQR of the box edges | Lines extending from the box |
| Outliers | Points beyond 1.5 x IQR from Q1 or Q3 | Individual dots or circles |
| Fences | Boundaries at Q1 - 1.5 x IQR and Q3 + 1.5 x IQR | Not drawn, but define whisker limits |

A scatter plot lays bare the individual points that a box plot later folds into five summary numbers. Looking at the two side by side gives you the fullest read on your data.
How to Draw a Box and Whisker Diagram by Hand
Nothing cements the concept like running the arithmetic yourself. We will track a small operations dataset all the way through: the delivery times, in minutes, recorded for twelve orders from a campus food stall.
Dataset: 14, 17, 19, 21, 22, 24, 26, 27, 29, 31, 34, 58
Notice that this set has an even number of readings and one suspiciously slow delivery, which makes it a good stress test for the procedure.
Step 1: Sort the Data
Put every reading in ascending order first. Ours already arrives sorted, which keeps the bookkeeping simple:
14, 17, 19, 21, 22, 24, 26, 27, 29, 31, 34, 58
Step 2: Find the Median (Q2)
With twelve readings, no single value sits dead center. The median lands between the 6th and 7th entries, so we average them:
Q2 (Median) = (24 + 26) / 2 = 25
Step 3: Find Q1 (First Quartile)
Q1 is the median of the lower half, the six readings that fall below the overall median:
14, 17, 19, 21, 22, 24
Again the half has an even count, so Q1 is the average of its two middle entries:
Q1 = (19 + 21) / 2 = 20
Step 4: Find Q3 (Third Quartile)
Q3 is the median of the upper half, the six readings above the overall median:
26, 27, 29, 31, 34, 58
Q3 = (29 + 31) / 2 = 30
Step 5: Compute the IQR and Fence Values
IQR = Q3 - Q1 = 30 - 20 = 10
- Lower fence: Q1 - 1.5 x IQR = 20 - 15 = 5
- Upper fence: Q3 + 1.5 x IQR = 30 + 15 = 45
Here the fences earn their keep. The 58-minute delivery sits above the upper fence of 45, so it qualifies as an outlier and gets plotted on its own.
Step 6: Set the Whisker Endpoints
- Lower whisker endpoint: the smallest reading still at or above the lower fence = 14
- Upper whisker endpoint: the largest reading still at or below the upper fence = 34
Step 7: Draw the Chart
- Lay down a number line that spans the entire range of readings
- Sketch a rectangle stretching from Q1 (20) to Q3 (30)
- Drop a vertical line inside the box at the median (25)
- Run a whisker from the left edge of the box down to 14
- Run a whisker from the right edge of the box out to 34
- Mark the lone 58-minute reading as a separate point past the right whisker
Pulling it together, the five-number summary plus the flagged outlier reads:
| Statistic | Value |
|---|---|
| Minimum (non-outlier) | 14 |
| Q1 | 20 |
| Median (Q2) | 25 |
| Q3 | 30 |
| Maximum (non-outlier) | 34 |
| Outlier | 58 |
Creating Box Plots in Excel
From Excel 2016 onward, box and whisker is a first-class chart type, so you can hand the spreadsheet your raw numbers and let it do the quartile math.
Using the Built-In Chart Option
- Drop your values into one column, with a label in the top cell
- Highlight that column, header included
- Walk through Insert > Charts > Statistical > Box and Whisker
- Excel works out the quartiles on its own and draws the finished chart
Formatting Options
Right-click the box itself and a panel of toggles appears:
- Show inner points: Plot every reading that lands inside the whiskers
- Show outlier points: Surface readings that sit past the whisker tips
- Show mean markers: Drop a marker at the arithmetic mean
- Show mean line: Trace a line connecting the means across groups
- Inclusive or Exclusive quartile calculation: Pick the formula Excel uses for Q1 and Q3
Getting Publication-Ready Output
| Setting | Recommendation |
|---|---|
| Colors | Use a scientific color palette |
| Fonts | Choose readable fonts (see our font guide) |
| Gridlines | Remove or lighten for a cleaner appearance |
| Legend | Include only when comparing multiple groups |
| Export | Save at high resolution as PNG or SVG |
Creating Box Plots in Python
When you want precise command over every line, fill, and marker, Matplotlib and Seaborn are hard to beat.
Basic Box Plot with Matplotlib
import matplotlib.pyplot as plt
data = [14, 17, 19, 21, 22, 24, 26, 27, 29, 31, 34, 58]
fig, ax = plt.subplots(figsize=(8, 5))
ax.boxplot(data, vert=False, patch_artist=True,
boxprops=dict(facecolor='#4C72B0', alpha=0.7),
medianprops=dict(color='white', linewidth=2))
ax.set_xlabel('Delivery Time (minutes)')
ax.set_title('Distribution of Order Delivery Times')
plt.tight_layout()
plt.savefig('boxplot.png', dpi=300)
plt.show()Comparing Multiple Groups with Seaborn
import seaborn as sns
import pandas as pd
# Sample data for three treatment groups
data = pd.DataFrame({
'Group': ['Control']*20 + ['Treatment A']*20 + ['Treatment B']*20,
'Value': [45, 48, 42, 50, 47, 44, 49, 51, 46, 43,
53, 48, 55, 50, 52, 47, 54, 51, 49, 56,
60, 62, 58, 65, 61, 59, 63, 57, 64, 66,
55, 52, 58, 53, 57, 54, 56, 59, 51, 60]
})
fig, ax = plt.subplots(figsize=(8, 6))
sns.boxplot(x='Group', y='Value', data=data, palette='Set2', ax=ax)
ax.set_ylabel('Measurement Value')
ax.set_title('Treatment Comparison')
plt.tight_layout()
plt.savefig('comparison_boxplot.png', dpi=300)
plt.show()Overlaying Individual Data Points
A growing number of journals ask you to show the underlying observations on top of the summary box:
fig, ax = plt.subplots(figsize=(8, 6))
sns.boxplot(x='Group', y='Value', data=data, palette='Set2', ax=ax)
sns.stripplot(x='Group', y='Value', data=data,
color='black', alpha=0.4, size=4, ax=ax)
plt.tight_layout()
plt.savefig('boxplot_with_points.png', dpi=300)
plt.show()Creating Box Plots in R
Across many quantitative fields, R paired with ggplot2 is the default workshop for statistical figures.
Basic Box Plot with ggplot2
library(ggplot2)
deliveries <- data.frame(value = c(14, 17, 19, 21, 22, 24,
26, 27, 29, 31, 34, 58))
ggplot(deliveries, aes(y = value)) +
geom_boxplot(fill = "#4C72B0", alpha = 0.7, width = 0.4) +
labs(title = "Distribution of Order Delivery Times",
y = "Delivery Time (minutes)") +
theme_minimal()Group Comparisons in R
# Sample grouped data
treatment_data <- data.frame(
Group = rep(c("Control", "Treatment A", "Treatment B"), each = 20),
Value = c(rnorm(20, mean = 47, sd = 3),
rnorm(20, mean = 52, sd = 4),
rnorm(20, mean = 60, sd = 3))
)
ggplot(treatment_data, aes(x = Group, y = Value, fill = Group)) +
geom_boxplot(alpha = 0.7) +
geom_jitter(width = 0.15, alpha = 0.4, size = 2) +
scale_fill_brewer(palette = "Set2") +
labs(title = "Treatment Group Comparison",
y = "Measurement Value") +
theme_minimal() +
theme(legend.position = "none")Notched Box Plots for Visual Significance Testing
The notch carved around the median traces a rough 95% confidence interval. If the notches of two boxes fail to overlap, you have a quick visual hint that their medians genuinely differ:
ggplot(treatment_data, aes(x = Group, y = Value, fill = Group)) +
geom_boxplot(notch = TRUE, alpha = 0.7) +
scale_fill_brewer(palette = "Set2") +
theme_minimal()When Box Plots Are the Right Choice
For all their strengths, box plots are not a default for every dataset. The checklist below helps you decide when they earn their place.
Situations Where Box Plots Work Well
- You are contrasting two or more distributions on the same scale
- You need outliers in continuous measurements to jump out automatically
- You want spread captured compactly, median and quartiles and range together
- Several summary values matter at once and space is tight
- You are weighing median differences between experimental conditions
When Another Chart Type Fits Better
| Situation | Better Alternative | Why |
|---|---|---|
| Small sample size (n < 10) | Strip or dot plot | Box plots imply more statistical stability than small samples provide |
| Showing exact distribution shape | Violin plot or histogram | Box plots hide multimodal distributions |
| General audience presentations | Bar chart with error bars | Box plots require some statistical background to interpret |
| Describing a single group | Histogram or density plot | More detail on the underlying shape |
| Displaying every observation | Beeswarm or strip plot | Individual values stay visible |
Our guide to scientific diagrams makes the same point: a figure works best when its type fits both the shape of the data and the people reading it.

A bar chart hands you the average and stops there. A box plot keeps the spread, the skew, and the stray values that an average quietly erases.
Comparing Multiple Groups with Box Plots
Lining several groups up next to one another is probably the single most frequent reason people reach for a box and whisker diagram.
Side-by-Side Layout
Anchor every group to one common axis, then read across them:
- Center: Are the medians sitting at noticeably different heights?
- Spread: Is one group far more scattered than the rest?
- Shape: Do the distributions look balanced, or do they lean to one side?
- Outliers: Which groups are throwing off stray points?
Two Categorical Variables
If two factors organize your data at once, say treatment arm crossed with sampling time, grouped or faceted layouts keep both dimensions readable:
# Grouped box plot in Seaborn
sns.boxplot(x='Timepoint', y='Value', hue='Treatment', data=df)# Faceted box plot in ggplot2
ggplot(df, aes(x = Treatment, y = Value)) +
geom_boxplot() +
facet_wrap(~Timepoint)Interpreting What You See
| Observation | Interpretation |
|---|---|
| Boxes at different heights | Groups have different central tendencies |
| One box much wider than another | That group has more variability |
| Median line near box edge | Distribution is skewed in that direction |
| Many outlier dots | Data may deviate substantially from normal |
| Non-overlapping boxes | Groups likely differ in a meaningful way |
Common Mistakes to Avoid When Drawing Box Plots
A handful of habits crop up again and again, and each one can quietly mislead a reader or chip away at the trust your figure deserves.
Mistake 1: Applying Box Plots to Very Small Samples
Problem: Feed a box plot five observations and it still dutifully renders a box and two whiskers, even though every feature now hangs on one or two numbers. The picture looks far more settled than the evidence warrants.
Fix: Below roughly ten observations, switch to a strip plot or dot plot that puts each value on the page individually.
Mistake 2: Overlooking Bimodal Distributions
Problem: When a dataset hides two separate clusters, the resulting box can look perfectly ordinary while erasing the twin peaks beneath it.
Fix: Layer a strip plot over the box (geom_jitter in R, sns.stripplot in Python), or reach for a violin plot that draws the whole shape.
Mistake 3: Inconsistent Whisker Definitions
Problem: Whisker rules vary by tool. One stretches to the extremes, another halts at 1.5 x IQR, and a third pins the ends to fixed percentiles such as the 5th and 95th.
Fix: Spell out the rule you used right in the caption. Tukey's 1.5 x IQR convention is the version most readers will assume.
Mistake 4: Missing Axis Labels
Problem: Strip away the axis labels, units, and scale, and your readers are left guessing what any of the numbers actually mean.
Fix: Label every axis clearly and name the units. Hold the scale constant across groups so the comparison stays honest.
Mistake 5: Crowding Too Many Groups Into One Panel
Problem: Cram a dozen or more boxes onto one panel and precise reading goes out the window.
Fix: Sort the groups by median, break them across faceted panels, or bundle related categories. Seven boxes per panel is a sensible ceiling.
Mistake 6: Choosing Inaccessible Colors
Problem: Hues that blur together, or that colorblind readers cannot tell apart, add visual noise without carrying any information.
Fix: Reach for a scientific color palette designed with accessibility in mind, and let color mark real group differences rather than decorate.
Advanced Variations on the Standard Box Plot
The classic box and whisker diagram has spawned a family of richer formats, each adding a layer of analytical detail.
Violin Plots
A violin plot drapes a kernel density estimate over the box plot skeleton so the entire distribution becomes visible. It pays off most when the data is multimodal, exactly the case a plain box plot would gloss over.
Notched Box Plots
Cutting notches into the box near the median sketches an approximate 95% confidence interval. Two boxes whose notches stay apart point to medians that likely differ, no formal test required.
Letter-Value Plots
Letter-value plots, introduced by Heike Hofmann and collaborators, push the quartile idea outward to further quantile levels. They shine on large datasets, where the bare five-number summary leaves real detail in the tails unaccounted for.
Raincloud Plots
A raincloud stacks three views in one frame: a half-violin, a box plot, and a jittered strip of raw points. Now common in psychology and neuroscience work, it keeps the individual data honest while still offering a distributional summary.
Box Plots with Jittered Points
Scattering the actual observations across the box as a jittered strip has become standard, and sometimes mandatory, journal practice. The hybrid keeps both the summary statistics and the raw evidence in plain sight.

Blending chart types is increasingly the norm in current research. A time series, for instance, can carry box plots at each time point to expose how the distribution shifts along the way.
Box Plot Interpretation Quick Reference
| Feature | What to Look For | Interpretation |
|---|---|---|
| Median position | Where the line sits within the box | Near center means symmetric; near an edge means skewed |
| Box size | Width of the IQR | Larger box means more variability in the central 50% |
| Whisker length | Distance from box to whisker ends | Longer whiskers indicate a wider overall data range |
| Asymmetric whiskers | One whisker substantially longer | Data skews toward the longer whisker side |
| Number of outlier dots | Points beyond the whiskers | Many outliers may suggest a heavy-tailed or non-normal distribution |
| Box overlap across groups | Whether boxes from different groups intersect | No overlap suggests the groups differ in a meaningful way |
Frequently Asked Questions
What is the difference between a box plot and a box and whisker diagram?
There is none. Box plot, box and whisker plot, and box and whisker diagram all point to the identical statistical graphic. Each one charts the same five summary numbers (minimum, Q1, median, Q3, maximum) together with any outliers. The wordier name is just a nod to how the thing looks, a rectangle with thin lines, the whiskers, trailing off either end.
How do you calculate quartiles for a box and whisker diagram?
Begin by ordering every value from low to high. The median (Q2) sits in the middle of the whole set. Q1 is then the median of the bottom half, and Q3 is the median of the top half. When a half contains an even count of values, take the average of its two central entries (the same rule applies to the overall median for an even-sized dataset). One caveat: quartile formulas differ slightly between software packages, so verify which one your tool relies on.
How do you identify outliers in a box and whisker plot?
Outliers fall out of the interquartile range (IQR = Q3 - Q1). Anything sitting below Q1 - 1.5 x IQR or above Q3 + 1.5 x IQR counts as an outlier and is drawn as its own point past the whiskers. For a stricter cutoff, some analysts reserve the label 'extreme' for values beyond Q1 - 3 x IQR or Q3 + 3 x IQR.
Can you make a box plot in Google Sheets?
As of 2026, Google Sheets still ships without a dedicated box plot chart type. You can fake one through a candlestick chart, but the setup is fiddly. The cleaner route for most people is to work out the five-number summary in Sheets and then render the figure elsewhere, in Python, in R, or through a purpose-built option such as Figviz's AI Chart Generator, which takes care of the drawing for you.
When should I use a box plot instead of a bar chart?
Reach for a box plot whenever the spread, the shape, and the stragglers in your data matter as much as the average. It is the better tool for comparing groups where variation counts, for catching skew, and for surfacing outliers. Save the bar chart for plain categorical counts, or for moments when your audience just needs a clean mean-versus-mean comparison without statistical baggage.
How do I draw a box and whisker diagram for grouped data?
Build one box plot per group and set them along a common axis. Hand most software a grouping variable and it arranges this for you. Keep the axis scale identical across every group so the comparison stays fair, and give each group a distinct color or label so readers can tell them apart at a glance.
What does a skewed box plot look like?
When data skews right (positive skew), the median line hugs the lower edge of the box, the upper whisker runs longer than the lower one, and outliers gather toward the top. Flip all of that for a left-skewed plot: the median rides near the upper edge, the lower whisker stretches out, and the stray points show up at the bottom.
How many data points do I need for a meaningful box plot?
Technically five values are enough to draw one, but the plot only carries real statistical weight from about 15 to 20 observations upward. With a thin sample, each whisker and quartile rests on just a value or two, which dresses up a shaky picture as a stable one. Under ten observations, a dot plot or strip plot showing every reading serves you better.
Summary
More than fifty years on, the box and whisker diagram still earns its keep by squeezing a surprising volume of information into very little ink. Whether you are a student meeting quartiles for the first time, a researcher pitting treatment arms against each other, or an analyst combing through a large dataset, knowing how to build and read one pays off.
To recap the workflow, drawing a box and whisker diagram comes down to:
- Order the data from lowest to highest
- Work out the five-number summary: minimum, Q1, median, Q3, maximum
- Find the IQR and use it to set the fences that flag outliers
- Lay down the box from Q1 to Q3 with the median marked inside
- Run the whiskers to the most extreme readings that still sit within the fences
- Mark each outlier as its own point past the whisker ends
- Finish with labels, a title, and units so the chart explains itself
For more on assembling figures that actually communicate, browse our posts on data visualization best practices, designing scientific diagrams, and creating publication-quality figures.
Additional Resources
- Box Plot - Wikipedia
- A Complete Guide to Box Plots - Atlassian
- Box Plot Explained with Examples - Statistics By Jim
- Reading a Box and Whisker Plot - Simply Psychology
- Research Data Visualization Best Practices
- Scientific Color Palette Guide
- Best Fonts for Scientific Figures
Want polished box plots without touching a line of code? Try Figviz and turn your raw numbers straight into publication-ready charts.
Author

Categories
More Posts

How to Create Scatter Plots in Excel: Step-by-Step Guide (2026)
Learn how to make scatter plots in Excel with trend lines, labels, and formatting. Complete guide with screenshots and tips for research data visualization.


How to Design Scientific Infographics: 8-Step Guide for Researchers (2026)
Build credible, publication-ready infographics that earn scientists' trust. Covers data accuracy, visual hierarchy, and proper citation practices, with free templates.


Applied Research vs Basic Research: Definitions, Differences, and Real Examples
Understand the distinction between applied and basic research: how they differ in goals, timelines, funding, and outcomes, with concrete examples across science, medicine, and education.
