ACARA v9 CONTENT DESCRIPTION “choose appropriate forms of display or visualisation for a given type of data; justify selections and interpret displays for a given context”
Builds on: Comparing Data Distributions. This unit builds on the displays met across the strand, including box plots, and on reading graphs critically. Choosing and justifying a display is the bridge to planning a full statistical investigation in the final unit.
The display is part of the message
Once you have data, you have to decide how to show it, and that decision is not just decoration. A well-chosen graph makes the message leap out, while a poorly chosen one can hide it entirely or even suggest something false. The right choice depends on two things: what kind of data you have, and what question you want the display to answer. This unit is about matching the display to the data, justifying that choice, and reading the display correctly.
Classifying the data first
The first step is always to classify the data. Categorical data sorts items into named groups, like favourite sport or eye colour, where the categories have no numerical value. Some categorical data is ordinal, carrying a natural order such as small, medium and large, while nominal categories like colours have no order at all. Numerical data, by contrast, consists of actual numbers, either discrete counts such as the number of pets in a household, or continuous measurements such as height or time that can take any value in a range. Knowing which kind you have narrows the sensible choices straight away.
Categorical or numerical?
Classify the data first; the type narrows the sensible displays.
Classify the data first. Categorical data falls into named groups and suits a bar or pie chart; numerical data is made of numbers and suits a histogram, dot plot, box plot, scatter plot or line graph.
Displays for categorical data
For categorical data, the workhorse display is the bar chart, where each category gets a bar whose height shows its frequency. Bars are drawn with gaps between them, a visual signal that the categories are separate and could be reordered without changing the meaning. When you want to show how categories make up a single whole, a pie chart can display each as a slice of the total, but only when the parts genuinely add to one hundred percent and there are not so many slices that they become unreadable.
Bar chart for categories
Each category gets a bar; the gaps signal that the categories are separate.
Each sport is a separate category, so the bars are drawn with gaps. The bars can be reordered without changing the meaning, because the categories have no numerical order.
Displays for numerical data
Numerical data calls for different tools. A histogram shows the distribution of numerical data grouped into intervals, and crucially its bars touch with no gaps, because the horizontal axis is a continuous number line rather than a set of separate labels. This is the key difference between a histogram and a bar chart: gaps mean categories, touching bars mean numbers. For small numerical data sets a dot plot or stem-and-leaf plot shows every individual value, and a box plot, met earlier, is ideal for comparing the centre, spread and outliers of distributions.
Bar versus histogram
The key distinction: gaps mean categories; touching bars mean numbers.
The key distinction: a bar chart shows categories with gaps between the bars, while a histogram shows grouped numbers with the bars touching on a continuous axis.
Showing relationships and change over time
Some questions are about relationships or change rather than single distributions. To explore whether two numerical variables are related, such as height and shoe size, a scatter plot places each pair as a point and reveals any association at a glance. To show how a quantity changes over time, a line graph connects successive points, and the connecting lines are appropriate precisely because time flows continuously between the readings. The connected line is what makes a trend visible.
Scatter plot for a relationship
Each point is one person; the rising cloud shows an association.
A scatter plot shows the relationship between two numerical variables. Here taller students tend to have larger shoe sizes, an upward association you can see at a glance.
Line graph over time
Connecting the readings is fair because time flows continuously between them.
A line graph connects successive readings over time. The connecting lines are appropriate because time flows continuously between the months, making the trend easy to read.
Justifying the choice
Choosing a display is therefore an argument you should be able to justify. The justification rests on the data type and the purpose together: a trend over months calls for a line graph, the share of a household budget suits a pie chart, and comparing the popularity of five sports is a job for a bar chart. A good display answers its question clearly; a poor one forces the reader to work against the graph rather than with it.
Avoiding mismatches and interpreting displays
The cost of a mismatch is a misleading or unreadable picture. Joining categorical data with a line graph implies a trend through categories that have no order, suggesting a pattern that does not exist. A pie chart with fifteen near-equal slices, or one whose parts do not form a whole, communicates almost nothing. Using a histogram for categorical data, with its touching bars, wrongly signals that the categories are numerical. Choosing the display is only half the skill; interpreting it is the other half. A finished graph should be read for its message, which category is largest, whether a trend rises or falls, how spread out the values are, and whether any points stand apart as outliers, and read critically by checking the axes and scale. Matching the display to the data, justifying that match, and interpreting what it shows is what turns a table of numbers into understanding.
Match and mismatch
A line suits a time trend; joining unordered categories invents a trend that is not there.
A line graph suits a trend over time on the left. Joining unordered categories with a line on the right invents a trend that does not exist, a common mismatch to avoid.
Quick self-check
1. 100 students name their favourite sport (a categorical variable). Which display best compares the categories?
2. You record the daily temperature over a month to show how it changes. Which display is most appropriate?
3. What is the key visual difference between a bar chart and a histogram?
4. To explore whether height and shoe size are related, which display should you use?
5. Which is a common MISMATCH that misleads the reader?