Data is important. More and more, big decisions are being made based on data, from doctors using genetic analysis to give people the right medicine for them to Facebook deciding what to show on your news feed. Indeed, the notion of making decisions based on huge data-sets, called "Big Data", is one of the defining technologies of the 21st century.
But there's no point in collecting all this data if you don't have a good way of looking at it. The way in which you present data can make all the difference in actually allowing people to draw conclusions from it.
Let's get you started on the basics of data visualisation, starting with the basics: what graph to use for what kind of data.
Your choice of graph should reflect what you want to show. The 5 main motives for creating a graph are to:
When you have a clear goal gather data to suit the goal and select a graph that is appropriate for the data and achieves your goal. Don't forget to avoid the traps of creating a misleading graph.
Let's look at a selection of graphs suitable for each purpose.
When you want to highlight a single idea or statistic then pictographs, icon charts, circle graphs or doughnut charts are a good choice.
Pictograph | Icon chart | Doughnut chart |
A doughnut chart is similar to a circle graph with some arguing it may be easier to visually compare the lengths of the arcs than areas or sectors. It also gives area in the middle which can be used to display important information.
When you want to compare values across categories or the shape of a data set then graphs such as bar graphs, bar charts, circle graphs, doughnut charts, stacked bar graphs and tree maps are some common graphs used.
Paired bar graph | Circle graph | Stacked bar chart |
If you want to compare values or frequencies across a few categories, a bar graph is a good first choice. It is simple to understand, and gives a good visual sense of how different things are. Here is an example:
The reason that bar graphs like this are so easy to understand is that you can easily compare the sizes between the columns. Here we can see that the difference between Australia, the US and Japan is really quite minor compared with the huge difference with the world's lowest life expectancy in Chad.
Often when textbooks introduce graphs, they describe bar graphs as a horizontal version of a bar graph. So then, why do we need them? Why can't we just use bar graphs?
The answer is that bar graphs have a definite advantage when the category names are very long, or when there are lots of them:
As you can see, the names would be too long for a bar graph, and a bar graph with this many columns would be confusing. This graph is from The Economist's daily chart, which contains many examples of well-crafted graphs.
Circle graphs, segmented bar graphs and doughnut graphs are used to show the proportional break down of data. These are only appropriate when you comparing separate parts that make up 100% of the data set. To make these charts easy to read and understand it is good practice to order the segments from greatest to least and limit the chart to at most of 7 categories.
This set of circle graphs, for example, makes its point about the relative frequency of climate change denial amongst scientists and members of the public very clearly. Even with a lot of exact figures and long category names, this graph does not come out as overwhelming, as it might if we tried to do the same thing with a bar graph.
To display change in data over time popular graphs are broken-broken line graphs and area graphs, and to show change in a variable across locations we can use map charts.
Broken-line graph | Area chart | Choropleth map |
Bar graphs are not recommended to show change over time, particularly when there are a large number of data points or small changes. As you can see in the bar graph below, the graph format is not well suited to showing such a large number of columns and as we are interested in the change from year to year this would be more easily seen by connecting the top of the columns with lines. To show the trend clearly the graph has also been truncated which is misleading to a viewer comparing the size of the columns.
A broken-line graph such as the one below, allows us to clearly see the data, trends and we can graph and compare several series of data at once.
Tables, lists, flowcharts, Venn diagrams and mind-maps are popular ways to summarise and organise data to show how the data can be grouped or interrelated.
Venn Diagram Showing groupings of the data. |
Mind map Showing how concepts within a topic are related. |
Flowchart Showing steps in a process or hierarchy. |
Often we wish to further analyse data to find relationships between different variables or distributions of a variable. Scatterplots are commonly used to identify the relationship between two variables and histograms are useful for displaying the distribution of a continuous variable.
Scatterplot | Histogram |
What would be the best choice of graph for:
Visual essays and articles rich with graphics are popular to engage readers. Find some articles on a topic that interests you that contains statistics and information presented in graphs.
Here are a few suggested articles with strong use of data visualisation:
The Pudding is a site that offers visual essay on a range of topics including: dialog in movies broken down by age and gender, vocabulary of rappers, statistics of captive whales and dolphins.
The Statistics Canada also strives to create articles with interactive graphics on topics including: earnings and mobility and use of protective equipment during Covid 19.
Analyse the graphs and statistics shown in your chosen articles and answer the following questions:
Now that you have your new powers to use data to make a point, you must use them for good, not evil. Graphs are powerful visual tools which convince people that "the facts" support what you are saying, and can easily be used to mislead people. Make sure you use your graph-creating powers to help show people what the data actually says, rather than what you would like it to say!