topic badge

11.01 Venn diagrams and two-way tables

Lesson

Concept summary

There are two important types of visual models we can use when dealing with bivariate data.

Venn diagram

A diagram which shows all possible logical relations between two or more sets.

Each circle in a Venn diagram represents a particular set or category.

The overlapping region in the middle, labeled b, represents elements that belong to both sets.

The region labeled a represents elements that belong only to the first set and not to the second set. Similarly, the region labeled c represents elements that belong only to the second set and not to the first set.

The region labeled d represents all elements that are included in the data but do not belong to either set.

Two-way frequency table

A data display that can be used to summarize bivariate categorical data. Also known as a joint frequency table.

A two-way frequency table also displays how elements are distributed across two sets:

Set ANot Set ATotal
Set B51015
Not Set B121325
Total172340

Each row sums to give the totals on the right, and each column sums to give the totals along the bottom.

Worked examples

Example 1

One hundred students in a school are asked about the subjects that they study. 58 of them are studying both math and science, 12 are studying math but not science, and 23 are not studying either math or science.

a

Represent the information in a Venn diagram.

Approach

There are two categories of data here: students who study math, and students who study science. So we can using a Venn diagram that looks like the following:

Solution

We know that 58 students study both math and science. So we can put this number in the center section, where the two circles overlap:

We also know that 12 students are studying math but not science. These students belong in the left circle but not the right circle:

Lastly, we have also been told that 23 students are not studying either of these subjects. This number should be in the rectangle that represents the whole set, but not in either of the circles:

b

Determine how many students are studying science in total.

Approach

We can use the fact that there are 100 students in the data set in total to find how many study science.

Solution

The total number of students will be equal to the number who study science at all, plus the number who do not study science.

The number who do not study science will be equal to the number who study math but not science, which is 12, plus the number who do not study either, which is 23.

If we let the number of students who study science be x, the we can put all of this together to get x + 23 + 12 = 100.

Solving this gives x = 65, and so there are 65 students in total who study science.

Reflection

Looking at the Venn diagram we have created, the students who are studying science will be those in the circle on the right. This includes both the 58 students in the overlap who study both math and science, as well as the unknown number who study science but not math.

So we could also have solved this by first finding the number of students who study science but not math, and then adding that amount to 58.

Example 2

The two-way frequency table displays the number of people at Chili Fest with which kind of chili they bought and whether they added extra spice or not.

Added spiceDid not add spiceTotal
Meat331401732
Vegetarian12543168
Total456444
a

Describe the group of people which has exactly 401 people in it.

Approach

We should find 401 in the table and look up and across to identify the corresponding headings.

Solution

The number 401 lies in the row for people who chose meat and in the column for people who did not add spice. This means that there were 401 people who chose meat chili and did not add spice.

b

Calculate the total number of people who ate chili at Chili Fest.

Approach

We can find this total in a variety of ways:

  • Add the four numbers in middle
  • Add the total numbers for each type of chili (the total column on the right)
  • Add the total numbers for each type of spice level (the total row along the bottom)

Solution

Let's add the numbers in the total colum on the right:

\text{Meat}+\text{Vegetarian}=732+168=900

So there were 900 people who ate chili at Chili Fest.

Reflection

We can check this by adding up the four numbers in the middle, which represent each specific combination of meat and spice, meat and no spice, vegetarian and spice, and vegetarian and no spice:

331+401+125+43=900

Example 3

A scientist recorded some data on the flowering pattern and type of soil for a variety of coreopsis plants:

  • 1000 plants were observed
  • 136 plants did not flower
  • 394 plants that did flower were planted in peat soil
  • 30 plants that were planted in sandysoil did not flower

The data can be organized into a two-way frequency table:

FloweredDid not flowerTotal
Peat soil
Sandy soil
Total
a

Complete the table based on the given information about coreopsis plants.

Approach

We can first fill in the given information, and then use the sums of rows and columns to find missing values.

Solution

Filling in what we already know, we have:

FloweredDid not flowerTotal
Peat soil394
Sandy soil30
Total1361000

Based on this information, we can now calculate:

  • The total number that flowered: 1000-136=864
  • The number that did not flower and were planted in peat soil: 136-30=106
FloweredDid not flowerTotal
Peat soil394106
Sandy soil30
Total8641361000

With these newly calculated values, we can now find the remaining unknown values:

  • Total number planted in peat soil: 394+106=500
  • Then the total number planted in sandy soil must be: 1000-500=500
  • The number that flowered and were planted in sandy soil: 864-394=470
FloweredDid not flowerTotal
Peat soil394106500
Sandy soil47030500
Total8641361000

Reflection

We can always check by adding across each row and down each column to ensure that the totals add up.

b

Keshawn said because there were a total of 1000 plants observed and 30 plants that were planted in sandy soil did not flower, this means 1000-30=970 were planted in peat soil and did flower. Identify and correct his error.

Solution

Keshawn seems to have a misconception around how the frequencies relate to one another. He has stated that \text{sandy and not flowered}+\text{peat and flowered}=\text{Total} while it should be\text{sandy and flowered}+\text{sandy and not flowered}+\text{peat and flowered}+\text{peat and not flowered}=\\ \text{Total}

We need to consider all possible cases, not just opposite cases.

He should have said that because there were a total of 1000 plants observed and 30 plants that were planted in sandy soil did not flower, this means that 1000-30=970 were planted in peat soil or were planted in sandy soil and did flower.

Outcomes

G.S.CP.A.1.B

Flexibly move between visual models (Venn diagrams, frequency tables, etc.) and set notation.

G.MP1

Make sense of problems and persevere in solving them.

G.MP3

Construct viable arguments and critique the reasoning of others.

G.MP5

Use appropriate tools strategically.

G.MP6

Attend to precision.

What is Mathspace

About Mathspace