When thinking about what a Discrete Random Variable (or DRV for short) actually is, the name itself tells you most about what you need to know.
But before we get into explaining each of the words (discrete, random and variable), we'll start with an example which will make it very clear what we're talking about when we say DRV.
Let's say there's a cat who's about to give birth to three kittens. Before they are born we know each will either be male or female. The exact combination of male and female kittens is unknown before they are born. We can instead consider all possible outcomes of the birth. An easy way to do this is to represent the possibilities with the following tree diagram.
What we've done so far is create a simple sample space. You've done this heaps of times before. What we need to do now though, is choose something to focus on in this situation. There's only two things to focus on in this situation: either we focus on the number of female kittens or the number of male kittens.
Let's focus on the number of female kittens.
How many female kittens will we see once all the kittens are born? Either $0,1,2$0,1,2 or $3$3.
So the number of female kittens will vary and will occur at random.
When a quantity varies, we can define a variable. In this case let's define $X$X as the number of female kittens born.
$x$x will have values of $0,1,2$0,1,2 and $3$3, where $x$x represents the possible outcomes for event $X$X occurring.
Each of these values for $X$X have a particular chance or probability of occurring. We can use our tree diagram and a table to summarise these probabilities.
$x$x | $0$0 | $1$1 | $2$2 | $3$3 |
---|---|---|---|---|
$P(X=x)$P(X=x) | $\frac{1}{8}$18 | $\frac{3}{8}$38 | $\frac{3}{8}$38 | $\frac{1}{8}$18 |
What we've created here is a discrete probability distribution and represented it with an individual probability table.
$x$x represents the individual outcomes of event $X$X occurring.
$P(X=x)$P(X=x) represents the probability that outcome $x$x occurs for random variable $X$X. Put more simply, it's the probability of each of the outcomes occurring.
We could also represent this information with a cumulative probability table. That is, $P\left(X\le x\right)$P(X≤x) represents the probability of the outcome being less than or equal to $x$x.
$x$x | $0$0 | $1$1 | $2$2 | $3$3 |
---|---|---|---|---|
$P\left(X\le x\right)$P(X≤x) | $\frac{1}{8}$18 | $\frac{4}{8}$48 | $\frac{7}{8}$78 | $\frac{8}{8}$88 |
Now that we've taken a look at an example, let's use it to really understand what a DRV is.
The masses of babies born in a local hospital in the last month have been recorded.
One midwife is interested in using this data to help predict the mass of the next baby to be born.
Can this situation be modelled by a discrete random variable?
Yes
No
What is the reason why this cannot be represented by a discrete random variable?
The outcomes are continuous.
The outcomes are categorical.
On average, the number of green snakes in each packet of snakes sold is $5$5.
Can this data be represented by a discrete random variable?
Yes
No