topic badge

9.02 Probability density functions

Lesson

Probability density function (PDF)

We recall that discrete probability functions are represented graphically by a probability mass function, $f(x)=P\left(X=x\right)$f(x)=P(X=x). A probability mass function looks like a bar graph. The value of $f\left(x\right)$f(x) gives the probability of the random variable having the outcome $x$x. The sum of all of the probabilities must equal $1$1. The two graphs below are examples of probability mass functions.

A probability density function (PDF) is used to describe the probability distribution of a continuous random variable. As we saw at the end of last lesson this function models the limiting case of a histogram where the amount of data increases and the class interval size decreases.

Properties of probability density functions

A probability density function, $f(x)$f(x), must satisfy the following two properties:

  • $f(x)\ge0$f(x)0 for all $x$x (since probability values are positive).
  • $\int_{-\infty}^{+\infty}\ f(x)\ dx=1$+ f(x) dx=1 (This is because the sum of all the probabilities is $1$1).

Note: Often our probability function occurs between two specific values, on an interval $[a,b]$[a,b] and can be defined as $0$0 elsewhere, thus from the second property above we would have $\int_b^a\ f(x)\ dx=1$ab f(x) dx=1.

 

 

Finding the probability for continuous probability distributions

Given there is a continuum of possibilities, the probability that a continuous random variable takes on a particular value is zero. For example, if we wanted to find the probability that the outcome of a continuous random variable is a specific value, say $2$2, then $P(X=2)=0$P(X=2)=0. Instead, we can use a continuous probability distribution to find the probability that a random variable has an outcome in a particular interval.

To find the area we need to find the integral of $f(x)$f(x) between $a$a and $b$b where $f(x)$f(x) is the probability density function, and $a$a and $b$b are outcomes of the random variable within the given domain. Another way to notate this is:

$\int_a^b\ f(x)\ dx=P(a\le X\le b)$ba f(x) dx=P(aXb)

Note: Since $P(X=a)=0$P(X=a)=0 and $P(X=b)=0$P(X=b)=0, the inclusion or not of the endpoints does not alter the probability. Thus, $P(aP(a<X<b)=P(aXb).

Worked example

Example 1

For the probability density function, find:

(a) $P(0\le X\le2)$P(0X2)

Think: to find the area when $x$x is between $0$0 and $2$2, we shade the corresponding area under the function and find its area by finding the area of a rectangle.

Do: The rectangle has dimensions $2$2 by $\frac{1}{20}$120, therefore:

$\text{Area}=2\times\frac{1}{20}=0.1$Area=2×120=0.1

And:

$P\left(0\le X\le2\right)=0.1$P(0X2)=0.1

 

(b) $P\left(4\le X\le15\right)$P(4X15)

Do: The relevant area is shaded below:

The area is of a rectangle with dimensions $15-4$154 by $\frac{1}{20}$120 is given by:

$\text{Area}=\left(15-4\right)\times(\frac{1}{20})=0.55$Area=(154)×(120)=0.55

Therefore:

$P\left(4\le X\le15\right)=0.55$P(4X15)=0.55

Example 2

Let $X$X be a continuous random variable whose probability density function is $f(x)=3x^2$f(x)=3x2, on the interval $0\le x\le1$0x1. What is $P(0.5\le X\le1)$P(0.5X1)?

Think: To find the probability we need to find the area under the curve which involves finding the integral of $f(x)$f(x) between $0.5$0.5 and $1$1.

Do:

Thus, $P\left(0.5\le X\le1\right)=\frac{7}{8}$P(0.5X1)=78.

Practice questions

Question 1

Consider the probability density function $p\left(x\right)$p(x) drawn below for a random variable $X$X.

Loading Graph...

  1. Calculate the area between $p(x)$p(x) and the $x$x axis, without using integration. Show your working.

  2. Which features of $p\left(x\right)$p(x) are also features of all continuous probability distribution functions? Select all options that apply.

    $p\left(x\right)$p(x) is zero on both ends of the distribution.

    A

    The area under $p\left(x\right)$p(x) is a triangle.

    B

    $p\left(x\right)$p(x) is $0$0 outside the region $0\le x\le5$0x5.

    C

    The area under $p\left(x\right)$p(x) is equal to $1$1.

    D
  3. Calculate $P$P$($($X$X$<$<$3$3$)$) using geometric reasoning.

  4. Calculate $P$P$($($X>3$X>3$\mid$$X\le4$X4$)$) using geometric reasoning.

Question 2

A continuous random variable $X$X has probability density function $f\left(x\right)=\frac{6}{x^2}$f(x)=6x2 for $\left[3,6\right]$[3,6] and $f\left(x\right)=0$f(x)=0 otherwise.

  1. Confirm that $\int_3^6\frac{6}{x^2}dx=1$636x2dx=1.

  2. Determine $P\left(X<4\right)$P(X<4).

  3. Determine $P\left(X>5\right)$P(X>5).

  4. Determine $P\left(X<4\ \mid\ X>3.5\right)$P(X<4  X>3.5).

question 3

The probability density function of a random variable $X$X is defined by $p\left(x\right)=k\sin2x$p(x)=ksin2x for $0\le x\le\frac{\pi}{6}$0xπ6, and $p\left(x\right)=0$p(x)=0 elsewhere, is drawn below. Note that $k$k is a constant.

Loading Graph...

  1. Use calculus techniques to determine the value for $k$k.

  2. Calculate $P($P($X<\frac{\pi}{8}$X<π8$)$).

  3. Calculate $P($P($X\le\frac{7\pi}{48}$X7π48$\mid$$X\ge\frac{\pi}{24}$Xπ24$)$). Give your answer correct to two decimal places.

 

Cumulative probability distribution function (CDF)

The cumulative distribution function (CDF) provides a general formula for finding the probabilities of continuous distribution functions without needing to integrate repeatedly. We have explored this idea in our section on the fundamental theorem of calculus.

The CDF gives us the probability of a random variable being less than or equal to a given cutoff.

We can use the CDF to find probabilities, measures of location such as the median and quantiles.

 

Cumulative distribution function (CDF)

For a continuous random variable $X$X then the cumulative distribution function (CDF) is:

$F(x)=P\left(X\le x\right)$F(x)=P(Xx) for all $x$x.

This means that:

$F(x)=\int_{-\infty}^x\ f(t)\ dt$F(x)=x f(t) dt where $f\left(t\right)$f(t) is the probability density function.

Note, if $f\left(t\right)>0$f(t)>0 on an interval $[a,b]$[a,b], and $0$0 elsewhere, then $F(x)=\int_{-\infty}^x\ f(t)\ dt=\int_a^x\ f(t)\ dt$F(x)=x f(t) dt=xa f(t) dt.

An identity that may be useful for piecewise functions is:

where $x=c$x=c is the boundary value where $f(x)$f(x) changes from one sub-function to another.

Worked examples

Example 3

A continuous probability function is given by $f\left(x\right)=\frac{4x^3}{255}$f(x)=4x3255 defined in the domain $[1,4]$[1,4] where $f(x)=0$f(x)=0 for all other $x$x.

(a) Find the cumulative distribution function.

Think: The CDF is found by integrating $f\left(x\right)$f(x).

Do:

$F\left(x\right)$F(x) $=$= $\int_1^x\frac{4t^3}{255}dt$x14t3255dt
  $=$= $\frac{1}{255}\int_1^x4t^3dt$1255x14t3dt
  $=$= $\frac{1}{255}\left[t^4\right]_1^x$1255[t4]x1
  $=$= $\frac{x^4-1}{255}$x41255

 

So the cumulative distribution function is:

$F\left(x\right)=$F(x)= $0$0, $x<1$x<1
$\frac{x^4-1}{255}$x41255, $1\le x\le4$1x4
$1$1 $x>4$x>4

 

(b) Use the CDF to find $P\left(X\le3\right)$P(X3).

Think: $P\left(X\le3\right)$P(X3) is the area under the function to the left of $x=3$x=3.

Do: Using $F\left(x\right)$F(x), we substitute $x=3$x=3:

$P\left(X\le3\right)$P(X3) $=$= $F(3)$F(3)
  $=$= $\frac{3^4-1}{255}$341255
  $=$= $\frac{81-1}{255}$811255
  $=$= $\frac{80}{255}$80255
  $=$= $\frac{16}{51}$1651
 

Therefore, $P\left(X\le3\right)=\frac{16}{51}$P(X3)=1651

(c) Use the CDF to find $P\left(1.5\le X\le3.1\right)$P(1.5X3.1).

Think: The area under the curve that we are interested in is found by calculating the integral between $x=1.5$x=1.5 and $x=3.1$x=3.1 or simply finding $F(3.1)-F(1.5)$F(3.1)F(1.5) using the CDF.

Do:

$P(1.5\le X\le3.1)$P(1.5X3.1) $=$= $F(3.1)-F(1.5)$F(3.1)F(1.5)
  $=$= $\frac{3.1^4-1}{255}-\frac{1.5^4-1}{255}$3.1412551.541255
  $\approx$ $0.3582-0.0159$0.35820.0159
  $=$= $0.342$0.342 (to $3$3 d.p.)
Example 4

A probability density function is defined piecewise by:

$f\left(x\right)=$f(x)= $k\left(5+x\right)$k(5+x), $-3\le x\le0$3x0
$k\left(5-x\right)$k(5x), $00<x3
$0$0 elsewhere

 

(a) Find the value of the constant $k$k, and hence, write the equation of $f\left(x\right)$f(x).

Think: The integral of $f(x)$f(x) over the domain $[-3,3]$[3,3] must be $1$1 because it is a probability density function. We can integrate the piecewise function by integrating the separate pieces over their respective domains. Then solve for $k$k by equating the integral to $1$1.

Do:

For the integral to be $1$1, the value of $k$k must be $\frac{1}{21}$121.

The function is therefore:

$f\left(x\right)=$f(x)= $\frac{1}{21}(5+x)$121(5+x), $-3\le x\le0$3x0
$\frac{1}{21}(5-x)$121(5x), $00<x3
$0$0 otherwise

 

(b) Find the cumulative distribution function, $F\left(x\right)$F(x), for the probability density function given.

Think: Just as the probability density function is split into two, to find the cumulative distribution function we will find the function over each interval and then combine.

Do:

For $-3\le x\le0$3x0:

For $00<x3, $F(x)$F(x) gives the area under the curve up to $x$x, so for a point $00<x3, we will require the area up to $x=0$x=0 plus the area up to the point under the second curve. Since $F\left(0\right)=\frac{1}{2}$F(0)=12 (from above), we have:

Hence, the cumulative distribution function is:

$F(x)=$F(x)= $0$0 $x<-3$x<3
$\frac{1}{21}(5x+\frac{x^2}{2}+\frac{21}{2})$121(5x+x22+212), $-3\le x\le0$3x0
$\frac{1}{2}+\frac{1}{21}(5x-\frac{x^2}{2})$12+121(5xx22), $00<x3
$1$1 $x>3$x>3

 

Finding quantiles

Using the probability density function or the cumulative probability distribution function we can solve for the value of $x$x for which the chance of an outcome lying below this value is a given probability. For example, given the distribution of the lifespan of a certain battery we could calculate the time in which we would expect $80%$80% of batteries to fail. Given the cumulative distribution $F\left(x\right)$F(x), we would be solving $F\left(x\right)=0.8$F(x)=0.8.

The median value of a continuous probability distribution is a special case of this where the probability of obtaining a value below or above this is $0.5$0.5. We will look more closely at the median in the next lesson as a measure of the centre of the distribution.

Worked example

Example 5

Find the value of $x$x such that $P\left(X\le x\right)=0.125$P(Xx)=0.125 for the continuous probability distribution defined as $f\left(x\right)=3x^2$f(x)=3x2 in the domain $[0,1]$[0,1] and $0$0 elsewhere.

Think: We want to find $x$x such that $\int_0^x\ f\left(t\right)\ dx=0.125$x0 f(t) dx=0.125. We can do this be finding the cumulative distribution function $F\left(x\right)$F(x) and then solving for $$.

Do: Integrating $f(x)=3x^2$f(x)=3x2:

Solving for $F(x)=0.125$F(x)=0.125:

$x^3$x3 $=$= $0.125$0.125

Take the cube root of both sides

$\therefore x$x $=$= $0.5$0.5

 

 

Hence, for this distribution, $12.5%$12.5% of outcomes will fall below $0.5$0.5.

 

Practice questions

Question 4

Consider the probability density function below:

$p\left(x\right)$p(x)$=$= $\frac{x}{96}$x96 $0\le x\le12$0x12
$-\frac{x}{32}+\frac{1}{2}$x32+12 $1212<x16
$0$0 otherwise
  1. $p\left(x\right)$p(x) has already been graphed for the region outside $0\le x\le16$0x16. Draw the rest of the function to complete the graph.

    Loading Graph...

  2. Use geometric reasoning to calculate the area under $p\left(x\right)$p(x).

  3. Which of the following properties of $p\left(x\right)$p(x) are not features of probability distribution functions in general?

    $p\left(x\right)$p(x) does not take a negative value over its whole domain.

    A

    $p\left(x\right)=0$p(x)=0 for all negative values of $x$x.

    B

    The total area underneath $p\left(x\right)$p(x) is equal to $1$1.

    C

    The area underneath $p\left(x\right)$p(x) is a triangle.

    D
  4. Determine the cumulative distribution function:

    $F\left(x\right)$F(x)$=$=
    $\editable{}$ $x<0$x<0
    $\editable{}$ $0\le x\le12$0x12
    $\editable{}$ $1212<x16
    $\editable{}$ $x>16$x>16

question 5

A continuous random variable $X$X has cumulative distribution function given below.

$F\left(x\right)$F(x) $=$= $0$0 for $x<\frac{1}{3}$x<13
$\frac{1}{4\ln3}\ln\left(9x^2\right)$14ln3ln(9x2) for $\frac{1}{3}\le x\le3$13x3
$1$1 for $x>3$x>3

 

 
  1. Determine $P\left(X>2\right)$P(X>2).

    Round your answer to two decimal places.

  2. Calculate $t$t such that $P\left(XP(X<t)=14.

question 6

The length of time $X$X after a computer is turned on until it crashes can be modeled by the probability density function $p\left(x\right)=\frac{1}{36}e^{-\frac{x}{36}}$p(x)=136ex36 when $x\ge0$x0, and $p\left(x\right)=0$p(x)=0 otherwise.

  1. Determine the cumulative probability distribution function for $X$X, $F(x)$F(x).

    $F\left(x\right)$F(x)$=$= $\editable{}$ $x<0$x<0
    $\editable{}$

    $x\ge0$x0

  2. Determine the probability that the computer crashes after exactly $10$10 hours of use.

  3. Determine the probability that the computer can last at least $50$50 hours of use. Give your answer correct to two decimal places.

  4. Determine the probability that a computer which has lasted $12$12 hours will last $30$30 hours without crashing. Give your answer correct to two decimal places.

Outcomes

4.4.1.2

understand the concepts of a probability density function, cumulative distribution function, and probabilities associated with a continuous random variable given by integrals; examine simple types of continuous random variables and use them in appropriate contexts

What is Mathspace

About Mathspace