In the previous lesson, we calculated the probability of a continuous distribution function by integrating the probability density function. The Cumulative Distribution Function (CDF) provides a general formula for finding the probabilities of continuous distribution functions. The CDF is essentially the primitive function of the probability density function.
The CDF gives us the probability of a random variable being less than or equal to a given cutoff.
We can use the CDF to find probabilities, measures of location and quantiles.
For a continuous random variable $X$X that has values in a closed interval $\left[a,b\right]$[a,b] then the cumulative distribution function (CDF) is
$F\left(x\right)=P\left(a\le X\le x\right)$F(x)=P(a≤X≤x) for all $x$x in the domain $\left[a,b\right]$[a,b]
$F\left(x\right)=\int_a^x\ f(t)\ dt$F(x)=∫xa f(t) dt where $f\left(t\right)$f(t) is the probability density function defined in the domain $\left[a,b\right]$[a,b]
An identity that may prove to be useful here is:
In particular, we can make good use of this when $f\left(t\right)$f(t) is a piecewise function and $x=c$x=c is the boundary value at which $f\left(t\right)$f(t) changes from one sub-function to another.
A probability density function is defined piecewise by:
$f\left(x\right)$f(x)$=$= | $k\left(5+x\right)$k(5+x), | $-3\le x\le0$−3≤x≤0 | |
$k\left(5-x\right)$k(5−x), | $0\le x\le3$0≤x≤3 | ||
$0$0, | elsewhere |
(a) Find the value of the constant $k$k, and hence, write the equation of $f\left(x\right)$f(x).
Think: The integral of $f\left(x\right)$f(x) over the domain $\left[-3,3\right]$[−3,3] must be $1$1 because it is a probability density function. We can integrate the piecewise function by integrating the separate pieces over their respective domains. Then solve for $k$k by equating the integral to $1$1.
Do:
For the integral to be $1$1, the value of $k$k must be $\frac{1}{21}$121.
The function is therefore:
$f\left(x\right)$f(x)$=$= | $\frac{1}{21}\left(5+x\right)$121(5+x), | $-3\le x\le0$−3≤x≤0 | |
$\frac{1}{21}\left(5-x\right)$121(5−x), | $0\le x\le3$0≤x≤3 | ||
$0$0, | elsewhere |
(b) Find the cumulative distribution function, $F\left(x\right)$F(x), for the probability density function given.
Think: Just as the probability density function is split into two, to find the cumulative distribution function we will find the function over each interval and then combine.
Do:
For $-3\le x\le0$−3≤x≤0:
For $0\le x\le3$0≤x≤3, $F\left(x\right)$F(x) gives the area under the curve up to $x$x, so for a point $0\le x\le3$0≤x≤3, we will require the area up to $x=0$x=0 plus the area up to the point under the second curve. Since $F\left(0\right)=\frac{1}{2}$F(0)=12 (from above), we have:
Hence, the cumulative distribution function is:
$F\left(x\right)$F(x)$=$= | $\frac{1}{21}\left(5x+\frac{x^2}{2}+\frac{21}{2}\right)$121(5x+x22+212), | $-3\le x\le0$−3≤x≤0 | |
$\frac{1}{2}+\frac{1}{21}\left(5x-\frac{x^2}{2}\right)$12+121(5x−x22), | $0\le x\le3$0≤x≤3 | ||
$0$0, | elsewhere |
A continuous probability function is given by $f\left(x\right)=\frac{4x^3}{255}$f(x)=4x3255 defined in the domain $\left[1,4\right]$[1,4] where $f\left(x\right)=0$f(x)=0 for all other $x$x.
(a) Find the cumulative distribution function.
Think: The CDF is found by integrating $f\left(x\right)$f(x).
Do:
(b) Use the CDF to find $P\left(X\le3\right)$P(X≤3)
Think: $P\left(X\le3\right)$P(X≤3) is the area under the function to the left of $x=3$x=3
Do: Using $F\left(x\right)=\frac{x^4-1}{155}$F(x)=x4−1155, we substitute $x=3$x=3:
$P\left(X\le3\right)$P(X≤3) | $=$= | $F(3)$F(3) |
$=$= | $\frac{3^4-1}{255}$34−1255 | |
$=$= | $\frac{81-1}{255}$81−1255 | |
$=$= | $\frac{80}{255}$80255 | |
$=$= | $\frac{16}{51}$1651 |
Therefore, $P\left(X\le3\right)=\frac{16}{51}$P(X≤3)=1651
(c) Use the CDF to find $P\left(1.5\le X\le3.1\right)$P(1.5≤X≤3.1).
Think: The area under the curve that we are interested in is found by calculating the integral between $x=1.5$x=1.5 and $x=3.1$x=3.1 or simply finding $F\left(3.1\right)-F\left(1.5\right)$F(3.1)−F(1.5) using the CDF.
Do:
$P\left(1.5\le X\le3.1\right)$P(1.5≤X≤3.1) | $=$= | $F\left(3.1\right)-F\left(1.5\right)$F(3.1)−F(1.5) |
$=$= | $\frac{3.1^4-1}{255}-\frac{1.5^4-1}{255}$3.14−1255−1.54−1255 | |
$\approx$≈ | $0.3582-0.0159$0.3582−0.0159 | |
$=$= | $0.342$0.342 (to three decimal places) |
The mode is the data value with the highest frequency. For a continuous distribution, we look for the value of $x$x that gives the maximum point of a probability density function. Depending on the function, we may need to use calculus to help us find where the maximum value occurs.
A continuous probability distribution $f\left(x\right)=\frac{3x\left(6-x\right)}{100}$f(x)=3x(6−x)100 is defined in the domain $\left[1,6\right]$[1,6], find the mode of the distribution.
Think: The mode is the value of $x$x which give the maximum point of the probability function. We can use calculus to find the first derivative and solve $f'\left(x\right)=0$f′(x)=0 to find the stationary point and check this is within the given domain. Then we can use the second derivative to test that it is a maximum. Looking at the function we can see that it is a concave down parabola as $a<0$a<0 therefore we can expect there to be a maximum point.
Do:
Differentiating:
$f\left(x\right)$f(x) | $=$= | $\frac{3x\left(6-x\right)}{100}$3x(6−x)100 |
$=$= | $\frac{1}{100}\left(18x-3x^2\right)$1100(18x−3x2) | |
$\therefore f'\left(x\right)$∴f′(x) | $=$= | $\frac{1}{100}\left(18-6x\right)$1100(18−6x) |
Solving $f'\left(x\right)=0$f′(x)=0 for the stationary point:
$\frac{1}{100}\left(18-6x\right)$1100(18−6x) | $=$= | $0$0 |
Multiply both sides by $100$100 |
$18-6x$18−6x | $=$= | $0$0 |
|
$6x$6x | $=$= | $18$18 |
|
$x$x | $=$= | $3$3 |
|
Therefore, there is a stationary point when $x=3$x=3, this is within the domain of the probability function. So if we confirm this stationary point is a maximum, we have found our mode.
Differentiating again to find the second derivative:
$f''\left(x\right)=-\frac{6}{100}$f′′(x)=−6100
At the point $x=3$x=3:
$f''\left(3\right)=-\frac{6}{100}<0$f′′(3)=−6100<0
Therefore, since the graph is concave down, the maximum value does indeed occur at $x=3$x=3 and this is the mode of the distribution.
For a random variable, consider the following probability density function.
$f\left(x\right)$f(x) | $=$= | $\frac{5x^4}{7775}$5x47775 | for $1\le x\le6$1≤x≤6 | |
$0$0 | otherwise |
State the cumulative distribution function $F\left(x\right)$F(x) over $1\le x\le6$1≤x≤6 where $F\left(x\right)=0$F(x)=0 for $x<1$x<1 and $F\left(x\right)=1$F(x)=1 for $x>6$x>6.
Use $C$C as the constant of integration.
Find $P\left(X\le2\right)$P(X≤2).
Find $P\left(X<5\right)$P(X<5).
Find $P\left(2\le X\le4\right)$P(2≤X≤4).
Find the mode of the following probability density functions.
$f\left(x\right)$f(x) | $=$= | $\frac{3\left(9+8x-x^2\right)}{434}$3(9+8x−x2)434 | for $\left[0,7\right]$[0,7] | |
$0$0 | otherwise |
$f\left(x\right)$f(x) | $=$= | $\frac{4e^{4x}}{e^8\left(e^{16}-1\right)}$4e4xe8(e16−1) | for $2\le x\le6$2≤x≤6 | |
$0$0 | otherwise |
We know that the CDF gives us the probability of a range of values. We also know the area under the probability density function is $1$1. Knowing this we can find various quantiles of the distribution by solving $F\left(x\right)$F(x) for a specific area.
Because the area under a probability density function is $1$1, then it follows that the area either side of the median value of a continuous probability distribution must be $0.5$0.5.
Using the CDF the median is the value of $x$x where $F\left(x\right)=\int_a^x\ f\left(t\right)\ dt=0.5$F(x)=∫xa f(t) dt=0.5 where $f\left(x\right)$f(x) is the probability density function defined in the domain $\left[a,b\right]$[a,b].
Find the median of the continuous probability distribution defined as $f\left(x\right)=\frac{1}{24}\left(x+3\right)$f(x)=124(x+3) in the domain $\left[1,5\right]$[1,5].
Think: We want to find $x$x such that $\int_1^x\ f\left(x\right)\ dx=0.5$∫x1 f(x) dx=0.5. We can do this be finding the cumulative distribution function $F\left(x\right)$F(x) and then solving for $F\left(x\right)=0.5$F(x)=0.5.
Do: Integrating $f\left(x\right)=\frac{1}{24}(x+3)$f(x)=124(x+3):
Solving for $F\left(x\right)=\frac{1}{2}$F(x)=12:
$\frac{1}{24}(\frac{x^2}{2}+3x-\frac{7}{2})$124(x22+3x−72) | $=$= | $\frac{1}{2}$12 |
First multiply both sides by $24$24 |
$\frac{x^2}{2}+3x-\frac{7}{2}$x22+3x−72 | $=$= | $12$12 |
Next take $12$12 from both sides |
$\frac{x^2}{2}+3x-\frac{31}{2}$x22+3x−312 | $=$= | $0$0 |
Now multiply both sides by $2$2 to simplify |
$x^2+6x-31$x2+6x−31 | $=$= | $0$0 |
Finally, use technology or quadratic formula to solve |
$\therefore x$∴x | $=$= | $-3\pm2\sqrt{10}$−3±2√10 |
|
Since $1\le x\le5$1≤x≤5, $x=-3+2\sqrt{10}\approx3.32$x=−3+2√10≈3.32.
Hence, the median is approximately $3.32$3.32.
Quartiles are the upper limit of particular proportions of a data set. Specifically, $Q_1$Q1 represents the first $25%$25% of the data set and $Q_3$Q3 represents the first $75%$75% of the data set. So when we want to find the lower quartile, $Q_1$Q1, for example, we solve $F\left(x\right)=0.25$F(x)=0.25.
Deciles divide a data set into ten parts and percentiles divide a data set into one hundred parts. Therefore to find, for example, the 6th decile we solve $F\left(x\right)=0.6$F(x)=0.6. And similarly if we are to find the 78th percentile we solve $F\left(x\right)=0.78.$F(x)=0.78.
$F\left(x\right)=\int_a^x\ f\left(t\right)\ dt$F(x)=∫xa f(t) dt where $f\left(x\right)$f(x) is the probability density function defined in the domain $\left[a,b\right]$[a,b]
For the following probability density function, find:
$f\left(x\right)$f(x) | $=$= | $\frac{x^2}{168}$x2168 for $2\le x\le8$2≤x≤8 | |
$0$0 elsewhere |
the median, $m$m.
Round your answer to two decimal places.
the $3$3rd quartile, $q$q.
Round your answer to two decimal places.
the $67$67th percentile, $p$p.
Round your answer to two decimal places.
the $8$8th decile, $r$r.
Round your answer to two decimal places.