Lesson

n a previous chapter, there is an introduction to the idea of 'regression' or finding a line-of-best-fit.

The three-median method is useful when it is assumed that there is a strong linear relation in the data.

- The data points are ordered by increasing levels of the independent variable then grouped into three approximately equal groups. If the number of data points is divisible by three, the groups will be equal. If there is one extra point, it is placed in the middle group. If there are two points left over, they are put into the outside groups.
- The median data points are found for each group.
- The gradient of the regression line is the gradient of the line joining the lower and upper median points.
- The line is moved vertically one-third of the distance towards the central median point.

There are some calculations hidden in this description and they are illustrated in the following example.

$x$x |
$10$10 | $13$13 | $16$16 | $19$19 | $22$22 | $25$25 | $28$28 | $31$31 | $34$34 | $37$37 | $40$40 | $43$43 | $46$46 | $49$49 | $52$52 | $55$55 |

$y$y |
$16$16 | $15$15 | $26$26 | $23$23 | $36$36 | $39$39 | $48$48 | $61$61 | $46$46 | $73$73 | $62$62 | $70$70 | $76$76 | $65$65 | $74$74 | $94$94 |

In this data set, there are $16$16 observations. So, there will be $5$5 data points in the outside groups and $6$6 in the central group.

The median data points are: $(16,26)$(16,26), $(32.5,53.5)$(32.5,53.5) and $(49,65)$(49,65).

The gradient of the regression line is: $\frac{65-26}{49-16}=\frac{39}{33}=\frac{13}{11}$65−2649−16=3933=1311.

The equation of the line joining the lower and upper median points is $\frac{13}{11}=\frac{y-26}{x-16}$1311=`y`−26`x`−16. After rearranging, this is $y=\frac{13}{11}x+\frac{78}{11}$`y`=1311`x`+7811.

So, at the central median point, where $x=32.5$`x`=32.5, the point on the line has $y$`y`-coordinate given by $y=\frac{13}{11}\times32.5+\frac{78}{11}$`y`=1311×32.5+7811. This simplifies to $y=45.5$`y`=45.5.

The $y$`y`-coordinate of the central median point is $53.5$53.5, which is $8$8 units above the line. So, we move the line vertically by $\frac{8}{3}$83.

The regression line must have the equation

$y=\frac{13x}{11}+\frac{322}{33}$`y`=13`x`11+32233

The data and the regression line are shown in the graph below.

The median points are coloured black. The blue line is the line joining the lower and upper median points. The vertical black line passes through the central median point. The red line is the $3$3-median regression line.

It is apparent in this case that the fit is not very good. We note that the position of the line depends on just three median values.and that at most five data points are needed to find these. (In this case, only four were needed.) As a consequence, most of the information in the data set is ignored and the likelihood of a good fit is reduced accordingly.