In the sciences, and particularly in the biological sciences, researchers examine data to see whether values of one variable are related to values of another variable.
We speak of an independent or treatment variable and a corresponding dependent variable.
When the value of a dependent variable can be predicted (at least approximately) by the level of the independent variable, we say there is an association between the two variables. An association can be weak if the predicted response is likely to be only a very rough estimate, or it can be strong if the independent variable is expected to predict the response accurately.
When data consisting of pairs of numbers (values of the independent and dependent variables) are displayed in a scatter plot with the help of a spreadsheet program or other software, the program can be asked to find the line of best fit or regression line. We can also calculate the correlation coefficient , r, which quantifies the strength and direction of a correlation. A related value is R^{2} which is simply r^{2}. The R^2-number that indicates the strength of the linear association between the two variables.
The R^2-number can be anywhere between 0 and 1. By squaring r, we no longer have to worry about positives or negatives. An R^2-number near 0 means the association is weak or absent. When the R^2-number is near 1, the association is strong.
Data pairs are shown in the columns on the left. These are plotted and the regression line, its formula and the R^2-number are provided.
An association between two variables does not imply a causal relationship.
Without further evidence, we are not able to say that the independent or treatment variable causes the variation in the dependent variable. It may be that both variables are changing in response to some other hidden factor.
The term biometric data is used by businesses dealing with security to mean measurements of facial features, fingerprints, iris color and pattern, and any other characteristics of a person relating to their identity.
The term is also used by the fitness industry when referring to a person's capacity in physical activities.
More broadly, biometric data concerns the application of mathematical or statistical theory to any measurements in biology, human or otherwise. Thus, biometric data appears in the biological sciences which include medical, veterinary and agricultural science.
To complete this learning activity, you should design and carry out your own investigation into a possible association between two biometric variables or between a biometric variable and a non-biometric variable.
We can use a similar procedure to the one from INVESTIGATION: Interpretation of data with awareness of issues.
There are many variables of the numerical kind that you could choose to study. Here are a few:
Some of these would need specialized techniques and measuring equipment for their collection, but others could be studied more conveniently. You should add to this list if necessary and choose variables that are of interest to you.