Member-only story
A guide to understand these statistical measures, with worked examples using cats 😺
Imagine that you are working on a study that looks at the relationship between cat’s breeds and their physiques. You collect data from 50 cats, and save their weight, body length, gender and breed info into a spreadsheet. Now you’d like to summarise the average weight and body length of the cats, as well as how they differ based on the cats’ breeds. In statistics, the latter is called spread or dispersion, and the most commonly used metrics to quantify spread are variance, covariance and correlation.
Let’s start with the easiest one: Variance
Variance
Variance measures how far from the mean (average) individual data point(s) is. In our example, we can use variance to describe how much cats’ weights vary depending on their breed or gender. A high variance tells us that the values in our sample are far from their mean, while a low variance indicates that values are closely clustered around the mean.
Variance is always positive, and is used to describe dispersion of one variable.
How to calculate Variance
Imagine that you collected the weight data from 5 male and 5 female cats. You can…