We study standard deviation. In statistics, we have learned using mean to find the average, and we have learned to use range and interquartile range to see the spread of the data. But both range and interquartile range are not very good as the tool to represent the spread of the data.

A better way is the measure called standard deviation:

Standard Deviation SD = sqrt(sum of (Xi – AvgX)^2/N)

where N is the total number of data, AvgX is the mean of the data. Xi – AvgX is called the deviation of Xi from the mean AvgX for each i = 1, 2, …, N

For a set of grouped data in the form of a frequency table, we have

Mean AvgX = sum of fx / sum of f

where x is the class mark of each class and f is the frequency of the corresponding class.

Standard Deviation SD = sqrt(sum of f(Xi – AvgX)^2 / sum of f)

[Sorry, I can’t type using the summation notation sigma, which is what we learn in the class. This will simplify the writing.]

Here is a written proof of going from the definition to a formulae often used:

sqrt(∑(Xi – AvgX)^2 / N) = ∑Xi^2 / N – (∑Xi / N)^2

I showed the proof in the classroom, but students may not get it.

Homework:

Page 1 and Page 2: #1, #2, #3, #4, #5, #6.