Standard Deviation

A standard deviation is a measure of spread from the mean of a set of data.

It calculates the distance from each item to the mean to generate a value.

`\sigma = sqrt(frac(\Sigma(x_i - \bar(x))^2)(N))`

The value `\sigma` is the standard deviation.

The value `N` is the number of values in the data.

`\Sigma` means add all the items up. Each item has a value given by `x_i`: `x_1` is the value for the first item, `x_2` the value for the second, etc.

`\bar(x)` is the mean for that set of data. It may also be shown as `\mu`.

The process involves several steps.

Add up the number of values, `N`.

The value `\bar(x)` is the mean for the group. This is calculated using the normal process.

For each value in the group, work out the difference between the value and the mean `(\x_i - \bar(x))`.

Square each difference `(x_i - \bar(x))^2`

Add all the squared differences up `\Sigma(x_i - \bar(x))^2`

Divide by the number of items `frac(\Sigma(x_i - \bar(x))^2)(N)`

Square root the answer `sqrt(frac(\Sigma(x_i - \bar(x))^2)(N))` to obtain the standard deviation.

Example 1

Widgets are produced at a factory over a 10-hour shift. The number of widgets that failed in each hour was as follows:

Hour No.	Failures
1	2
2	3
3	7
4	7
5	5
6	1
7	6
8	3
9	9
10	5

Hour No.	Failures	`(x_i - \bar(x))`	`(x_i - \bar(x))^2`
1	2	-2.8	7.84
2	3	-1.8	3.24
3	7	2.2	4.84
4	7	2.2	4.84
5	5	0.2	0.04
6	1	-3.8	14.44
7	6	1.2	1.44
8	3	-1.8	3.24
9	9	3.2	10.24
10	5	0.2	0.04
TOTAL	48		50.20

Calculate the mean (`\bar(x)`) for the number of failures

`\mu` = 48 ÷ 10 = 4.8.

Work out the deviation for each entry `x_i - \bar(x)`

eg for hour 1: `(x_i - \bar(x))` = 2 - 4.8 = -2.8

Square each entry

eg for hour 1: `(x_i - \bar(x))^2` = -2.8² = 7.84

Add the squared values

`\Sigma (x_i - \bar(x))^2` = 7.84 + 3.24 + … = 50.20

Divide by the number of entries `frac(\Sigma (x_i - \bar(x))^2)(N)`

50.20 ÷ 10 = 5.02

Square root the result: `sqrt(frac(\Sigma (x_i - \bar(x))^2)(N))`

= `sqrt(5.02)` = 2.24

Answer: 2.24

Example 2

In the widget factory, tests were conducted on a further two consecutive days.

On the first day, the number of fault widgets per hour had a mean of 5.2 and a standard deviation of 2.03. On the second day, the mean number of faulty widgets was 5.1 with a standard deviation of 2.29.

The foreman said that the second day was more consistent as the mean was lower. Is he correct?

Standard deviation shows the spread of data, which highlights consistency.

Answer: No, consistency is given by the standard deviation. The first day has the lower standard deviation.