Standard Deviation

## Standard Deviation

A standard deviation is a measure of spread from the mean of a set of data.

It calculates the distance from each item to the mean to generate a value.

\sigma = sqrt(frac(\Sigma(x_i - \bar(x))^2)(N))

The value \sigma is the standard deviation.

The value N is the number of values in the data.

\Sigma means add all the items up. Each item has a value given by x_i: x_1 is the value for the first item, x_2 the value for the second, etc.

\bar(x) is the mean for that set of data. It may also be shown as \mu.

The process involves several steps.

Add up the number of values, N.

The value \bar(x) is the mean for the group. This is calculated using the normal process.

For each value in the group, work out the difference between the value and the mean (\x_i - \bar(x)).

Square each difference (x_i - \bar(x))^2

Add all the squared differences up \Sigma(x_i - \bar(x))^2

Divide by the number of items frac(\Sigma(x_i - \bar(x))^2)(N)

Square root the answer sqrt(frac(\Sigma(x_i - \bar(x))^2)(N)) to obtain the standard deviation.

## Example 1

Widgets are produced at a factory over a 10-hour shift. The number of widgets that failed in each hour was as follows:

Hour No. Failures
1 2
2 3
3 7
4 7
5 5
6 1
7 6
8 3
9 9
10 5

Hour No. Failures (x_i - \bar(x)) (x_i - \bar(x))^2
1 2 -2.8 7.84
2 3 -1.8 3.24
3 7 2.2 4.84
4 7 2.2 4.84
5 5 0.2 0.04
6 1 -3.8 14.44
7 6 1.2 1.44
8 3 -1.8 3.24
9 9 3.2 10.24
10 5 0.2 0.04
TOTAL 48 50.20

Calculate the mean (\bar(x)) for the number of failures

\mu = 48 ÷ 10 = 4.8.

Work out the deviation for each entry x_i - \bar(x)

eg for hour 1: (x_i - \bar(x)) = 2 - 4.8 = -2.8

Square each entry

eg for hour 1: (x_i - \bar(x))^2 = -2.82 = 7.84

\Sigma (x_i - \bar(x))^2 = 7.84 + 3.24 + … = 50.20

Divide by the number of entries frac(\Sigma (x_i - \bar(x))^2)(N)

50.20 ÷ 10 = 5.02

Square root the result: sqrt(frac(\Sigma (x_i - \bar(x))^2)(N))

= sqrt(5.02) = 2.24

## Example 2

In the widget factory, tests were conducted on a further two consecutive days.

On the first day, the number of fault widgets per hour had a mean of 5.2 and a standard deviation of 2.03. On the second day, the mean number of faulty widgets was 5.1 with a standard deviation of 2.29.

The foreman said that the second day was more consistent as the mean was lower. Is he correct?

Standard deviation shows the spread of data, which highlights consistency.

Answer: No, consistency is given by the standard deviation. The first day has the lower standard deviation.