WTMaths logo
Empirical Unbiased Sample

Empirical Unbiased Sample

Empirical unbiased samples simply means that events have been observed in such a way that the results are not changed by the method of observation.

For example, if there were four production lines producing widgets in a factory, then taking a sample from one production line and using it to estimate the total number of faulty widgets produced in a factory would be biased, as this production line might produce more faulty widgets than the others.

In addition, the more samples that are taken, the closer you are likely to be to the correct answer. If you could test every widget produced by the factory, then your estimate would be totally correct for that day. If you only tested every other widget, then it would be less accurate. If you only tested 1 in every 100 widgets, it would be less accurate still.

Increasing the sample size (the number tested, out of the total number) will tend to a more accurate answer.

Example 1

Joe threw a die (dice) 50 times. He threw a 6 four times. Joe suspects that the die may be biased. What should he do to confirm his suspicion?

The more often you carry out an experiment, the more likely you are to get to what will normally happen.

Answer: Carry out a test with a larger number of throws.

Example 2

A factory has two production lines producing widgets. Each production line produces 10,000 widgets a day. The supervisor from production line A sampled 100 widgets and found 3 were faulty. The supervisor of the other production line B sampled 500 widgets and found 16 were faulty. Comment on these results.

Two points to note in the answer: first, there was some calculations to compare the two results. Secondly, the results were interpreted ie what the results show along with their accuracy.

Answer: The estimated number of faulty widgets produced by line A = `frac(3)(100)` 10000 = 300.

For line B = `frac(16)(500)` 10000 = 320.

It appears that line B produces more faulty widgets. The number of samples taken by line B is likely to be more accurate, as they took five times as many samples.