Home     Consultant Biography     Testimonials     Typical Client Results     Papers     Downloads     Services     Contact      
How to calculate a sample size for variable measurements.

How many times have you been faced with estimating the average weight, length, width, or specific gravity of a molded part in a supplier's lot? While 100% inspection is often recommended, it is not an option when destructive testing is applied or when time is an issue. Likewise, if an engineer is troubleshooting a process, he or she would like a rough answer quickly than a precise answer in a day or two. Both of these questions can be addressed with effective sampling.

In order to determine the most effective sample size, one needs to know how accurate the determination needs to be and the variation within the population. How accurate is expressed in the units of measurement. For example, if one is measuring the temperature of a water bath, the accuracy could be to the nearest 5F, 1F, 0.1F, 0.01F, etc. The higher the accuracy, the greater the sampling requirements. If the required accuracy is unknown, then compare it to the specification. If a part has a specification that is 5 - 10 units, then the variable should be known to the nearest 1 unit, and not the nearest 0.1 since it presumably adds no value.

Next, one must estimate how correct the final determination must be. This is the level of confidence. The most common values are 90%, 95%, 99%, or 99.9%. For typical troubleshooting, 90% is usually more than sufficient. If this is for a product specification question, then 95% is considered acceptable, with 99% for critical values. 99.9% level of confidence should only be used when the prediction must be correct 999 times in 1000. This is rarely used for all practical purposes.

Finally, the standard deviation of the population must be estimated. This can be quite tricky. Larger standard deviations give more samples, so if there is no prior knowledge, try estimating variation as 10% of the target. This is usually more than enough to estimate the worse case variation. If the process has some historical data, then use that value. If there is time and money, 15 or 30 samples can be taken and measured and a sample standard deviation used to represent the population standard deviation. If the samples are truly random, this is usually the best alternative.

Using the last three pieces of data, the number of samples can be calculated using the rearranged formula for a confidence interval:

n = (z * s / d )2

Where s is the standard deviation, d is the accuracy, and z is a number based on the level of confidence. For reference, some values of z with its corresponding level of confidence are:
Level of confidence z
90%1.65 
95% 1.96
99% 2.58
99.9% 3.30


For example, if there is a case of 1000 parts and I want to know the average part weight so it can be shipped to the customer, the thought process would go something like this:

* The specification is 3.0 - 3.5 grams, therefore I need to know the weight to the nearest 0.1 grams.
* Historical inspection data indicates a standard deviation of 0.2.
* The level of confidence is 95% since it will be released to a customer.
* n = (1.96 * 0.2 / 0.1 )2 = 15.3664

It would take 16 parts to accurately determine the part weight of the case with a 5% chance of being wrong.

Quick Sampling tips

  • Take random samples whenever possible to maximize variation. Never pull everything from the same case, shift, or production run.
  •  Always pull at least 10% extra samples to ensure that the minimum is covered.
  • For non-destructive tests, keep the samples until after all results have been disseminated and understood.
  • Look at the data and re-measure values that are different from the rest by 100%. They are probably typographical errors.