|
|
Understanding Data Types
I was recently discussing Taguchi's robust design methodology with a reliability engineer who was positive Taguchi was wrong. According to him, because his data varied from large positive to large negative numbers, the concept of signal to noise had no meaning and therefore it proved Taguchi was wrong. It was a struggle convincing him that Taguchi was probably right after numerous books, articles, and even a quality award by the same name! Still, he persisted in using his data as 'evidence.' So where was the problem? The answer was that he did not have ratio data, but interval. Without a true zero, signal to noise did not mean anything. This issue of Quality Concepts will review the four basic data types and the computations that can be used on them.
Nominal Any type of data can be nominal. Simply put, it is the name given to a measurement. For example, automobile make and models are nominal: Toyota Camry, Honda Accord, and Ford Taurus. The data is distribution free and collections of nominal data is considered randomly distributed. Therefore, the chi-squared is the most common test for nominal data.
| Make/Model | Situation A1 | Situation B2 | | Ford Taurus | 20 | 58 | | Toyota Camry | 20 | 1 | | Honda Accord | 20 | 1 | | Total cars | 60 | 60 |
1A chi squared test would indicate nothing unusual.
2A chi squared test would indicate something unusual is happening here.
Ordinal Ordinal data is more structured than nominal. Ordinal data can be ranked from high to low. Letter grades are an example of ordinal data. An A is higher than a B which is higher than an F. Percentiles, quartiles, median, and rank-order correlation are all allowable. For example, many students have seen a chart like the following:
| Letter Grade | Number of Students | Percentage | Cumulative Percentage | | A | 5 | 11% | 99% | | B | 10 | 22% | 88% | | C | 15 | 33% | 66% | | D | 10 | 22% | 33% | | F | 5 | 11% | 11% | IntervalInterval data is much more common than most suppose. This difference between the data is constant, but there is no fixed zero. Fahrenheit and Celsius temperature scales are the classic examples of interval data. While the difference between 90F and 100F is the same as 10F and 20F, 0F has no physical meaning. Compare that to the Kelvin temperature scale. At 0K, it is absolute zero. Nothing is colder. So what kind of calculations can be made with interval data? Arithmetic mean, standard deviation, and correlation are all appropriate, but coefficient of variation, geometric mean, and harmonic means cannot be computed. RatioData that has a true zero is called ratio. All computations can be computed with ratio data. Most people assume they are working with this type of data, when it may actually be interval or ordinal. Next time that a calculation looks funny or doesn't work out as expected, try determining what kind of data you are working with. It may surprise you. |
|
|