Quality Concepts Matter

Basic QA Statistics Series(Part 2)- Basic Measures of Central Tendency and Measurement Scales

Gary Cox is a great Quality resource in addition to being very funny! gcox@barringtongrp.ca

REFLECTION: FOR STUDENTS: Learning is not compulsory… neither is survival. – W. Edwards Deming

FOR ACADEMICS: Our schools must preserve and nurture the yearning for learning that everyone is born with. -W Edwards Deming

FOR PROFESSIONALS/PRACTITIONERS: Data are not taken for museum purposes; they are taken as a basis for doing something. If nothing is to be done with the data, then there is no use in collecting any. The ultimate purpose of taking data is to provide a basis for action or a recommendation for action. The step intermediate between the collection of data and the action is prediction. -W. Edwards Deming

Foundation

The previous post covered just the definition of Population and Sample and the descriptions of each using Parameters for Population and Statistics for a Sample. We also mentioned data. To be able to communicate about data, we first have to define data. Define should always be the first step for better understanding.

Data are characteristics or information (usually numerical) that are collected through observation. In a more technical sense, data consists of a set of values of qualitative or quantitative variables concerning one or more persons or objects.

The two broadest categories of Data are: Qualitative and Quantitative-

Qualitative data deals with characteristics and descriptors that cannot be easily measured but can be observed in terms of the attributes, properties, and of course, qualities of an object (such as color and shape). Quantitative data are data that can be measured, verified, and manipulated. Numerical data such as length and weight of objects are all Quantitative.

On the next level of Data are Discrete and Continuous Data.

Discrete Data– Pyzdek and Keller defined discrete data as such: “Data are said to be discrete when they take on only a finite number of points that can be represented by the non-negative integers” (Kubiak, 2017). Discrete data is count data and sometimes called categorical or attribute data. A count cannot be made more precise. You cannot have 2.2 fully functional cars.

Continuous Data– Pyzdek and Keller state- “ Data are said to be Continuous when they exist on an interval, or on several intervals.” Another term used is Variable data. Height, weight, and temperature are continuous data because between any two values on the measurement scale, there is an infinite number of other values (Kubiak, 2017).

Measurement Scales

  • Nominal
    • Classifies data into categories with no order implied
  • Ordinal
    • Refers to data positions within a set, where the order is essential, but precise differences between the values are not explicitly defined (example: poor, ok, excellent).
  • Interval
    • An Interval scale has meaningful differences but no absolute zero. (Ex: Temperature, excluding the Kelvin scale)
  • Ratio
    • Ratio scales have meaningful differences and an absolute zero. (Ex: Length, weight and age)

(Kubiak, 2017)

I know that it seems like a lot to digest, but recording data correctly is critical. Next, we will discuss the Central Limit Theorem: Per the central limit theorem, the mean of a sample of data will be closer to the mean of the overall population in question, as the sample size increases, notwithstanding the actual distribution of the data. In other words, the true form of the distribution does not have to be normally distributed (a bell curve) as long as the sample size is sufficiently large(Kubiak, 2017). There will eventually be a separate post(s) on sampling, distribution, and choosing the ideal sample size, but we are starting at the basics.

Note: Ordinal Data can be confusing. It depends on the how the ordinal scale is arranged. The Likert Scale would be considered quantitative ordinal, while the Movie rating scale would be considered qualitative ordinal.

(Kubiak, 2017)

Measures of Central Tendency

Three Common ways for quantifying the centrality of a population or sample include the 

  • Mean
    • Arithmetic Average of a data set. This is the sum of the values divided by the number of individual values Ex: [1,3,5,10] Average is 4.75
  • Median
    • This is the middle value of an ordered data set. When the data are made up of an odd number of values, the median value is the central value of the ordered set. [1, 3, 5], so 3 is the median. When there is an even number of data points, the median is the average of the two middle values of the ordered set [1, 3, 5, 10]. In this case, the Median is the average of 3 and 5: (3+5)/2=
  • Mode
    • The mode is the most frequently found value in a data set. It is possible for there to be more than one mode. EX: [1,2,3,5,1,6,8,1,8,1,3]- The Mode is 1

(Kubiak, 2017)

Conclusion

Correctly recording Data and using the proper scale to track your Data is the first step to understanding your process outputs.
A next baby step is knowing how to measure your process based upon your data scale. Being able to calculate the Measures of Central Tendency helps you, but Stats software will do much of this for you. Still, you need to know what you are seeing. It is always most helpful to know what those stat software programs are doing with your data so you can more robustly defend your decisions. Next time we will go a little deeper and talk about Measures of Dispersion (and it is precisely what it sounds like!).

Processing…
Success! You're on the list.
Try Amazon Prime 30-Day Free Trial
Create Amazon Business Account

Bibliography

Kubiak, T. a. (2017). The Certified Six Sigma Black Belt Handbook Third Edition. Milwaukee: ASQ Quality Press.