You are on page 1of 2

Basic Statistics

What are Statistics?


Statistics can be defined as the practice of collecting, organizing, describing, and analyzing data
allowing us to characterize and then optimize any process.

Additionally, statistics serve as a means to better understand data. They also allow us to
numerically describe and characterize a process. Statistics, when applied properly, can help us
make predictions about the future. And finally, statistics create a universal language everyone can
speak and understand.

All Around Us
To be sure statistics are all around us. For example, we’re all aware of something called miles or
liters per gallon. This is nothing more than a statistic meant to describe fuel the efficiency of our
vehicles.

Another popular statistic is the median home price. This statistic is especially useful for those
looking for a new home.

And finally, the Dow Jones Industrial Average is an example of statistics at its finest as are other
stock market indices from around the world.

Attribute Data
There are two main types of data. The first type of data I want to discuss is attribute data.

And when we speak of attribute data… there are actually two variations. The first type of attribute
data is called binary data. With binary data we’re dealing with two levels. For example, we either
pass or fail. The light is either on or off. The product is either good or bad.

The second form of attribute data is count data. With this type of data we’re able count things as
the name implies. For example, if someone fails the test we can count how many answers they
missed. If the product is bad we can count the number of defects and so on.

Now, to be sure, if it’s available we’ll always want to use count data versus binary data since
we can learn so much more about the situation. For example, instead of saying a product is bad… it
would be much more useful if we could count the number of defects on the product. Or instead of
simply telling the student he failed the exam… it would be useful if we could tell them exactly how
many questions he missed.

6. Basic Statistics Page 1


Variable Data
The second type of data is called variable data sometimes referred to as continuous data. Variable
data comes from a measurement scale that can be divided into finer and finer increments. Things like
the weight, distance, dimensions, and speed are all examples of variable data.

Which to Use?
So the question is… which type of data is best? Better stated… if both are available what type of
data should we seek to collect and analyze?

The answer is if it’s available we always want to collect and analyze variable data. There are
some statistical reasons related to something called power and sample size… but the gist of it comes
down to the fact that variable data is more powerful, statistically speaking, than attribute data.

As such, we may only need 30 data points of variable data to characterize a process while we may
need 100 data points of attribute data to learn anything at all. So, again, when possible always seek to
collect and analyze variable data since we can learn so much more from a small amount of data.

Descriptive Statistics
OK, now it’s time to learn how to work
with and leverage either attribute or
variable data. And to do that we need to
discuss something called descriptive
statistics.

As the name implies… descriptive


statistics help us to describe the data
we’re studying. Specifically, descriptive
statistics help us learn two things about
our data.

The first thing it helps us describe is the central tendency of a data set. In other words it helps us
find the mid point of the data set. This is useful as it can help us understand whether our process is
centered between customer requirements or whether it’s skewed to one side or another.

The second thing descriptive statistics do is describe the level of dispersion or variation in our
data set. In other words, they help us to understand how much spread exists within our data. This is
useful as it helps us understand how controlled our process is.

You see, a process with a lot of dispersion is often a process out of control. And if the process is out
of control chances are it’s costing you money and impacting your customers in a negative way.

6. Basic Statistics Page 2

You might also like