You are on page 1of 66

BASIC STATISTICS

A. CHAPTER OBJECTIVES
B. INTRODUCTION
C. SAMPLING
D. FUNDAMENTALS OF IMPROVEMENT
E. MEASURES OF CENTRAL TENDENCY
F. MEASURES OF VARIABILITY
G. NORMAL PROBABILITY TEST
H. HANDLING NON-NORMAL DATA
I. SAMPLING DISTRIBUTIONS
J. CENTRAL LIMIT THEOREM
K. STANDARD NORMAL PROBABILITY
DISTRIBUTION

CHAPTER OBJECTIVES

INTRODUCTION
We want information on the entire population.
Since this is not always possible nor economically
realistic, we use statistics.
Statistics enable us to take a sample from the population
and estimate a characteristic about the population.
It is important to differentiate the sample statistics
(estimations) from the population parameters.

SAMPLING

Sample is a portion (subset) of a larger population from which


information is required.
Sample can yield information that can be used to predict characteristics
of a population.
Sample statistics provide an estimate of the population parameters.
The figure below displays basic symbols for sample statistics versus
population parameters.
Sample Statistics vs. Population Parameters

X = Sample Mean

= Population Mean

s = Sample
Standard Deviation

Statistics

= Population Standard
Deviation

Estimate

Parameters

FUNDAMENTALS OF IMPROVEMENT
Variability & stability are used to determine the status of a
process.
We use the mean () to determine if process is on target.
We use standard deviation () to determine variability.
Stability helps us to determine how well a process
performs over time.
Stability is represented by a constant mean and
predictable variability over time.
Every process displays variation; some display controlled
variation while others display uncontrolled variation (Walter
Shewart).

VARIATION
Control chart A displays controlled variation; stable &
consistent pattern of variation over time.
Control chart B displays uncontrolled variation; variation
that changes over time.
X-Bar
Chart for Process A
X - B a r C h a r t fo r P ro c e s s A

X-Bar C ha rt for P rocess B

U C L = 7 7 .2 0

80

U C L = 7 7 .2 0

U CL = 7 7 .2 7
75

M ea n
X = 7 0 .9 1

70

X = 7 0 .9 1
70

X =70.98

70

S a m p le

S a m p le M e a n
S a m pl e M e a n

75

L C L = 6 4 .7 0

60

65

L C L = 6 4 .6 2

65

L C L = 6 4 .6 2

5
0

10
5

15
10

20
15

S a m p l e N u mb e r

25
20

50
0

25

10

15

2 0

25

S a m p le N u m b e r

S a m p l e N u mb e r

VARIATION - cont.
Variation will be present in any process and can be tolerated if:
The variation of the output is relatively small compared to the
process specifications and the process is on target.
The process is stable over time.

OLD VERSUS NEW WAY OF


THINKING
New Way - reduce variation/focus on nominal.
Old Way - as long as the process is within specifications,
everything is ok.
LSL
LSL

USL
USL

Old
Old
Thinking
Thinking

Cost

Acceptable

LSL
LSL

Cost

Nom
Nom

Nom
Nom

USL
USL

New
New
Thinking
Thinking

DATA ANALYSIS TASK


Class,
class.,
Says
Professor
Ada

Our data analysis task is to:


Determine if the process is stable.
If the process is not stable, we should identify and remove
the cause(s) of instability.
If the process is stable, we should:
Estimate the magnitude of the total variability.
Identify the sources of the variability.
Reduce the variability.
We will now review some basic statistical concepts to
assist us in our task of data analysis.

MEASURE OF CENTRAL TENDENCY


We will review 3 common measures of central tendency:

Mean
Median
Mode

10

MEAN
Mean ( - population, X - sample) is the arithmetic average of the data
values (X 1, X2, X3. XI) which is expressed as follows.
Population - =

Sample - X

Brian Sr.

A sample yields an estimate X for the true mean of a


population () from which a sample is randomly drawn.
The mean is the most commonly used measure of location
(Central Tendency).
The mean reflects the influence of all values, but is
strongly influenced by extreme values.

11

MEDIAN
Single value from data set that measures the central item
in the data.
Single item is the middle most or most central item in data
set.
Half the items lie above this point; other half lie below it.
Reflects 50% rank or center number after data set has
been sorted from high to low.
Median is Robust to extreme values.

12

MODE

Brian Jr.

Represents most frequently observed data of a sample.


As a result, it is not representative of all the data.
Most useful for data collected using a nominal scale
(customer survey with rating scale of 1 to 5).
Can be used to identify the most important interval when
data is classified by frequencies (histogram).

13

M e a n , M e d ia n ,
M ode

Normal distribution mean, median & mode


coincide.

When data is skewed to


left or right, mode is the
highest point, but median
is located between mode
& median.

Mean is more
representative of the
location of the
distribution.

S y m m e tr ic a l
d is tr ib u tio n
M ode
M e d ia n
M ean

S k e w e d to le ft
M ode
M e d ia n
M ean

S k e w e d to r ig h t

14

APPLICATION EXERCISE

Minitab can easily calculate the mean & median as follows:


1. Open file Distskew.mtw located in Gbdata.
2. Stat>Basic Statistics>Display Descriptive Statistics.
3. In the dialog box:
Double click on C1, C2 and C3 to enter into variables
box.

15

Dialog Window

Click OK.

16

SESSION WINDOW
Results for: DISTSKEW.MTW
Descriptive Statistics: Norm, Pos Skew, Neg Skew
Variable
Norm
Pos Skew
Neg Skew

N
500
500
500

N*
0
0
0

Variable
Norm
Pos Skew
Neg Skew

Maximum
103.301
130.366
77.106

Mean
70.000
70.000
70.000

SE Mean
0.447
0.447
0.447

StDev
10.000
10.000
10.000

Minimum
29.824
62.921
1.866

Q1
63.412
63.647
67.891

Median
69.977
65.695
73.783

Q3
76.653
72.821
76.290

Point out Mean, Median, Minimum and Maximum


Next slide will run graphs for same data
17

Minitab can also display a histogram of these same


data sets as follows:
1. Graph>Histogram.
2. In the Histogram dialog window
Click Simple
Click OK
3. In the Histogram-Simple dialog window:
Double click on C1, C2 and C3 to enter into the
Graph Variables box.

18

Dialog Window

Select
Simple
OK

Click OK.
19

HISTOGRAMS
Histogram of Neg Skew

Histogram of Pos Skew

250

140
120

200

Frequency

Frequency

100
150

100

80
60
40

50
20
0

12

24

36
Neg Skew

48

60

72

70

80

90

100
Pos Skew

110

120

Histogram of Norm
70

Project Manager
Select all 3 Graphs
Right click Selected Graphs
Select Tile

50
Frequency

Graphs

60

40
30
20
10
0

30

40

50

60

70
Norm

80

90

100

20

130

MEASURES OF VARIABILITY
Mean, median & mode tell us only part of what we need to
know about the characteristics of data.
We must also measure dispersion (spread) or variability.
Three measures of variability will be reviewed. Range,
variance and standard deviation.

21

RANGE

Difference between highest & lowest observed values.


Easy to understand; usefulness as a measure of
dispersion is limited.
Considers only the highest & lowest values; fails to take
into account other data in set.
Heavily influenced by extreme values.
Inefficient for large samples greater than n=10.
Generally used for developing control chart limits on
process control charts.

22

VARIANCE
Variance (2 population, s2 sample) - sum of the squared
distances between the mean and each item divided by the
total number of elements in the population.
Formulas:
Population

( X )
N

Sample

(X X )

n 1

Variance is the square of the units (squared dollars) which is


not easily interpreted.
We have to make a change in the variance to compute a
useful measure of deviation.
Measure is called standard deviation.

23

STANDARD DEVIATION
Standard deviation ( population, s sample) - quantifies
data variability and is the square root of the variance.
Enables us to determine where the values of a frequency
distribution are located relative to the mean.
This can be shown with a normal curve and related
probability areas.

24

In these cases we can say that:


About 68% of the values in the population will fall within plus
or minus 1 standard deviation of the mean.
About 95% of the values in the population will fall within plus
or minus 2 standard deviations of the mean.
About 99.73% of the values in the population will fall within
plus or minus 3 standard deviations of the mean.
25

STANDARD DEVIATION
Formulas:
Population

Sample

( X )

( X X )

n 1

An important Six Sigma principle indicates that the total variation (variance) of
a process output variable can be partitioned into the variation due to the
process inputs of the process as follows:
2

If Total = variance of the process output;


2
X1 = variance due to input variable x1;
2
X2 = variance due to input variable x2;
2
2
2
Then, Total = X1 + X2
2
2
So, Total X 1 X 2

26

NORMAL PROBABILITY TEST

We can test whether a given data


set can be described as normal
with a normal probability test.
If the distribution is close to
normal, the normal probability plot
will be a straight line.

27

NORMAL PROBABILITY TEST

Open-up Distskew.mtw located in the Gbdata file and


proceed as follows:
1. Stat>Basic Statistics>Normality Test
2. Using the dialog window, produce 3 separate Normal
Probability Plots; C1, C2 and C3 (Test for normality Anderson-Darling).
Note: If the Normality test shows a P-Value equal to or
less than 0.05, the data is NOT represented by a
normal distribution.

28

Normality Test Dialog Window

Note: We use Anderson-Darling test.


Run once for each Norm, Pos Skew, and Neg Skew

29

NORMAL PROBABILITY TEST RESULTS


Probability Plot of Norm

Probability Plot of Neg Skew

Normal

Normal
Mean
StDev
N
AD
P-Value

99

Percent

95
90

70.00
10.00
500
0.418
0.328

p=
.328

80
70
60
50
40
30
20

99.9

95
90

P<
.005

80
70
60
50
40
30
20

10

10

0.1

Mean
70.00
StDev
10.00
N
500
AD
44.491
P-Value <0.005

99

Percent

99.9

0.1

30

40

50

60

70
Norm

80

90

100

110

20

40
60
Neg Skew

80

100

Probability Plot of Pos Skew


Normal
99.9

Mean
70.00
StDev
10.00
N
500
AD
46.489
P-Value <0.005

99

Percent

95
90
80
70
60
50
40
30
20

P<
.005

10
5
1

What are the results?


If the Normality test shows a
P-Value equal to or less
than 0.05, the data is NOT
represented by a normal
distribution.

0.1

40

50

60

70

80
90
Pos Skew

100

110

120

130

30

MYSTERY DISTRIBUTION
Following the previous procedure, generate a normal probability plot for the
Mystery variable in C4 column of the Distskew.mtw file.

P-Value is less than 0.05; distribution is nonnormal.


NOTE: Observe the Bimodal distribution in the plot.

31

Graphical Summary

Next, create a histogram and descriptive statistics


summary of the Mystery data located in C4 as follows.
Stats>Basic Statistics>Graphical Summary
Enter C4 into variables window.

32

Graphical Summary Dialog Window

Click OK

33

RESULT
Minitab provides a histogram along with the related
descriptive statistics.
Summary for Mystery
Anderson-Darling Normality Test
A-Squared
P-Value <
Mean
StDev
Variance
Skewness
Kurtosis
N

40

60

80

100

120

140

Minimum
1st Quartile
Median
3rd Quartile
Maximum

160

27.11
0.005
100.00
32.38
1048.78
0.00716
-1.63184
500
41.77
68.69
104.20
130.81
162.82

95% Confidence Interval for Mean


97.15

102.85

95% Confidence Interval for Median


82.78

117.66

95% Confidence Interval for StDev

95% Confidence Intervals

30.49

34.53

Mean
Median
80

90

100

110

120

34

HANDLING NONNORMAL DATA

Nonnormal distribution is common for some measurements. Minitab can


be utilized to analyze the capability or performance of a process using
nonnormal data.

First, you should attempt to determine the cause(s) of nonnormal data.

Typical examples:
Two different machines provide a bimodal distribution. As a result,
analyze the data for each machine separately.
Data comes from an unstable process. As a result, the process must
be stabilized before reliable statistical results can be obtained.

35

HANDLING NONNORMAL DATA

In instances where the process is stable and predictable and the data proves to be
nonnormal there are a couple of options
Normalize the data via a transformation (transformations are beyond the scope of
our training)
Utilize a nonnormal probability model (weibull, lognormal, exponential, etc) to
analyze overall capability (Pp, Ppk, PPU, and PPL)

Prior to using any data we analyze the normality.


Verify if the data are normal using Minitab (Stat>Basic Statistics>Normality Test).
If the data proves to be nonnormal,
Utilize Minitabs Stat > Quality Tools > Individual Distribution Identification
Allows you to evaluate the optimal distribution for your data based on
probability plots and goodness-of-fit tests prior to conducting a capability
analysis study

36

NONNORMAL DATA EXERCISE

Open New Project (close without save)


File > New
Project
OK
Open-up Cltest.mtw, located in your Gbdata
file and follow along as we go through this
exercise.
First we will determine normality of data using;
Stat>Basic Statistics>Normality
Select Variable C3 Dist3

Probability Plot of Dist3


Normal
99.9

Mean
StDev
N
AD
P-Value

99
95
90

Percent

80
70
60
50
40
30
20
10
5
1
0.1

-2

-1

2
Dist3

Results: P value < 0.005 indicates data is nonnormal. As a result we will


proceed to identify the optimal distribution for our data.

37

0.9100
0.8654
500
19.095
<0.005

INDIVIDUAL DISTRIBUTION
IDENTIFICATION
Using Cltest.mtw, perform an Individual Distribution
Identification test as follows:
1. Stat > Quality Tools > Individual Distribution Identification
2. In the dialog window
Enter C3 Dist3 into single column.
Select Specify
Use the default distributions ( Normal, Exponential,
Weibull, Gamma).
Note: We will use these settings to simplify the example.
We could have used the Use all distributions option to
look at 10 additional distributions. However, in this instance
we know that one of these will Best Fit the distribution.
OK

38

INDIVIDUAL DISTRIBUTION IDENTIFICATION


Review the session window - Goodness of Fit Test section.
Point out that 3 distributions exhibit a good fit to the fitted line as identified by the
Anderson-Darling (AD) statistic.
The AD statistic is a measure of how far the plot points fall from the fitted line in
a probability plot.
The smaller the AD statistic the better the fit!
Point out the p-value and note that a p-value greater than alpha (.05) suggests that
the data follow that distribution.
AD
P
Exponential
1.032
0.109
Review distribution curves on each and point out specifically the Exponential,
Weibull and Gamma distributions.
Each plot similar, difference being confidence interval (outside lines)
Statistically the Exponential is the Best Fit
Remember the p-value is the probability that the data is from that distribution

39

CAPABILITY ANALYSIS
Now that we are comfortable with the probability that our data fits the
Exponential distribution we are able to perform a capability analysis.
Using Cltest.mtw, we will perform a capability analysis for Nonnormal
data and fit data with Exponential distribution.
1. Stat > Quality Tools > Capability Analysis > Nonnormal
2. In the dialog window
Enter C3 Dist3 into single column.
In Fit data with
Select Distribution
Select Exponential
Lower spec Enter 0
Upper spec Enter 3
Click OK
Next slide has graphs

40

Exponential Distribution Model


Pp = 0.50

Ppk = 0.44

Exp. Overall Performance = 37000.5


Process Capability of Dist3
Calculations Based on Exponential Distribution Model
LSL

USL

Process Data
LSL
0.00000
Target
*
USL
3.00000
Sample Mean 0.90997
SampleN
500
Mean
0.90997

O verall Capability
Pp
0.50
PPL
1.00
PPU
0.44
Ppk
0.44
Exp. O verall Performance
PPM<LSL
0.0
PPM>USL 37000.5
PPMTotal
37000.5

O bserved Performance
PPM<LSL
0
PPM>USL 32000
PPMTotal
32000

0.0

0.8

1.6

2.4

3.2

4.0

4.8

We are able to predict the long term process capability.

41

Note: Short term capability is not calculated for nonnormal data.

SAMPLING DISTRIBUTIONS
If we selected 10 groups of 25 samples from a continuous process
& computed the mean length and standard deviation of the length of
each sample group, the mean and standard deviation of each
sample group would be different.
Sampling distribution of the mean - a probability distribution of all
the possible means of the samples.
Sampling distributions can be partially described by its mean and
standard deviation.
Rather than say standard deviation of the distribution of sample
means we call it the standard error of mean.
Standard error indicates size of the chance error and the accuracy
we will likely get if we use it a sample statistic to estimate a
population parameter.

42

EXAMINING SAMPLING DISTRIBUTIONS

Population A is distributions with mean ()


and standard deviation ().

InB we take ongoing samples of 10 and


calculate mean & standard deviation for
each sample.
The sample means would not be the same
as the population.

B.

C is a distribution of all the means from


every sample taken.
This distribution is called sampling
distribution of the mean.

C.

A.

43

SAMPLING DISTRIBUTION OF THE MEAN

The sampling distribution has a mean equal to the population


mean ( X = ).

The sampling distribution has a standard deviation (a standard error) equal


to the population standard deviation divided by the square root of the

sample size ( X
).
n

The sampling distribution is normally distributed

The equation for the standard error (standard deviation) of the mean
for an infinite population is:

44

CENTRAL LIMIT THEOREM

Mean of sampling distribution will equal the population mean


even if the population mean is non-normal (regardless of
sample size).
Relationship between the shape of the population distribution
and the shape of the sampling distribution of the mean is
called the Central Limit Theorem.
This theorem is perhaps the most important theorem in all
statistical inference and is the basis upon which control charts
work.
Assures us that the form of the distribution of sample means
approaches the form of the normal distribution if the sample
size increases.

45

CENTRAL LIMIT THEOREM


S Sigma Training
What this means is that...
If I have a group of data, which
its distribution shape is any
form:

And you create subgroups out of that data :

The distribution of the averages of those


subgroups will always be
A narrower and more normally shaped
distribution.
4

If you have a group of data which its distribution shape is any


form and you create subgroups out of that data, the
distribution of the averages of those subgroups will always
be a narrower and more normally shaped distribution.
46

GRAPHICAL EXERCISE
Turn to the population graphical
exercise located on the next page of
your student manual and proceed as
follows:
1. Select 2 dots (at random).
2. Using the selected dots, draw a
new dot in-between the two.
3. Repeat steps 1 and 2 until all
preprinted dots are used only
once.
4. Circle the new dots, ignoring the
original dots.

47

GRAPHICAL EXERCISE

Questions:
1.
Is the spread of the new population different from the original?
2.
What about the shape?
3.
What differences are there between the original population of
dots and the population resulting from the subgroup?

48

CONTROL CHART EXERCISE


This exercise looks at the effects of the central limit theorem on 2 different SPC charts using the same data.

1.
2.

Open-up Cenlimit.mtw, located in your Gbdata file.


Perform 3 analysis:
A. Choose Stat > Control Charts > Variables Charts for Individuals > Individuals.

Variables: select C1
B. Choose Stat > Control Charts > Variables Charts for Subgroups > Xbar.

All observations for a chart are in one column

Select C1 Output

Subgroup sizes: enter 5

Select Xbar Options

Select Storage tab

Select Point Plotted stores the subgroup mean in worksheet for analysis
C.

3.

Choose Stat > Basic Statistics > Display Descriptive Statistics

In field Variables: select C1 Output and C2 PPOI1

Click Graphs

Uncheck First quartile and Third Quartile

Students are to investigate the upper and lower control limits.

How do they compare?

Why the difference; after all, its the same data?


Provide enough time for students to review and allow student to display and discuss their opinions of results
before going to next slide!

49

RESULTS OF CONTROL CHART EXERCISE


Xbar Chart of Output

I Chart of Output

100

UCL=80.70

80

UCL=96.59

90

70

_
_
X=68.28

65

Individual Value

Sample Mean

75
80
_
X=68.28

70
60
50

60
LCL=55.86

55
3

12

15
18
Sample

21

24

27

30

40

LCL=39.97
1

15

30

45

60
75
90
Observation

105

120

135

150

Control limits are tighter on the x bar chart.


Standard deviation is smaller on x bar chart than individual chart
Variable
Output
PPOI1

N
150
30

N*
0
0

Descriptive Statistics
Mean SE Mean StDev Minimum
68.280
0.776 9.498
43.000
68.280
0.858 4.701
58.000

Median
68.000
67.600

Maximum
92.000
80.800

50

TEAM EXERCISE-CENTRAL LIMIT


APPLICATION
Working as a team, you will analyze 2 different populations and 2 datasets containing the means of the subgroups from the 2 populations. Be
prepared to display and discuss results.
DO NOT LOOK AT THE RESULTS IN YOUR STUDENT MANUAL
UNTIL YOU COMPLETE THE EXERCISE.
1. Open the file Cltest.mtw located in Gbdata.
2. Analyze columns C1 and C2 against C7 and C8.
Note: using column C5 as a subgroup reference, 2 data-sets
containing the mean of the subgroups were created (mean 1 and 2).
3. Use a flipchart or computer to investigate, note and report:
Mean and standard deviation of these groups; what is the
difference?
What is the relation between the individual standard deviation (C1
and C2) and the means standard deviation (C7 and C8).
Create a normal probability plot for both sets and compare.

51

TEAM EXERCISE RESULTS


Session Window:
Descriptive Statistics: Dist1, Dist2, Mean1, Mean2
Variable
Dist1
Dist2
Mean1
Mean2

N
500
500
100
100

N*
0
0
0
0

Mean
0.90016
0.90005
0.90016
0.90005

SE Mean
0.00445
0.00291
0.00408
0.00311

StDev
0.09952
0.06497
0.04082
0.03106

Minimum
0.56399
0.62989
0.79356
0.82731

Median
0.89696
0.91351
0.90219
0.90221

Maximum
1.24185
0.99842
0.97541
0.97392

52

TEAM EXERCISE RESULTS


DIST1 vs. MEAN1 - Normal Plot
Probability Plot of Dist1

Probability Plot of Mean1

Normal

Normal

99.9

Mean
StDev
N
AD
P-Value

99

80
70
60
50
40
30
20

0.9002
0.09952
500
0.213
0.852

95
90
80
70
60
50
40
30
20

10

10

0.1

Mean
StDev
N
AD
P-Value

99

Percent

Percent

95
90

99.9

0.1

0.5

0.6

0.7

0.8

0.9
Dist1

1.0

1.1

1.2

1.3

P VALUE = .852

0.80

0.85

0.90
Mean1

0.95

1.00

1.05

P VALUE = .348

53

0.9002
0.04082
100
0.404
0.348

TEAM EXERCISE RESULTS


DIST2 vs. MEAN2 - Normal Plot
Probability Plot of Dist2

Probability Plot of Mean2

Normal

Normal

99.9

Mean
StDev
N
AD
P-Value

99

80
70
60
50
40
30
20

0.9001
0.06497
500
10.132
<0.005

95
90
80
70
60
50
40
30
20

10

10

0.1

Mean
StDev
N
AD
P-Value

99

Percent

Percent

95
90

99.9

0.1

0.6

0.7

0.8

0.9

1.0

1.1

Dist2

P VALUE = <0.005

0.80

0.85

0.90
Mean2

0.95

1.00

P VALUE = .232

54

0.9001
0.03106
100
0.478
0.232

STANDARD NORMAL PROBABILITY


DISTRIBUTION

In Chapter 8, you were introduced to the standard normal table for determining
the area under the normal curve and how to determine a Sigma level (Z value).
We can also use the normal table to compute the probability (area under the
curve) of being within a certain distance (ie. Spec limits) from the mean in units
of standard deviation (Z values).
Z standard transform equation produces a value from a distribution where
mean=0 and =1.

(X X )
Z

Z value indicates how far the number isfrom the mean in units of standard
deviations (Z).
For estimating a process yield, we can substitute the upper and lower spec
limit for X in the equation. We can calculate the proportion of product that is
out-of-spec.

55

Z TRANSFORM EXAMPLE

Lets determine estimates of the proportion of the normal curve that is


outside of the upper and lower specs where:
Mean = 1.03
= .0573
LSL = .90
USL = 1.10
These spec limits are displayed below.

56

Z TRANSFORM EXAMPLE
To perform the calculations, lets proceed as follows.
1. Calculate the Z score for each specification limit (upper and lower).
Z

( LSL X )

(USL X )

(.9 1.03)
.0573

(1.1 1.03)
.0573

Z 2.27

Z 1.22

2. Calculate the areas below the lower specification and above the upper
specification using the normal table.
Table A (Area Under the Standardized Normal Curve) located in the Gbdata
file gives us an area of .0116 (1.16%) for a Z value of 2.27 (disregard the
negative sign) and it gives us an area of .1112 (11.12%) for a Z value of 1.22.
If we add these 2 area under the curve together, we get 12% (.0116 + .1112 =
.1228 or 12%).

57

Z TRANSFORM EXAMPLE
This is shown graphically below.

.0116
.1112

+
.9
LSL
Z=-2.27

1.03
x

1.1
USL
Z=1.22

58

Z TRANSFORM EXAMPLE - cont.

1. Determine the sigma level (Z score).


To calculate the sigma level (Z score) for this process, we proceed as
follows.
Add the percentages (areas) of the upper and lower specifications limits
(.0116 + .1112 = .1228).
Using Minitabs Inverse Cumulative Distribution function, we will convert
the area outside of the specification to a Z score. This means, calculate
the Z score (sigma level) for .1228 based upon the standard normal
distribution that has a mean = 0 and a = 1.
Note: You could also use any of the other previous methods reviewed to
determine the sigma value (i.e.; normal table, etc).

59

Z TRANSFORM EXAMPLE - cont.


To calculate a Z score (Sigma Level) for .1228, proceed as follows.
Calc>Probability Distributions>Normal
Click on Inverse Cumulative Probability with mean =0, standard
deviation=1
Click on input constant and enter .1228
Click OK
Result:
P (X<=X) = .1228
X = -1.1611 (Sigma Level)

60

Z TRANSFORM EXAMPLE - cont.


The Sigma level of -1.1611 is graphically displayed below.

.1228

-1.161

61

INDIVIDUAL Z TRANSFORM EXERCISE


Now its your turn! Given the following situation, determine the
required probability using the Z transform. Be prepared to explain
and display your results.
DO NOT LOOK AT THE RESULTS IN YOUR
STUDENT MANUAL UNTIL YOU HAVE
COMPLETED THE EXERCISE!
The Finance Director for a company claims that the sum of the monthly
customer payments in millions of dollars received by Accounts
Payable on the first day of the month is normally distributed with a mean of
$10.1 and a standard deviation of $2.6.
A. Find the probability that, on the first day of a randomly selected month, the
payment receipts would be less than $6 million.
B. Find the probability that, on the first day of a randomly selected month, the
payment receipts would be between $6 million and $14 million.

62

RESULTS OF INDIVIDUAL
Z TRANSFORM EXERCISE
Results of A are as follows:
First, calculate the Z value for X = $6 million.
(X X )

6 10.1
Z
2.6

Where; Z

Z 1.57

Next, referring to Table A (Gbdata file), Z = -1.57 is equal to an area of


.0582 or 5.82%.

63

RESULTS OF INDIVIDUAL
Z TRANSFORM EXERCISE
The area below is between Z=-1.57 and left-hand tail
(disregard minus sign).

.0582

+
6
Z=-1.57

10.1

64

RESULTS OF INDIVIDUAL
Z TRANSFORM EXERCISE
Results of B are as follows:
First, calculate the Z values for X = $6 million.
Z = -1.57 (as indicated in A)
Then, calculate the Z value for X = $14 million as follows.
Z

(X X )

(14 10.1)
2.6

Z 1.5
Next, using Table A, calculate the probabilities for each of the Z values.
Z = -1.57 is equal to an area of .0582. This is the area between Z = -1.57 and
the left-hand tail.
Z = 1.5 is equal to an area of .0668. This is the area between Z = 1.5 and the
right-hand tail.

65

RESULTS OF INDIVIDUAL
Z TRANSFORM EXERCISE
Lastly, add the 2 probabilities together and subtract from 1 to determine
the area between Z = -1.57 and Z = 1.5.
.0582 (area between Z = -1.57 and the left-hand tail)
.0668 (area between . Z = 1.5 and the right-hand tail)
.1250 (total area below Z = -1.57 and above Z = 1.5)

.875

1 - .1250 = .875 or 87.5% (area between Z = -1.57 and Z = 1.5)


This is depicted graphically as shown.
.0582

.0668

+
6
Z=-1.57

10.1

14
Z=1.5

66

You might also like