You are on page 1of 36

Discrete Data Analysis

Discrete Data Analysis


The table below summarizes the methods used to analyze discrete
data.
z is a value from the normal distribution, for
the required level of confidence.
Z
2-Sided
Confidence
Level
1 - Sided
Confidence
Level
1.282 80.00% 90.00%
1.645 90.00% 95.00%
1.960 95.00% 97.50%
2.326 98.00% 99.00%
2.576 99.00% 99.50%
Large n (sample size)
p not too
close to 0 or 1
np>10 and
n(1-p)>10
Poisson
Confidence
Interval
Exact
Binomial Test

2
(Chi-square)
Large n (sample size)
small proportion
defective (p<0.10)
1

P
r
o
p
o
r
t
i
o
n
) 1 ( p p Z
) (

2 1
p p
n n 2 1
1 1
+
Normal
Approximation
Poisson
Approximation
M
o
r
e

t
h
a
n

2
P
r
o
p
o
r
t
i
o
n
s
(
a
n
d

2

w
a
y

t
a
b
l
e
s
)
C
o
m
p
a
r
i
n
g

2
P
r
o
p
o
r
t
i
o
n
s
n
p p
Z p
) 1 (



Normal Approximation
Normal Approximation
One Proportion One Proportion
Use this approximation when the sample size is large and the number of
defects in the sample is greater than 10 (np>10), and the number of good
parts in the sample is greater than 10 (n(1-p)>10).
Size Sample
Defects #
= p
A two sided confidence interval for the proportion (p) that are defective in
the population is given by the equation:
n
p p
Z p
)

1 (



The resultant will provide the lower and upper limits of a range of all
plausible values for the proportion of defects of that population.
n n
p p Z p p
2 1
2 1
1 1
)

1 (

)

( +
The resultant will provide the lower and upper limits of a range of all
plausible values of the difference between the proportion defective in the
populations.
If 0 is included within the range of plausible values, then there is not
strong evidence that the proportions of defects in the two populations are
different.
Comparing Two Proportions Comparing Two Proportions
If evaluating two different sample sets with proportion defective data, the
confidence interval for the difference in proportion defectives between the
two sample sets is given by:
Where:
If K
i
= # of defects in the i
th
sample
& n
i
= sample size of the i
th
sample
1
1
1

n
k
p =
) (
) (

2 1
2 1
n n
k k
p
+
+
=
2
2
2

n
k
p =
35 36
37 34
Confidence Intervals Confidence Intervals
A confidence interval is a range of plausible values for a population parameter, such as
the mean of the population, .
For example, a test of 8 units might give an average efficiency of 86.2%. This is the
most likely estimate of the efficiency of the entire population. However, observations
vary, so the true population efficiency might be somewhat higher or lower than 86.2%. A
95% confidence interval for the efficiency might be (81.2%, 91.2%). 95% of the intervals
constructed in this manner will contain the true population parameter.
The confidence interval for the mean of one sample is
t comes from the t tables (Page 65) with n-1 degrees of freedom and with the desired
level of confidence.
Confidence Interval for the difference in the means of 2 samples, if the variances of 2
samples are assumed to be equal is:
t comes from the t tables (Page 65) with n
1
+n
2
-2 degrees of freedom, and with the
desired level of confidence. S
P
is the pooled standard deviation:
In MINITAB, confidence Intervals are calculated using 1 Sample and 2 Sample t
methods, above. In the text output shown below, the 95% confidence interval for the
difference between the mean of Manu_a and the mean of Manu_b is 6.65 to 8.31. This
statement points to accepting H
a,
that the means are different.
95% CI for mu Manu_a - mu Manu_b: ( 6.65, 8.31)
n
t X


n n
p 2 1
t*s ) x x
2 1
1 1
( +
( ) ( ) [ ]
( ) 2
1 1
2 1
2
2 2
2
1 1
+
+
=
n n
s n s n
P
S
Using the 1 Sample t test
Run Stat >Basic Statistics >1-Sample t. In the dialog box, identify the variable or
variables to be tested.
Select the test to be performed, Confidence Interval or Test Mean.
If Confidence Interval is to be calculated on other than an a value of 95%, change to
the appropriate number.
If Test Mean is selected, identify the desired mean to be tested, the mean of the null
hypothesis and in the alternative box,select the alternative hypothesis which is
appropriate for the analysis. This will determine the test used for the analysis (one tailed
or two tailed).
If graphic output is needed, select the graphs button and choose among Histograms,
Dotplot and Boxplot output.
Click Ok to run analysis.
Analyzing the test results.
If running the Confidence Interval option, Minitab will calculate the t statistic and will
calculate a confidence interval for the data.
If using the Test Mean option, Minitab will provide descriptive statistics for the tested
distribution(s), the t statistic and a p value.
The graphic outputs will all have a graphical representation of the confidence interval of
the mean, shown by a red line with a dot at the mean for the sample population
The
2
Test for independent relationship

tests the null hypothesis (Ho,) that two
discrete variables are independent.
Data relating two discrete variables are used to create a contingency table.
For each of the cells in the contingency table the observed frequency is
compared to the expected frequency in order to test for independence.
The expected frequencies in each cell must be at least five (5) for the
2
test to
be valid.
For continuous data, it is best to test for dependency, or correlation, by using
scatter plots and regression analysis.
Example of
2
Analysis
1. There are 2 variables to be studied, height and weight. The null hypothesis Ho is
that weight is independent of height.
2. For each variable 2 conditions (categories) are defined.
Weight: < 140 lbs, > 140 lbs
Height: < 56, > 56
3. The data has been accumulated as shown below:
Height below 56 Height above 56 Totals Rows
Weight below 140 LBS 20 13 33
Weight Above 140 LBS 11 22 33
Totals Columns 31 35 N=66
Manual
2
1. Compute fexp for each cell ij: fexp ij = (Row Total)i x (Column Total)j / N.
2. N is total of all fobs for all 4 cells. (For our example, N = 66 and fexp1,2 =
(33*35)/66 = 1155 / 66 = 17.5).
3. Calculate
2
calc, where
2
calc = [ [[ [(fobs - fexp)
2
/ fexp ] = 4.927
4. Calculate the degrees of freedom df = (Number of Rows - 1)(Number of
Columns -1). For our example, df = (2-1) * (2-1) = 1.
5. Determine
2
crit from the
2
table for the degrees of freedom and confidence
level desired (usually 5% risk). For 1 df and 5% risk,
2
crit = 3.841
6. If
2
calc>
2
crit, then reject Ho and accept Ha, i.e. that weight depends on height.
In this example, we reject Ho.

Using Minitab to perform
2
Analysis
Minitab can be used to analyze data using
2
with two different processes.
Stat>Tables>Chi Square Test and Stat>Tables>Cross Tabulation. Chi Square
Test analyzes data which is in a table. Cross Tabulation analyzes data which is
in columns with subscripted categories. Since Minitab commonly needs data in
columns to graph, Cross Tabulation is a preferred method for most analysis.



2 2
-- -- Test for Independence Test for Independence
33 38
t Test t Test
A t test tests the hypothesis that the means of two distributions are equal. It can be
used to demonstrate a shift of the mean after a process change. If there has
been a change to a process, and it must be determined whether or not the mean
of the output was changed, compare samples before and after the change using
the t test.
Your ability to detect a shift (or change) is improved by increasing the size of
your samples and by increasing the size of the shift (or change) that you are
trying to detect, or by decreasing the variation (See Sample size; page 27 -
28).
There are two tests for means, a One Sample t test and a Two Sample t
test.
The one sample t test Stat >Basic Statistics >1-Sample t
compares a single distribution average to a target or hypothesized value.
The two sample t test Stat > Basic Statistics > 2-Sample t analyzes
the means of two separate distributions.
Using the 2 Sample t test
1. Pull samples in a random manner from the distributions whose means are being
evaluated. In Minitab, the data can be in separate columns or in a single column
with a subscript column.
2. Determine the Null Hypothesis Ho and Alternative Hypothesis Ha (Less than,
Equal to, or Greater than).
3. Confirm if variances are similar using F test or Homogeneity of Variance (page
30).
4. Run Stat > Basic Statistics > 2-Sample t In the dialog box, select Samples in
One Column and identify the data column and subscript column, or Samples in
Different Columns and identify both columns.
5. In the alternative box, select the alternative hypothesis which is appropriate for the
analysis. This will determine the test used for the analysis (one tailed or two
tailed).
6. If the variances are similar, check the Assume Equal Variances box.
7. If graphic output is needed, select the graphs button and choose between dotplot
and boxplot output.
8. Click Ok to run analysis.
Analyzing the test results.
Minitab will provide a calculation of descriptive statistics for each distribution, provide
a Confidence Interval statement (page 32) and provide a statement of the t test as
a test of the difference between two means. The output will provide a t statistic,
a p value and the degrees of freedom statistic. To use the t distribution table
on page 65, the t statistic and the degrees of freedom are required. Analysis
can be made using that table or the p value.
Minitab: Stat>Tables>Chi Square
1. Create the table shown in the example in
Minitab.
2. Run Stat>Tables>Chi Square Test. In the
dialog box select the columns containing the
tabular data, in this case, C2 and C3. Click
OK to run.
3. In the Session window, the table that is created
will show the expected value for each of the data
cells under the actual data for the cell, plus the
c
2
calculation ,
2
calc
= [(f
obs
- f
exp
)
2
/ f
exp
]
.
4. The Chi Square calculation is
2
calc
= 4.927.
The p value for this test of 0.026.
5. The degrees of freedom df = (Number of Rows -
1)(Number of Columns -1) is shown, df = 1.
6. Determine
2
crit
from the
2
table for the degrees
of freedom and confidence level desired (usually
5% risk).
2
crit
=.3.841.
7. Since
2
calc >
2
crit
, reject Ho.
Because the data is in tabular form in Minitab, no
other analysis can be done.
column format. Note that the data is in a single column and the factors or variables being
considered are shown as subscripted values. In this graphic, the data is in column C6 and
the appropriate subscripts are in column C4 and C5.
1. Run Stat>Tables>Cross Tabulation. In the dialog box select the columns identifying
Minitab: Stat>Tables>Cross Tabulation
If additional analysis of data is desired, including
any graphical analysis, the Stat>Tables>Cross
Tabulation procedure is preferred. This
procedure uses data in the common Minitab
the factors or variables in the Classification
Variables box.
2. Click Chi Square Analysis and select Above
and expected count.
3. Select the column containing the response data
in the Frequencies in box, in this case, data.
4. Click Run.
5. The output in the session window is very
similar to the output for Stat>Tables>Chi
Square Test, except that it does not show
the Chi Square calculation for each cell.
6. Analysis of the test are done as before,
either by using the generated p value or by
using the calculated
2
and degrees of
freedom and entering the tables with that
information to find a
2
crit.
32 39
Testing Equality of Variances
Testing Equality of Variances
The F test is used to compare the variances of two distributions. It tests the
hypothesis, H
o,
that the variances of two distributions are equal. It is performed by
forming a ratio of two variances from two samples and comparing the ratio with a value
in the F distribution table. The F test can be used to demonstrate that the variance
has been increased or decreased after a process change. Since t tests and
ANOVA need to know if population variance is the same or different, this test is also
a prerequisite for doing other types of hypothesis testing. In Minitab, this test is done
as Homogeneity of Variance.
The F test is also used during the ANOVA process to confirm or reject hypotheses
about the equality of averages of several populations.
Performing an F Test
1. Pull samples in a random manner from the two distributions for which you are
comparing the variances. Prior to running the test confirm sample distribution
normality for each sample (page 17).
2. Compute the F statistic, Fcalc = . The F statistic should always be
calculated so that the larger variance is in the numerator.
3. Calculate the degrees of freedom for each sample. Degrees of freedom = n
i
-1,
where n
i
is the sample size for the i
th
sample, i.e., n
1
-1 & n
2
-1.
4. Specify the risk level that you can tolerate for making an error in your decision
(usually set at 5%.)
5. Use the F distribution table (p 59 - 60) to determine F
crit
for the degrees of
freedom in your samples and for the risk level you have chosen.
6. Compare F
calc
to F
crit
. If F
calc
< F
crit
., the null hypothesis, H
o
, which implies that
the variances from both distributions are equal, cannot be rejected. If F
calc
>F
crit
,
reject the null hypothesis and conclude that the samples have different variances.
Using Homogeneity of Variance (For MINITAB Analysis)
1. Homogeneity of Variance will allow analysis of multiple population variances
simultaneously. It will also allow analysis of non-normal distributions. Data from
all sample groups must be stacked in a single column with the samples
identified with a separate subscript or factor column.
2. In Minitab, use STAT>ANOVA>HOMOGENEITY OF VARIANCE. In the dialog
box, identify the single response column and a separate Factors column or
columns.
3. Analysis of the test will be done using the p value. If the data is Normal (See
Normality, page 15), use Bartlett's Test. Use Levene's Test when the data come
from continuous, but not necessarily normal distributions.
4. The computations for the homogeneity of variance test require that at least one
cell contains a non-zero standard deviation. Normally, it is possible to compute a
standard deviation for a factor if it contains at least two observations.
5. Two standard deviations are necessary to calculate Bartletts and Levenes test
statistics.
2
2
2
1

1. Determine the desired confidence level (80%; 90% or 95%).


2. Find the lower and upper confidence interval factors for that level of
confidence for the number of failures found in the sample.
3. Divide these factors by the actual sample size used.
4. The resultant of the two calculations gives the range of plausible values
for the proportion of the population that is defective.
Example: k=2; n=200 (2 defects in 200 sampled parts or CTQ outputs)
Then:
Poisson Approximation
Poisson Approximation
Use this approximation when the sample size is large and the probability of
defects (p) in the sample is less than 0.10.
In such a situation:
n
k
p =

Where: k=number of defects


n= number of sample parts
The confidence interval for this proportion defective can be found using
the Poisson distribution.
0100 .
200
2
= = p And the 95% 2-sided confidence Interval is:
Where:
Lower confidence factor = .619
Upper confidence factor = 7.225
CI=(.619/200, 7.225/200)
=(.0031,.0361)
31 40
F Test - Compares the variances of two Distributions
H
0
- The sample variances tested are statistically the same

2
0
=
2
1
H
A
- The sample variances tested are not equal

2
0

2
1
Homogeneity of Variance - Compares the variances of
multiple Distributions
H
0
- The sample variances tested are statistically the same

2
0
=
2
1
=
3
==
k
H
A
- At least one of the sample variances tested are not
equal

2
0

2
1

2
1
Bartletts Test - Tests Normal Distributions
Levenes Test - Tests non-normal distributions

2

- Tests the hypothesis that two discretely measured
variables operate independently of each other
H
o
: Independent (There is no relationship between the populations)
H
o
: p
1
= p
2
= p
3
=

... = p
n
H
a
: Dependent (There is a relationship between the populations)
H
a
: At least one of the equalities does not hold
Hypothesis Statements (
Hypothesis Statements (
cont
cont
.)
.)
What is a p-value? What is a p-value?
Statistical definitions of p-value:
The observed level of significance.
The chance of claiming a difference if there is no difference.
The smallest value of alpha that will result in rejecting the null hypothesis.
If p < Alpha, then the difference is statistically significant. Reject the null
hypothesis and declare that there is a difference.
Think of (1 - p) as the degree of confidence that there is a difference.
Example: p = .001, so (1 - p) = .999, or 99.9%.
You can think of this as 99.9% confidence that there is a difference.
How do I use it? How do I use it?
f (X) f (X) f (X) f (X)
f (X) f (X) f (X) f (X)
Y= Y= Y= Y=
Y= Y= Y= Y=
The Transf er The Transf er The Transf er The Transf er
The Transf er The Transf er The Transf er The Transf er
Funct i on Funct i on Funct i on Funct i on
Funct i on Funct i on Funct i on Funct i on
Effect Effect Root Causes Root Causes
Output Output Inputs Inputs
If Inputs are Continuous:
Regression
Analysis of Covariance
If Outputs are :
What is the mathematical relationship
between the Y and the Xs
What is the mathematical relationship What is the mathematical relationship
between the Y and the Xs between the Y and the Xs
Copyright 1995 Six Sigma Academy, Inc.
D
i
s
c
r
e
t
e
C
o
n
t
i
n
u
o
u
s
If Inputs are Discrete:
ANOVA
t Tests; F Tests
Confidence Intervals
DOE
If Inputs are Continuous:
Logistic Regression
If Inputs are Discrete:
Logistic Regression

2
Confidence Intervals- Proportions
DOE
30 41
Stating the Hypothesis H Stating the Hypothesis H
O O
and H and H
A A
The starting point for a hypothesis test is the null hypothesis - H
o
. H
o
is the hypothesis of sameness, or no difference.
Example: The population mean equals the test mean.
H
o
H
a
The second hypothesis is H
a
- the alternative hypothesis. It represents
the hypothesis of difference.
Example: The population mean does not equal the test mean.
You usually want to show that there is a difference (H
a
).
Start by assuming equality (H
o
).
If the data show they are not equal, then they must be different (H
a
).
Hypothesis Statements
1 Sample t - Compares a single distribution to a
target or hypothesized value.
H
0
- The sample tested equals the target

0
=Target
H
a
- The sample tested is not equal to the target or
greater than/less than the target.

0
Target

0
>Target

0
<Target
2 Sample t - Compares the means of two separate
distributions
H
0
- The samples tested are statistically the same

0
=
1
H
a
- The sample tested is not equal to the target or
greater than/less than the target.

0

1

0
>
1

0
<
1
ANOVA - One Way ANOVA - One Way
ANOVA,, ANalysis Of VAriance is a technique used to determine the statistical
significance of the relationship between a dependent variable (Y) and a single or
multiple independent variable(s) or factors (Xs).
ANOVA should be used when the independent variables (Xs) are categorical (not
continuous). Regression Analysis (Pages 43 - 45) is a technique for performing a
similar analysis with continuous independent variables.
ANOVA determines if the differences between the averages of the levels is greater
than the expected variation. It answers the question: Is the signal between levels
greater than the noise within levels?
ANOVA allows the investigator to compare several means simultaneously with the
correct overall level of risk.
Basic Assumptions for using ANOVA
Equal Variances (or close to the same) for each subgroup.
Independent and normally distributed observations.
Data must represent the population variation.
Acceptable Gage R&R
ANOVA tests for equality of means is fairly robust to the assumption of normality
for moderately large sample sizes, so normality is often not a major concern.
ANOVA ANOVA
The One Way ANOVA enables the investigation of a single factor at multiple levels
with a continuous dependent variable. The primary investigation question is Do any of
the populations of Y stemming from the levels of X have different means?
MINITAB will do this analysis either with the data in table form, with data for each level
of X in separate columns (STAT>ANOVA>ONE WAY (UNSTACKED) ) or with all the
data in a single column and the factor levels identified by a separate subscript column
(STAT>ANOVA>ONE WAY). For the data below, use One-Way c1-c3 and One
Way for data in columns c4-c5.
In the dialog box, for One Way
(Unstacked), identify each of the
columns containing the data.
In the dialog box for One-way,
identify the column containing the
Response (Y) and the Factor (X) as
appropriate.
For both analyses, if graphic
analysis is desired select the
Graphs button and select between
Dotplots and Boxplots .
Click OK to run. For analysis, see
page 41.
29
42
Hypothesis Testing
Hypothesis Testing
Steps in Hypothesis Testing Steps in Hypothesis Testing
1. Define the problem; state the objective of the Test.
2. Define the Null and Alternate Hypotheses.
3. Decide on the appropriate statistical hypothesis test; Variance (Page 30);
Mean (t Test - Page 31 - 32); Frequency of Occurrence (Discrete -
2
-
Page 35 - 36).
4. Define the acceptable and risk.
5. Define the sample size required. (Page 27 - 28)
6. Develop the sampling plan and collect samples.
7. Calculate the test statistic from the data.
8. Compare the calculated test statistic to a predicted test statistic for the risk
levels defined.
If calculated is larger than the predicted test statistic, the statistic
indicates difference.
Since all data are variable, an observed change could be due to chance and may not
be repeatable. Hypothesis testing determines if the change could be due to chance
alone, or if there is strong evidence that the change is real and repeatable.
In order to show that a change is real and not due to chance alone, first assume there
is no change (Null Hypothesis, H
O
). If the observed change is larger than the change
expected by chance, then the data are inconsistent with the null hypothesis of no
change. We then reject the null hypothesis of no change and accept the alternative
hypothesis, H
A
.
The null hypothesis might be that two suppliers provide parts with the same average
flatness (H
O
:
1
=
2
, the mean for supplier 1 is the same as the mean for supplier 2). In
this case, the alternative hypothesis is that average flatness is not equal (H
A
:
1

2
).
If the means are equal and your decision
is that they are equal (top left box), then
you made the correct decision.
If the means are not equal and your
decision is that they are not equal(bottom
right box), then you made the right
decision.
If the means are equal but your decision
is that they are not equal (Bottom left
box), then you made a type 1 error. The
probability of this error is alpha ()
If the means are not equal but your
decision is that they are equal (top right
box), then you made a Type 2 error. The
probability of this error is beta ().
.
Real World
Real World
Decision
Decision

1
=
2

1
=
2
1

1
=
2

1
=
2

2
Correct
Decision
Type 2
Error

Type 1
Error

Correct
Decision
Anova Two Way
Anova Two Way
Two way ANOVA evaluates the effect of two separate factors on a single response can be
evaluated. Each cell(combination of independent variables) must contain an equal number
of observations (must be balanced). See General Linear Model (Pages 42) for unbalanced
Data sets. In the data set on the right, Strength is
the response (Y) and Chem and Fabric are the
separate factors (X
1
and X
2
). To analyze the
significance of these factors on Y, run
STAT>ANOVA>TWO WAY. In the dialog box,
identify the Response (Y), Strength. In the Row
Factor box, identify the first of two factors (X) for
analysis. In the Column Factor, Identify the
second vital X. Select the Display Means box
for each factor to gain Confidence interval and
means analysis.
Select STORE RESIDUALS and then STORE
FITS.
If graphical analysis of the ANOVA data is
desired, select the Graphs button and choose
one, or all of the four diagnostic graphs available.
This analysis does not produce F and p-values,
since you can not specify whether the effects are
fixed or random. Use Balanced ANOVA (Page
36) to perform a two-way analysis of variance,
specify fixed or random effects, and display the F
and p-values when you have balanced data. If you have unbalanced data and random
effects, use General Linear Model (Page 42) with Options to display the appropriate test
results.
It can be seen from the SS column
that the error SS is very small
relative to the other terms. In the
graphic Confidence interval analysis
it is clear that both factors are
statistically significant, since some
of the confidence intervals do not
overlap.
43 28
Calculating Sample Size
Calculating Sample Size
( (( ( ) )) )
( (( ( ) )) )
2
2
2 /
/
2


z z
n
+ ++ +
= == =
( (( ( ) )) )
( (( ( ) )) )
350
3 .
326 . 2 645 . 1
2
2
2
= == =
+ ++ +
= == = n
Example:
= .10, = .01,
/=.3
/2 Z
/2
Z

.20 .10 1.282 .20 0.842
.10 .05 1.645 .10 1.282
.05 .025 1.960 .05 1.645
.01 .005 2.576 .01 2.326
To calculate the actual sample size without the table,
or to program a spreadsheet to calculate sample size,
use this equation.
ANOVA - Balanced
ANOVA - Balanced
Figure 2
The Balanced ANOVA allows the analysis of process data with two or more factors. As
with the Two Way ANOVA, Balanced ANOVA allows analysis of the effect of multiple
factors, at multiple levels simultaneously. A factor (B) is nested within another factor (A)
if the level of B appears with only a single level of A. Two factors are crossed if every
level of one factor appears with every level of the other factor. The data for individual
levels of factors must be balanced: each combination of independent variables (cell)
must have an equal number of observations. See General Linear Model (Page 38) for
analysis of unbalanced designs. Guidelines for normality and variance remain same as
shown on page 38.
Figure 1 shows how some of the factors and
data might look in the MINITAB worksheet.
Note there are five (5) data points for each
combination of the three factors.
To analyze for significance of these factors(X
ij
)
on the response variable (Y), run
STAT>ANOVA>BALANCED ANOVA. In the
dialog box (Figure 2) , Identify the
Y variable in the Response box and identify
the factors in the Model box. Note that the
pipes [Shift \] indicate the model analyzed is
to include factor interactions. Select Storage
to store residuals and fits for later analysis.
Select Options and select Display means... to
display information about data means for each
factor and level.
Figure 3 is the primary output of this analysis. There is no significant graphic analysis
for the balanced ANOVA. See page 41 for analysis of this output.
Figure 1
Figure 3
PIPE
44 27
Sample Size Determination Sample Size Determination
When using sampling to analyze processes, sample size must be consciously selected
based on the allowable and risk, the smallest amount of true difference () that
you need to observe for the change to be of practical significance and the variation
of the characteristic being measured (). As variation decreases or sample size
increases it is easier to detect a difference.
Steps to defining sample size Steps to defining sample size
1. Determine the smallest true difference to be detected, the gap ( ).
2. Confirm the process variation ( ) of the processes to be evaluated.
3. Calculate /.
4. Determine acceptable and risk.
5. Use chart on page 58 to read the sample size required for each level of the factor
tested.
For example --
Assume the direction of the effect is unknown, but you need to see a delta sigma (/)
of 1.0 in order to say the change is important. For an risk of 5% and a risk of 10%,
we would need to use 21 samples. Remember that we would need 21 at each level of
the factor tested. If for the same , were reduced so that / were 2, only 5 samples
would be required. In general, the smaller the shift (/) you are trying to detect, and/or
the lower the tolerable risk, the greater the number of samples required.
Sample size sensitivity is a function of the standard error of the mean Smaller
samples are less sensitive than larger samples.
gap delta( )
Today Today
T
Desired Desired
variation( )
Today Today
gap delta ( )
Desired Desired
variation( )

1
2

). ( n
Continuous Data Analysis
Continuous Data Analysis
T
Interpreting the ANOVA Output
Interpreting the ANOVA Output
The first table lists the factors and levels. In the table shown there are three factors,
Region, Shift and WorkerEx. There are three levels each for Region and Shift. The
values assigned for the Region and Shift levels are 1,2 &3. WorkerEx is a two level
factor and has level values of 1&2.
The second table is the ANOVA output. The columns are as defined below.
Source
The source shows the identified factors from the model, showing
both the single factor information (i.e., Region) and the interaction
information (i.e., Region*Shift)
DF
Degrees of Freedom for the particular factor. Region and shift have
3 levels and 3-1=2 df, and workerex has 2 levels and 2-1=1 df.
SS
Factor Sum of Squares is a measure of the variation of the sample
means of that factor.
MS
Factor Mean Square is the SS divided by the DF.
F
The F
calc
value is the MS of the factor divided by the MS of the Error
term. In the case of Region, F=90.5773.325=27.24. If using F
crit
to analyze for significance, enter table with DF degrees of freedom
and =.05. Compare F
calc
to F
crit
. If F
calc
is greater than F
crit
, the
factor is significant.
P
The calculated P value, the observed level of significance. If P<.05,
the factor is statistically significant at the 95% level of confidence.
Note: The relative size of the error SS to total SS indicates the percent of variation left
unexplained by the model. In this case, the unexplained variation is 39.16% of the total
variation in this model. The s of this unexplained variation is the square root of the MS of
the Error term (3.325). In this case the within group variation has a sigma of 1.82. If this
remaining variation does not enable the process to achieve the desired performance state,
look for additional factors.
45 26
Analysis and Improve
Analysis and Improve
Tools
Tools
Y
Y
Discrete Discrete
C
o
n
t
i
n
u
o
u
s
C
o
n
t
i
n
u
o
u
s
X
X
D
i
s
c
r
e
t
e
D
i
s
c
r
e
t
e
Continuous Continuous
Logistic Regression, Discriminant Analysis and CART (Classification and
Regression Trees) are advanced topics not taught in Six Sigma Training.
The following references may be helpful.
Breiman, Friedman, Olshen and Stone; Classification and Regression Trees;
Chapman and Hall, 1984
Hosmer and Lemeshow; Applied Logistic Regression; Wiley, 1989
Minitab Help - Additional information about Discriminate Analysis
Tables (Cross
tab)
Chi Square
Confidence
intervals for
proportions
Pareto
Confidence
intervals
t test
ANOVA
Homogeneity of
Variance
GLM
DOE (factorial fit)
Logistic
regression
Discriminant
Analysis
CART
(Classification and
Regression Trees)
Linear regression
Multiple regression
Stepwise
Regression
DOE response
surface
General Linear Model
General Linear Model
The General Linear Model (GLM) can handle unbalanced data - such as data sets with
missing observations. Where the Balanced ANOVA required the number of observations
to be equal in each factor/level grouping, GLM can work around this limitation.
The data must be full rank (enough data to estimate terms in the model). But you dont
have to worry about this, because Minitab will tell you if your data isnt full rank!
Interpretation:
Temp1 is a significant X variable, because it explains 62% of the total variation
(528.04/850.4). (Temp1 also has a p-value < 0.05, indicating that it is statistically
significant)
Neither Oxygen1 nor the interaction between Oxygen and Temperature appears
significant.
The unexplained variation represents 30.95% ((263.17850.4)*100) and the estimate of
the within subgroup variation is 5.4 (square root of 29.24).
In the data set shown in Figure 1, note that there is only
one data point in Rot1, the response column for factor
Temp1- level 10 / Oxygen1- level 10(rows 8 & 9), and
only two data points for Temp1 - level 16 / Oxygen1 -
Level 6 (Row 14). In such case, Balanced ANOVA
would not run because the requirement of equal
observation would require three data points in each cell
(factor and level combination).
Run STAT>ANOVA>GENERAL LINEAR MODEL. In
the Dialog box, identify the response variable in the
Response box and the factors in the Model box. Use
the pipe (shifted \) to include interactions in the
analysis.
Figure 2 is the primary output of this analysis. There is
no graphic analysis of this output.
Figure 1
Figure 2
Pareto Diagrams
Pareto Diagrams
Stat Stat>Quality Tools>Pareto Chart >Quality Tools>Pareto Chart
Cause and Effect Diagrams
Cause and Effect Diagrams
Fishbone Diagrams: Fishbone Diagrams: Stat Stat>Quality Tools>Cause & Effect >Quality Tools>Cause & Effect
When analyzing categorical defect data , it is
useful to use the Pareto chart to visualize the
relative defect frequency. A Pareto Chart is a
frequency ordered column chart. The analysis can
either analyze raw defect data, such as scratch,
dent, etc, or it can analyze count data such as is
made available from Assembly Line Defects
reports. The graphic on the left is from count data.
Set up the worksheet with two columns, the first
with the defect cause descriptor and the second
with count or frequency of occurrences. In the
PARETO CHART dialog box, select Chart
O
thers
Incom
plete P
art
D
efective H
ousi
Leaky G
asket
M
issing C
lips
M
issing S
crew
s
18 10 19 43 59 274
4.3 2.4 4.5 10.2 13.9 64.8
100.0 95.7 93.4 88.9 78.7 64.8
400
300
200
100
0
100
80
60
40
20
0
Defect
Count
Percent
Cum %
P
e
r
c
e
n
t
C
o
u
n
t
Pareto Chart for Defects
DEFECTS TABLE. Link the cause descriptor to the LABELS IN box and the counts to the
FREQUENCY box. Click OK. For more information, see Minitab Context sensitive help in the
Pareto Dialog box.
To interpret the pareto, look for a sharp gradient to the categories with 80% of counted defects
attributable to 20-30% of the identified categories. If Pareto is flat with all categories linked to
approximately the same number of defects, try to restate the question to redefine the categorical
splits.
Exhaust Quality
Condensation
Moisture%
Inspectors
Microscopes
Micrometers
Brake
Engager
Angle
Suppliers
Lubricants
Alloys
Speed
Lathes
Bits
Sockets
Operators
Training
Supervisors
Shifts
Men
Machines
Materials
Methods
Measurements
Environment
Cause-and-Effect Diagram
When working with the Advocacy team to define the
potential factors (Xs), it is often helpful to use a
Cause and Effect Diagram or Fishbone to display
the factors. The arrangement helps in the discovery
of potential interactions between Factors (Xs).
Use Minitab worksheet columns to record
descriptors for the factors identified during the team
brainstorming session. Group the descriptors in
columns by categories such as the 5Ms. Once the
factors are all recorded, open the Minitab
Stat>Quality Tools>Cause and Effect dialog
box.
The dialog box will have the 5Ms and Environment shown as default categories of factors. If using
these categories, link the worksheet columns of categorized descriptors to the dialog box categories.
If the team has elected to use other Category names, replace the default names and link the
appropriate columns. Click OK.
To interpret the Cause and Effect Diagram, look for places where a factor in one category could also
be included in another category. Question the Advocacy team about priority or significance of the
factors in each category. Then prioritize the factors as a whole. For the most significant factors, ask
the team where there is the potential for changes in one factor to influence the actions of another
factor. Use this information to plan analysis work.
25 46
Regression Analysis
Regression Analysis
Regression can be used to describe the mathematical relationship between the response
variable and the vital few Xs, if you have continuous data for your Xs. Also, after the
vital few variables have been isolated, solving a regression equation can be used to
determine what tolerances are needed on the vital few variables in order to assure that
the response variable is within a desired tolerance.
Regression analysis can find a linear fit between the response variable Y and the vital few
input variables X
1
and X
2
.
This linear equation can be used to decide what tolerances must be maintained on X
1
and
X
2
in order to hold a desired tolerance on a Variable Y.
(Start with a scatter diagram to examine the data.)
error X
B X B B
Y + + + =
2
2 1 1 0

2
2
1
1
X
B
X
B
Y + =
Regression analysis can be done using several of the MINITAB tools.
Stat>Regression>Fitted Line Plot is explained on Page 20. This section will discuss
Regression>Regression.
Data must be paired in the MINITAB worksheet. That is, one measurement from each
input factor (x) is paired with the response data (Y) for that particular measurement point.
Plot the data first using Minitab Stat>Plot. Analyze the data using
Stat>Regression>Regression. In the dialog box indicate the Response (Y) in the
Response box and the
expected factors (Xs) in the
Predictors box. Select the
Storage button and in that
dialog box select Fits and
Residuals. Click OK twice to
run the analysis. The output
will appear as shown in the
figure to the right.
The full regression equation is
shown at the top of the output.
Predictor influence can be
evaluated using the p column in
the first table. Analysis of the
second table is done in similar
fashion to the ANOVA analysis
on page 41. Note that R
2
(ADJ)
is similar to R
2
but is modified to
reflect the number of terms in
the regression. If there are many terms in the model, and the sample size is small, then R
2
(ADJ) can be much lower than R
2
, and you may be over-fitting. In this example, the total
sample size is large (n=560), so R
2
and R
2
(ADJ) are similar.
24 47
Time Series Plot
Time Series Plot
Graph>Time Series Plot Graph>Time Series Plot
The time series plot is useful as a
diagnostic tool. Use it to analyze data
collection processes, non-normal data sets,
etc. In GRAPH VARIABLES: Identify any
number of variables (Y) from the worksheet
you wish to look at over time. Minitab
assumes the values are entered in the order
they occurred. Enter one column at a time.
Minitab will automatically sequence to the
next graph for each column. The X axis is
the time axis and is set by selecting the
appropriate setting in TIME SCALE. Each
Time series plot will display on a separate graph. In FRAME,ANNOTATE and
OPTIONS, you can change chart axes, display multiple charts, etc. In analyzing the
Time Series Plot, look for a story. Look for trends, sudden shifts, a regular cycle,
extreme values, etc. If any of these exist, they can be used as a lead into problem
solving.
5 4 3 2 1 0 -1 -2 -3 -4 -5
60
50
40
30
20
10
0
95% Confidenc e Inter val
StDev
Lambda
Last I ter ation I nf o
2. 783
2. 782
2. 784
0.170
0.113
0.056
StDev Lambda
Up
Est
Low
Box-Cox Plot for Skewed
Box-Cox Transformation
Box-Cox Transformation
Stat Stat>Control Charts>Box-Cox Transformation >Control Charts>Box-Cox Transformation
The transformed data is the original data raised to the power of . Subgroup data
can be in columns or across rows. In the dialog box, indicate how DATA ARE
ARRANGED and where located. If data is subgrouped and subgroups are in rows,
identify configuration. To store transformed data, select STORE TRANSFORMED
DATA IN and indicate new location.
The Box-Cox transformation can be useful for correcting non-normality in process
data, and for correcting problems due to unstable process variation. Under most
conditions, it is not necessary to correct for non-normality unless the data are highly
skewed. It may not be necessary to transform data which are used in control charts,
because control charts work well in situations where data are not normally
distributed.
Note: You can only use this procedure with positive data.
BOX-COX TRANSFORMATION is a
useful tool for finding a transformation
that will make a data set closer to a
normal distribution. Once confirmed
that the distribution is non-normal, use
Box Cox to find an appropriate
transformation. Box-Cox provides an
exponent used in the transformation
called lambda, .
Stepwise Regression
Stepwise Regression
Stepwise Regression is useful to search for leverage factors in a data set with many
factors (xs) and a response variable (Y). The tool can analyze up to 100 factors. But,
while this enables the analysis of Baseline data for potential Vital Xs, be careful not
to draw conclusions about significance of Xs without first confirming with a DOE.
To use Stepwise regression, the data needs to be entered in Minitab with each
variable in a separate column and each row representing a single data point. Next
select Stat>Regression>Stepwise.
In the dialog box, identify the column containing the response (Y) data in the
Response box. In the Predictor box, identify the columns containing the factors
(Xs) you want Minitab to use. If their F-statistic falls below the value in the F to
remove text box under Options (Default = 4), Minitab removes them. By selecting
the Options button, you can change the F
crit
value for adding and removing factors
from the selection and also reduce the number of steps of analysis the tool goes
through before asking for your input.
Minitab will prioritize the leverage X variables and run the first regression step on
the factor with the greatest influence. It continues to add variables as long as the t
value is greater than the SQRT of the identified F statistic limit (Default = 4). The
Minitab output includes
1) the constant and the factor coefficients for the significant terms.
2) the t value for the Factors included.
3) the s for the unexplained variation based on the current model.
4) the R
2
for the current model.
If you have chosen 1 step between pauses, Minitab will then ask if you wish to run
more. Type yes and enter. Continue this procedure until MINITAB wont calculate
any more. At that point, you will have identified your potential leverage Xs.
Output Output
In this output, there are five potential
predictors identified by stepwise
regression. The steps are shown by
the numbered columns and include
the regression information for
included factors. The information in
column 1 represents the regression
equation information if only Form is
used. In column 5, the regression
equation information includes five
factors, but the s is .050 and the R
2
is only 25%. In all probability, the
analyst will choose to gather
information including additional
factors during next runs.
48
23
Regression with Curves (Quadratic) & Regression with Curves (Quadratic) &
Interactions Interactions
When analyzing multiple factor relationships, it is important to consider if there is
potential for quadratic (curved) relationships and interactions. Normal graphic analysis
techniques and Regression do not allow analysis of the effects of interrelated factors. To
accomplish this, the data must be analyzed in an orthogonal array (See Page 49). In
order to create an orthogonal array with continuous data, the factor (x) data must be
centered. Do this as follows:
1. The data to be analyzed need to be is columns, with the response in one column
and the values of the factors paired with the response and recorded in separate
columns.
2. Use Stat>DOE>Define Custom RS Design
In the dialog box, identify the columns containing the factor settings.
3. Next, analyze the model using Stat>DOE>Analyze RS Design.
Identify the column containing the response data
Check: Analyze Data using Coded Units.
4. Click on Storage and select Fits and Residuals for later regression diagnostics.
Click OK. Click on Graphs and select the desired graphs for analysis diagnostics.
The initial analysis will include all terms in the potential equation including full
quadratic. Analysis of the output will be similar to that for
Regression>Regression (Page 43).
5. Where elements are
insignificant revert to the
Stat>DOE>Analyze RS
Design >Terms dialog box
to eliminate. In the case of
this example, the equation
can be analyzed as a linear
relationship, so select
Linear in the Include the
following terms box. Note
that this removes all the
interaction and quadratic
terms. Re-run the
regression. Once an
appropriate regression
analysis, including leverage
factors has been obtained, validate the adequacy of the model by using the
regression diagnostic plots Stat>Regression>Residuals Plots (Page 22).
Once an appropriate regression equation has been determined, remember this
analysis was done with centered data for the factors. The centering will have to be
reversed in order to make the equation useful from a practical standpoint. To
create a graphic of the model, use Stat>DOE>RSPlots (Page 52). From this
dialog box a contour plot of the results can be created.
Box Plot
Box Plot
GRAPH>BOXPLOT GRAPH>BOXPLOT
The boxplot is useful for comparing
multiple distributions (Continuous Y and
discrete X).
In the GRAPH section of the dialog box, fill
in the column(s) you want to show for Y
and if a column is used to identify various
categories of X, i.e., subgroup coding, etc.
Click FRAME Button to give you the
options of setting common axes or multiple
graphs on the same page. To generate
multiple plots on a single page, select
FRAME>MULTIPLE GRAPHS>OVERLAY GRAPHS... Click ATTRIBUTES to allow you
to change individual box colors. Click OK
The box represents the middle 50% of the distribution. The horizontal line is the median
(the middlemost value) The whiskers each represent a region sized at 1.5*(Q3-Q1), the
region shown by the box). Interpretation can be that the box represents the hump of the
distribution and the whiskers represent the tails. Asterisks represent points which would
fall outside the lower or upper limits of expected values.
Y variable: Select the column to be plotted on the y-axis.
Group variable: Select the column containing the groups (or categories). This variable is
plotted on the x-axis.
Type of interval plot Type of interval plot
Standard error: Choose to display standard error bars where the error bars extend one
standard error away from the mean of each subgroup.
Multiple: Enter a positive number to be used as the multiplier for standard errors (1 is the
default).
Confidence interval: Choose to display confidence intervals instead of standard error
bars. The confidence intervals assume a normal distribution for the data and use t-
distribution critical values.
Level: Enter the level of confidence for the intervals. The default confidence coefficient is
95%.
Interval Plot
Interval Plot
GRAPH>INTERVAL PLOT GRAPH>INTERVAL PLOT
type
t
h
i
c
k
n
e
s
s
new existing
129.25
129.15
129.05
128.95
Useful for comparison of multiple
distributions. Shows the spread of data
around the mean by plotting standard error
bars or confidence intervals.
The default form of the plot provides error
bars extending one standard error (standard
deviation/square root of n) above and below
a symbol at the mean of the data.
One-Variable Regression
One-Variable Regression
STAT>REGRESSION>FITTED LINE PLOT STAT>REGRESSION>FITTED LINE PLOT
In the STAT>REGRESSION>FITTED
LINE PLOT dialog box, identify
Response Variable (Y). Identify one (1)
Predictor (X). Select TYPE OF MODEL
(Linear, Quadratic or Cubic). Click on
STORAGE. Select RESIDUALS and
FITS.
If you need to transform data, use
OPTIONS and select Transformation.
In OPTIONS, select DISPLAY
CONFIDENCE BANDS and DISPLAY
PREDICTION BANDS. Click OK.
The output from the fitted line plot contains an equation which relates your predictor
(input variable) to your response (output variable). A plot of the data will indicate
whether or not a linear relationship between x and y is a sensible approximation.
These observations are modeled by the equation:
Y = b + mx + error
Confidence Bands are 95% confidence limits for data means. Prediction Bands
are limits for 95% of individual data points.
The R-sq is the square of the correlation coefficient. It is also the fraction of the
variation in the output (response) variable that is explained by the equation. What is
a good value? It depends... chemists may require an R-sq of .99. We may be
satisfied with an R
2
of .80.
Use Residual Plots (Below) to plot the residuals vs predicted values ( Fits) and
determine if there are additional patterns in the data.
760 750 740 730 720 710 700 690 680 670 660
700
600
500
400
300
200
Hardness
A
b
r
a
s
i
o
n
R-Sq = 0.784
Y = 2692.80 - 3.16067X
95% PI
95% CI
Regression
Regression Plot
Residual Plots
Residual Plots
Stat Stat>Regression>Residual Plots >Regression>Residual Plots
0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8
10
5
0
Residual
F
r
e
q
u
e
n
c
y
Histogram of Residuals
30 20 10 0
1
0
-1
Observation Number
R
e
s
id
u
a
l
I Chart of Residuals
X=0.000
3.0SL=0.9631
-3.0SL=-0.9631
10 9 8 7 6 5 4
0.5
0.0
-0.5
-1.0
Fit
R
e
s
id
u
a
l
Residuals vs. Fits
2 1 0 -1 -2
0.5
0.0
-0.5
-1.0
Normal Plot of Residuals
Normal Score
R
e
s
id
u
a
l
Weld Temp Fits
Any time a model has been created for an X/Y
relationship, through ANOVA, DOE, Regression, the
quality of that model can be evaluated by analysis of
the error in the equation.
When doing the REGRESSION (Page 37-38), or the
FITTED LINE PLOT (above), be sure to select store
FITS and RESIDUALS in the STORAGE dialog
box. If the fit is good, the error should be normally
distributed with an average of zero and there should
be no pattern to the error over the range.
Then in the RESIDUAL PLOTS dialog box, identify
the column where the residuals are stored
in the Residuals box and the fits storage column in the Fits box.
The output includes a normal plot of residuals, a histogram of residuals, an Individuals Chart of
Residuals and a scatter plot of Residuals versus Fits.
Analysis of the Normal Plot should show a relatively straight line if the residuals are normally
distributed. The I chart should be analyzed as a control chart. The histogram should be a bell-shaped
distribution. The residuals vs fits scatter plot should show no pattern, with a constant spread over the
range.
22
Binary Logistic Regression
Binary Logistic Regression
In binary logistic regression the predicted value (Y) will be probabilities p(d) of
an event such as success or failure occurring. The predicted values will be
bounded between zero and one (because they are probabilities).
Example: Predict the success or failure of winning a contract based on the
response cycle time to a request for proposal and the proposal team leader.
The probability of an event, (x) or Y, is not linear with respect to the Xs. The
change in (x) for a unit change becomes progressively smaller as (x)
approaches zero or one. Logistic regression develops a function to model this.
is the odds. The Logit is the Log of the odds. Ultimately the
transfer function being developed will solve for (x).
To analyze the binary logistic problem, use STAT>REGRESSION>BINARY
LOGISTIC REGRESSION. The data set used for Response will be Discrete
and Binary (Yes/No;Success/Failure). In the Model dialog box, enter all
factors to be analyzed. In the Factors dialog box, enter those factors which
are discrete. Use the Storage button and select Event probability. This will
store the Calculated Event probability for each unique value of the function.
)) ( 1 ( ) ( x x
) 1 (
1 0
1 0
) (
e
e
x
x
x

+
+
+
=
Analyze the Session Window
output.
1. Analyze the Hypothesis Test
for the model as a whole.
Check for a p value indicating
model significance.
2. Check for statistical
significance of the individual
factors separately. Use the P
Value.
3. Check the Odds ratios for the
individual predictor levels.
4. Use the Confidence Interval to
confirm significance. Where
confidence interval includes
1.0, the odds are not
significant.
5. Evaluate the Model for
Goodness of fit. Use
Hosmer- Lemeshow if there is a continuous X in the model.
6. Assess the measures of Association. Note % Concordant is a measure
similar to R
2
. A higher value here indicates a better predictive model.
Binary Logistic Regression
Link Function: Logit
Response Information
Variable Value Count
Bid Yes 113 (Event)
No 110
Total 223
Logistic Regression Table
Odds 95% CI
Predictor Coef StDev Z P Ratio Lower Upper
Constant 7.410 1.670 4.44 0.000
Index -8.530 1.799 -4.74 0.000 0.00 0.00 0.01
Brand Sp
Yes 1.2109 0.3005 4.03 0.000 3.36 1.86 6.05
Log-Likelihood = -134.795
Test that all slopes are zero: G = 39.513, DF = 2, P-Value = 0.000
1 1
2 2
3 3 4 4
Method Chi-Square DF P
Pearson 187.820 116 0.000
Deviance 224.278 116 0.000
Hosmer-Lemeshow 7.138 7 0.415
Table of Observed and Expected Frequencies:
(See Hosmer-Lemeshow Test for the Pearson Chi-Square Statistic)
(Between the Response Variable
and Predicted Probabilities)
Pairs Number Percent Summary Measures
Concordant 9060 72.9% Somers' D 0.47
Discordant 3260 26.2% Goodman-Kruskal Gamma 0.47
Ties 110 0.9% Kendall's Tau-a 0.23
Total 12430 100.0%
5 5
6 6
49
21
Normal Plot
Normal Plot
STAT>BASIC STATISTICS>NORMALITY TEST STAT>BASIC STATISTICS>NORMALITY TEST
Identify the variable you will be testing
in the Variable box.
Click OK (Use default Anderson Darling
test).
A Normal probability plot is a graphical method
to help you determine whether your data is
normally distributed. To graphically analyze
your data, look at the plotted points relative to
the sloped line. A normal distribution will yield
plotted points which closely hug the line. Non-
normal data will generally show points which
significantly stray from the line.
The test statistics displayed on the plot are A-
squared and p-value.
The A-squared value is an output of a test for normality. Focus your analysis on the p value. The p
value is The probability of claiming the data are not normal if the data are truly from a normal
distribution, a type I error. A high p-value would therefore be consistent with a normal distribution. A
low p-value would indicate non-normality. Use the appropriate type I error probability for judging this
result.
P-Value: 0.328
A-Squared: 0.418
Anderson-Darling Normality Test
N: 500
StDev: 10.0000
Average: 70.0000
106 96 86 76 66 56 46 36 26
.999
.99
.95
.80
.50
.20
.05
.01
.001
P
r
o
b
a
b
i
l
i
t
y
Normal
Normal Probability Plot
Descriptive Statistics
Descriptive Statistics
Stat Stat>Basic Statistics>Descriptive Statistics >Basic Statistics>Descriptive Statistics
92 82 72 62 52 42
95% Confidence Interval for Mu
71 70 69
95% Confidence Interval for Median
Variable: C1
68.6347
9.2856
68.5160
Maximum
3rd Quartile
Median
1st Quartile
Minimum
N
Kurtosis
Skewness
Variance
StDev
Mean
P-Value:
A-Squared:
70.8408
10.5136
70.2489
99.6154
75.8697
69.6848
62.5858
38.9111
500
3.89E-02
-5.0E-03
97.2442
9.8612
69.3824
0.790
0.235
95% Confidence Interval for Median
95% Confidence Interval for Sigma
95% Confidence Interval for Mu
Anderson-Darling Normality Test
Descriptive Statistics
The Descriptive Statistics>Graphs>Graphical
Summary graphic provides a histogram of the data
with a superimposed normal curve, a normality
check, a table of descriptive statistics, a box plot of
the data and confidence interval plots for mean and
median.
In the Descriptive Statistics dialog box select the
variables for which you wish to create the descriptive
statistics. If choosing a stacked variable with a
category column, check the BY VARIABLE box and
indicate the location of the category identifier.
Use the Graphs button to open the graphs dialog
box. In the graphs dialog box, select Graphical
Summary. When using this tool to interpret normality, confirm the p value and evaluate the shape of the
histogram. Remember that the p value is The probability of claiming the data are not normal if the
data are truly from a normal distribution, a type I error. A high p-value would therefore be consistent
with a normal distribution. A low p-value would indicate non-normality. When evaluating the shape
of the histogram graphically, determine: Is it bimodal? Is it skewed? If yes, investigate potential
causes for the non-normality. Improve if possible or analyze groups separately. If no special cause is
found for the non-normality, the distribution may be non-normal naturally and you may need to
transform the data (page 22) prior to calculating your Z.
50
Design for Six Sigma-
Design for Six Sigma-
Tolerance Analysis
Tolerance Analysis
Datum
A
B1 B4 B3 B2
Gap
+
Tolerance Analysis is a design method used determine the impact that
individual parts of a system have on the overall requirement for that
system.
Most often, Tolerance Analysis is applied to dimensional characteristics in
order to see the impact the dimensions have on the final assembly in
terms of a gap or interference. In this application, a tolerance loop may
be used to illustrate the relationship.
Purpose
To graphically show the relationships of multiple parts in a system which
result in a desired technical requirement in terms of a gap or
interference.
Process
1. Generate a layout drawing of your assembly. A hand sketch is all that is
required.
2. Clearly identify the gap in the most severe condition.
3. Select a DATUM or point from which to start your loop. (It is easier to
start the loop at one of the interfaces of the Gap.)
4. Use drawing dimensions as vectors to connect the two sides of the gap.
5. Assign sign convention (+/-) to vectors
In the diagram above, the relationship can be explained as:
GAP = A - B1 - B2 - B3 - B4
Because the relationship can be explained using only + & - signs, the
equation is considered LINEAR, and can be analyzed using a method
known as Root Sum of Squares (RSS) analysis.
Vector Assignment
Assign a positive (+) vector when
An increase in the dimension
increases the gap.
An increase in the dimension
reduces the interference.
Assign a negative (-) vector when
An increase in the dimension
reduces the gap.
An increase in the dimension
increases the interference.
51
20
9.7 9.5 9.3 9.1 8.9 8.7 8.5
9.7
9.5
9.3
9.1
8.9
8.7
8.5
NEW
E
X
I
S
T
I
N
G
Scatter Plot
Scatter Plot
GRAPH>PLOT GRAPH>PLOT
The scatter plot is a useful tool for
understanding the relationship between two
variables.
In the GRAPH VARIABLES box select each X
and Y variable you wish to plot. MINITAB will
create individual plots for each pair of variables
selected. In the Six Sigma method, the
selected Y should be the dependent variable
and the selected X the independent variable.
Select as many combinations as you wish.
Click the OPTIONS button to add jitter to
the graph. Where there are multiple data points with the same value, this will allow
each data point to be seen.
Click FRAME button to give the options for setting common axes or multiple graphs.
Click ATTRIBUTES button to access options for changing Graphic colors or fill type.
Click ANNOTATION button to access options for changing the appearance of the data
points or to add titles, data labels or text.
Results: If Y changes as X changes, there is a potential relationship. Use the graph to
check visually for linearity or non-linearity.
Histogram
Histogram
GRAPH>HISTOGRAM GRAPH>HISTOGRAM
The histogram is useful to look
graphically at the distribution of data.
In the GRAPH VARIABLES Box select
each variable you wish to graph
individually.
Click the OPTIONS button to change the
Histogram displayed:
Type of Histogram - Frequency
(Default); Percent; Density
Type of Intervals - Midpint (Default ) or Cutpoint
Definition of intervals - Automatic (Default ) or manual definition
See Minitab Help for explanation of how to use these options. Click HELP in the
HISTOGRAM>OPTIONS Dialog Box.
Click FRAME button to give the options for setting common axes or multiple
graphs.
Click ATTRIBUTES button to access options for changing graphic colors or fill
type.
Design for Six Sigma - Design for Six Sigma -
Tolerance Analysis (continued) Tolerance Analysis (continued)
Gather and prepare the required data
gap nominals
gap specification limits
process data for each component
process mean
process s.st
process s.lt
process data assumptions, if data not available: try 'expert data sources'
or....process s.lt estimates when data is not available
s.st, s.lt from capability data
Z-shift assumptions, long term-to-short term, when one is known
multiply s.st by 1.6 for a variance inflation factor
multiply s.st by 1.6+ for a process that has less control long term
( divide s.lt by the above factor if long-term historical data is known)
The Tolanal.xls spreadsheet performs its analysis using Root
Sum of Squares method, and should only be applied to LINEAR
relationships (i.e. Y=X1+X2-X3). Non-linear relationships require
more detailed analysis using advanced DFSS tools such as
Monte Carlo or the ANALYSIS.XLS spreadsheet. Contact a
Master Blackbelt for support.
Using RSS, the statistics for the GAP can be explained as
follows. The bar (

) above the terms designates mean values &


S designates Standard Deviation.
___ _ __ __ __ __
GAP = A - B1 - B2 - B3 - B4
Given these equations, the impact that each individual part has
on the entire system can be analyzed. In order to perform this
analysis, follow these steps:
Linear Tolerance Spreadsheet Linear Tolerance Spreadsheet
Once the data has been collected, it can be analyzed using the
Tolanal.xls spreadsheet described on the next page. The
Tolanal.xls spreadsheet can be found on the GEA website, under
Six Sigma, Forms & Tools.
( )
s s s s s S B B B B A gap
2
4
2
3
2
2
2
1
2
+ + + + =
Every problem solving task is focused on finding out something. The investigation will
be more effective if it is planned. Planning is appropriate for Gage R&R, characterizing
the process, analyzing the process for difference (hypothesis testing), design of
experiments or confirmation run analysis. In short it is appropriate in every phase of the
MAIC or MAD process.
This investigative process is called reverse loading because the process begins with a
question focusing on what is desired at the end of the process.
19
52
The Planning Questions The Planning Questions
1) What do you want to know?
2) How do you want to see what it is that
you need to know?
3) What type of tool will generate what it
is that you need to see?
4) What type of data is required of the
selected tool ?
5) Where can you get the required type
of data?
Critical
Questions
Plan
Execute
Copyright 1995 Six Sigma Academy, Inc.
The
Principle of
Reverse
Loading
Know
See
Tool
Data
Where
Tolerancing
Tolerancing
Analysis -
Analysis -
Linear Linear
Spreadsheet Spreadsheet
Input your process data into the spreadsheet
input gap technical requirements
input interface dimension 'nominals' as the baseline case
input design vectors from loop diagram
input process 's (long and short term from available data)
Analyze the initial output
Gap: Min/Max of constraints (info only) vs. Z.upper and Z.lower for 'Gap' (CTQ)
Parts: RSS-.st contribution by each part: target the high RSS% first
Modify the input variables to optimize the Z-gap
..change:
Part nominals vs. means (implies process shift... tooling sensitive?)
Component .st (caution here... if you reduce, you are making 'big' assumptions )
Z-shift factor (change the .lt, using actual data or assumptions)
Target CTQ specifications (if not constrained... negotiate with Customer)
Review your output for the optimized Z-gap condition
If initial Z-gap is very high, you can move part nominals or 'open' the variation (don't penalize
yourself by constraining processes too much)
If you cannot get Z-gap to your goals, re-design should be considered
Understand ALL implications of your changes to any of the input variables
Establish your tolerances based on %RSS contribution, sensitivity to Z.gap and desired
sigma' level of the particular contributing dimension; know the effect on your NOMINAL
design
highest %RSS contributors will have most impact. Iterate by moving the nominal by 1.0 .st
in both directions.... continue iterating to 2*s.st.., 4*s.st.., 5*s.st.., etc.
understand and weigh your risks by increasing tolerance (effect on nominals, subsequent
operations, etc)... how is the process managed and what are the controls? Who supplies?
2. Input target
dimensions and
vector direction
Spreadsheet
identifies major
contributors to
system variation
3. Input short
and long term
s of part dims
1. Input the
technical
requirements
53
18
Six Sigma Process Report
Six Sigma Process Report
Analysis of Continuous data
Actual (LT)
Potential (ST)
20 15 10 5
Process Performance
USL LSL
Actual (LT)
Potential (ST)
1,000,000
100,000
10,000
1000
100
10
1
8 7 6 5 4 3 2 1
Potential (ST) Actual (LT)
Sigma
PPM
(Z.Bench)
Process Benchmarks
215.402
2.17
15034.0
3.52
Process Demographics
Award
15
8
22
Likert S
Perceive
Aging
Mountain
Wine Qua
Dick Kel
August 1
Opportunity:
Nominal:
Lower Spec:
Upper Spec:
Units:
Characteristic:
Process:
Department:
Project:
Reported by:
Date:
Report 1: Executive Summary
Date 06/31/96
Reported By Duke Brewster
Project Shoe Cast
Department Brake Division
Process Casting
Characteristic Hardness
Units Brinell
Upper Spec 42
Lower Spec 38
Nominal 40
Opportunity
Data Source
Time Span 01/01/96 -
06/31/96
Data Trace Bin # 1057a-9942
The Six Sigma Process Report, Six
Sigma>Process Report displays data to enable
the analysis of continuous process data. The
default reports are the Executive Summary (Report
1)and the Process Capability Report (Report 2).
To use this tool effectively, the response data (Y)
must be collected in rational subgroups of two (2)
or more data points. In addition to the Y data, a
demographics column may be added to provide the
demographic information on the right side of Report
1. The demographics column must be entered in
exact order shown if used. See figure.
Once the data is entered, create the report by
calling Six Sigma>Process Report.
1. Identify the configuration of the Y data, and
the location of the useful data. (columns or
rows).
2. Identify the CTQ specifications or the location
of the demographic information.
3. If detailed demographic information is to be
used, select the Demographics button. Either
enter the data for the information (shown at
the left) in the dialog box or reference a
spreadsheet column with this information
listed as shown.
4. When the report is generated with only this
information, the default reports will be shown.
If additional reports are desired, they can be
accessed through the Reports button.
5
4
3
2
1
0
S=1.691
3.0SL=4.342
-3.0SL=0.000
8 7 6 5 4 3 2 1
16
15
14
13
12
11
10
9
Xbar and S Chart
Subgroup
X=12.44
3.0SL=15.74
-3.0SL=9.133
22 8
20.7227 9.2773
Potential (ST) Capability
Process Tolerance
Specification
I I I
I I I
22 8
18.6145 6.2592
Actual (LT) Capability
Process Tolerance
Specification
I I I
I I I
Mean
StDev
Z.USL
Z.LSL
Z.Bench
Z.Shift
P.USL
P.LSL
P.Total
Yiel
PPM
Cp
Cpk
Pp
Ppk
LT ST
Capability Indices
Data Source:
Time Span:
0.72
1.13
15034.0
98.4966
0.015034
0.015033
0.000001
1.3513
2.1692
2.1692
4.6756
2.0454
12.4368
0.78
1.22
215.402
99.9785
0.000215
0.000108
0.000108
1.3513
3.5205
3.7003
3.7003
1.8917
15.0000
Executive Summary - Top left graphic displays the predicted distribution based on data.
MINITAB assumes normal data and will display a normal curve whether the data is
normal or not. The lower left hand graphic displays the expected PPM defect rates as
subgroups are added to the prediction. When this curve stabilizes (levels off) enough
data has been taken. The Process Benchmarks show the reported Z Benchmark scores
and PPM (Defects in both tails are combined) (Page 8).
Capability Study - The control charts provide an excellent means for diagnosing the
rational subgrouping process. Use normal techniques for analysis of this chart (Page 54).
The capability indices on the right provide tabular results of the study. The bar diagrams
at the bottom of the report show comparative graphics of the short term and long term
process predictions.
Design Of Experiments Design Of Experiments
Baselining data collection is considered passive observation. The process is monitored
and recorded without intentional changes or tweaking. In Designed Experiments,
independent variables (Factors) are actively manipulated and recorded and the effect on
the dependent variable (Response) is observed. Designed experiments are used to:
Determine which factors (Xs) have the greatest impact on the response (Y).
Quantify the effects of the factors (Xs) on the response (Y).
Prove the factors (Xs) you think are important really do affect the process.
Orthogonality Orthogonality
Since our goal in experimentation is to determine the effect each factor has on the
response independent of the effects of other factors, experiments must be designed so as
to be horizontally and vertically balanced. An experimental array is vertically balanced if
there are an equal number of high and low values in each column. The array is
horizontally balanced if for each level within each factor we are testing an equal number of
high and low values from each of the other factors. If we have a balanced design in this
manner, it is Orthogonal. Standard generated designs are orthogonal. When modifying or
fractionating standard designs be alert to assure maintenance of orthogonality.
Repetition Repetition
Completing a run more than once without resetting the independent variables is called
repetition. It is commonly used to minimize the effect of measurement and to analyze
factors affecting short-term variation in the response.
Replication Replication
Duplicate experimental runs more than once after resetting the independent variables is
called replication. It is commonly used to assure generalization of results over longer term
conditions. When using MINITAB for experimental designs, Replications can be
programmed during the design creation.
Randomization Randomization
Running experimental trials in a random sequence is a common, recommended practice
that assures that variables that change over time have an equal opportunity to affect all the
runs. When possible, randomizing should be used for designed experimental plans. It is
the default setting when MINITAB generates the design, but can be deselected using the
OPTIONS button.
Blocking Blocking
A block is a group of homogeneous units. It may be a group of units made at the same
time, such as a block by shift or lot, or it may be a group of units made from the same
material such as raw material lot or manufacturer. When blocking an experiment, you are
adding a factor to the design; i.e., in a full factorial 2
4
experiment with blocking, the actual
design will analyze as a 2
5-1
experiment. When analyzing processes subject to multiple
shift or multiple raw material flow environments, etc, blocking by those conditions is
recommended.
54
17
Normality of Data
Normality of Data
97.5 85.0 72.5 60.0 47.5 35.0
95% Confidence Interval for Mu
71 70 69
95% Confidence Interval for Median
Variable: Normal
69.021
9.416
69.121
Maximum
3rd Quartile
Median
1st Quartile
Minimum
N
Kurtosis
Skewness
Variance
StDev
Mean
P-Value:
A-Squared:
70.737
10.662
70.879
103.301
76.653
69.977
63.412
29.824
500
0.393445
-5.0E-02
100.000
10.0000
70.0000
0.328
0.418
95% Confidence Interval for Median
95% Confidence Interval for Sigma
95% Confidence Interval for Mu
Anderson-Darling Normality Test
Descriptive Statistics
P-Value: 0.328
A-Squared: 0.418
Anderson-Darling Norm
N: 500
StDev: 10.0000
Average: 70.0000
106 96 86 76 66 56 46 36 26
.999
.99
.95
.80
.50
.20
.05
.01
.001
P
r
o
b
a
b
i
l
i
t
y
Normal
Normal Probability Plot
100 50 0
15
14
13
12
11
10
90
80
70
60
50
Subgroup
Means
1
111
1111
1
1
1
11
11
1
1
1
11
1111
1
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
1
1
11
1
111
1
1
1
1
1
1
1
1
1
11
1
11
1
111
1
1
11
111
11
1
1
1
1
1
1
1
1
11
1
X=100.
3.0SL=113.
-
50
40
30
20
1
0
Ranges
1
R=22.83
3.0SL=48.2
-
Xbar/R Chart for Mystery
Data from many processes can be approximated by a normal distribution. Additionally,
The Central Limit Theorem states that characteristics which are the average of individual
values are likely to have an approximately normal distribution. Prior to characterizing your
project Y, it is valuable to analyze the data for normality to confirm whether the data does
follow a normal distribution. If there is strong evidence that the data do not follow a normal
distribution, then predictions of future performance should not be made using the normal
distribution.
Use Stat>Basic Stats>Normality Test
(Fig 1) (Page 21) or Stat>Basic
Stat>Descriptive Statistics (Fig 2) (Page 21)
with Graphs>Graphical Summary checked.
If using Normality Test, the default is
Anderson-Darling. Use that test for most
investigations. Use other tests with caution.
For example, Kolmogorov-Smirnov is actually a
less sensitive test.
The test statistic for primary use in analyzing the
test results is the P value. The null hypothesis,
H
o
,states that the process is normal, so if the p
value <.05, then there is evidence that the data
do not follow a normal distribution. If the
process shows non-normality, either there are
special causes of variation that cause the non-
normality, or the common cause variation is not
normal. Analyze first for special cause.
Use Stat>Control Charts (Fig 3)(Page 49)or
Plot>Time Series Plot (Page 24) to look for
out of control points or drifts of the process
over time. Try to determine the cause of those
points and separate, or stratify, the data using
that knowledge. If the levels of Xs have been
captured, use graphics to aid in visualizing the
process stratified by the Xs. If the data can be
stratified and within the strata the data is
normal, the process can be characterized at the
individual levels and perhaps characterized
using the Product Report (page 15). The
discovery of a special cause contributing to non-
normality may lead to improving the process. If
the common cause variation is non-normal, it
may be possible to transform the data to an
approximately normal distribution. MINITAB
provides such a tool in Stat>Control
Charts>Box-Cox Transformation (Page 24).
Additional notes on data transformation can be
found in the Quality Handbook; Juran Chap 22.
`
Fig 1
Fig 3
Fig 2
Fig 3
Factorial Designs Factorial Designs
Factorial Designs are primarily used to analyze the effects of two or more factors and
their interactions. Based on the level of risk acceptable, experiments may be either full
factorial, looking at each factor combination , or fractional factorial, looking at a fraction of
the factor combinations. Fractional Factorial experiments are an economical way to
screen for vital Xs. They only look at a fraction of the factor combinations. Their results
may be misleading because of confounding, the mixing of the effect of one factor with the
effect of a second factor or interaction. In planning a fractional factorial experiment, it is
important to know the confounding patterns, and confirm that they will not prevent
achievement of the goals of the DOE.
To create a Factorial Experiment using
MINITAB, select STAT>DOE>CREATE
FACTORIAL DESIGN. In the dialog box
(Fig 1) select the Number of Factors and
then the Designs Button. If the number
of factors allows both a fractional and full
factorial design, the Designs dialog box
(Fig 2) will show the available selections
including both full and fractional designs.
Resolution, which is a measure of
confounding is shown by each displayed
design. While in this dialog box identify
the number of replicates and blocks to be
used in the design. Select OK to return to
the initial dialog box. Select Options. In
the Options dialog box select Randomize
Run if planned. Finally, select the
Factors button and in that dialog box,
name the factors being studied and the
factor experimental levels. Click OK twice
to generate the completed design. The
design will be generated on the MINITAB
worksheet as shown in Fig 3. An analysis
of the design, including the design
Resolution and confounding will be
generated in the MINITAB
Session Window.
Now run the experiment and
collect the data. Record run
data in a new column in same
row as run factor settings.
Fig 2
Fig 1
Characterizing the Process - Characterizing the Process -
Rational Subgrouping
To separate the measurement of Z.ST
and Z.LT and understand fully how the process
operates, capture data in such a way to see both short term variation, inherent to the
technology being used, and long term variation, which reflects the variation induced
by outside influences. The process of collecting data in such a manner is called
Rational Subgrouping. Analyzing Rational Subgroups allows analysis of centering
vs. spread and control vs. technology.
Steps to characterize a process using Rational Subgroups
1. Work with operational advocacy team to define the factors (Xs) suspected as
influential in causing output variation (Y). Confirm which of these factors are
operationally controllable and which are environmental. Prioritize and understand the
cycle time for sensing the identified factors. Be sure to question the effect of
elements of all the 5Ms of process variation:
Machine - Technology; Maintenance; Setup
Materials - Batch/Lot/Coil Differences
Method - MTS; Workstation layout; Operator method
Manpower - Station Rotation; Shift Changes; skill levels
Measurement - R&R; Calibration effects
Environment Weather; Job Site or shop
2. Define a data collection plan over time that captures data within each subgroup
taken over a period of time short enough that only the variation inherent to the
technology occurs. Subgroup size can be anything greater than two (2). Two
measured data points are necessary to see subgroup variation. Larger subgroups
provide greater sensitivity to the process changes, so the choice of subgroup size
must be made to balance the needs of the business and the need for process
understanding. This variation is called common cause and represents the best the
process can achieve. In planning data collection use of The Planning Questions
(Page 19) is helpful.
3. Define the plan to allow for collection of the subgroups over a long period of time
which allows the elements of long term variation and systematic effects of potentially
important variables to influence the subgroup results. Do not tweak, or purposely
adjust the process, but rather recognize that the process will drift over time and plan
the data collection accordingly.
4. Capture data and analyze data using Control Charts (Page 53 - 55) and 6 Sigma
Process Report (Page 18) during the data collection period. Stay close to the
process and watch for data shifts and causes for the shifts. Capture data
documenting the levels of the identified vital Xs. This data may be helpful in
analyzing the causes of process variation. During data collection it may be helpful to
maintain a control chart or some other visual means of sensing process shift.
5. Capture sufficient subgroups of data to allow for multiple changes in all the
identified vital Xs and also to allow for a stable estimate of the mean and variation in
the output variable (Y). See 6 Sigma Process Report (Page 18) for explanation of
graphic indicator of estimation stability.
16
55
DOE Analysis
DOE Analysis
Analysis of DOEs includes both graphical and tabular information. Once the data for the
experimental runs has been collected and entered in the MINITAB worksheet, analyze
with STAT>DOE>ANALYZE FACTORIAL DESIGN. In the ANALYZE FACTORIAL
DESIGN dialog box, identify the column(s) with the response data in the Responses
box. Select the GRAPHS button. In the GRAPHS dialog box, select PARETO for the
effects plots and change ALPHA ( level of significance) to .05. Click OK twice. Note
that we have not used the other options buttons at this time. Leave Randomize at default
settings. The initial analysis provides a session window output and a Pareto graph.
20 10 0
B
D
BD
DE
E
CE
A
BC
AB
BE
AE
AD
AC
CD
C
Pareto Chart of the Effects
(response is PCReact, Alpha = .05)
A: Feedrate
B: Catalyst
C: Agitate
D: Tempera
E: Concentr
Analysis of the DOE requires both graphic and model analysis, however, the model should
be generated and analyzed before full graphic analysis can be completed. An analysis of
the Fit Model in the MINITAB Session window shows the amount of effect and the model
coefficients. Most important though is the ANOVA table. This table may show the
significant factors or interactions (See Balanced ANOVA Page 40). In this case, the F
score is shown as ** and there are no p values. This indicates that the model as
defined is too complex to be analyzed with the amount of data points taken. The model
needs to be simplified. The Pareto graphic is a helpful tool for that. Note that effects B,D
and E and interactions BD and DE show as significant effects. Remaining non-significant
effects can be eliminated.
Rerun STAT>DOE>ANALYZE FACTORIAL DESIGN . This time select the TERMS
option button. In the dialog box, deselect the terms not shown as significant in the Pareto.
Click OK. Select STORAGE and select RESIDUALS and FITS. Click OK twice. The
resulting ANOVA table shows the significance of the factors and the model coefficients are
provided. Next, Run STAT>DOE>FACTORIAL PLOTS. Select and setup each of the
plots, MAIN EFFECTS, INTERACTIONS and CUBE as follows. Identify the response
column in the RESPONSES box. Select only the significant factors to be included in the
plot. Click OK twice to generate the plots. Confirm the significance of effects and
interactions graphically using MAIN EFFECTS and INTERACTIONS plots. Use the CUBE
PLOT to identify the select factor levels for achieving the most desirable response.
6 3 180
140
Concentr
Temperat
Catalyst
180
140
2
1
Interaction Plot for PCReact
Concentr Temperat Catalyst
6 3 180
140
2 1
75
70
65
60
55
P
C
R
e
a
c
t
Main Effects for PCReact
80.0
94.0
66.0
62.0
47.0
64.5
55.5
53.0
Concentr
emperat
Catalyst
2 1
80
40
6
3
Cube Plot - Means for PCReact
15
56
Response Surface
Response Surface
Central Composite Design (CCD) Central Composite Design (CCD)
Response Surface analysis is a type of Designed Experiment that allows
investigation of non-linear relationships. It is a tool for fine tuning process
optimization once the region of optimal process conditions is known. Using the CCD
type RS Design, you will be designing an experiment that tests each factor at five
levels, and an experiment which can be used to augment a factorial experiment that
has been completed. The CCD design will include FACTORIAL points, STAR points
and CENTER points.
Start by Running STAT>DOE>CREATE RS DESIGN . Select CENTRAL
COMPOSITE from the design type choices in the dialog box. Identify the number of
factors to be studied and click the DESIGN button. In the DESIGN dialog box,
select the experiment design desired, including the blocks. Click OK and then select
the FACTORS button. In that dialog box identify the factors and their high and low
factorial settings and click OK. Randomize runs is found in the OPTIONS dialog box.
Click OK to generate the design. The design will be placed on a new worksheet.
Collect data for each of the scheduled trials defined by the design. Note that there
will be multiple points run at the centerpoint of each factor and there will be star
points for each factor beyond the factor ranges identified in the design.
Analyze the data using STAT>DOE>ANALYZE RS DESIGN. In the dialog box
identify the response column. Leave the Use Coded Units selected and choose the
appropriate setting for the USE BLOCKS box, depending on plan. Click OK and run.
The resulting output is a combination of the Regression Output (Page 43) and the
ANOVA output (PAGE 41). The regression output analyzes how the individual
factors and interactions fit the model. The ANOVA table will analyze the type of
relationship and also the total fit of the model. If Lack of Fit error is significant,
another model may be appropriate. Simplify the model for terms and regression
complexity as appropriate. See DOE Analysis (Page 51). Rerun
STAT>DOE>ANALYZE RS DESIGN and select the TERMS button. Before
rerunning the simplified analysis, select STORAGE and select FITS and
RESIDUALS.
Continue simplification and tabular analysis to attempt to find a simple model that
explains a large portion of the variation. Confirm regression fit quality using Residual
Plots (Page 22). The terms in the ANOVA Matrix should show significance, except
that Lack of Fit term should become insignificant (p>.05). Next run
STAT>DOE>RSPLOTS. Select either CONTOUR or SURFACE plot and SETUP
for the selection. In the SETUP dialog box, confirm that the appropriate factors are
included for the plot, noting that each plot will have only the factor pair shown.
Check that the plot is displayed using UNCODED units and run. Use the graphic
generated to visually analyze for optimal factor setting or use the model coefficients
and solve for the optimal settings mathematically.
30
35
40
31 30 29 28 27 26 25
95
85
75
Volume
C
o
m
p
o
s
i
t
i
o
n
Contour Plot of strength
20
24.5
30 strength
Volume
26.5
27.5
28.5
Volume 29.5
30.5
31.5
20
24.5
25.5
30
26.5
30 strength
40
30
40
75
85
Compositio
95
75
Compositio
Six Sigma Product Report Six Sigma Product Report
The Six Sigma Product Report Six Sigma>Product Report is used to
calculate and aggregate Z values from discrete data and data from multiple
normal processes.
Enter # defects, # units and # opportunities data in separate columns in
MINITAB. When Z shift is included in the calculation (1.5 default) the
reported Z bench is short term. If zero is entered, the reported Z.bench is
long term.
Defect count - Enter the actual defects recorded in the sample population.
If using defect data from Continuous Process Study, use PPM for long term.
If this report is a rollup of subordinate processes, use the defect count from
the subordinate process totals.
Units - Enter the actual number of parts included in the sample population
evaluated. If using data from Continuous Process Study, use 1,000,000. If
this report is a rollup of subordinate processes, use the actual number of
parts included in the sample population evaluated.
Opportunities - At the lowest level,
use one (1) for the number of
opportunities. One (1) is the
number of CTQs characterized at
the lowest level of analysis. If this
report is a rollup of subordinate
processes, use the total number of
opportunities accounted for in the
subordinate process.
Characteristics (Optional) Enter
the test name for the
Characteristic, CTQ or subprocess.
Shift - Process Z
SHIFT
can be entered three ways. If the report is an agregate
of a number of continuous data based studies, for example, a part with
multiple CTQs, the Z
SHIFT
data can be entered in the worksheet as a separate
column and refered to in the Product Report dialog box. A fixed Z
SHIFT
of 1.5
is the default and will be used if nothing is specified. A Z
SHIFT
of zero (0) will
produce a report that shows only the long-term results.
As the levels are rolled up, the data from the totals in the subordinated
processes will become line items in the higher level breakdown. In the chart
above, the process reported includes data from 12 subprocesses. Door
assembly South includes a process that included six (6) CTQs
characterized.
Analyzing the report
The far right hand column of the report shows the Z.
Bench
for the individual
processes and for the Cumulative Z.
Bench
. The number at the bottom of the
DPO column, in this case 0.081917, reports the P (d), probability of a defect
at the end of the line.
Total
C91
C90
C87
Plastics
C86
Door Assy South
C85
C84
Sealed SystemLow
Sealed SystemHigh
C83
CGCase
Characteristic
2.892
2.969
3.492
3.211
2.262
3.752
2.824
3.392
3.101
3.604
4.139
3.344
2.233
ZBench
1.500
1.500
1.500
1.500
1.500
1.500
1.500
1.500
1.500
1.500
1.500
1.500
1.500
ZShift
81917
70852
23166
43534
223144
12174
92673
29223
54667
17708
4157
32627
231767
PPM
0.081917
0.070852
0.023166
0.043534
0.223144
0.012174
0.092673
0.029223
0.054667
0.017708
0.004157
0.032627
0.231767
DPO
0.071
0.023
0.044
0.223
0.012
0.556
0.029
0.055
0.053
0.008
0.033
0.695
DPU
1465992
66636
66636
66636
66636
66636
399816
66636
66636
199908
133272
66636
199908
TotOpps
1
1
1
1
1
6
1
1
3
2
1
3
Opps
66636
66636
66636
66636
66636
66636
66636
66636
66636
66636
66636
66636
Units
120089
4721
1544
2901
14869
811
37052
1947
3643
3540
554
2174
46332
Defs
Report 7: Product Performance
57
Control charts are a practical tool for detecting product and/or process performance
changes in and R over time in relation to historical performance. Since they are a
rigorous maintenance tool, control charts should be used as an alternate to closed loop
process control, such as mechanical sensing and process adjustment.
Common and special-caused variation can be seen in rationally subgrouped samples:
common-cause variation characterized by steady state stable process variation
(captured by the within subgroup variation).
special-cause variation characterized by outside assignable causes on the process
variation (captured by the between subgroup variation).
Control Chart Analysis signals when the steady state process variation has been
influenced by outside assignable causes.
Variables Control Charts
Variable Control Charts are used in pairs. One chart characterizes the variation of
subgroup averages, and the other chart characterizes the variation of the spread of the
subgroups.
Individual Charts (X/Moving Range): These charts are excellent for tracking long term
variation changes. Because they use a single measurement for each data point, they are
not a tool of choice where measurement variation is involved, such as with part
dimensions. They work well with temperatures, pressures, concentration, etc.
Subgroup Charts (XBar R or Xbar S): These charts are excellent for tracking changes
in short term variation as well as variation over time. They require multiple measurements
(two or more) in each subgroup. Using rational subgroup techniques with this chart
enables graphic analysis of both short term variation changes (Range or S) and long term
variation (X Bar chart). This chart is the chart of choice where measurement variation is
involved. It is also an excellent tool for tracking processes during baselining or
rebaselining, since it assists in pointing to special cause influence on results. Because
there is usually no change in temperature, pressures or concentration in the short term,
they are not used for that type of measurement.
Attribute Charts
Attribute Control Charts are a single chart. The common difference between these charts
is whether they track proportion, a ratio, or defects, a count.
Proportion Defective Charts (P charts): This chart tracks proportion. The data point
plotted is the ratio
Number of defects
/
Number of Pieces Inspected .
In using proportion defective charts,
the number of pieces in a sample can vary, and the control limit for the chart will vary
based on that sample size.
Number Defective Charts (nP Charts): This chart tracks defect count. The data point
plotted is the number of defects in a sample. Because the data point is a number relative
to a sample size, it is important that the sample size be relatively constant between
samples. The sample size should be defined so that the average number of defects is at
least five in order for this chart to be effective.
In setting up Control Charts, use the Planning Questions (Page 19) first. Those questions
along with these notes will help define the type of chart needed. Use SETCIM, MINITAB,
SPQ (Supplier Process Quality) or other electronic methods for long term charting.
Control Charts
Control Charts
P(d) 1 Y
NA
=
Rolled Throughput Yield
Rolled Throughput Yield
Normalized Average Yield
Normalized Average Yield
Normalized Average Yield (Y
NA
) is the average yield of one
opportunity. It answers the question What is the probability that the
output of this process will meet the output requirements? The Y
NA
of a process is an average defect rate, and can be used for
comparing processes with differing levels of complexity.
Normalized Average Yield (Y
NA
), is the probability of good product, so
if we calculate 1-Y
NA
, we can find the Probability of a defect, P(d).
With this we can find the Z.
LT
score for a process.
ies Opportunit
RT NA
Y Y
1
) ( =
DPU
RT
e Y

=
Rolled Throughput Yield (Y
RT
) is the probability of completing all the
opportunities in a process without a defect.
As such, it is a tool which can focus the investigation when narrowing
down the problem from a larger business problem.
In a process which has 18 stations, each with 5 opportunities, and
DPO = 0.001, the Y
RT
is .9139, calculated as follows:
In addition to the straight multiplication method of calculating Y
RT
Y
RT
= Y
1
x Y
2
x Y
3
......x Y
N
Where Y
1
, Y
2
, Y
3
....Y
N
are yields of individual stations or operations
in a process.
Y
RT
can also be estimated using the Poisson Approximation
.91389 (.995) ) (.999
RT
Y
) ((Yield)
RT
Y
18 18 5
Stations #
Station
ies Opportunit #
= = =
=
DPU Ln(Y
rt
)
And conversely
14
A lack of control (out of control) is indicated when one
or more of the following rules apply to your chart
data:
1. A single point above or below a control limit
2. Two out of three consecutive points are on the same side of the
mean, in Zone A or beyond
10 / 11 points above Mean
12 / 14 points above Mean
3. Four out of five consecutive points are on the same side of the mean,
in Zone B or beyond
4. At least eight consecutive points are on the same side of the mean,
in Zone C.
5. 7 points in a row trending up or 7 points in a row trending down
6. 14 points sequentially alternating up then down then up, etc..
7. 14 points in a row in Zone C on both sides of the mean.
8. 8 points in a row alternating in Zone B or beyond.
Interpreting Variables Control
Interpreting Variables Control
Charts
Charts
UCL
LCL
X
A
B
C
C
B
A
Rule 1Rule 2 Rule 3 Rule 4 Rule 5
Note: A, B, and C represent plus and minus one, two
and three sigma zones from the overall process average.
Note: A, B, and C represent plus and minus one, two
and three sigma zones from the overall process average.
13
DPU / DPO DPU / DPO
DPU
DPU is the number of defects per unit produced. Its an average. This
means that on the average, each unit produced will have so many defects.
DPU gives us an index of quality generated by the effects of process,
material, design, environmental and human factors. Keep in mind that
DPU measures symptoms, not problems. (Its the Y, not the Xs).
DPU = (# Defects) / (# units)
[DPU is the average number of defects in a unit]
DPU forms the foundation for Six Sigma. From DPU and a knowledge of
the opportunities, we can calculate the long term capability of the process.
Opportunity
An opportunity is anything you measure, test or inspect. It may be a part,
product or service CTQ. It can be each of the elements of an assembly or
subassembly.
DPO
DPO is the number of defects per opportunity. It is a probability.
[DPO is the probability of a defect on any one CTQ or step of a process]
Yield = 1-DPO
DPO is the foundation for determining the Z value when using discrete
data. To find Z, given DPU, convert DPU to DPO. Then look up the P(d)
for DPO in the body of the Z table. Convert to Z score (page 7).
Unit
ies Opportunit #
* Units =# ies Opportunit Total
unit
ies Opportunit
DPU

* Units
Defects #
= (DPO) y Opportunit per Defects
t uni
ies Opportunit
=
58
59
Analysis Criteria.
Desirable system will have a Gage R&R <10% and Categories of
Discrimination >5.
The system is acceptable if Gage R&R >10%, but < 20% and discrimination
categories =5.
If Gage R&R is > 20%, but < 30% and Categories of Discrimination =4, the
decision about acceptability will be based on importance of measuring the
characteristic and business cost.
If Gage R&R is >30%, or the Categories of Discrimination < 4, the
measurement system is not considered acceptable and needs to be
improved.
MINITAB Analysis Outputs
MINITAB provides a tabular and graphical
output. The tabular output has three
tables; the first an ANOVA table (see
ANOVA Interpretation; Page 37). The
second table provides raw calculated
results of the study and the third table
provides the percent contribution results.
Interpretation of Gage R&R results is
focused on the third table. The third table
displays the % Contribution and %
Study Variation. % Contribution and %
Study Variation figures are interpreted as
Gage R&R. If you have included a
tolerance range with the Options button,
this table will also report a % Tolerance result.
The Number of Distinct Categories is also provided. This number indicates how many
classifications can be reliably distinguished given the observed process variation.
The graphical analysis provides
several important graphic tools.
The control chart should appear
out of control. Operator to
operator variation defines control
limits. If the gage has adequate
sensitivity beyond its own noise,
more than 50% of the points will
be outside the control limits. If
this is not the case, the system
is inadequate to detect part-to-
part variations.
The range chart should be in
control, showing consistency between the operators. If there are only two or three
distinct ranges recorded, it may indicate lack of gage resolution.
The column chart shows the graphic picture of data provided in table three of the
tabular report. The graphics on the right show various interaction patterns that may
be helpful in troubleshooting a problem measurement system.
(1) Measurement Systems Analysis Reference Manual; AIAG 1994
12
( )
( )
( )
X
, and
R
R
UCL R and LCL R
R R
m m
=
+ + +
=
=
+ + +

=
= =
+

X X X
k
R X X
R R R
k
CL X E
D D
k
m i i
m
k
x
m
m m
1 2
1
1 2 1
2
4 3
1
...
...
Control Chart Constants
Control Chart Constants
n A2 A3 D3 D4 B3 B4 d2 c 4
1 2.660 3.760 - - - - - -
2 1.880 2.659 0 3.267 0 3.267 1.128 0.7979
3 1.023 1.954 0 2.575 0 2.568 1.693 0.8862
4 0.729 1.628 0 2.282 0 2.266 2.059 0.9213
5 0.577 1.427 0 2.115 0 2.089 2.326 0.9400
6 0.483 1.287 0 2.004 0.03 1.970 2.534 0.9515
7 0.419 1.182 0.076 1.924 0.118 1.882 2.704 0.9594
8 0.373 1.099 0.136 1.864 0.185 1.815 2.847 0.9650
9 0.337 1.032 0.184 1.816 0.239 1.761 2.970 0.9693
10 0.308 0.975 0.223 1.777 0.284 1.716 3.078 0.9727
Variables Control Chart Control Limit Constants
( )
( )
X
X X . . . X
k
, where X
X
n
R
R R . . . R
k
UCL X A R and LCL X A R
UCL D R and LCL D R
1 2 k
i
i 1
n
1 2 k
x
2
x
2
R 4 R 3
=
+ + +
=
=
+ + +
= + =
= =
=

Average/Range Chart
Individual X /Moving Range Chart
( )
( )
np # defective for each subgroup
np
np
k
, for all k subgroups
UCL np 3 np
UCL np 3 np
np
np
=
=
= +
=
1
1
p
p
p Charts np Charts
( )
( )
p
np
n
and p
np
N
UCL p 3
p 1 p
n
LCL p 3
p 1 p
n
p
p
= =
= +

=

np = number of defectives
n= subgroup size
N= total number defectives for all
subgroups
15%
60
What it is:
Gage R&R is a means for checking the measurement system (gage plus
operator) to gain a better understanding of the variation and sources from the
measurement system.
Gage R&R=
m

* . 515 or 5.15*
AV
EV
2 2
+
Where
m
=Measurement System standard deviation
Components of Measurement Error
Repeatability = Equipment Variation (EV): The variation in
measurements attributable to one measurement instrument when used
several times by one appraiser to measure the identical characteristic
on the same part.
Reproducibility = Appraisal Variation (AV): The variation in
measurements attributable to different appraisers using the same
measurement instrument to measure the same characteristic on the
same part.
How to do the gage R&R study.
1. Determine how the gage is going to be used; i.e., Product Acceptance
or Process Control.
Gage must have resolution 10X finer than the process variation it is
intended to measure. (i.e., measurement of parts with process
variation of .001 requires a gage with .0001 resolution)
2. Select approximately ten parts which represent the entire expected
range of the process variation, including several beyond the normally
acceptable range. Code (blind) the parts.
3. Identify two or three Gage R&R participants from the people who actually
do the measurement. Have them each measure each part two or three
times. The measurements should by done with samples randomized and
blinded.
4. Record results on a MINITAB worksheet as follows:
a) One Column - Coded Part Numbers (PARTS)
b) One Column - Appraiser number or name (OPER)
c) One Column - Recorded Measurement (RESP)
5. Analyze using MINITAB by running Stat>Quality Tools>GageR&R
a) In the initial dialog box choose ANOVA method.
b) Identify the appropriate columns for PARTS, OPERATOR,
and MEASUREMENT Data .
c) If you wish to include the analysis for process tolerance, select
the OPTIONS button. This is only to be used if the gage is
for pass fail decisions only, not for process control.
d) If you wish to show demographic information on the graphic
output, including gage number, etc, select the Gage
Information button.
Gage R&R Gage R&R
(1) (1)
11
Precontrol
Precontrol
Provide ongoing visual means of on-the-floor process control.
Gives operators decision rules for continuing or stopping
production.
Rules are based on probability that population mean has
shifted.
Why use it?
What does it do?
How do I do it?
1. Establish control zones:
2. When five parts in a row are green, the process is
qualified.
3. Sample two consecutive parts on a periodic basis.
4. Decision rules for operators:
A. If first part is green, no action needed, continue to run.
B. If first part is yellow, then check a second part.
If second part is green, no action needed.
If second part is yellow on same side, then adjust
If second part is yellow on opposite side, stop, call
support engineer.
C. If any part is red, stop, call support engineer.
5. After correcting and restarting a process, must achieve 5
consecutive green samples to re-qualify.
Yellow Red Red Yellow Green
-1.5 s -3.0 s +1.5 s +3.0 s

.07 .07 .86


61
10
Data Validity Studies
Data Validity Studies
Non Measurement data is that which is not the result of a measurement
using a gage.
Examples:
Finance data (T&L cost; Cost & Benefits; Utility Costs; Sales, etc.)
Sales Data (Units sold; Items purchased, etc.)
HR data (Employee Information; medical service provider
information)
Customer Invoice Data
Samples of data should be selected to assure they represent the
population. A minimum of 100 data points is desirable. The data is then
analyzed for agreement by comparing each data point (as reported by
the standard reporting mechanism) to its true observed value.
The validity of the data is reported as % Agreement.
100
ns Observatio of Number
Agreements of Number
Agreement %

=
% Agreement should be very good. Typically this measure is much
greater than 95%.
% Agreement for Binary
% Agreement for Binary
(Pass / Fail) Data
(Pass / Fail) Data
Calculate % Agreement in similar manner to Non Measurement, except
using the following equation.
100
ies Opportunit of Number
Agreements of Number
Agreement %

=
Where the number of opportunities is found by the following equations.
n = total number of assessments per sample
s = number of samples


=
4
1
2
n
s ies Opportunit #
then odd, is n If

=
4
2
n
s ies Opportunit #
then even, is n If
Overall % Agreement = Agreement rate for all opportunities
Repeatability % Agreement = Compare the assessments for one operator over
multiple assessment opportunities. (Fix this problem first)
Reproducibility % Agreement = Compare assessments of the same part from
operator to operator.
Project Closure
Project Closure
At Closure, the project must be positioned so that the changes made to the process are
sustainable over time. Doing so requires the completion of a number of tasks.
1. The improvement must be fully implemented, with leverage factors identified and
controlled. The process must have been re-baselined to confirm the degree of
improvement.
2. Process owners must be fully trained and running the process, controlling the
leverage factors and monitoring the Response (Y).
3. Required Quality Plan and Control Procedures, drawings, documents, policies,
generated reports or institutionalized rigor must be completed .
Workstation Instructions
Job Descriptions
Preventive Maintenance Plan
Written Policy or controlled ISO documents
Documented training procedures
Periodic internal audits or review meetings.
4. The project History Binder must be completed which records key information about
the project work in hard copy. Where MINITAB has been used for analysis, hard
copies of the generated graphics and tables should be included.
Initial baseline data
Gage R & R calculations
Statistical characterization of the process
DOE (Design of Experiments)
Hypothesis testing
Any data from Design Change Process activities (described on the next page),
Failure Modes and Effects Analysis (FMEA), Design for Six Sigma (DFSS),
etc.
Copies of engineering part and tooling drawing changes showing Z score
values on the drawings.
Confirmation run data
Financial data (costs and benefits)
Final decision on improvement and conclusions
All related quality system documents
A scorecard (with frequency of reporting)
Documented control plan
5. All data entries must be complete in PROTRAK
Response Variable Z scores at initial Baselining
Response Variable Z scores at Re-baselining,
Project Definition
Improvements Made
Accomplishments, Barriers and Milestones for all project phases
Tools used for all project phases.
6. Costs and Benefits for the project must be reconfirmed with site finance.
7. Investigate potential transfer opportunities where project lessons learned can be
applied to other business processes.
8. Submit closure package for signoff through the site approval channels.
Sample Size
Sample Size
= 20% = 10% = 5% = 1%
20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1%
/ / / /
0.2 225 328 428 651 309 428 541 789 392 525 650 919 584 744 891 1202
0.3 100 146 190 289 137 190 241 350 174 234 289 408 260 331 396 534
0.4 56 82 107 163 77 107 135 197 98 131 162 230 146 186 223 300
0.5 36 53 69 104 49 69 87 126 63 84 104 147 93 119 143 192
0.6 25 36 48 72 34 48 60 88 44 58 72 102 65 83 99 134
0.7 18 27 35 53 25 35 44 64 32 43 53 75 48 61 73 98
0.8 14 21 27 41 19 27 34 49 25 33 41 57 36 46 56 75
0.9 11 16 21 32 15 21 27 39 19 26 32 45 29 37 44 59
1.0 9 13 17 26 12 17 22 32 16 21 26 37 23 30 36 48
1.1 7 11 14 22 10 14 18 26 13 17 21 30 19 25 29 40
1.2 6 9 12 18 9 12 15 22 11 15 18 26 16 21 25 33
1.3 5 8 10 15 7 10 13 19 9 12 15 22 14 18 21 28
1.4 5 7 9 13 6 9 11 16 8 11 13 19 12 15 18 25
1.5 4 6 8 12 5 8 10 14 7 9 12 16 10 13 16 21
1.6 4 5 7 10 5 7 8 12 6 8 10 14 9 12 14 19
1.7 3 5 6 9 4 6 7 11 5 7 9 13 8 10 12 17
1.8 3 4 5 8 4 5 7 10 5 6 8 11 7 9 11 15
1.9 2 4 5 7 3 5 6 9 4 6 7 10 6 8 10 13
2.0 2 3 4 7 3 4 5 8 4 5 6 9 6 7 9 12
2.1 2 3 4 6 3 4 5 7 4 5 6 8 5 7 8 11
2.2 2 3 4 5 3 4 4 7 3 4 5 8 5 6 7 10
2.3 2 2 3 5 2 3 4 6 3 4 5 7 4 6 7 9
2.4 2 2 3 5 2 3 4 5 3 4 5 6 4 5 6 8
2.5 1 2 3 4 2 3 3 5 3 3 4 6 4 5 6 8
2.6 1 2 3 4 2 3 3 5 2 3 4 5 3 4 5 7
2.7 1 2 2 4 2 2 3 4 2 3 4 5 3 4 5 7
2.8 1 2 2 3 2 2 3 4 2 3 3 5 3 4 5 6
2.9 1 2 2 3 1 2 3 4 2 2 3 4 3 4 4 6
3.0 1 1 2 3 1 2 2 4 2 2 3 4 3 3 4 5
3.1 1 1 2 3 1 2 2 3 2 2 3 4 2 3 4 5
3.2 1 1 2 3 1 2 2 3 2 2 3 4 2 3 3 5
3.3 1 1 2 2 1 2 2 3 1 2 2 3 2 3 3 4
3.4 1 1 1 2 1 1 2 3 1 2 2 3 2 3 3 4
3.5 1 1 1 2 1 1 2 3 1 2 2 3 2 2 3 4
3.6 1 1 1 2 1 1 2 2 1 2 2 3 2 2 3 4
3.7 1 1 1 2 1 1 2 2 1 2 2 3 2 2 3 4
3.8 1 1 1 2 1 1 1 2 1 1 2 3 2 2 2 3
3.9 1 1 1 2 1 1 1 2 1 1 2 2 2 2 2 3
4.0 1 1 1 2 1 1 1 2 1 1 2 2 1 2 2 3
9
62
Fulfillment & Span
Fulfillment & Span
Fulfillment Providing what the customer wants
when the customer wants it
Fulfillment is a highly segmented metric and typically does not follow a normal
distribution. Because the data is non-normal some of the traditional 6 Sigma
tools should not be used (such as the 6 Sigma Process Report). Therefore,
Median and Span will be used to measure Fulfillment.
Median the middle value in a data set
Span The difference between two values in the data set
(e.g. 1/99 Span = the difference between the 99
th
percentile and the 1
st
percentile)
Example
A sample of 100 delivery times has
a high value of 40 days
If that one value had instead been
30 days, the 1/99 span would
change by 10 days
10/90 span is not affected by what
happens to that highest point
We dont want our decision to be influenced by a single data point.
Therefore, the Span calculation is dependent on the sample size. Larger
data sets will have a wider span. Following are corporate guidelines on the
Span calculation:
Sample Size Span
100-500 10/90 Span
500-5000 5/95 Span
>5000 1/99 Span
In order to analyze a fulfillment process, the data should be segmented by the
variables that may affect the process. Each segment of data should be
compared to identify if the segmenting factor had an influence on the Median
and the Span. Moods Median test is a tool that can be used to identify
significant differences in Median. Factors that are identified as having an
influence on Span and Median, should be evaluated further through designed
experimentation.
F Distribution
F Distribution
Denom DF 1 2 3 4 5 6 7 8 9 10
1 161.40 199.50 215.70 224.60 230.20 234.00 236.80 238.90 240.50 241.90
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99
120 3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96 1.91
3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83
8 63
Z - An Important Measure
Z - An Important Measure
Z Short Term Z Short Term
Z.LT
describes the sustained reproducibility of a
process. It is also called long-term capability. It
reflects all of the sources of operational
variation, the influence of common cause
variation, dynamic nonrandom process centering
error, and any static off-set present in the
process mean. This metric assumes the data
were gathered in accordance to the principals
and spirit of a rational sampling plan (p. 14).
This equation is applicable to all types of
tolerances. It is used to estimate the long-term
process PPM.
Z Shift Z Shift
Z Z Z
SHIFT ST LT
. . . =
Z Long Term Z Long Term
Z Benchmark Z Benchmark
) (
. P P Z LSL USL Benchmark
score Z + =
While the Z values above are all calculated in reference to a single spec limit,
Z Benchmark is the Z score of the summation of the probabilities of defects in both
tails of the distribution. To find, sum the Probability of defect at the Lower Spec Limit
(P
LSL
)

and the Probability of defect at the Upper Spec Limit (P
USL
).

Look up the sum
of the combined probabilities in a normal table to find the corresponding Z value.
Z
SL Target
s
ST
.

ST
=

Z.st describes how the process performs at any given moment in time. It is referred to
as instantaneous capability, short-term capability or process entitlement. It is used
when referring to the SIGMA of a process. It is the process capability if everything is
controlled so that only background noise (common cause variation) is present. This metric
assumes the process is centered and the data were gathered in accordance to the
principals and spirit of a rational subgrouping plan (p. 14). The Target assumes that
each subgroup average is aligned to this number, so that all subgroup means are artificially
centered on this number. The s
st
used in this equation can be estimated by the square root
of the Mean Square Error term in the ANOVA Table. Since it is centered data, it can be
calculated from either one of the Specification Limits (SL).
Z LT
USL
USL
LT
.

Minimum of
or
Z LT
LSL
LSL
LT
.

Z.
SHIFT
describes how well the process being
measured is controlled over time. It reflects the
difference between the short term and long term
capability. It focuses on the dynamic nonrandom
process centering error, and any static off-set
present in the process mean. Interpretation of
the Z.shift is only valid when following the
principles of rational subgrouping (p.14)


=.05
=.05
Numerator Degrees of Freedom
F Distribution
F Distribution
64 7


=.05
=.05
Numerator Degrees of Freedom
The Standard Normal Curve
The Standard Normal Curve
** Area under the curve = 1, the center is 0 **
Z Area Z Area Z Area Z Area
0.00 .500000000 1.51 .065521615 3.02 .001263795 4.53 .000002999
0.05 .480061306 1.56 .059379869 3.07 .001070234 4.58 .000002369
0.10 .460172290 1.61 .053698886 3.12 .000904215 4.63 .000001867
0.15 .440382395 1.66 .048457216 3.17 .000762175 4.68 .000001469
0.20 .420740315 1.71 .043632958 3.22 .000640954 4.73 .000001153
0.25 .401293634 1.76 .039203955 3.27 .000537758 4.78 .000000903
0.30 .382088486 1.81 .035147973 3.32 .000450127 4.83 .000000705
0.35 .363169226 1.86 .031442864 3.37 .000375899 4.88 .000000550
0.40 .344578129 1.91 .028066724 3.42 .000313179 4.93 .000000428
0.45 .326355105 1.96 .024998022 3.47 .000260317 4.98 .000000332
0.50 .308537454 2.01 .022215724 3.52 .000215873 5.03 .000000258
0.55 .291159644 2.06 .019699396 3.57 .000178601 5.08 .000000199
0.60 .274253121 2.11 .017429293 3.62 .000147419 5.13 .000000154
0.65 .257846158 2.16 .015386434 3.67 .000121399 5.18 .000000118
0.70 .241963737 2.21 .013552660 3.72 .000099739 5.23 .000000091
0.75 .226627465 2.26 .011910681 3.77 .000081753 5.28 .000000070
0.80 .211855526 2.31 .010444106 3.82 .000066855 5.33 .000000053
0.85 .197662672 2.36 .009137469 3.87 .000054545 5.38 .000000041
0.90 .184060243 2.41 .007976235 3.92 .000044399 5.43 .000000031
0.95 .171056222 2.46 .006946800 3.97 .000036057 5.48 .000000024
1.00 .158655319 2.51 .006036485 4.02 .000029215 5.53 .000000018
1.05 .146859086 2.56 .005233515 4.07 .000023617 5.58 .000000014
1.10 .135666053 2.61 .004527002 4.12 .000019047 5.63 .000000010
1.15 .125071891 2.66 .003906912 4.17 .000015327 5.68 .000000008
1.20 .115069593 2.71 .003364033 4.22 .000012305 5.73 .000000006
1.25 .105649671 2.76 .002889938 4.27 .000009857 5.78 .000000004
1.30 .096800364 2.81 .002476947 4.32 .000007878 5.83 .000000003
1.35 .088507862 2.86 .002118083 4.37 .000006282 5.88 .000000003
1.40 .080756531 2.91 .001807032 4.42 .000004998 5.93 .000000002
1.45 .073529141 2.96 .001538097 4.47 .000003968 5.98 .000000001
1.50 .066807100 3.01 .001306156 4.52 .000003143 6.03 .000000001
Z = 2.76
This
table lists
the tail
area to
the right
of Z.
Table of Area
Under the
Normal Curve
Copyright 1995 Six Sigma Academy, Inc.
Performance
Limit
The Z value is a measure of process
capability and is often referred to
as the sigma of the process. A Z
= 1 indicates a process for which
the performance limit falls one
standard deviation from the mean.
If we calculate the standard normal
deviate for a given performance
limit and discover that Z = 2.76,
the probability of a defect (P(d)) is
the probability of a point lying
beyond the Z value of 2.76.
Z
Units of Measure
Z=0


Mean
Point of Inflection
Z=1
1
Z=1
Total Area = 1
Denom DF 12 15 20 24 30 40 60 120
1 243.90 245.90 248.00 249.10 250.10 251.10 252.20 253.30 254.30
2 19.41 19.43 19.45 19.45 19.46 19.47 19.48 19.49 19.50
3 8.74 8.70 8.66 8.64 8.62 8.59 8.57 8.55 8.53
4 5.91 5.86 5.80 5.77 5.75 5.72 5.69 5.66 5.63
5 4.68 4.62 4.56 4.53 4.50 4.46 4.43 4.40 4.36
6 4.00 3.94 3.87 3.84 3.81 3.77 3.74 3.70 3.67
7 3.57 3.51 3.44 3.41 3.38 3.34 3.30 3.27 3.23
8 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97 2.93
9 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75 2.71
10 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58 2.54
11 2.79 2.72 2.65 2.61 2.57 2.53 2.49 2.45 2.40
12 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34 2.30
13 2.60 2.53 2.46 2.42 2.38 2.34 2.30 2.25 2.21
14 2.53 2.46 2.39 2.35 2.31 2.27 2.22 2.18 2.13
15 2.48 2.40 2.33 2.29 2.25 2.20 2.16 2.11 2.07
16 2.42 2.35 2.28 2.24 2.19 2.15 2.11 2.06 2.01
17 2.38 2.31 2.23 2.19 2.15 2.10 2.06 2.01 1.96
18 2.34 2.27 2.19 2.15 2.11 2.06 2.02 1.97 1.92
19 2.31 2.23 2.16 2.11 2.07 2.03 1.98 1.93 1.88
20 2.28 2.20 2.12 2.08 2.04 1.99 1.95 1.90 1.84
21 2.25 2.18 2.10 2.05 2.01 1.96 1.92 1.87 1.81
22 2.23 2.15 2.07 2.03 1.98 1.94 1.89 1.84 1.78
23 2.20 2.13 2.05 2.01 1.96 1.91 1.86 1.81 1.76
24 2.18 2.11 2.03 1.98 1.94 1.89 1.84 1.79 1.73
25 2.16 2.09 2.01 1.96 1.92 1.87 1.82 1.77 1.71
26 2.15 2.07 1.99 1.95 1.90 1.85 1.80 1.75 1.69
27 2.13 2.06 1.97 1.93 1.88 1.84 1.79 1.73 1.67
28 2.12 2.04 1.96 1.91 1.87 1.82 1.77 1.71 1.65
29 2.10 2.03 1.94 1.90 1.85 1.81 1.75 1.70 1.64
30 2.09 2.01 1.93 1.89 1.84 1.79 1.74 1.68 1.62
40 2.00 1.92 1.84 1.79 1.74 1.69 1.64 1.58 1.51
60 1.92 1.84 1.75 1.70 1.65 1.59 1.53 1.47 1.39
120 1.83 1.75 1.66 1.61 1.55 1.50 1.43 1.35 1.25
1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22 1.00
Probability
of a Defect
Example = .00289
Chi-Square Distribution
Chi-Square Distribution
df .995 .990 .975 .950 .900 .750 .500
1 .000039 .000160 .000980 .003930 .015800 .101500 .455000
2 0.010 0.020 0.051 0.103 0.211 0.575 1.386
3 0.072 0.115 0.216 0.352 0.584 1.213 2.366
4 0.207 0.297 0.484 0.711 1.064 1.923 3.357
5 0.412 0.554 0.831 1.145 1.610 2.675 4.351
6 0.676 0.872 1.237 1.635 2.204 3.455 5.348
7 0.989 1.239 1.690 2.167 2.833 4.255 6.346
8 1.344 1.646 2.180 2.733 3.490 5.071 7.344
9 1.735 2.088 2.700 3.325 4.168 5.899 8.343
10 2.156 2.558 3.247 3.940 4.865 6.737 9.342
11 2.603 3.053 3.816 4.575 5.578 7.584 10.341
12 3.074 3.571 4.404 5.226 6.304 8.438 11.340
13 3.565 4.107 5.009 5.892 7.042 9.299 12.340
14 4.075 4.660 5.629 6.571 7.790 10.165 13.339
15 4.601 5.229 6.262 7.261 8.547 11.036 14.339
16 5.142 5.812 6.908 7.962 9.312 11.912 15.338
17 5.697 6.408 7.564 8.672 10.085 12.792 16.338
18 6.265 7.015 8.231 9.390 10.865 13.675 17.338
19 6.844 7.633 8.907 10.117 11.651 14.562 18.338
20 7.434 8.260 9.591 10.851 12.443 15.452 19.337
21 8.034 8.897 10.283 11.591 13.240 16.344 20.337
22 8.643 9.542 10.982 12.338 14.041 17.240 21.337
23 9.260 10.196 11.688 13.091 14.848 18.137 22.337
24 9.886 10.856 12.401 13.848 15.659 19.037 23.337
25 10.520 11.524 13.120 14.611 16.473 19.939 24.337
26 11.160 12.198 13.844 15.379 17.292 20.843 25.336
27 11.808 12.879 14.573 16.151 18.114 21.749 26.336
28 12.461 13.565 15.308 16.928 18.939 22.657 27.336
29 13.121 14.256 16.047 17.708 19.768 23.567 28.336
30 13.787 14.953 16.791 18.493 20.599 24.478 29.336
7 Basic QC Tools - Ishikawa
7 Basic QC Tools - Ishikawa
The seven basic QC tools are the simplest, quickest tools for
structured problem solving. In many cases these tools will define
the appropriate area in which to focus to solve quality problems.
They are an integral part of the Six Sigma DMAIC process toolkit.
65 6
Brainstorming: Allows generation of a high volume of ideas quickly.
Generally used integrally with the advocacy team when identifying the
potential Xs.
Pareto: Helps to define the potential vital few Xs. The pareto links
data to problem causes and aids in making data based decisions (Page
23).
Histogram: Displays frequency of occurrence of various categories in
chart form, can be used as first cut at mean, variation, distribution of
data. An important part of process data analysis. (Page 18).
Cause & Effect / Fishbone Diagram: Helps identify potential problem
causes and focus brainstorming. (Page 23).
Flowcharting / Process Mapping: Displays actual steps of process.
Provides basis for examining potential areas of improvement.
Scatter Charts: Shows relationship between two variables.
(Page 18).
Check Sheets: Capture data in a format that facilitates interpretation.
Pareto Chart
Fishbone (Ishikawa)
Diagram

Chi-Square Distribution
Chi-Square Distribution
df .250 .100 .050 .025 .010 .005 .001
1 1.323 2.706 3.841 5.024 6.635 7.879 10.828
2 2.773 4.605 5.991 7.378 9.210 10.597 13.816
3 4.108 6.251 7.815 9.348 11.345 12.838 16.266
4 5.385 7.779 9.488 11.143 13.277 14.860 18.467
5 6.626 9.236 11.070 12.832 15.086 16.750 20.515
6 7.841 10.645 12.592 14.449 16.812 18.548 22.458
7 9.037 12.017 14.067 16.013 18.475 20.278 24.322
8 10.219 13.362 15.507 17.535 20.090 21.955 26.125
9 11.389 14.684 16.919 19.023 21.666 23.589 27.877
10 12.549 15.987 18.307 20.483 23.209 25.188 29.588
11 13.701 17.275 19.675 21.920 24.725 26.757 31.264
12 14.845 18.549 21.026 23.337 26.217 28.300 32.909
13 15.984 19.812 22.362 24.736 27.688 29.819 34.528
14 17.117 21.064 23.685 26.119 29.141 31.319 36.123
15 18.245 22.307 24.996 27.488 30.578 32.801 37.697
16 19.369 23.542 26.296 28.845 32.000 34.267 39.252
17 20.489 24.769 27.587 30.191 33.409 35.718 40.790
18 21.605 25.989 28.869 31.526 34.805 37.156 43.312
19 22.718 27.204 30.144 32.852 36.191 38.582 43.820
20 23.828 28.412 31.410 34.170 37.566 39.997 45.315
21 24.935 29.615 32.671 35.479 38.932 41.401 46.797
22 26.039 30.813 33.924 36.781 40.289 42.796 48.268
23 27.141 32.007 35.172 38.076 41.638 44.181 49.728
24 28.241 33.196 36.415 39.364 42.980 45.558 51.179
25 29.339 34.382 37.652 40.646 44.314 46.928 52.620
26 30.434 35.563 38.885 41.923 45.642 48.290 54.052
27 31.528 36.741 40.113 43.194 46.963 49.645 55.476
28 32.620 37.916 41.337 44.461 48.278 50.993 56.892
29 33.711 39.087 42.557 45.722 49.588 52.336 58.302
30 34.800 40.256 43.773 46.979 50.892 53.672 59.703
66 5

Practical Problem Statement
Practical Problem Statement
A major cause of futile attempts to solve a problem is poor, up front statement of the
problem. Define the problem using available facts, and planned improvement.
1. Write an initial as is problem statement condition. This statement describes the
problem as it exists now. It is a statement of what hurts or what bugs you. The
statement should contain data based measures of the hurt. For example:
As Is: The response time for 15% of our service calls is more than 24 hours.
2. Be sure the problem statement meets the following criteria:
Is as specific as possible
Contains no potential causes
Contains no conclusions or potential solutions
Is sufficiently narrow in scope
The most common mistake in developing a Problem Statement is the problem is
stated at too high a level or is too broad for effective investigation. Use the
Structure Tree (Page 25), Pareto (Page 25) or Rolled Throughput Yield analysis
(Page 14) to break the problem down further.
3. Avoid the following in wording problem statements:
4. Determine if you have identified the correct level to address the problem.
Ask: Is my Y response variable (Output) defined at a level at which it can
be solved by direct interaction with its independent variables (Xs) Inputs?
5. Determine if correcting the Y response variable will result in the desired
improvement in the problem as stated.
6. Describe the desired state, a description of what you want to achieve by solving
the problem, as objectively as possible. As with the as is statement, be sure the
desired state is in measurable observable terms. For example:
Desired State: The response time for all our service calls is less than 24 hours.
Avoid Ineffective Problem
Statement
Effective Problem Statement
Questions How can we reduce the
downtime on the
Assembly Line.
Assembly Line downtime
currently runs 15% of operating
hours.
The word
lack
We lack word processing
software
Material to be typed is
backlogged by five days.
Solution
masquerading
as a problem
We need to hire another
warehouse shipping
clerk.
50% of the scheduled days
shipments are not being pulled
on time.
Blaming
people instead
of processes
File Clerks arent doing
their jobs.
Files cannot be located within
the allowed 5 minutes after
requested.
67
Defining a Six Sigma Project Defining a Six Sigma Project
A well defined problem is the first step
in a successful project!
4
Normal Distribution
Normal Distribution
Z
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 5.00E-01 4.96E-01 4.92E-01 4.88E-01 4.84E-01 4.80E-01 4.76E-01 4.72E-01 4.68E-01 4.64E-01
0.1 4.60E-01 4.56E-01 4.52E-01 4.48E-01 4.44E-01 4.40E-01 4.36E-01 4.33E-01 4.29E-01 4.25E-01
0.2 4.21E-01 4.17E-01 4.13E-01 4.09E-01 4.05E-01 4.01E-01 3.97E-01 3.94E-01 3.90E-01 3.86E-01
0.3 3.82E-01 3.78E-01 3.75E-01 3.71E-01 3.67E-01 3.63E-01 3.59E-01 3.56E-01 3.52E-01 3.48E-01
0.4 3.45E-01 3.41E-01 3.37E-01 3.34E-01 3.30E-01 3.26E-01 3.23E-01 3.19E-01 3.16E-01 3.12E-01
0.5 3.09E-01 3.05E-01 3.02E-01 2.98E-01 2.95E-01 2.91E-01 2.88E-01 2.84E-01 2.81E-01 2.78E-01
0.6 2.74E-01 2.71E-01 2.68E-01 2.64E-01 2.61E-01 2.58E-01 2.55E-01 2.51E-01 2.48E-01 2.45E-01
0.7 2.42E-01 2.39E-01 2.36E-01 2.33E-01 2.30E-01 2.27E-01 2.24E-01 2.21E-01 2.18E-01 2.15E-01
0.8 2.12E-01 2.09E-01 2.06E-01 2.03E-01 2.01E-01 1.98E-01 1.95E-01 1.92E-01 1.89E-01 1.87E-01
0.9 1.84E-01 1.81E-01 1.79E-01 1.76E-01 1.74E-01 1.71E-01 1.69E-01 1.66E-01 1.64E-01 1.61E-01
1.0 1.59E-01 1.56E-01 1.5 39E01 1.52E-01 1.49E-01 1.47E-01 1.45E-01 1.42E-01 1.40E-01 1.38E-01
1.1 1.36E-01 1.34E-01 1.31E-01 1.29E-01 1.27E-01 1.25E-01 1.23E-01 1.21E-01 1.19E-01 1.17E-01
1.2 1.15E-01 1.13E-01 1.11E-01 1.09E-01 1.08E-01 1.06E-01 1.04E-01 1.02E-01 1.00E-01 9.85E-02
1.3 9.68E-02 9.51E-02 9.34E-02 9.18E-02 9.01E-02 8.85E-02 8.69E-02 8.53E-02 8.38E-02 8.23E-02
1.4 8.08E-02 7.93E-02 7.78E-02 7.64E-02 7.49E-02 7.35E-02 7.21E-02 7.08E-02 6.94E-02 6.81E-02
1.5 6.68E-02 6.55E-02 6.43E-02 6.30E-02 6.18E-02 6.06E-02 5.94E-02 5.82E-02 5.71E-02 5.59E-02
1.6 5.48E-02 5.37E-02 5.26E-02 5.16E-02 5.05E-02 4.95E-02 4.85E-02 4.75E-02 4.65E-02 4.55E-02
1.7 4.46E-02 4.36E-02 4.27E-02 4.18E-02 4.09E-02 4.01E-02 3.92E-02 3.84E-02 3.75E-02 3.67E-02
1.8 3.59E-02 3.52E-02 3.44E-02 3.36E-02 3.29E-02 3.22E-02 3.14E-02 3.07E-02 3.01E-02 2.94E-02
1.9 2.87E-02 2.81E-02 2.74E-02 2.68E-02 2.62E-02 2.56E-02 2.50E-02 2.44E-02 2.39E-02 2.33E-02
2.0 2.28E-02 2.22E-02 2.17E-02 2.12E-02 2.07E-02 2.02E-02 1.97E-02 1.92E-02 1.88E-02 1.83E-02
2.1 1.79E-02 1.74E-02 1.70E-02 1.66E-02 1.62E-02 1.58E-02 1.54E-02 1.50E-02 1.46E-02 1.43E-02
2.2 1.39E-02 1.36E-02 1.32E-02 1.29E-02 1.26E-02 1.22E-02 1.19E-02 1.16E-02 1.13E-02 1.10E-02
2.3 1.07E-02 1.04E-02 1.02E-02 9.90E-03 9.64E-03 9.39E-03 9.14E-03 8.89E-03 8.66E-03 8.42E-03
2.4 8.20E-03 7.98E-03 7.76E-03 7.55E-03 7.34E-03 7.14E-03 6.95E-03 6.76E-03 6.57E-03 6.39E-03
2.5 6.21E-03 6.04E-03 5.87E-03 5.70E-03 5.54E-03 5.39E-03 5.23E-03 5.09E-03 4.94E-03 4.80E-03
2.6 4.66E-03 4.53E-03 4.40E-03 4.27E-03 4.15E-03 4.02E-03 3.91E-03 3.79E-03 3.68E-03 3.57E-03
2.7 3.47E-03 3.36E-03 3.26E-03 3.17E-03 3.07E-03 2.98E-03 2.89E-03 2.80E-03 2.72E-03 2.64E-03
2.8 2.56E-03 2.48E-03 2.40E-03 2.33E-03 2.26E-03 2.19E-03 2.12E-03 2.05E-03 1.99E-03 1.93E-03
2.9 1.87E-03 1.81E-03 1.75E-03 1.70E-03 1.64E-03 1.59E-03 1.54E-03 1.49E-03 1.44E-03 1.40E-03
3.0 1.35E-03 1.31E-03 1.26E-03 1.22E-03 1.18E-03 1.14E-03 1.11E-03 1.07E-03 1.04E-03 1.00E-03
3.1 9.68E-04 9.35E-04 9.04E-04 8.74E-04 8.45E-04 8.16E-04 7.89E-04 7.62E-04 7.36E-04 7.11E-04
3.2 6.87E-04 6.64E-04 6.41E-04 6.19E-04 5.98E-04 5.77E-04 5.57E-04 5.38E-04 5.19E-04 5.01E-04
3.3 4.84E-04 4.67E-04 4.50E-04 4.34E-04 4.19E-04 4.04E-04 3.90E-04 3.76E-04 3.63E-04 3.50E-04
3.4 3.37E-04 3.25E-04 3.13E-04 3.02E-04 2.91E-04 2.80E-04 2.70E-04 2.60E-04 2.51E-04 2.42E-04
3.5 2.33E-04 2.24E-04 2.16E-04 2.08E-04 2.00E-04 1.93E-04 1.86E-04 1.79E-04 1.72E-04 1.66E-04
3.6 1.59E-04 1.53E-04 1.47E-04 1.42E-04 1.36E-04 1.31E-04 1.26E-04 1.21E-04 1.17E-04 1.12E-04
3.7 1.08E-04 1.04E-04 9.97E-05 9.59E-05 9.21E-05 8.86E-05 8.51E-05 8.18E-05 7.85E-05 7.55E-05
3.8 7.25E-05 6.96E-05 6.69E-05 6.42E-05 6.17E-05 5.92E-05 5.68E-05 5.46E-05 5.24E-05 5.03E-05
3.9 4.82E-05 4.63E-05 4.44E-05 4.26E-05 4.09E-05 3.92E-05 3.76E-05 3.61E-05 3.46E-05 3.32E-05
4.0 3.18E-05 3.05E-05 2.92E-05 2.80E-05 2.68E-05 2.57E-05 2.47E-05 2.36E-05 2.26E-05 2.17E-05
4.1 2.08E-05 1.99E-05 1.91E-05 1.82E-05 1.75E-05 1.67E-05 1.60E-05 1.53E-05 1.47E-05 1.40E-05
4.2 1.34E-05 1.29E-05 1.23E-05 1.18E-05 1.13E-05 1.08E-05 1.03E-05 9.86E-06 9.43E-06 9.01E-06
4.3 8.62E-06 8.24E-06 7.88E-06 7.53E-06 7.20E-06 6.88E-06 6.57E-06 6.28E-06 6.00E-06 5.73E-06
4.4 5.48E-06 5.23E-06 5.00E-06 4.77E-06 4.56E-06 4.35E-06 4.16E-06 3.97E-06 3.79E-06 3.62E-06
4.5 3.45E-06 3.29E-06 3.14E-06 3.00E-06 2.86E-06 2.73E-06 2.60E-06 2.48E-06 2.37E-06 2.26E-06
4.6 2.15E-06 2.05E-06 1.96E-06 1.87E-06 1.78E-06 1.70E-06 1.62E-06 1.54E-06 1.47E-06 1.40E-06
4.7 1.33E-06 1.27E-06 1.21E-06 1.15E-06 1.10E-06 1.05E-06 9.96E-07 9.48E-07 9.03E-07 8.59E-07
4.8 8.18E-07 7.79E-07 7.41E-07 7.05E-07 6.71E-07 6.39E-07 6.08E-07 5.78E-07 5.50E-07 5.23E-07
4.9 4.98E-07 4.73E-07 4.50E-07 4.28E-07 4.07E-07 3.87E-07 3.68E-07 3.50E-07 3.32E-07 3.16E-07
Z
Issue is clearly defined to the lowest level of cause
and effect. The project should have a response
variable (Y) with specifications and constraints (i.e.
cycle time for returned parts, washer base width). It
should be bound by clearly defined goals. If it looks
big, it is. A poorly defined project will require greater
scoping time and will have a longer completion time
than one that is clearly defined.
SPECIFIC
Financially justifiable - directly impacts a business
metric that returns value: PPM, reliability, yield,
pricing errors, field returns, factory yield, overtime,
transportation, warehousing, availability, SCR,
rework, under billing and scrap.
VALUE-ADDED
The response variable (Y) must have reasonable
historical DATA, or you must have the ability to
capture a reliable data stream.
Having a method for measuring vital Xs is also
essential for in-depth process analysis with data.
Discreet data can be effectively used for problem
investigation, but variable (continuous) data is
better. Projects based on unreliable data have
unreliable results.
MEASURABLE
The selected project should be one which can be
addressed by the accepted local organization
Adequate support is needed to ensure successful
project completion and permanent change to the
process. It is difficult to manage improvements in
Louisville from the field
LOCALLY
ACTIONABLE
CUSTOMER
FOCUSED
The Project Y should be clearly linked to a specific
customer want or need - can result in improved
customer perception or consumer satisfaction
(Customer WOW): on time delivery, billing accuracy,
call answer rate.
3 68
Normal Distribution
Normal Distribution
Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
5.0 3.00E-07 2.85E-07 2.71E-07 2.58E-07 2.45E-07 2.32E-07 2.21E-07 2.10E-07 1.99E-07 1.89E-07
5.1 1.80E-07 1.71E-07 1.62E-07 1.54E-07 1.46E-07 1.39E-07 1.31E-07 1.25E-07 1.18E-07 1.12E-07
5.2 1.07E-07 1.01E-07 9.59E-08 9.10E-08 8.63E-08 8.18E-08 7.76E-08 7.36E-08 6.98E-08 6.62E-08
5.3 6.27E-08 5.95E-08 5.64E-08 5.34E-08 5.06E-08 4.80E-08 4.55E-08 4.31E-08 4.08E-08 3.87E-08
5.4 3.66E-08 3.47E-08 3.29E-08 3.11E-08 2.95E-08 2.79E-08 2.64E-08 2.50E-08 2.37E-08 2.24E-08
5.5 2.12E-08 2.01E-08 1.90E-08 1.80E-08 1.70E-08 1.61E-08 1.53E-08 1.44E-08 1.37E-08 1.29E-08
5.6 1.22E-08 1.16E-08 1.09E-08 1.03E-08 9.78E-09 9.24E-09 8.74E-09 8.26E-09 7.81E-09 7.39E-09
5.7 6.98E-09 6.60E-09 6.24E-09 5.89E-09 5.57E-09 5.26E-09 4.97E-09 4.70E-09 4.44E-09 4.19E-09
5.8 3.96E-09 3.74E-09 3.53E-09 3.34E-09 3.15E-09 2.97E-09 2.81E-09 2.65E-09 2.50E-09 2.36E-09
5.9 2.23E-09 2.11E-09 1.99E-09 1.88E-09 1.77E-09 1.67E-09 1.58E-09 1.49E-09 1.40E-09 1.32E-09
6.0 1.25E-09 1.18E-09 1.11E-09 1.05E-09 9.88E-10 9.31E-10 8.78E-10 8.28E-10 7.81E-10 7.36E-10
6.1 6.94E-10 6.54E-10 6.17E-10 5.81E-10 5.48E-10 5.16E-10 4.87E-10 4.59E-10 4.32E-10 4.07E-10
6.2 3.84E-10 3.61E-10 3.40E-10 3.21E-10 3.02E-10 2.84E-10 2.68E-10 2.52E-10 2.38E-10 2.24E-10
6.3 2.11E-10 1.98E-10 1.87E-10 1.76E-10 1.66E-10 1.56E-10 1.47E-10 1.38E-10 1.30E-10 1.22E-10
6.4 1.15E-10 1.08E-10 1.02E-10 9.59E-11 9.02E-11 8.49E-11 7.98E-11 7.51E-11 7.06E-11 6.65E-11
6.5 6.25E-11 5.88E-11 5.53E-11 5.20E-11 4.89E-11 4.60E-11 4.32E-11 4.07E-11 3.82E-11 3.59E-11
6.6 3.38E-11 3.18E-11 2.98E-11 2.81E-11 2.64E-11 2.48E-11 2.33E-11 2.19E-11 2.06E-11 1.93E-11
6.7 1.82E-11 1.71E-11 1.60E-11 1.51E-11 1.42E-11 1.33E-11 1.25E-11 1.17E-11 1.10E-11 1.04E-11
6.8 9.72E-12 9.13E-12 8.57E-12 8.05E-12 7.56E-12 7.10E-12 6.66E-12 6.26E-12 5.87E-12 5.52E-12
6.9 5.18E-12 4.86E-12 4.56E-12 4.28E-12 4.02E-12 3.77E-12 3.54E-12 3.32E-12 3.12E-12 2.93E-12
7.0 2.75E-12 2.58E-12 2.42E-12 2.27E-12 2.13E-12 2.00E-12 1.87E-12 1.76E-12 1.65E-12 1.55E-12
7.1 1.45E-12 1.36E-12 1.28E-12 1.20E-12 1.12E-12 1.05E-12 9.88E-13 9.26E-13 8.69E-13 8.15E-13
7.2 7.64E-13 7.16E-13 6.72E-13 6.30E-13 5.90E-13 5.54E-13 5.19E-13 4.86E-13 4.56E-13 4.28E-13
7.3 4.01E-13 3.76E-13 3.52E-13 3.30E-13 3.09E-13 2.90E-13 2.72E-13 2.55E-13 2.39E-13 2.24E-13
7.4 2.10E-13 1.96E-13 1.84E-13 1.72E-13 1.62E-13 1.51E-13 1.42E-13 1.33E-13 1.24E-13 1.17E-13
7.5 1.09E-13 1.02E-13 9.58E-14 8.98E-14 8.41E-14 7.87E-14 7.38E-14 6.91E-14 6.47E-14 6.06E-14
7.6 5.68E-14 5.32E-14 4.98E-14 4.66E-14 4.37E-14 4.09E-14 3.83E-14 3.58E-14 3.36E-14 3.14E-14
7.7 2.94E-14 2.76E-14 2.58E-14 2.42E-14 2.26E-14 2.12E-14 1.98E-14 1.86E-14 1.74E-14 1.63E-14
7.8 1.52E-14 1.42E-14 1.33E-14 1.25E-14 1.17E-14 1.09E-14 1.02E-14 9.58E-15 8.97E-15 8.39E-15
7.9 7.85E-15 7.35E-15 6.88E-15 6.44E-15 6.02E-15 5.64E-15 5.28E-15 4.94E-15 4.62E-15 4.32E-15
8.0 4.05E-15 3.79E-15 3.54E-15 3.31E-15 3.10E-15 2.90E-15 2.72E-15 2.54E-15 2.38E-15 2.22E-15
8.1 2.08E-15 1.95E-15 1.82E-15 1.70E-15 1.59E-15 1.49E-15 1.40E-15 1.31E-15 1.22E-15 1.14E-15
8.2 1.07E-15 9.99E-16 9.35E-16 8.74E-16 8.18E-16 7.65E-16 7.16E-16 6.69E-16 6.26E-16 5.86E-16
8 3 5.48E-16 5.12E-16 4.79E-16 4.48E-16 4.19E-16 3.92E-16 3.67E-16 3.43E-16 3.21E-16 3.00E-16
8.4 2.81E-16 2.62E-16 2.45E-16 2.30E-16 2.15E-16 2.01E-16 1.88E-16 1.76E-16 1.64E-16 1.54E-16
8.5 1.44E-16 1.34E-16 1.26E-16 1.17E-16 1.10E-16 1.03E-16 9.60E-17 8.98E-17 8.40E-17 7.85E-17
8.6 7.34E-17 6.87E-17 6.42E-17 6.00E-17 5.61E-17 5.25E-17 4.91E-17 4.59E-17 4.29E-17 4.01E-17
8.7 3.75E-17 3.51E-17 3.28E-17 3.07E-17 2.87E-17 2.68E-17 2.51E-17 2.35E-17 2.19E-17 2.05E-17
8.8 1.92E-17 1.79E-17 1.68E-17 1.57E-17 1.47E-17 1.37E-17 1.28E-17 1.20E-17 1.12E-17 1.05E-17
8.9 9.79E-18 9.16E-18 8.56E-18 8.00E-18 7.48E-18 7.00E-18 6.54E-18 6.12E-18 5.72E-18 5.35E-18
9.0 5.00E-18 4.68E-18 4.37E-18 4.09E-18 3.82E-18 3.57E-18 3.34E-18 3.13E-18 2.92E-18 2.73E-18
9.1 2.56E-18 2.39E-18 2.23E-18 2.09E-18 1.95E-18 1.83E-18 1.71E-18 1.60E-18 1.49E-18 1.40E-18
9.2 1.31E-18 1.22E-18 1.14E-18 1.07E-18 9.98E-19 9.33E-19 8.73E-19 8.16E-19 7.63E-19 7.14E-19
9.3 6.67E-19 6.24E-19 5.83E-19 5.46E-19 5.10E-19 4.77E-19 4.46E-19 4.17E-19 3.90E-19 3.65E-19
9.4 3.41E-19 3.19E-19 2.98E-19 2.79E-19 2.61E-19 2.44E-19 2.28E-19 2.14E-19 2.00E-19 1.87E-19
9.5 1.75E-19 1.63E-19 1.53E-19 1.43E-19 1.34E-19 1.25E-19 1.17E-19 1.09E-19 1.02E-19 9.56E-20
9.6 8.94E-20 8.37E-20 7.82E-20 7.32E-20 6.85E-20 6.40E-20 5.99E-20 5.60E-20 5.24E-20 4.90E-20
9.7 4.58E-20 4.29E-20 4.01E-20 3.75E-20 3.51E-20 3.28E-20 3.07E-20 2.87E-20 2.69E-20 2.52E-20
9.8 2.35E-20 2.20E-20 2.06E-20 1.93E-20 1.80E-20 1.69E-20 1.58E-20 1.48E-20 1.38E-20 1.29E-20
9.9 1.21E-20 1.13E-20 1.06E-20 9.90E-21 9.26E-21 8.67E-21 8.11E-21 7.59E-21 7.10E-21 6.64E-21
10.0 6.22E-21 5.82E-21 5.44E-21 5.09E-21 4.77E-21 4.46E-21 4.17E-21 3.91E-21 3.66E-21 3.42E-21
Six Sigma Six Sigma
Problem Solving Processes Problem Solving Processes
Step Description Focus Tools Deliverables
Define
A Identify Project CTQs Y VOC; Process Map;
CAP
Project CTQs (1)
B Develop Team Charter Project CAP Approved Charter (2)
C Define Process Map Y=f(x) Process Map High Level Process
Map (3)
Measure
1 Select CTQ
Characteristics
Y VOC; QFD;FMEA Project Y (4)
2 Define Performance
Standards
Y VOC, Blueprints Performance
Standard for Project Y
(5)
3 Measurement System
Analysis
Y & X Continuous Gage
R&R; Test/Retest,
Attribute R&R
Data Collection Plan
& MSA (6), Data for
Project Y (7)
Analyze
4 Establish Process
Capability
Y Capability Indices Process Capability for
Project Y (8)
5 Define Performance
Objectives
Y Team,
Benchmarking
Improvement Goal for
Project Y (9)
6 Identify Variation
Sources
X Process Analysis,
Graphical Analysis,
Hypothesis Tests
Prioritized List of all Xs
(10)
Improve
7 Screen Potential
Causes
X DOE-Screening List of Vital Few Xs
(11)
8 Discover Variable
Relationships
X Factorial Designs Proposed Solution
(13)
9 Establish Operating
Tolerances
X Simulation Piloted Solution (14)
Control
10 Define & Validate
Measurement System
on Xs in Actual
Application
X, Y Continuous Gage
R&R, Test/Retest,
Attribute R&R
MSA
11 Determine New
Process Capability
X, Y Capability Indices Process Capability Y,
X
12 Implement Process
Control
X Control Charts;
Mistake Proofing;
FMEA
Sustained Solution
(15), Project
Documentation (16),
Six Sigma Toolkit - Index Six Sigma Toolkit - Index
t-Distribution
t-Distribution
1-
df .600 .700 .800 .900 .950 .975 .990 .995
1 0.325 0.727 1.376 3.078 6.314 12.706 31.821 63.657
2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032
6 0.265 0.553 0.906 1.440 1.943 2.447 3.143 3.707
7 0.263 0.549 0.896 1.415 1.895 2.365 2.998 3.499
8 0.262 0.546 0.889 1.397 1.860 2.306 2.896 3.355
9 0.261 0.543 0.883 1.383 1.833 2.262 2.821 3.250
10 0.260 0.542 0.879 1.372 1.812 2.228 2.764 3.169
11 0.260 0.540 0.876 1.363 1.796 2.201 2.718 3.106
12 0.259 0.539 0.873 1.356 1.782 2.179 2.681 3.055
13 0.259 0.538 0.870 1.350 1.771 2.160 2.650 3.012
14 0.258 0.537 0.868 1.345 1.761 2.145 2.624 2.977
15 0.258 0.536 0.866 1.341 1.753 2.131 2.602 2.947
16 0.258 0.535 0.865 1.337 1.746 2.120 2.583 2.921
17 0.257 0.534 0.863 1.333 1.740 2.110 2.567 2.898
18 0.257 0.534 0.862 1.330 1.734 2.101 2.552 2.878
19 0.257 0.533 0.861 1.328 1.729 2.093 2.539 2.861
20 0.257 0.533 0.860 1.325 1.725 2.086 2.528 2.845
21 0.257 0.532 0.859 1.323 1.721 2.080 2.518 2.831
22 0.256 0.532 0.858 1.321 1.717 2.074 2.508 2.819
23 0.256 0.532 0.858 1.319 1.714 2.069 2.500 2.807
24 0.256 0.531 0.857 1.318 1.711 2.064 2.492 2.797
25 0.256 0.531 0.856 1.316 1.708 2.060 2.485 2.787
26 0.256 0.531 0.856 1.315 1.706 2.056 2.479 2.779
27 0.256 0.531 0.855 1.314 1.703 2.052 2.473 2.771
28 0.256 0.530 0.855 1.313 1.701 2.048 2.467 2.763
29 0.256 0.530 0.854 1.311 1.699 2.045 2.462 2.756
30 0.256 0.530 0.854 1.310 1.697 2.042 2.457 2.750
40 0.255 0.529 0.851 1.303 1.684 2.021 2.423 2.704
60 0.254 0.527 0.848 1.296 1.671 2.000 2.390 2.660
120 0.254 0.526 0.845 1.289 1.658 1.980 2.358 2.617
0.253 0.524 0.842 1.282 1.645 1.960 2.326 2.576
69
Index
Analysis and Improve Tools Selection Matrix 26
ANOVA
ANOVA / ANOVA One Way
ANOVA Two Way
ANOVA - Balanced
Interpreting the ANOVA Output
41
42
43
44
Calculating Sample Size (Equation for manual Calculation) 28
Characterizing the Process - Rational Subgrouping 16
Control Chart Constants 59
Control Charts 57-58
Data Validity Studies/% Agreement on Binary (Pass / Fail) Data 10
Defining a Six Sigma Project 4
Definition of Z 8
Design for Six Sigma
Loop Diagrams
Tolerancing Analysis
50
51-52
Discrete Data Analysis 35
DOE
Design of Experiments
Factorial Designs
DOE Analysis
53
54
55
DPU / DPO 13
Gage R & R 11 - 12
General Linear Model 45
Hypothesis Statements 30-31
Hypothesis Testing 29
Minitab Graphics
Histogram / Scatter Plot
Descriptive Statistics / Normal Plot
One Variable Regression / Residual Plots
Boxplot / Interval Plot
Time Series Plot / Box-Cox Transformation
Pareto Diagrams / Cause & Effect Diagrams
20
21
22
23
24
25
Normal Approximation
2 Test (Test for Independence)
Poisson Approximation
36
37-38
39
Normality of Data 17
Planning Questions 19
Practical Problem Statement 5
Precontrol 60
Project Closure 61
Regression Analysis
Regression
Stepwise Regression
Regression with Curves (Quadratic) and Interactions
Binary Logistic Regression
46
47
48
49
Response Surface - CCD 56
Rolled Throughput Yield 14
Sample Size Determination 27
Seven Basic Tools 6
Six Sigma Problem Solving Processes 3
Six Sigma Process Report 18
Six Sigma Product Report 15
Stable Ops and 6 Sigma 9
t Test (Testing Means) (1 Sample t; 2 Sample t; Confidence Intervals) 33-34
Tables
Determining Sample Size
F Test
2 Test
Normal Distribution
t Test
62
63-64
65-66
67-68
69
Testing Equality of Variance (F test; Homogeneity of Variance) 32
The Normal Curve 7
The Transfer Function 40
The material in this Toolkit is a combination of material
developed by the GEA Master Black Belts and Dr. Mikel Harry
(The Six Sigma Academy, Inc.). Worksheets, statistical tables
and graphics are outputs of MINITAB for Windows Version
12.2, Copyright 1998, Minitab, Inc. It is intended for use as a
quick reference for trained Black Belts and Green Belts.
More detail information is available from the Quality Coach
Website SSQC.ge.com.
If you need more GEA Six Sigma Information, visit the GE
Appliances Six Sigma Website at
http://genet.appl.ge.com/sixsigma
For information on GE Corporate Certification Testing, go to
the Green Belt Training Site via the GE Appliances Six Sigma
Website.
For information about other GE Appliances Six Sigma
Training, contact a member of the GEA Six Sigma Training
Team
Jeff Keller - Ext 7649
Email: jeff.keller@appl.ge.com
Irene Ligon - Ext 4562
Email:Irene.ligon@appl.ge.com
Broadcast Group eMail:
GEASixSigmaTrainingTeam@Exchange.appl.ge.com
The Toolkit - A Six Sigma Resource The Toolkit - A Six Sigma Resource The Toolkit - A Six Sigma Resource
GLOSSARY OF SIX SIGMA TERMS GLOSSARY OF SIX SIGMA TERMS
1. - alpha risk - Probability of falsely accepting the alternative (H
A
) of
difference
2. ANOVA - Analysis of Variance
3. - Beta risk - Probability of falsely accepting the null hypothesis (H
0
)
of no difference
4.
2
- Tests for independent relationship between two discrete variables
5. - Difference between two means
6. DOE - Design of Experiments
7. DPU - Defects per unit
8. e
-DPU
- Rolled throughput yield
9. F- Test - Used to compare the variances of two distributions
10. g - number of subgroups
11. FIT - The point estimate of the mean response for each level of the
independent variable.
12. H
0
- Null hypothesis
13. H
A
- Alternative hypothesis
14. LSL - Lower spec limit
15. - Population mean
16. - Sample mean
17. n - number of samples in a subgroup
18. N - Number in the total population
19. P Value - If the calculated value of p is lower than the alpha () risk,
then reject the null hypothesis and conclude that there is a difference.
Often referred to as the observed level of significance.
20. Residual - The difference between the observed values and the Fit,
the error in the model
21. - Population standard deviation
22. - Summation
23. - Sample standard deviation
24. Stratify - Divide or arrange data in organized classes or segments,
based on known characteristics or factors.
24. SS - Sum of squares
25. t-Test - Used to compare the means of two distributions
26. Transfer Function - Prediction Equation - Y=f(x)
27. USL - Upper spec limit
28. - mean
29. - mean of the means
30. Z - Transforms a set of data such that =0 and =1
31. Z
LT
- Z Long term
32. Z
ST
- Z short term
33. Z
SHIFT
- Z
ST
- Z
LT

) s (
X
X
Revision 4.5 - September 2001
GE Appliances Copyright 2001
GE GE Appliances Appliances
Six Sigma Toolkit Six Sigma Toolkit
Six Sigma Toolkit
g gg g
g gg g
Rev 4.5 9/2001
GE Appliances Proprietary

You might also like