Professional Documents
Culture Documents
Normal Approximation
Normal Approximation
One Proportion One Proportion
Use this approximation when the sample size is large and the number of
defects in the sample is greater than 10 (np>10), and the number of good
parts in the sample is greater than 10 (n(1-p)>10).
Size Sample
Defects #
= p
A two sided confidence interval for the proportion (p) that are defective in
the population is given by the equation:
n
p p
Z p
)
1 (
The resultant will provide the lower and upper limits of a range of all
plausible values for the proportion of defects of that population.
n n
p p Z p p
2 1
2 1
1 1
)
1 (
)
( +
The resultant will provide the lower and upper limits of a range of all
plausible values of the difference between the proportion defective in the
populations.
If 0 is included within the range of plausible values, then there is not
strong evidence that the proportions of defects in the two populations are
different.
Comparing Two Proportions Comparing Two Proportions
If evaluating two different sample sets with proportion defective data, the
confidence interval for the difference in proportion defectives between the
two sample sets is given by:
Where:
If K
i
= # of defects in the i
th
sample
& n
i
= sample size of the i
th
sample
1
1
1
n
k
p =
) (
) (
2 1
2 1
n n
k k
p
+
+
=
2
2
2
n
k
p =
35 36
37 34
Confidence Intervals Confidence Intervals
A confidence interval is a range of plausible values for a population parameter, such as
the mean of the population, .
For example, a test of 8 units might give an average efficiency of 86.2%. This is the
most likely estimate of the efficiency of the entire population. However, observations
vary, so the true population efficiency might be somewhat higher or lower than 86.2%. A
95% confidence interval for the efficiency might be (81.2%, 91.2%). 95% of the intervals
constructed in this manner will contain the true population parameter.
The confidence interval for the mean of one sample is
t comes from the t tables (Page 65) with n-1 degrees of freedom and with the desired
level of confidence.
Confidence Interval for the difference in the means of 2 samples, if the variances of 2
samples are assumed to be equal is:
t comes from the t tables (Page 65) with n
1
+n
2
-2 degrees of freedom, and with the
desired level of confidence. S
P
is the pooled standard deviation:
In MINITAB, confidence Intervals are calculated using 1 Sample and 2 Sample t
methods, above. In the text output shown below, the 95% confidence interval for the
difference between the mean of Manu_a and the mean of Manu_b is 6.65 to 8.31. This
statement points to accepting H
a,
that the means are different.
95% CI for mu Manu_a - mu Manu_b: ( 6.65, 8.31)
n
t X
n n
p 2 1
t*s ) x x
2 1
1 1
( +
( ) ( ) [ ]
( ) 2
1 1
2 1
2
2 2
2
1 1
+
+
=
n n
s n s n
P
S
Using the 1 Sample t test
Run Stat >Basic Statistics >1-Sample t. In the dialog box, identify the variable or
variables to be tested.
Select the test to be performed, Confidence Interval or Test Mean.
If Confidence Interval is to be calculated on other than an a value of 95%, change to
the appropriate number.
If Test Mean is selected, identify the desired mean to be tested, the mean of the null
hypothesis and in the alternative box,select the alternative hypothesis which is
appropriate for the analysis. This will determine the test used for the analysis (one tailed
or two tailed).
If graphic output is needed, select the graphs button and choose among Histograms,
Dotplot and Boxplot output.
Click Ok to run analysis.
Analyzing the test results.
If running the Confidence Interval option, Minitab will calculate the t statistic and will
calculate a confidence interval for the data.
If using the Test Mean option, Minitab will provide descriptive statistics for the tested
distribution(s), the t statistic and a p value.
The graphic outputs will all have a graphical representation of the confidence interval of
the mean, shown by a red line with a dot at the mean for the sample population
The
2
Test for independent relationship
tests the null hypothesis (Ho,) that two
discrete variables are independent.
Data relating two discrete variables are used to create a contingency table.
For each of the cells in the contingency table the observed frequency is
compared to the expected frequency in order to test for independence.
The expected frequencies in each cell must be at least five (5) for the
2
test to
be valid.
For continuous data, it is best to test for dependency, or correlation, by using
scatter plots and regression analysis.
Example of
2
Analysis
1. There are 2 variables to be studied, height and weight. The null hypothesis Ho is
that weight is independent of height.
2. For each variable 2 conditions (categories) are defined.
Weight: < 140 lbs, > 140 lbs
Height: < 56, > 56
3. The data has been accumulated as shown below:
Height below 56 Height above 56 Totals Rows
Weight below 140 LBS 20 13 33
Weight Above 140 LBS 11 22 33
Totals Columns 31 35 N=66
Manual
2
1. Compute fexp for each cell ij: fexp ij = (Row Total)i x (Column Total)j / N.
2. N is total of all fobs for all 4 cells. (For our example, N = 66 and fexp1,2 =
(33*35)/66 = 1155 / 66 = 17.5).
3. Calculate
2
calc, where
2
calc = [ [[ [(fobs - fexp)
2
/ fexp ] = 4.927
4. Calculate the degrees of freedom df = (Number of Rows - 1)(Number of
Columns -1). For our example, df = (2-1) * (2-1) = 1.
5. Determine
2
crit from the
2
table for the degrees of freedom and confidence
level desired (usually 5% risk). For 1 df and 5% risk,
2
crit = 3.841
6. If
2
calc>
2
crit, then reject Ho and accept Ha, i.e. that weight depends on height.
In this example, we reject Ho.
Using Minitab to perform
2
Analysis
Minitab can be used to analyze data using
2
with two different processes.
Stat>Tables>Chi Square Test and Stat>Tables>Cross Tabulation. Chi Square
Test analyzes data which is in a table. Cross Tabulation analyzes data which is
in columns with subscripted categories. Since Minitab commonly needs data in
columns to graph, Cross Tabulation is a preferred method for most analysis.
2 2
-- -- Test for Independence Test for Independence
33 38
t Test t Test
A t test tests the hypothesis that the means of two distributions are equal. It can be
used to demonstrate a shift of the mean after a process change. If there has
been a change to a process, and it must be determined whether or not the mean
of the output was changed, compare samples before and after the change using
the t test.
Your ability to detect a shift (or change) is improved by increasing the size of
your samples and by increasing the size of the shift (or change) that you are
trying to detect, or by decreasing the variation (See Sample size; page 27 -
28).
There are two tests for means, a One Sample t test and a Two Sample t
test.
The one sample t test Stat >Basic Statistics >1-Sample t
compares a single distribution average to a target or hypothesized value.
The two sample t test Stat > Basic Statistics > 2-Sample t analyzes
the means of two separate distributions.
Using the 2 Sample t test
1. Pull samples in a random manner from the distributions whose means are being
evaluated. In Minitab, the data can be in separate columns or in a single column
with a subscript column.
2. Determine the Null Hypothesis Ho and Alternative Hypothesis Ha (Less than,
Equal to, or Greater than).
3. Confirm if variances are similar using F test or Homogeneity of Variance (page
30).
4. Run Stat > Basic Statistics > 2-Sample t In the dialog box, select Samples in
One Column and identify the data column and subscript column, or Samples in
Different Columns and identify both columns.
5. In the alternative box, select the alternative hypothesis which is appropriate for the
analysis. This will determine the test used for the analysis (one tailed or two
tailed).
6. If the variances are similar, check the Assume Equal Variances box.
7. If graphic output is needed, select the graphs button and choose between dotplot
and boxplot output.
8. Click Ok to run analysis.
Analyzing the test results.
Minitab will provide a calculation of descriptive statistics for each distribution, provide
a Confidence Interval statement (page 32) and provide a statement of the t test as
a test of the difference between two means. The output will provide a t statistic,
a p value and the degrees of freedom statistic. To use the t distribution table
on page 65, the t statistic and the degrees of freedom are required. Analysis
can be made using that table or the p value.
Minitab: Stat>Tables>Chi Square
1. Create the table shown in the example in
Minitab.
2. Run Stat>Tables>Chi Square Test. In the
dialog box select the columns containing the
tabular data, in this case, C2 and C3. Click
OK to run.
3. In the Session window, the table that is created
will show the expected value for each of the data
cells under the actual data for the cell, plus the
c
2
calculation ,
2
calc
= [(f
obs
- f
exp
)
2
/ f
exp
]
.
4. The Chi Square calculation is
2
calc
= 4.927.
The p value for this test of 0.026.
5. The degrees of freedom df = (Number of Rows -
1)(Number of Columns -1) is shown, df = 1.
6. Determine
2
crit
from the
2
table for the degrees
of freedom and confidence level desired (usually
5% risk).
2
crit
=.3.841.
7. Since
2
calc >
2
crit
, reject Ho.
Because the data is in tabular form in Minitab, no
other analysis can be done.
column format. Note that the data is in a single column and the factors or variables being
considered are shown as subscripted values. In this graphic, the data is in column C6 and
the appropriate subscripts are in column C4 and C5.
1. Run Stat>Tables>Cross Tabulation. In the dialog box select the columns identifying
Minitab: Stat>Tables>Cross Tabulation
If additional analysis of data is desired, including
any graphical analysis, the Stat>Tables>Cross
Tabulation procedure is preferred. This
procedure uses data in the common Minitab
the factors or variables in the Classification
Variables box.
2. Click Chi Square Analysis and select Above
and expected count.
3. Select the column containing the response data
in the Frequencies in box, in this case, data.
4. Click Run.
5. The output in the session window is very
similar to the output for Stat>Tables>Chi
Square Test, except that it does not show
the Chi Square calculation for each cell.
6. Analysis of the test are done as before,
either by using the generated p value or by
using the calculated
2
and degrees of
freedom and entering the tables with that
information to find a
2
crit.
32 39
Testing Equality of Variances
Testing Equality of Variances
The F test is used to compare the variances of two distributions. It tests the
hypothesis, H
o,
that the variances of two distributions are equal. It is performed by
forming a ratio of two variances from two samples and comparing the ratio with a value
in the F distribution table. The F test can be used to demonstrate that the variance
has been increased or decreased after a process change. Since t tests and
ANOVA need to know if population variance is the same or different, this test is also
a prerequisite for doing other types of hypothesis testing. In Minitab, this test is done
as Homogeneity of Variance.
The F test is also used during the ANOVA process to confirm or reject hypotheses
about the equality of averages of several populations.
Performing an F Test
1. Pull samples in a random manner from the two distributions for which you are
comparing the variances. Prior to running the test confirm sample distribution
normality for each sample (page 17).
2. Compute the F statistic, Fcalc = . The F statistic should always be
calculated so that the larger variance is in the numerator.
3. Calculate the degrees of freedom for each sample. Degrees of freedom = n
i
-1,
where n
i
is the sample size for the i
th
sample, i.e., n
1
-1 & n
2
-1.
4. Specify the risk level that you can tolerate for making an error in your decision
(usually set at 5%.)
5. Use the F distribution table (p 59 - 60) to determine F
crit
for the degrees of
freedom in your samples and for the risk level you have chosen.
6. Compare F
calc
to F
crit
. If F
calc
< F
crit
., the null hypothesis, H
o
, which implies that
the variances from both distributions are equal, cannot be rejected. If F
calc
>F
crit
,
reject the null hypothesis and conclude that the samples have different variances.
Using Homogeneity of Variance (For MINITAB Analysis)
1. Homogeneity of Variance will allow analysis of multiple population variances
simultaneously. It will also allow analysis of non-normal distributions. Data from
all sample groups must be stacked in a single column with the samples
identified with a separate subscript or factor column.
2. In Minitab, use STAT>ANOVA>HOMOGENEITY OF VARIANCE. In the dialog
box, identify the single response column and a separate Factors column or
columns.
3. Analysis of the test will be done using the p value. If the data is Normal (See
Normality, page 15), use Bartlett's Test. Use Levene's Test when the data come
from continuous, but not necessarily normal distributions.
4. The computations for the homogeneity of variance test require that at least one
cell contains a non-zero standard deviation. Normally, it is possible to compute a
standard deviation for a factor if it contains at least two observations.
5. Two standard deviations are necessary to calculate Bartletts and Levenes test
statistics.
2
2
2
1
2
0
=
2
1
H
A
- The sample variances tested are not equal
2
0
2
1
Homogeneity of Variance - Compares the variances of
multiple Distributions
H
0
- The sample variances tested are statistically the same
2
0
=
2
1
=
3
==
k
H
A
- At least one of the sample variances tested are not
equal
2
0
2
1
2
1
Bartletts Test - Tests Normal Distributions
Levenes Test - Tests non-normal distributions
2
- Tests the hypothesis that two discretely measured
variables operate independently of each other
H
o
: Independent (There is no relationship between the populations)
H
o
: p
1
= p
2
= p
3
=
... = p
n
H
a
: Dependent (There is a relationship between the populations)
H
a
: At least one of the equalities does not hold
Hypothesis Statements (
Hypothesis Statements (
cont
cont
.)
.)
What is a p-value? What is a p-value?
Statistical definitions of p-value:
The observed level of significance.
The chance of claiming a difference if there is no difference.
The smallest value of alpha that will result in rejecting the null hypothesis.
If p < Alpha, then the difference is statistically significant. Reject the null
hypothesis and declare that there is a difference.
Think of (1 - p) as the degree of confidence that there is a difference.
Example: p = .001, so (1 - p) = .999, or 99.9%.
You can think of this as 99.9% confidence that there is a difference.
How do I use it? How do I use it?
f (X) f (X) f (X) f (X)
f (X) f (X) f (X) f (X)
Y= Y= Y= Y=
Y= Y= Y= Y=
The Transf er The Transf er The Transf er The Transf er
The Transf er The Transf er The Transf er The Transf er
Funct i on Funct i on Funct i on Funct i on
Funct i on Funct i on Funct i on Funct i on
Effect Effect Root Causes Root Causes
Output Output Inputs Inputs
If Inputs are Continuous:
Regression
Analysis of Covariance
If Outputs are :
What is the mathematical relationship
between the Y and the Xs
What is the mathematical relationship What is the mathematical relationship
between the Y and the Xs between the Y and the Xs
Copyright 1995 Six Sigma Academy, Inc.
D
i
s
c
r
e
t
e
C
o
n
t
i
n
u
o
u
s
If Inputs are Discrete:
ANOVA
t Tests; F Tests
Confidence Intervals
DOE
If Inputs are Continuous:
Logistic Regression
If Inputs are Discrete:
Logistic Regression
2
Confidence Intervals- Proportions
DOE
30 41
Stating the Hypothesis H Stating the Hypothesis H
O O
and H and H
A A
The starting point for a hypothesis test is the null hypothesis - H
o
. H
o
is the hypothesis of sameness, or no difference.
Example: The population mean equals the test mean.
H
o
H
a
The second hypothesis is H
a
- the alternative hypothesis. It represents
the hypothesis of difference.
Example: The population mean does not equal the test mean.
You usually want to show that there is a difference (H
a
).
Start by assuming equality (H
o
).
If the data show they are not equal, then they must be different (H
a
).
Hypothesis Statements
1 Sample t - Compares a single distribution to a
target or hypothesized value.
H
0
- The sample tested equals the target
0
=Target
H
a
- The sample tested is not equal to the target or
greater than/less than the target.
0
Target
0
>Target
0
<Target
2 Sample t - Compares the means of two separate
distributions
H
0
- The samples tested are statistically the same
0
=
1
H
a
- The sample tested is not equal to the target or
greater than/less than the target.
0
1
0
>
1
0
<
1
ANOVA - One Way ANOVA - One Way
ANOVA,, ANalysis Of VAriance is a technique used to determine the statistical
significance of the relationship between a dependent variable (Y) and a single or
multiple independent variable(s) or factors (Xs).
ANOVA should be used when the independent variables (Xs) are categorical (not
continuous). Regression Analysis (Pages 43 - 45) is a technique for performing a
similar analysis with continuous independent variables.
ANOVA determines if the differences between the averages of the levels is greater
than the expected variation. It answers the question: Is the signal between levels
greater than the noise within levels?
ANOVA allows the investigator to compare several means simultaneously with the
correct overall level of risk.
Basic Assumptions for using ANOVA
Equal Variances (or close to the same) for each subgroup.
Independent and normally distributed observations.
Data must represent the population variation.
Acceptable Gage R&R
ANOVA tests for equality of means is fairly robust to the assumption of normality
for moderately large sample sizes, so normality is often not a major concern.
ANOVA ANOVA
The One Way ANOVA enables the investigation of a single factor at multiple levels
with a continuous dependent variable. The primary investigation question is Do any of
the populations of Y stemming from the levels of X have different means?
MINITAB will do this analysis either with the data in table form, with data for each level
of X in separate columns (STAT>ANOVA>ONE WAY (UNSTACKED) ) or with all the
data in a single column and the factor levels identified by a separate subscript column
(STAT>ANOVA>ONE WAY). For the data below, use One-Way c1-c3 and One
Way for data in columns c4-c5.
In the dialog box, for One Way
(Unstacked), identify each of the
columns containing the data.
In the dialog box for One-way,
identify the column containing the
Response (Y) and the Factor (X) as
appropriate.
For both analyses, if graphic
analysis is desired select the
Graphs button and select between
Dotplots and Boxplots .
Click OK to run. For analysis, see
page 41.
29
42
Hypothesis Testing
Hypothesis Testing
Steps in Hypothesis Testing Steps in Hypothesis Testing
1. Define the problem; state the objective of the Test.
2. Define the Null and Alternate Hypotheses.
3. Decide on the appropriate statistical hypothesis test; Variance (Page 30);
Mean (t Test - Page 31 - 32); Frequency of Occurrence (Discrete -
2
-
Page 35 - 36).
4. Define the acceptable and risk.
5. Define the sample size required. (Page 27 - 28)
6. Develop the sampling plan and collect samples.
7. Calculate the test statistic from the data.
8. Compare the calculated test statistic to a predicted test statistic for the risk
levels defined.
If calculated is larger than the predicted test statistic, the statistic
indicates difference.
Since all data are variable, an observed change could be due to chance and may not
be repeatable. Hypothesis testing determines if the change could be due to chance
alone, or if there is strong evidence that the change is real and repeatable.
In order to show that a change is real and not due to chance alone, first assume there
is no change (Null Hypothesis, H
O
). If the observed change is larger than the change
expected by chance, then the data are inconsistent with the null hypothesis of no
change. We then reject the null hypothesis of no change and accept the alternative
hypothesis, H
A
.
The null hypothesis might be that two suppliers provide parts with the same average
flatness (H
O
:
1
=
2
, the mean for supplier 1 is the same as the mean for supplier 2). In
this case, the alternative hypothesis is that average flatness is not equal (H
A
:
1
2
).
If the means are equal and your decision
is that they are equal (top left box), then
you made the correct decision.
If the means are not equal and your
decision is that they are not equal(bottom
right box), then you made the right
decision.
If the means are equal but your decision
is that they are not equal (Bottom left
box), then you made a type 1 error. The
probability of this error is alpha ()
If the means are not equal but your
decision is that they are equal (top right
box), then you made a Type 2 error. The
probability of this error is beta ().
.
Real World
Real World
Decision
Decision
1
=
2
1
=
2
1
1
=
2
1
=
2
2
Correct
Decision
Type 2
Error
Type 1
Error
Correct
Decision
Anova Two Way
Anova Two Way
Two way ANOVA evaluates the effect of two separate factors on a single response can be
evaluated. Each cell(combination of independent variables) must contain an equal number
of observations (must be balanced). See General Linear Model (Pages 42) for unbalanced
Data sets. In the data set on the right, Strength is
the response (Y) and Chem and Fabric are the
separate factors (X
1
and X
2
). To analyze the
significance of these factors on Y, run
STAT>ANOVA>TWO WAY. In the dialog box,
identify the Response (Y), Strength. In the Row
Factor box, identify the first of two factors (X) for
analysis. In the Column Factor, Identify the
second vital X. Select the Display Means box
for each factor to gain Confidence interval and
means analysis.
Select STORE RESIDUALS and then STORE
FITS.
If graphical analysis of the ANOVA data is
desired, select the Graphs button and choose
one, or all of the four diagnostic graphs available.
This analysis does not produce F and p-values,
since you can not specify whether the effects are
fixed or random. Use Balanced ANOVA (Page
36) to perform a two-way analysis of variance,
specify fixed or random effects, and display the F
and p-values when you have balanced data. If you have unbalanced data and random
effects, use General Linear Model (Page 42) with Options to display the appropriate test
results.
It can be seen from the SS column
that the error SS is very small
relative to the other terms. In the
graphic Confidence interval analysis
it is clear that both factors are
statistically significant, since some
of the confidence intervals do not
overlap.
43 28
Calculating Sample Size
Calculating Sample Size
( (( ( ) )) )
( (( ( ) )) )
2
2
2 /
/
2
z z
n
+ ++ +
= == =
( (( ( ) )) )
( (( ( ) )) )
350
3 .
326 . 2 645 . 1
2
2
2
= == =
+ ++ +
= == = n
Example:
= .10, = .01,
/=.3
/2 Z
/2
Z
.20 .10 1.282 .20 0.842
.10 .05 1.645 .10 1.282
.05 .025 1.960 .05 1.645
.01 .005 2.576 .01 2.326
To calculate the actual sample size without the table,
or to program a spreadsheet to calculate sample size,
use this equation.
ANOVA - Balanced
ANOVA - Balanced
Figure 2
The Balanced ANOVA allows the analysis of process data with two or more factors. As
with the Two Way ANOVA, Balanced ANOVA allows analysis of the effect of multiple
factors, at multiple levels simultaneously. A factor (B) is nested within another factor (A)
if the level of B appears with only a single level of A. Two factors are crossed if every
level of one factor appears with every level of the other factor. The data for individual
levels of factors must be balanced: each combination of independent variables (cell)
must have an equal number of observations. See General Linear Model (Page 38) for
analysis of unbalanced designs. Guidelines for normality and variance remain same as
shown on page 38.
Figure 1 shows how some of the factors and
data might look in the MINITAB worksheet.
Note there are five (5) data points for each
combination of the three factors.
To analyze for significance of these factors(X
ij
)
on the response variable (Y), run
STAT>ANOVA>BALANCED ANOVA. In the
dialog box (Figure 2) , Identify the
Y variable in the Response box and identify
the factors in the Model box. Note that the
pipes [Shift \] indicate the model analyzed is
to include factor interactions. Select Storage
to store residuals and fits for later analysis.
Select Options and select Display means... to
display information about data means for each
factor and level.
Figure 3 is the primary output of this analysis. There is no significant graphic analysis
for the balanced ANOVA. See page 41 for analysis of this output.
Figure 1
Figure 3
PIPE
44 27
Sample Size Determination Sample Size Determination
When using sampling to analyze processes, sample size must be consciously selected
based on the allowable and risk, the smallest amount of true difference () that
you need to observe for the change to be of practical significance and the variation
of the characteristic being measured (). As variation decreases or sample size
increases it is easier to detect a difference.
Steps to defining sample size Steps to defining sample size
1. Determine the smallest true difference to be detected, the gap ( ).
2. Confirm the process variation ( ) of the processes to be evaluated.
3. Calculate /.
4. Determine acceptable and risk.
5. Use chart on page 58 to read the sample size required for each level of the factor
tested.
For example --
Assume the direction of the effect is unknown, but you need to see a delta sigma (/)
of 1.0 in order to say the change is important. For an risk of 5% and a risk of 10%,
we would need to use 21 samples. Remember that we would need 21 at each level of
the factor tested. If for the same , were reduced so that / were 2, only 5 samples
would be required. In general, the smaller the shift (/) you are trying to detect, and/or
the lower the tolerable risk, the greater the number of samples required.
Sample size sensitivity is a function of the standard error of the mean Smaller
samples are less sensitive than larger samples.
gap delta( )
Today Today
T
Desired Desired
variation( )
Today Today
gap delta ( )
Desired Desired
variation( )
1
2
). ( n
Continuous Data Analysis
Continuous Data Analysis
T
Interpreting the ANOVA Output
Interpreting the ANOVA Output
The first table lists the factors and levels. In the table shown there are three factors,
Region, Shift and WorkerEx. There are three levels each for Region and Shift. The
values assigned for the Region and Shift levels are 1,2 &3. WorkerEx is a two level
factor and has level values of 1&2.
The second table is the ANOVA output. The columns are as defined below.
Source
The source shows the identified factors from the model, showing
both the single factor information (i.e., Region) and the interaction
information (i.e., Region*Shift)
DF
Degrees of Freedom for the particular factor. Region and shift have
3 levels and 3-1=2 df, and workerex has 2 levels and 2-1=1 df.
SS
Factor Sum of Squares is a measure of the variation of the sample
means of that factor.
MS
Factor Mean Square is the SS divided by the DF.
F
The F
calc
value is the MS of the factor divided by the MS of the Error
term. In the case of Region, F=90.5773.325=27.24. If using F
crit
to analyze for significance, enter table with DF degrees of freedom
and =.05. Compare F
calc
to F
crit
. If F
calc
is greater than F
crit
, the
factor is significant.
P
The calculated P value, the observed level of significance. If P<.05,
the factor is statistically significant at the 95% level of confidence.
Note: The relative size of the error SS to total SS indicates the percent of variation left
unexplained by the model. In this case, the unexplained variation is 39.16% of the total
variation in this model. The s of this unexplained variation is the square root of the MS of
the Error term (3.325). In this case the within group variation has a sigma of 1.82. If this
remaining variation does not enable the process to achieve the desired performance state,
look for additional factors.
45 26
Analysis and Improve
Analysis and Improve
Tools
Tools
Y
Y
Discrete Discrete
C
o
n
t
i
n
u
o
u
s
C
o
n
t
i
n
u
o
u
s
X
X
D
i
s
c
r
e
t
e
D
i
s
c
r
e
t
e
Continuous Continuous
Logistic Regression, Discriminant Analysis and CART (Classification and
Regression Trees) are advanced topics not taught in Six Sigma Training.
The following references may be helpful.
Breiman, Friedman, Olshen and Stone; Classification and Regression Trees;
Chapman and Hall, 1984
Hosmer and Lemeshow; Applied Logistic Regression; Wiley, 1989
Minitab Help - Additional information about Discriminate Analysis
Tables (Cross
tab)
Chi Square
Confidence
intervals for
proportions
Pareto
Confidence
intervals
t test
ANOVA
Homogeneity of
Variance
GLM
DOE (factorial fit)
Logistic
regression
Discriminant
Analysis
CART
(Classification and
Regression Trees)
Linear regression
Multiple regression
Stepwise
Regression
DOE response
surface
General Linear Model
General Linear Model
The General Linear Model (GLM) can handle unbalanced data - such as data sets with
missing observations. Where the Balanced ANOVA required the number of observations
to be equal in each factor/level grouping, GLM can work around this limitation.
The data must be full rank (enough data to estimate terms in the model). But you dont
have to worry about this, because Minitab will tell you if your data isnt full rank!
Interpretation:
Temp1 is a significant X variable, because it explains 62% of the total variation
(528.04/850.4). (Temp1 also has a p-value < 0.05, indicating that it is statistically
significant)
Neither Oxygen1 nor the interaction between Oxygen and Temperature appears
significant.
The unexplained variation represents 30.95% ((263.17850.4)*100) and the estimate of
the within subgroup variation is 5.4 (square root of 29.24).
In the data set shown in Figure 1, note that there is only
one data point in Rot1, the response column for factor
Temp1- level 10 / Oxygen1- level 10(rows 8 & 9), and
only two data points for Temp1 - level 16 / Oxygen1 -
Level 6 (Row 14). In such case, Balanced ANOVA
would not run because the requirement of equal
observation would require three data points in each cell
(factor and level combination).
Run STAT>ANOVA>GENERAL LINEAR MODEL. In
the Dialog box, identify the response variable in the
Response box and the factors in the Model box. Use
the pipe (shifted \) to include interactions in the
analysis.
Figure 2 is the primary output of this analysis. There is
no graphic analysis of this output.
Figure 1
Figure 2
Pareto Diagrams
Pareto Diagrams
Stat Stat>Quality Tools>Pareto Chart >Quality Tools>Pareto Chart
Cause and Effect Diagrams
Cause and Effect Diagrams
Fishbone Diagrams: Fishbone Diagrams: Stat Stat>Quality Tools>Cause & Effect >Quality Tools>Cause & Effect
When analyzing categorical defect data , it is
useful to use the Pareto chart to visualize the
relative defect frequency. A Pareto Chart is a
frequency ordered column chart. The analysis can
either analyze raw defect data, such as scratch,
dent, etc, or it can analyze count data such as is
made available from Assembly Line Defects
reports. The graphic on the left is from count data.
Set up the worksheet with two columns, the first
with the defect cause descriptor and the second
with count or frequency of occurrences. In the
PARETO CHART dialog box, select Chart
O
thers
Incom
plete P
art
D
efective H
ousi
Leaky G
asket
M
issing C
lips
M
issing S
crew
s
18 10 19 43 59 274
4.3 2.4 4.5 10.2 13.9 64.8
100.0 95.7 93.4 88.9 78.7 64.8
400
300
200
100
0
100
80
60
40
20
0
Defect
Count
Percent
Cum %
P
e
r
c
e
n
t
C
o
u
n
t
Pareto Chart for Defects
DEFECTS TABLE. Link the cause descriptor to the LABELS IN box and the counts to the
FREQUENCY box. Click OK. For more information, see Minitab Context sensitive help in the
Pareto Dialog box.
To interpret the pareto, look for a sharp gradient to the categories with 80% of counted defects
attributable to 20-30% of the identified categories. If Pareto is flat with all categories linked to
approximately the same number of defects, try to restate the question to redefine the categorical
splits.
Exhaust Quality
Condensation
Moisture%
Inspectors
Microscopes
Micrometers
Brake
Engager
Angle
Suppliers
Lubricants
Alloys
Speed
Lathes
Bits
Sockets
Operators
Training
Supervisors
Shifts
Men
Machines
Materials
Methods
Measurements
Environment
Cause-and-Effect Diagram
When working with the Advocacy team to define the
potential factors (Xs), it is often helpful to use a
Cause and Effect Diagram or Fishbone to display
the factors. The arrangement helps in the discovery
of potential interactions between Factors (Xs).
Use Minitab worksheet columns to record
descriptors for the factors identified during the team
brainstorming session. Group the descriptors in
columns by categories such as the 5Ms. Once the
factors are all recorded, open the Minitab
Stat>Quality Tools>Cause and Effect dialog
box.
The dialog box will have the 5Ms and Environment shown as default categories of factors. If using
these categories, link the worksheet columns of categorized descriptors to the dialog box categories.
If the team has elected to use other Category names, replace the default names and link the
appropriate columns. Click OK.
To interpret the Cause and Effect Diagram, look for places where a factor in one category could also
be included in another category. Question the Advocacy team about priority or significance of the
factors in each category. Then prioritize the factors as a whole. For the most significant factors, ask
the team where there is the potential for changes in one factor to influence the actions of another
factor. Use this information to plan analysis work.
25 46
Regression Analysis
Regression Analysis
Regression can be used to describe the mathematical relationship between the response
variable and the vital few Xs, if you have continuous data for your Xs. Also, after the
vital few variables have been isolated, solving a regression equation can be used to
determine what tolerances are needed on the vital few variables in order to assure that
the response variable is within a desired tolerance.
Regression analysis can find a linear fit between the response variable Y and the vital few
input variables X
1
and X
2
.
This linear equation can be used to decide what tolerances must be maintained on X
1
and
X
2
in order to hold a desired tolerance on a Variable Y.
(Start with a scatter diagram to examine the data.)
error X
B X B B
Y + + + =
2
2 1 1 0
2
2
1
1
X
B
X
B
Y + =
Regression analysis can be done using several of the MINITAB tools.
Stat>Regression>Fitted Line Plot is explained on Page 20. This section will discuss
Regression>Regression.
Data must be paired in the MINITAB worksheet. That is, one measurement from each
input factor (x) is paired with the response data (Y) for that particular measurement point.
Plot the data first using Minitab Stat>Plot. Analyze the data using
Stat>Regression>Regression. In the dialog box indicate the Response (Y) in the
Response box and the
expected factors (Xs) in the
Predictors box. Select the
Storage button and in that
dialog box select Fits and
Residuals. Click OK twice to
run the analysis. The output
will appear as shown in the
figure to the right.
The full regression equation is
shown at the top of the output.
Predictor influence can be
evaluated using the p column in
the first table. Analysis of the
second table is done in similar
fashion to the ANOVA analysis
on page 41. Note that R
2
(ADJ)
is similar to R
2
but is modified to
reflect the number of terms in
the regression. If there are many terms in the model, and the sample size is small, then R
2
(ADJ) can be much lower than R
2
, and you may be over-fitting. In this example, the total
sample size is large (n=560), so R
2
and R
2
(ADJ) are similar.
24 47
Time Series Plot
Time Series Plot
Graph>Time Series Plot Graph>Time Series Plot
The time series plot is useful as a
diagnostic tool. Use it to analyze data
collection processes, non-normal data sets,
etc. In GRAPH VARIABLES: Identify any
number of variables (Y) from the worksheet
you wish to look at over time. Minitab
assumes the values are entered in the order
they occurred. Enter one column at a time.
Minitab will automatically sequence to the
next graph for each column. The X axis is
the time axis and is set by selecting the
appropriate setting in TIME SCALE. Each
Time series plot will display on a separate graph. In FRAME,ANNOTATE and
OPTIONS, you can change chart axes, display multiple charts, etc. In analyzing the
Time Series Plot, look for a story. Look for trends, sudden shifts, a regular cycle,
extreme values, etc. If any of these exist, they can be used as a lead into problem
solving.
5 4 3 2 1 0 -1 -2 -3 -4 -5
60
50
40
30
20
10
0
95% Confidenc e Inter val
StDev
Lambda
Last I ter ation I nf o
2. 783
2. 782
2. 784
0.170
0.113
0.056
StDev Lambda
Up
Est
Low
Box-Cox Plot for Skewed
Box-Cox Transformation
Box-Cox Transformation
Stat Stat>Control Charts>Box-Cox Transformation >Control Charts>Box-Cox Transformation
The transformed data is the original data raised to the power of . Subgroup data
can be in columns or across rows. In the dialog box, indicate how DATA ARE
ARRANGED and where located. If data is subgrouped and subgroups are in rows,
identify configuration. To store transformed data, select STORE TRANSFORMED
DATA IN and indicate new location.
The Box-Cox transformation can be useful for correcting non-normality in process
data, and for correcting problems due to unstable process variation. Under most
conditions, it is not necessary to correct for non-normality unless the data are highly
skewed. It may not be necessary to transform data which are used in control charts,
because control charts work well in situations where data are not normally
distributed.
Note: You can only use this procedure with positive data.
BOX-COX TRANSFORMATION is a
useful tool for finding a transformation
that will make a data set closer to a
normal distribution. Once confirmed
that the distribution is non-normal, use
Box Cox to find an appropriate
transformation. Box-Cox provides an
exponent used in the transformation
called lambda, .
Stepwise Regression
Stepwise Regression
Stepwise Regression is useful to search for leverage factors in a data set with many
factors (xs) and a response variable (Y). The tool can analyze up to 100 factors. But,
while this enables the analysis of Baseline data for potential Vital Xs, be careful not
to draw conclusions about significance of Xs without first confirming with a DOE.
To use Stepwise regression, the data needs to be entered in Minitab with each
variable in a separate column and each row representing a single data point. Next
select Stat>Regression>Stepwise.
In the dialog box, identify the column containing the response (Y) data in the
Response box. In the Predictor box, identify the columns containing the factors
(Xs) you want Minitab to use. If their F-statistic falls below the value in the F to
remove text box under Options (Default = 4), Minitab removes them. By selecting
the Options button, you can change the F
crit
value for adding and removing factors
from the selection and also reduce the number of steps of analysis the tool goes
through before asking for your input.
Minitab will prioritize the leverage X variables and run the first regression step on
the factor with the greatest influence. It continues to add variables as long as the t
value is greater than the SQRT of the identified F statistic limit (Default = 4). The
Minitab output includes
1) the constant and the factor coefficients for the significant terms.
2) the t value for the Factors included.
3) the s for the unexplained variation based on the current model.
4) the R
2
for the current model.
If you have chosen 1 step between pauses, Minitab will then ask if you wish to run
more. Type yes and enter. Continue this procedure until MINITAB wont calculate
any more. At that point, you will have identified your potential leverage Xs.
Output Output
In this output, there are five potential
predictors identified by stepwise
regression. The steps are shown by
the numbered columns and include
the regression information for
included factors. The information in
column 1 represents the regression
equation information if only Form is
used. In column 5, the regression
equation information includes five
factors, but the s is .050 and the R
2
is only 25%. In all probability, the
analyst will choose to gather
information including additional
factors during next runs.
48
23
Regression with Curves (Quadratic) & Regression with Curves (Quadratic) &
Interactions Interactions
When analyzing multiple factor relationships, it is important to consider if there is
potential for quadratic (curved) relationships and interactions. Normal graphic analysis
techniques and Regression do not allow analysis of the effects of interrelated factors. To
accomplish this, the data must be analyzed in an orthogonal array (See Page 49). In
order to create an orthogonal array with continuous data, the factor (x) data must be
centered. Do this as follows:
1. The data to be analyzed need to be is columns, with the response in one column
and the values of the factors paired with the response and recorded in separate
columns.
2. Use Stat>DOE>Define Custom RS Design
In the dialog box, identify the columns containing the factor settings.
3. Next, analyze the model using Stat>DOE>Analyze RS Design.
Identify the column containing the response data
Check: Analyze Data using Coded Units.
4. Click on Storage and select Fits and Residuals for later regression diagnostics.
Click OK. Click on Graphs and select the desired graphs for analysis diagnostics.
The initial analysis will include all terms in the potential equation including full
quadratic. Analysis of the output will be similar to that for
Regression>Regression (Page 43).
5. Where elements are
insignificant revert to the
Stat>DOE>Analyze RS
Design >Terms dialog box
to eliminate. In the case of
this example, the equation
can be analyzed as a linear
relationship, so select
Linear in the Include the
following terms box. Note
that this removes all the
interaction and quadratic
terms. Re-run the
regression. Once an
appropriate regression
analysis, including leverage
factors has been obtained, validate the adequacy of the model by using the
regression diagnostic plots Stat>Regression>Residuals Plots (Page 22).
Once an appropriate regression equation has been determined, remember this
analysis was done with centered data for the factors. The centering will have to be
reversed in order to make the equation useful from a practical standpoint. To
create a graphic of the model, use Stat>DOE>RSPlots (Page 52). From this
dialog box a contour plot of the results can be created.
Box Plot
Box Plot
GRAPH>BOXPLOT GRAPH>BOXPLOT
The boxplot is useful for comparing
multiple distributions (Continuous Y and
discrete X).
In the GRAPH section of the dialog box, fill
in the column(s) you want to show for Y
and if a column is used to identify various
categories of X, i.e., subgroup coding, etc.
Click FRAME Button to give you the
options of setting common axes or multiple
graphs on the same page. To generate
multiple plots on a single page, select
FRAME>MULTIPLE GRAPHS>OVERLAY GRAPHS... Click ATTRIBUTES to allow you
to change individual box colors. Click OK
The box represents the middle 50% of the distribution. The horizontal line is the median
(the middlemost value) The whiskers each represent a region sized at 1.5*(Q3-Q1), the
region shown by the box). Interpretation can be that the box represents the hump of the
distribution and the whiskers represent the tails. Asterisks represent points which would
fall outside the lower or upper limits of expected values.
Y variable: Select the column to be plotted on the y-axis.
Group variable: Select the column containing the groups (or categories). This variable is
plotted on the x-axis.
Type of interval plot Type of interval plot
Standard error: Choose to display standard error bars where the error bars extend one
standard error away from the mean of each subgroup.
Multiple: Enter a positive number to be used as the multiplier for standard errors (1 is the
default).
Confidence interval: Choose to display confidence intervals instead of standard error
bars. The confidence intervals assume a normal distribution for the data and use t-
distribution critical values.
Level: Enter the level of confidence for the intervals. The default confidence coefficient is
95%.
Interval Plot
Interval Plot
GRAPH>INTERVAL PLOT GRAPH>INTERVAL PLOT
type
t
h
i
c
k
n
e
s
s
new existing
129.25
129.15
129.05
128.95
Useful for comparison of multiple
distributions. Shows the spread of data
around the mean by plotting standard error
bars or confidence intervals.
The default form of the plot provides error
bars extending one standard error (standard
deviation/square root of n) above and below
a symbol at the mean of the data.
One-Variable Regression
One-Variable Regression
STAT>REGRESSION>FITTED LINE PLOT STAT>REGRESSION>FITTED LINE PLOT
In the STAT>REGRESSION>FITTED
LINE PLOT dialog box, identify
Response Variable (Y). Identify one (1)
Predictor (X). Select TYPE OF MODEL
(Linear, Quadratic or Cubic). Click on
STORAGE. Select RESIDUALS and
FITS.
If you need to transform data, use
OPTIONS and select Transformation.
In OPTIONS, select DISPLAY
CONFIDENCE BANDS and DISPLAY
PREDICTION BANDS. Click OK.
The output from the fitted line plot contains an equation which relates your predictor
(input variable) to your response (output variable). A plot of the data will indicate
whether or not a linear relationship between x and y is a sensible approximation.
These observations are modeled by the equation:
Y = b + mx + error
Confidence Bands are 95% confidence limits for data means. Prediction Bands
are limits for 95% of individual data points.
The R-sq is the square of the correlation coefficient. It is also the fraction of the
variation in the output (response) variable that is explained by the equation. What is
a good value? It depends... chemists may require an R-sq of .99. We may be
satisfied with an R
2
of .80.
Use Residual Plots (Below) to plot the residuals vs predicted values ( Fits) and
determine if there are additional patterns in the data.
760 750 740 730 720 710 700 690 680 670 660
700
600
500
400
300
200
Hardness
A
b
r
a
s
i
o
n
R-Sq = 0.784
Y = 2692.80 - 3.16067X
95% PI
95% CI
Regression
Regression Plot
Residual Plots
Residual Plots
Stat Stat>Regression>Residual Plots >Regression>Residual Plots
0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8
10
5
0
Residual
F
r
e
q
u
e
n
c
y
Histogram of Residuals
30 20 10 0
1
0
-1
Observation Number
R
e
s
id
u
a
l
I Chart of Residuals
X=0.000
3.0SL=0.9631
-3.0SL=-0.9631
10 9 8 7 6 5 4
0.5
0.0
-0.5
-1.0
Fit
R
e
s
id
u
a
l
Residuals vs. Fits
2 1 0 -1 -2
0.5
0.0
-0.5
-1.0
Normal Plot of Residuals
Normal Score
R
e
s
id
u
a
l
Weld Temp Fits
Any time a model has been created for an X/Y
relationship, through ANOVA, DOE, Regression, the
quality of that model can be evaluated by analysis of
the error in the equation.
When doing the REGRESSION (Page 37-38), or the
FITTED LINE PLOT (above), be sure to select store
FITS and RESIDUALS in the STORAGE dialog
box. If the fit is good, the error should be normally
distributed with an average of zero and there should
be no pattern to the error over the range.
Then in the RESIDUAL PLOTS dialog box, identify
the column where the residuals are stored
in the Residuals box and the fits storage column in the Fits box.
The output includes a normal plot of residuals, a histogram of residuals, an Individuals Chart of
Residuals and a scatter plot of Residuals versus Fits.
Analysis of the Normal Plot should show a relatively straight line if the residuals are normally
distributed. The I chart should be analyzed as a control chart. The histogram should be a bell-shaped
distribution. The residuals vs fits scatter plot should show no pattern, with a constant spread over the
range.
22
Binary Logistic Regression
Binary Logistic Regression
In binary logistic regression the predicted value (Y) will be probabilities p(d) of
an event such as success or failure occurring. The predicted values will be
bounded between zero and one (because they are probabilities).
Example: Predict the success or failure of winning a contract based on the
response cycle time to a request for proposal and the proposal team leader.
The probability of an event, (x) or Y, is not linear with respect to the Xs. The
change in (x) for a unit change becomes progressively smaller as (x)
approaches zero or one. Logistic regression develops a function to model this.
is the odds. The Logit is the Log of the odds. Ultimately the
transfer function being developed will solve for (x).
To analyze the binary logistic problem, use STAT>REGRESSION>BINARY
LOGISTIC REGRESSION. The data set used for Response will be Discrete
and Binary (Yes/No;Success/Failure). In the Model dialog box, enter all
factors to be analyzed. In the Factors dialog box, enter those factors which
are discrete. Use the Storage button and select Event probability. This will
store the Calculated Event probability for each unique value of the function.
)) ( 1 ( ) ( x x
) 1 (
1 0
1 0
) (
e
e
x
x
x
+
+
+
=
Analyze the Session Window
output.
1. Analyze the Hypothesis Test
for the model as a whole.
Check for a p value indicating
model significance.
2. Check for statistical
significance of the individual
factors separately. Use the P
Value.
3. Check the Odds ratios for the
individual predictor levels.
4. Use the Confidence Interval to
confirm significance. Where
confidence interval includes
1.0, the odds are not
significant.
5. Evaluate the Model for
Goodness of fit. Use
Hosmer- Lemeshow if there is a continuous X in the model.
6. Assess the measures of Association. Note % Concordant is a measure
similar to R
2
. A higher value here indicates a better predictive model.
Binary Logistic Regression
Link Function: Logit
Response Information
Variable Value Count
Bid Yes 113 (Event)
No 110
Total 223
Logistic Regression Table
Odds 95% CI
Predictor Coef StDev Z P Ratio Lower Upper
Constant 7.410 1.670 4.44 0.000
Index -8.530 1.799 -4.74 0.000 0.00 0.00 0.01
Brand Sp
Yes 1.2109 0.3005 4.03 0.000 3.36 1.86 6.05
Log-Likelihood = -134.795
Test that all slopes are zero: G = 39.513, DF = 2, P-Value = 0.000
1 1
2 2
3 3 4 4
Method Chi-Square DF P
Pearson 187.820 116 0.000
Deviance 224.278 116 0.000
Hosmer-Lemeshow 7.138 7 0.415
Table of Observed and Expected Frequencies:
(See Hosmer-Lemeshow Test for the Pearson Chi-Square Statistic)
(Between the Response Variable
and Predicted Probabilities)
Pairs Number Percent Summary Measures
Concordant 9060 72.9% Somers' D 0.47
Discordant 3260 26.2% Goodman-Kruskal Gamma 0.47
Ties 110 0.9% Kendall's Tau-a 0.23
Total 12430 100.0%
5 5
6 6
49
21
Normal Plot
Normal Plot
STAT>BASIC STATISTICS>NORMALITY TEST STAT>BASIC STATISTICS>NORMALITY TEST
Identify the variable you will be testing
in the Variable box.
Click OK (Use default Anderson Darling
test).
A Normal probability plot is a graphical method
to help you determine whether your data is
normally distributed. To graphically analyze
your data, look at the plotted points relative to
the sloped line. A normal distribution will yield
plotted points which closely hug the line. Non-
normal data will generally show points which
significantly stray from the line.
The test statistics displayed on the plot are A-
squared and p-value.
The A-squared value is an output of a test for normality. Focus your analysis on the p value. The p
value is The probability of claiming the data are not normal if the data are truly from a normal
distribution, a type I error. A high p-value would therefore be consistent with a normal distribution. A
low p-value would indicate non-normality. Use the appropriate type I error probability for judging this
result.
P-Value: 0.328
A-Squared: 0.418
Anderson-Darling Normality Test
N: 500
StDev: 10.0000
Average: 70.0000
106 96 86 76 66 56 46 36 26
.999
.99
.95
.80
.50
.20
.05
.01
.001
P
r
o
b
a
b
i
l
i
t
y
Normal
Normal Probability Plot
Descriptive Statistics
Descriptive Statistics
Stat Stat>Basic Statistics>Descriptive Statistics >Basic Statistics>Descriptive Statistics
92 82 72 62 52 42
95% Confidence Interval for Mu
71 70 69
95% Confidence Interval for Median
Variable: C1
68.6347
9.2856
68.5160
Maximum
3rd Quartile
Median
1st Quartile
Minimum
N
Kurtosis
Skewness
Variance
StDev
Mean
P-Value:
A-Squared:
70.8408
10.5136
70.2489
99.6154
75.8697
69.6848
62.5858
38.9111
500
3.89E-02
-5.0E-03
97.2442
9.8612
69.3824
0.790
0.235
95% Confidence Interval for Median
95% Confidence Interval for Sigma
95% Confidence Interval for Mu
Anderson-Darling Normality Test
Descriptive Statistics
The Descriptive Statistics>Graphs>Graphical
Summary graphic provides a histogram of the data
with a superimposed normal curve, a normality
check, a table of descriptive statistics, a box plot of
the data and confidence interval plots for mean and
median.
In the Descriptive Statistics dialog box select the
variables for which you wish to create the descriptive
statistics. If choosing a stacked variable with a
category column, check the BY VARIABLE box and
indicate the location of the category identifier.
Use the Graphs button to open the graphs dialog
box. In the graphs dialog box, select Graphical
Summary. When using this tool to interpret normality, confirm the p value and evaluate the shape of the
histogram. Remember that the p value is The probability of claiming the data are not normal if the
data are truly from a normal distribution, a type I error. A high p-value would therefore be consistent
with a normal distribution. A low p-value would indicate non-normality. When evaluating the shape
of the histogram graphically, determine: Is it bimodal? Is it skewed? If yes, investigate potential
causes for the non-normality. Improve if possible or analyze groups separately. If no special cause is
found for the non-normality, the distribution may be non-normal naturally and you may need to
transform the data (page 22) prior to calculating your Z.
50
Design for Six Sigma-
Design for Six Sigma-
Tolerance Analysis
Tolerance Analysis
Datum
A
B1 B4 B3 B2
Gap
+
Tolerance Analysis is a design method used determine the impact that
individual parts of a system have on the overall requirement for that
system.
Most often, Tolerance Analysis is applied to dimensional characteristics in
order to see the impact the dimensions have on the final assembly in
terms of a gap or interference. In this application, a tolerance loop may
be used to illustrate the relationship.
Purpose
To graphically show the relationships of multiple parts in a system which
result in a desired technical requirement in terms of a gap or
interference.
Process
1. Generate a layout drawing of your assembly. A hand sketch is all that is
required.
2. Clearly identify the gap in the most severe condition.
3. Select a DATUM or point from which to start your loop. (It is easier to
start the loop at one of the interfaces of the Gap.)
4. Use drawing dimensions as vectors to connect the two sides of the gap.
5. Assign sign convention (+/-) to vectors
In the diagram above, the relationship can be explained as:
GAP = A - B1 - B2 - B3 - B4
Because the relationship can be explained using only + & - signs, the
equation is considered LINEAR, and can be analyzed using a method
known as Root Sum of Squares (RSS) analysis.
Vector Assignment
Assign a positive (+) vector when
An increase in the dimension
increases the gap.
An increase in the dimension
reduces the interference.
Assign a negative (-) vector when
An increase in the dimension
reduces the gap.
An increase in the dimension
increases the interference.
51
20
9.7 9.5 9.3 9.1 8.9 8.7 8.5
9.7
9.5
9.3
9.1
8.9
8.7
8.5
NEW
E
X
I
S
T
I
N
G
Scatter Plot
Scatter Plot
GRAPH>PLOT GRAPH>PLOT
The scatter plot is a useful tool for
understanding the relationship between two
variables.
In the GRAPH VARIABLES box select each X
and Y variable you wish to plot. MINITAB will
create individual plots for each pair of variables
selected. In the Six Sigma method, the
selected Y should be the dependent variable
and the selected X the independent variable.
Select as many combinations as you wish.
Click the OPTIONS button to add jitter to
the graph. Where there are multiple data points with the same value, this will allow
each data point to be seen.
Click FRAME button to give the options for setting common axes or multiple graphs.
Click ATTRIBUTES button to access options for changing Graphic colors or fill type.
Click ANNOTATION button to access options for changing the appearance of the data
points or to add titles, data labels or text.
Results: If Y changes as X changes, there is a potential relationship. Use the graph to
check visually for linearity or non-linearity.
Histogram
Histogram
GRAPH>HISTOGRAM GRAPH>HISTOGRAM
The histogram is useful to look
graphically at the distribution of data.
In the GRAPH VARIABLES Box select
each variable you wish to graph
individually.
Click the OPTIONS button to change the
Histogram displayed:
Type of Histogram - Frequency
(Default); Percent; Density
Type of Intervals - Midpint (Default ) or Cutpoint
Definition of intervals - Automatic (Default ) or manual definition
See Minitab Help for explanation of how to use these options. Click HELP in the
HISTOGRAM>OPTIONS Dialog Box.
Click FRAME button to give the options for setting common axes or multiple
graphs.
Click ATTRIBUTES button to access options for changing graphic colors or fill
type.
Design for Six Sigma - Design for Six Sigma -
Tolerance Analysis (continued) Tolerance Analysis (continued)
Gather and prepare the required data
gap nominals
gap specification limits
process data for each component
process mean
process s.st
process s.lt
process data assumptions, if data not available: try 'expert data sources'
or....process s.lt estimates when data is not available
s.st, s.lt from capability data
Z-shift assumptions, long term-to-short term, when one is known
multiply s.st by 1.6 for a variance inflation factor
multiply s.st by 1.6+ for a process that has less control long term
( divide s.lt by the above factor if long-term historical data is known)
The Tolanal.xls spreadsheet performs its analysis using Root
Sum of Squares method, and should only be applied to LINEAR
relationships (i.e. Y=X1+X2-X3). Non-linear relationships require
more detailed analysis using advanced DFSS tools such as
Monte Carlo or the ANALYSIS.XLS spreadsheet. Contact a
Master Blackbelt for support.
Using RSS, the statistics for the GAP can be explained as
follows. The bar (
=
= =
+
X X X
k
R X X
R R R
k
CL X E
D D
k
m i i
m
k
x
m
m m
1 2
1
1 2 1
2
4 3
1
...
...
Control Chart Constants
Control Chart Constants
n A2 A3 D3 D4 B3 B4 d2 c 4
1 2.660 3.760 - - - - - -
2 1.880 2.659 0 3.267 0 3.267 1.128 0.7979
3 1.023 1.954 0 2.575 0 2.568 1.693 0.8862
4 0.729 1.628 0 2.282 0 2.266 2.059 0.9213
5 0.577 1.427 0 2.115 0 2.089 2.326 0.9400
6 0.483 1.287 0 2.004 0.03 1.970 2.534 0.9515
7 0.419 1.182 0.076 1.924 0.118 1.882 2.704 0.9594
8 0.373 1.099 0.136 1.864 0.185 1.815 2.847 0.9650
9 0.337 1.032 0.184 1.816 0.239 1.761 2.970 0.9693
10 0.308 0.975 0.223 1.777 0.284 1.716 3.078 0.9727
Variables Control Chart Control Limit Constants
( )
( )
X
X X . . . X
k
, where X
X
n
R
R R . . . R
k
UCL X A R and LCL X A R
UCL D R and LCL D R
1 2 k
i
i 1
n
1 2 k
x
2
x
2
R 4 R 3
=
+ + +
=
=
+ + +
= + =
= =
=
Average/Range Chart
Individual X /Moving Range Chart
( )
( )
np # defective for each subgroup
np
np
k
, for all k subgroups
UCL np 3 np
UCL np 3 np
np
np
=
=
= +
=
1
1
p
p
p Charts np Charts
( )
( )
p
np
n
and p
np
N
UCL p 3
p 1 p
n
LCL p 3
p 1 p
n
p
p
= =
= +
=
np = number of defectives
n= subgroup size
N= total number defectives for all
subgroups
15%
60
What it is:
Gage R&R is a means for checking the measurement system (gage plus
operator) to gain a better understanding of the variation and sources from the
measurement system.
Gage R&R=
m
* . 515 or 5.15*
AV
EV
2 2
+
Where
m
=Measurement System standard deviation
Components of Measurement Error
Repeatability = Equipment Variation (EV): The variation in
measurements attributable to one measurement instrument when used
several times by one appraiser to measure the identical characteristic
on the same part.
Reproducibility = Appraisal Variation (AV): The variation in
measurements attributable to different appraisers using the same
measurement instrument to measure the same characteristic on the
same part.
How to do the gage R&R study.
1. Determine how the gage is going to be used; i.e., Product Acceptance
or Process Control.
Gage must have resolution 10X finer than the process variation it is
intended to measure. (i.e., measurement of parts with process
variation of .001 requires a gage with .0001 resolution)
2. Select approximately ten parts which represent the entire expected
range of the process variation, including several beyond the normally
acceptable range. Code (blind) the parts.
3. Identify two or three Gage R&R participants from the people who actually
do the measurement. Have them each measure each part two or three
times. The measurements should by done with samples randomized and
blinded.
4. Record results on a MINITAB worksheet as follows:
a) One Column - Coded Part Numbers (PARTS)
b) One Column - Appraiser number or name (OPER)
c) One Column - Recorded Measurement (RESP)
5. Analyze using MINITAB by running Stat>Quality Tools>GageR&R
a) In the initial dialog box choose ANOVA method.
b) Identify the appropriate columns for PARTS, OPERATOR,
and MEASUREMENT Data .
c) If you wish to include the analysis for process tolerance, select
the OPTIONS button. This is only to be used if the gage is
for pass fail decisions only, not for process control.
d) If you wish to show demographic information on the graphic
output, including gage number, etc, select the Gage
Information button.
Gage R&R Gage R&R
(1) (1)
11
Precontrol
Precontrol
Provide ongoing visual means of on-the-floor process control.
Gives operators decision rules for continuing or stopping
production.
Rules are based on probability that population mean has
shifted.
Why use it?
What does it do?
How do I do it?
1. Establish control zones:
2. When five parts in a row are green, the process is
qualified.
3. Sample two consecutive parts on a periodic basis.
4. Decision rules for operators:
A. If first part is green, no action needed, continue to run.
B. If first part is yellow, then check a second part.
If second part is green, no action needed.
If second part is yellow on same side, then adjust
If second part is yellow on opposite side, stop, call
support engineer.
C. If any part is red, stop, call support engineer.
5. After correcting and restarting a process, must achieve 5
consecutive green samples to re-qualify.
Yellow Red Red Yellow Green
-1.5 s -3.0 s +1.5 s +3.0 s
=
% Agreement should be very good. Typically this measure is much
greater than 95%.
% Agreement for Binary
% Agreement for Binary
(Pass / Fail) Data
(Pass / Fail) Data
Calculate % Agreement in similar manner to Non Measurement, except
using the following equation.
100
ies Opportunit of Number
Agreements of Number
Agreement %
=
Where the number of opportunities is found by the following equations.
n = total number of assessments per sample
s = number of samples
=
4
1
2
n
s ies Opportunit #
then odd, is n If
=
4
2
n
s ies Opportunit #
then even, is n If
Overall % Agreement = Agreement rate for all opportunities
Repeatability % Agreement = Compare the assessments for one operator over
multiple assessment opportunities. (Fix this problem first)
Reproducibility % Agreement = Compare assessments of the same part from
operator to operator.
Project Closure
Project Closure
At Closure, the project must be positioned so that the changes made to the process are
sustainable over time. Doing so requires the completion of a number of tasks.
1. The improvement must be fully implemented, with leverage factors identified and
controlled. The process must have been re-baselined to confirm the degree of
improvement.
2. Process owners must be fully trained and running the process, controlling the
leverage factors and monitoring the Response (Y).
3. Required Quality Plan and Control Procedures, drawings, documents, policies,
generated reports or institutionalized rigor must be completed .
Workstation Instructions
Job Descriptions
Preventive Maintenance Plan
Written Policy or controlled ISO documents
Documented training procedures
Periodic internal audits or review meetings.
4. The project History Binder must be completed which records key information about
the project work in hard copy. Where MINITAB has been used for analysis, hard
copies of the generated graphics and tables should be included.
Initial baseline data
Gage R & R calculations
Statistical characterization of the process
DOE (Design of Experiments)
Hypothesis testing
Any data from Design Change Process activities (described on the next page),
Failure Modes and Effects Analysis (FMEA), Design for Six Sigma (DFSS),
etc.
Copies of engineering part and tooling drawing changes showing Z score
values on the drawings.
Confirmation run data
Financial data (costs and benefits)
Final decision on improvement and conclusions
All related quality system documents
A scorecard (with frequency of reporting)
Documented control plan
5. All data entries must be complete in PROTRAK
Response Variable Z scores at initial Baselining
Response Variable Z scores at Re-baselining,
Project Definition
Improvements Made
Accomplishments, Barriers and Milestones for all project phases
Tools used for all project phases.
6. Costs and Benefits for the project must be reconfirmed with site finance.
7. Investigate potential transfer opportunities where project lessons learned can be
applied to other business processes.
8. Submit closure package for signoff through the site approval channels.
Sample Size
Sample Size
= 20% = 10% = 5% = 1%
20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1% 20% 10% 5% 1%
/ / / /
0.2 225 328 428 651 309 428 541 789 392 525 650 919 584 744 891 1202
0.3 100 146 190 289 137 190 241 350 174 234 289 408 260 331 396 534
0.4 56 82 107 163 77 107 135 197 98 131 162 230 146 186 223 300
0.5 36 53 69 104 49 69 87 126 63 84 104 147 93 119 143 192
0.6 25 36 48 72 34 48 60 88 44 58 72 102 65 83 99 134
0.7 18 27 35 53 25 35 44 64 32 43 53 75 48 61 73 98
0.8 14 21 27 41 19 27 34 49 25 33 41 57 36 46 56 75
0.9 11 16 21 32 15 21 27 39 19 26 32 45 29 37 44 59
1.0 9 13 17 26 12 17 22 32 16 21 26 37 23 30 36 48
1.1 7 11 14 22 10 14 18 26 13 17 21 30 19 25 29 40
1.2 6 9 12 18 9 12 15 22 11 15 18 26 16 21 25 33
1.3 5 8 10 15 7 10 13 19 9 12 15 22 14 18 21 28
1.4 5 7 9 13 6 9 11 16 8 11 13 19 12 15 18 25
1.5 4 6 8 12 5 8 10 14 7 9 12 16 10 13 16 21
1.6 4 5 7 10 5 7 8 12 6 8 10 14 9 12 14 19
1.7 3 5 6 9 4 6 7 11 5 7 9 13 8 10 12 17
1.8 3 4 5 8 4 5 7 10 5 6 8 11 7 9 11 15
1.9 2 4 5 7 3 5 6 9 4 6 7 10 6 8 10 13
2.0 2 3 4 7 3 4 5 8 4 5 6 9 6 7 9 12
2.1 2 3 4 6 3 4 5 7 4 5 6 8 5 7 8 11
2.2 2 3 4 5 3 4 4 7 3 4 5 8 5 6 7 10
2.3 2 2 3 5 2 3 4 6 3 4 5 7 4 6 7 9
2.4 2 2 3 5 2 3 4 5 3 4 5 6 4 5 6 8
2.5 1 2 3 4 2 3 3 5 3 3 4 6 4 5 6 8
2.6 1 2 3 4 2 3 3 5 2 3 4 5 3 4 5 7
2.7 1 2 2 4 2 2 3 4 2 3 4 5 3 4 5 7
2.8 1 2 2 3 2 2 3 4 2 3 3 5 3 4 5 6
2.9 1 2 2 3 1 2 3 4 2 2 3 4 3 4 4 6
3.0 1 1 2 3 1 2 2 4 2 2 3 4 3 3 4 5
3.1 1 1 2 3 1 2 2 3 2 2 3 4 2 3 4 5
3.2 1 1 2 3 1 2 2 3 2 2 3 4 2 3 3 5
3.3 1 1 2 2 1 2 2 3 1 2 2 3 2 3 3 4
3.4 1 1 1 2 1 1 2 3 1 2 2 3 2 3 3 4
3.5 1 1 1 2 1 1 2 3 1 2 2 3 2 2 3 4
3.6 1 1 1 2 1 1 2 2 1 2 2 3 2 2 3 4
3.7 1 1 1 2 1 1 2 2 1 2 2 3 2 2 3 4
3.8 1 1 1 2 1 1 1 2 1 1 2 3 2 2 2 3
3.9 1 1 1 2 1 1 1 2 1 1 2 2 2 2 2 3
4.0 1 1 1 2 1 1 1 2 1 1 2 2 1 2 2 3
9
62
Fulfillment & Span
Fulfillment & Span
Fulfillment Providing what the customer wants
when the customer wants it
Fulfillment is a highly segmented metric and typically does not follow a normal
distribution. Because the data is non-normal some of the traditional 6 Sigma
tools should not be used (such as the 6 Sigma Process Report). Therefore,
Median and Span will be used to measure Fulfillment.
Median the middle value in a data set
Span The difference between two values in the data set
(e.g. 1/99 Span = the difference between the 99
th
percentile and the 1
st
percentile)
Example
A sample of 100 delivery times has
a high value of 40 days
If that one value had instead been
30 days, the 1/99 span would
change by 10 days
10/90 span is not affected by what
happens to that highest point
We dont want our decision to be influenced by a single data point.
Therefore, the Span calculation is dependent on the sample size. Larger
data sets will have a wider span. Following are corporate guidelines on the
Span calculation:
Sample Size Span
100-500 10/90 Span
500-5000 5/95 Span
>5000 1/99 Span
In order to analyze a fulfillment process, the data should be segmented by the
variables that may affect the process. Each segment of data should be
compared to identify if the segmenting factor had an influence on the Median
and the Span. Moods Median test is a tool that can be used to identify
significant differences in Median. Factors that are identified as having an
influence on Span and Median, should be evaluated further through designed
experimentation.
F Distribution
F Distribution
Denom DF 1 2 3 4 5 6 7 8 9 10
1 161.40 199.50 215.70 224.60 230.20 234.00 236.80 238.90 240.50 241.90
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99
120 3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96 1.91
3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83
8 63
Z - An Important Measure
Z - An Important Measure
Z Short Term Z Short Term
Z.LT
describes the sustained reproducibility of a
process. It is also called long-term capability. It
reflects all of the sources of operational
variation, the influence of common cause
variation, dynamic nonrandom process centering
error, and any static off-set present in the
process mean. This metric assumes the data
were gathered in accordance to the principals
and spirit of a rational sampling plan (p. 14).
This equation is applicable to all types of
tolerances. It is used to estimate the long-term
process PPM.
Z Shift Z Shift
Z Z Z
SHIFT ST LT
. . . =
Z Long Term Z Long Term
Z Benchmark Z Benchmark
) (
. P P Z LSL USL Benchmark
score Z + =
While the Z values above are all calculated in reference to a single spec limit,
Z Benchmark is the Z score of the summation of the probabilities of defects in both
tails of the distribution. To find, sum the Probability of defect at the Lower Spec Limit
(P
LSL
)
and the Probability of defect at the Upper Spec Limit (P
USL
).
Look up the sum
of the combined probabilities in a normal table to find the corresponding Z value.
Z
SL Target
s
ST
.
ST
=
Z.st describes how the process performs at any given moment in time. It is referred to
as instantaneous capability, short-term capability or process entitlement. It is used
when referring to the SIGMA of a process. It is the process capability if everything is
controlled so that only background noise (common cause variation) is present. This metric
assumes the process is centered and the data were gathered in accordance to the
principals and spirit of a rational subgrouping plan (p. 14). The Target assumes that
each subgroup average is aligned to this number, so that all subgroup means are artificially
centered on this number. The s
st
used in this equation can be estimated by the square root
of the Mean Square Error term in the ANOVA Table. Since it is centered data, it can be
calculated from either one of the Specification Limits (SL).
Z LT
USL
USL
LT
.
Minimum of
or
Z LT
LSL
LSL
LT
.
Z.
SHIFT
describes how well the process being
measured is controlled over time. It reflects the
difference between the short term and long term
capability. It focuses on the dynamic nonrandom
process centering error, and any static off-set
present in the process mean. Interpretation of
the Z.shift is only valid when following the
principles of rational subgrouping (p.14)
=.05
=.05
Numerator Degrees of Freedom
F Distribution
F Distribution
64 7
=.05
=.05
Numerator Degrees of Freedom
The Standard Normal Curve
The Standard Normal Curve
** Area under the curve = 1, the center is 0 **
Z Area Z Area Z Area Z Area
0.00 .500000000 1.51 .065521615 3.02 .001263795 4.53 .000002999
0.05 .480061306 1.56 .059379869 3.07 .001070234 4.58 .000002369
0.10 .460172290 1.61 .053698886 3.12 .000904215 4.63 .000001867
0.15 .440382395 1.66 .048457216 3.17 .000762175 4.68 .000001469
0.20 .420740315 1.71 .043632958 3.22 .000640954 4.73 .000001153
0.25 .401293634 1.76 .039203955 3.27 .000537758 4.78 .000000903
0.30 .382088486 1.81 .035147973 3.32 .000450127 4.83 .000000705
0.35 .363169226 1.86 .031442864 3.37 .000375899 4.88 .000000550
0.40 .344578129 1.91 .028066724 3.42 .000313179 4.93 .000000428
0.45 .326355105 1.96 .024998022 3.47 .000260317 4.98 .000000332
0.50 .308537454 2.01 .022215724 3.52 .000215873 5.03 .000000258
0.55 .291159644 2.06 .019699396 3.57 .000178601 5.08 .000000199
0.60 .274253121 2.11 .017429293 3.62 .000147419 5.13 .000000154
0.65 .257846158 2.16 .015386434 3.67 .000121399 5.18 .000000118
0.70 .241963737 2.21 .013552660 3.72 .000099739 5.23 .000000091
0.75 .226627465 2.26 .011910681 3.77 .000081753 5.28 .000000070
0.80 .211855526 2.31 .010444106 3.82 .000066855 5.33 .000000053
0.85 .197662672 2.36 .009137469 3.87 .000054545 5.38 .000000041
0.90 .184060243 2.41 .007976235 3.92 .000044399 5.43 .000000031
0.95 .171056222 2.46 .006946800 3.97 .000036057 5.48 .000000024
1.00 .158655319 2.51 .006036485 4.02 .000029215 5.53 .000000018
1.05 .146859086 2.56 .005233515 4.07 .000023617 5.58 .000000014
1.10 .135666053 2.61 .004527002 4.12 .000019047 5.63 .000000010
1.15 .125071891 2.66 .003906912 4.17 .000015327 5.68 .000000008
1.20 .115069593 2.71 .003364033 4.22 .000012305 5.73 .000000006
1.25 .105649671 2.76 .002889938 4.27 .000009857 5.78 .000000004
1.30 .096800364 2.81 .002476947 4.32 .000007878 5.83 .000000003
1.35 .088507862 2.86 .002118083 4.37 .000006282 5.88 .000000003
1.40 .080756531 2.91 .001807032 4.42 .000004998 5.93 .000000002
1.45 .073529141 2.96 .001538097 4.47 .000003968 5.98 .000000001
1.50 .066807100 3.01 .001306156 4.52 .000003143 6.03 .000000001
Z = 2.76
This
table lists
the tail
area to
the right
of Z.
Table of Area
Under the
Normal Curve
Copyright 1995 Six Sigma Academy, Inc.
Performance
Limit
The Z value is a measure of process
capability and is often referred to
as the sigma of the process. A Z
= 1 indicates a process for which
the performance limit falls one
standard deviation from the mean.
If we calculate the standard normal
deviate for a given performance
limit and discover that Z = 2.76,
the probability of a defect (P(d)) is
the probability of a point lying
beyond the Z value of 2.76.
Z
Units of Measure
Z=0
Mean
Point of Inflection
Z=1
1
Z=1
Total Area = 1
Denom DF 12 15 20 24 30 40 60 120
1 243.90 245.90 248.00 249.10 250.10 251.10 252.20 253.30 254.30
2 19.41 19.43 19.45 19.45 19.46 19.47 19.48 19.49 19.50
3 8.74 8.70 8.66 8.64 8.62 8.59 8.57 8.55 8.53
4 5.91 5.86 5.80 5.77 5.75 5.72 5.69 5.66 5.63
5 4.68 4.62 4.56 4.53 4.50 4.46 4.43 4.40 4.36
6 4.00 3.94 3.87 3.84 3.81 3.77 3.74 3.70 3.67
7 3.57 3.51 3.44 3.41 3.38 3.34 3.30 3.27 3.23
8 3.28 3.22 3.15 3.12 3.08 3.04 3.01 2.97 2.93
9 3.07 3.01 2.94 2.90 2.86 2.83 2.79 2.75 2.71
10 2.91 2.85 2.77 2.74 2.70 2.66 2.62 2.58 2.54
11 2.79 2.72 2.65 2.61 2.57 2.53 2.49 2.45 2.40
12 2.69 2.62 2.54 2.51 2.47 2.43 2.38 2.34 2.30
13 2.60 2.53 2.46 2.42 2.38 2.34 2.30 2.25 2.21
14 2.53 2.46 2.39 2.35 2.31 2.27 2.22 2.18 2.13
15 2.48 2.40 2.33 2.29 2.25 2.20 2.16 2.11 2.07
16 2.42 2.35 2.28 2.24 2.19 2.15 2.11 2.06 2.01
17 2.38 2.31 2.23 2.19 2.15 2.10 2.06 2.01 1.96
18 2.34 2.27 2.19 2.15 2.11 2.06 2.02 1.97 1.92
19 2.31 2.23 2.16 2.11 2.07 2.03 1.98 1.93 1.88
20 2.28 2.20 2.12 2.08 2.04 1.99 1.95 1.90 1.84
21 2.25 2.18 2.10 2.05 2.01 1.96 1.92 1.87 1.81
22 2.23 2.15 2.07 2.03 1.98 1.94 1.89 1.84 1.78
23 2.20 2.13 2.05 2.01 1.96 1.91 1.86 1.81 1.76
24 2.18 2.11 2.03 1.98 1.94 1.89 1.84 1.79 1.73
25 2.16 2.09 2.01 1.96 1.92 1.87 1.82 1.77 1.71
26 2.15 2.07 1.99 1.95 1.90 1.85 1.80 1.75 1.69
27 2.13 2.06 1.97 1.93 1.88 1.84 1.79 1.73 1.67
28 2.12 2.04 1.96 1.91 1.87 1.82 1.77 1.71 1.65
29 2.10 2.03 1.94 1.90 1.85 1.81 1.75 1.70 1.64
30 2.09 2.01 1.93 1.89 1.84 1.79 1.74 1.68 1.62
40 2.00 1.92 1.84 1.79 1.74 1.69 1.64 1.58 1.51
60 1.92 1.84 1.75 1.70 1.65 1.59 1.53 1.47 1.39
120 1.83 1.75 1.66 1.61 1.55 1.50 1.43 1.35 1.25
1.75 1.67 1.57 1.52 1.46 1.39 1.32 1.22 1.00
Probability
of a Defect
Example = .00289
Chi-Square Distribution
Chi-Square Distribution
df .995 .990 .975 .950 .900 .750 .500
1 .000039 .000160 .000980 .003930 .015800 .101500 .455000
2 0.010 0.020 0.051 0.103 0.211 0.575 1.386
3 0.072 0.115 0.216 0.352 0.584 1.213 2.366
4 0.207 0.297 0.484 0.711 1.064 1.923 3.357
5 0.412 0.554 0.831 1.145 1.610 2.675 4.351
6 0.676 0.872 1.237 1.635 2.204 3.455 5.348
7 0.989 1.239 1.690 2.167 2.833 4.255 6.346
8 1.344 1.646 2.180 2.733 3.490 5.071 7.344
9 1.735 2.088 2.700 3.325 4.168 5.899 8.343
10 2.156 2.558 3.247 3.940 4.865 6.737 9.342
11 2.603 3.053 3.816 4.575 5.578 7.584 10.341
12 3.074 3.571 4.404 5.226 6.304 8.438 11.340
13 3.565 4.107 5.009 5.892 7.042 9.299 12.340
14 4.075 4.660 5.629 6.571 7.790 10.165 13.339
15 4.601 5.229 6.262 7.261 8.547 11.036 14.339
16 5.142 5.812 6.908 7.962 9.312 11.912 15.338
17 5.697 6.408 7.564 8.672 10.085 12.792 16.338
18 6.265 7.015 8.231 9.390 10.865 13.675 17.338
19 6.844 7.633 8.907 10.117 11.651 14.562 18.338
20 7.434 8.260 9.591 10.851 12.443 15.452 19.337
21 8.034 8.897 10.283 11.591 13.240 16.344 20.337
22 8.643 9.542 10.982 12.338 14.041 17.240 21.337
23 9.260 10.196 11.688 13.091 14.848 18.137 22.337
24 9.886 10.856 12.401 13.848 15.659 19.037 23.337
25 10.520 11.524 13.120 14.611 16.473 19.939 24.337
26 11.160 12.198 13.844 15.379 17.292 20.843 25.336
27 11.808 12.879 14.573 16.151 18.114 21.749 26.336
28 12.461 13.565 15.308 16.928 18.939 22.657 27.336
29 13.121 14.256 16.047 17.708 19.768 23.567 28.336
30 13.787 14.953 16.791 18.493 20.599 24.478 29.336
7 Basic QC Tools - Ishikawa
7 Basic QC Tools - Ishikawa
The seven basic QC tools are the simplest, quickest tools for
structured problem solving. In many cases these tools will define
the appropriate area in which to focus to solve quality problems.
They are an integral part of the Six Sigma DMAIC process toolkit.
65 6
Brainstorming: Allows generation of a high volume of ideas quickly.
Generally used integrally with the advocacy team when identifying the
potential Xs.
Pareto: Helps to define the potential vital few Xs. The pareto links
data to problem causes and aids in making data based decisions (Page
23).
Histogram: Displays frequency of occurrence of various categories in
chart form, can be used as first cut at mean, variation, distribution of
data. An important part of process data analysis. (Page 18).
Cause & Effect / Fishbone Diagram: Helps identify potential problem
causes and focus brainstorming. (Page 23).
Flowcharting / Process Mapping: Displays actual steps of process.
Provides basis for examining potential areas of improvement.
Scatter Charts: Shows relationship between two variables.
(Page 18).
Check Sheets: Capture data in a format that facilitates interpretation.
Pareto Chart
Fishbone (Ishikawa)
Diagram
Chi-Square Distribution
Chi-Square Distribution
df .250 .100 .050 .025 .010 .005 .001
1 1.323 2.706 3.841 5.024 6.635 7.879 10.828
2 2.773 4.605 5.991 7.378 9.210 10.597 13.816
3 4.108 6.251 7.815 9.348 11.345 12.838 16.266
4 5.385 7.779 9.488 11.143 13.277 14.860 18.467
5 6.626 9.236 11.070 12.832 15.086 16.750 20.515
6 7.841 10.645 12.592 14.449 16.812 18.548 22.458
7 9.037 12.017 14.067 16.013 18.475 20.278 24.322
8 10.219 13.362 15.507 17.535 20.090 21.955 26.125
9 11.389 14.684 16.919 19.023 21.666 23.589 27.877
10 12.549 15.987 18.307 20.483 23.209 25.188 29.588
11 13.701 17.275 19.675 21.920 24.725 26.757 31.264
12 14.845 18.549 21.026 23.337 26.217 28.300 32.909
13 15.984 19.812 22.362 24.736 27.688 29.819 34.528
14 17.117 21.064 23.685 26.119 29.141 31.319 36.123
15 18.245 22.307 24.996 27.488 30.578 32.801 37.697
16 19.369 23.542 26.296 28.845 32.000 34.267 39.252
17 20.489 24.769 27.587 30.191 33.409 35.718 40.790
18 21.605 25.989 28.869 31.526 34.805 37.156 43.312
19 22.718 27.204 30.144 32.852 36.191 38.582 43.820
20 23.828 28.412 31.410 34.170 37.566 39.997 45.315
21 24.935 29.615 32.671 35.479 38.932 41.401 46.797
22 26.039 30.813 33.924 36.781 40.289 42.796 48.268
23 27.141 32.007 35.172 38.076 41.638 44.181 49.728
24 28.241 33.196 36.415 39.364 42.980 45.558 51.179
25 29.339 34.382 37.652 40.646 44.314 46.928 52.620
26 30.434 35.563 38.885 41.923 45.642 48.290 54.052
27 31.528 36.741 40.113 43.194 46.963 49.645 55.476
28 32.620 37.916 41.337 44.461 48.278 50.993 56.892
29 33.711 39.087 42.557 45.722 49.588 52.336 58.302
30 34.800 40.256 43.773 46.979 50.892 53.672 59.703
66 5
Practical Problem Statement
Practical Problem Statement
A major cause of futile attempts to solve a problem is poor, up front statement of the
problem. Define the problem using available facts, and planned improvement.
1. Write an initial as is problem statement condition. This statement describes the
problem as it exists now. It is a statement of what hurts or what bugs you. The
statement should contain data based measures of the hurt. For example:
As Is: The response time for 15% of our service calls is more than 24 hours.
2. Be sure the problem statement meets the following criteria:
Is as specific as possible
Contains no potential causes
Contains no conclusions or potential solutions
Is sufficiently narrow in scope
The most common mistake in developing a Problem Statement is the problem is
stated at too high a level or is too broad for effective investigation. Use the
Structure Tree (Page 25), Pareto (Page 25) or Rolled Throughput Yield analysis
(Page 14) to break the problem down further.
3. Avoid the following in wording problem statements:
4. Determine if you have identified the correct level to address the problem.
Ask: Is my Y response variable (Output) defined at a level at which it can
be solved by direct interaction with its independent variables (Xs) Inputs?
5. Determine if correcting the Y response variable will result in the desired
improvement in the problem as stated.
6. Describe the desired state, a description of what you want to achieve by solving
the problem, as objectively as possible. As with the as is statement, be sure the
desired state is in measurable observable terms. For example:
Desired State: The response time for all our service calls is less than 24 hours.
Avoid Ineffective Problem
Statement
Effective Problem Statement
Questions How can we reduce the
downtime on the
Assembly Line.
Assembly Line downtime
currently runs 15% of operating
hours.
The word
lack
We lack word processing
software
Material to be typed is
backlogged by five days.
Solution
masquerading
as a problem
We need to hire another
warehouse shipping
clerk.
50% of the scheduled days
shipments are not being pulled
on time.
Blaming
people instead
of processes
File Clerks arent doing
their jobs.
Files cannot be located within
the allowed 5 minutes after
requested.
67
Defining a Six Sigma Project Defining a Six Sigma Project
A well defined problem is the first step
in a successful project!
4
Normal Distribution
Normal Distribution
Z
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 5.00E-01 4.96E-01 4.92E-01 4.88E-01 4.84E-01 4.80E-01 4.76E-01 4.72E-01 4.68E-01 4.64E-01
0.1 4.60E-01 4.56E-01 4.52E-01 4.48E-01 4.44E-01 4.40E-01 4.36E-01 4.33E-01 4.29E-01 4.25E-01
0.2 4.21E-01 4.17E-01 4.13E-01 4.09E-01 4.05E-01 4.01E-01 3.97E-01 3.94E-01 3.90E-01 3.86E-01
0.3 3.82E-01 3.78E-01 3.75E-01 3.71E-01 3.67E-01 3.63E-01 3.59E-01 3.56E-01 3.52E-01 3.48E-01
0.4 3.45E-01 3.41E-01 3.37E-01 3.34E-01 3.30E-01 3.26E-01 3.23E-01 3.19E-01 3.16E-01 3.12E-01
0.5 3.09E-01 3.05E-01 3.02E-01 2.98E-01 2.95E-01 2.91E-01 2.88E-01 2.84E-01 2.81E-01 2.78E-01
0.6 2.74E-01 2.71E-01 2.68E-01 2.64E-01 2.61E-01 2.58E-01 2.55E-01 2.51E-01 2.48E-01 2.45E-01
0.7 2.42E-01 2.39E-01 2.36E-01 2.33E-01 2.30E-01 2.27E-01 2.24E-01 2.21E-01 2.18E-01 2.15E-01
0.8 2.12E-01 2.09E-01 2.06E-01 2.03E-01 2.01E-01 1.98E-01 1.95E-01 1.92E-01 1.89E-01 1.87E-01
0.9 1.84E-01 1.81E-01 1.79E-01 1.76E-01 1.74E-01 1.71E-01 1.69E-01 1.66E-01 1.64E-01 1.61E-01
1.0 1.59E-01 1.56E-01 1.5 39E01 1.52E-01 1.49E-01 1.47E-01 1.45E-01 1.42E-01 1.40E-01 1.38E-01
1.1 1.36E-01 1.34E-01 1.31E-01 1.29E-01 1.27E-01 1.25E-01 1.23E-01 1.21E-01 1.19E-01 1.17E-01
1.2 1.15E-01 1.13E-01 1.11E-01 1.09E-01 1.08E-01 1.06E-01 1.04E-01 1.02E-01 1.00E-01 9.85E-02
1.3 9.68E-02 9.51E-02 9.34E-02 9.18E-02 9.01E-02 8.85E-02 8.69E-02 8.53E-02 8.38E-02 8.23E-02
1.4 8.08E-02 7.93E-02 7.78E-02 7.64E-02 7.49E-02 7.35E-02 7.21E-02 7.08E-02 6.94E-02 6.81E-02
1.5 6.68E-02 6.55E-02 6.43E-02 6.30E-02 6.18E-02 6.06E-02 5.94E-02 5.82E-02 5.71E-02 5.59E-02
1.6 5.48E-02 5.37E-02 5.26E-02 5.16E-02 5.05E-02 4.95E-02 4.85E-02 4.75E-02 4.65E-02 4.55E-02
1.7 4.46E-02 4.36E-02 4.27E-02 4.18E-02 4.09E-02 4.01E-02 3.92E-02 3.84E-02 3.75E-02 3.67E-02
1.8 3.59E-02 3.52E-02 3.44E-02 3.36E-02 3.29E-02 3.22E-02 3.14E-02 3.07E-02 3.01E-02 2.94E-02
1.9 2.87E-02 2.81E-02 2.74E-02 2.68E-02 2.62E-02 2.56E-02 2.50E-02 2.44E-02 2.39E-02 2.33E-02
2.0 2.28E-02 2.22E-02 2.17E-02 2.12E-02 2.07E-02 2.02E-02 1.97E-02 1.92E-02 1.88E-02 1.83E-02
2.1 1.79E-02 1.74E-02 1.70E-02 1.66E-02 1.62E-02 1.58E-02 1.54E-02 1.50E-02 1.46E-02 1.43E-02
2.2 1.39E-02 1.36E-02 1.32E-02 1.29E-02 1.26E-02 1.22E-02 1.19E-02 1.16E-02 1.13E-02 1.10E-02
2.3 1.07E-02 1.04E-02 1.02E-02 9.90E-03 9.64E-03 9.39E-03 9.14E-03 8.89E-03 8.66E-03 8.42E-03
2.4 8.20E-03 7.98E-03 7.76E-03 7.55E-03 7.34E-03 7.14E-03 6.95E-03 6.76E-03 6.57E-03 6.39E-03
2.5 6.21E-03 6.04E-03 5.87E-03 5.70E-03 5.54E-03 5.39E-03 5.23E-03 5.09E-03 4.94E-03 4.80E-03
2.6 4.66E-03 4.53E-03 4.40E-03 4.27E-03 4.15E-03 4.02E-03 3.91E-03 3.79E-03 3.68E-03 3.57E-03
2.7 3.47E-03 3.36E-03 3.26E-03 3.17E-03 3.07E-03 2.98E-03 2.89E-03 2.80E-03 2.72E-03 2.64E-03
2.8 2.56E-03 2.48E-03 2.40E-03 2.33E-03 2.26E-03 2.19E-03 2.12E-03 2.05E-03 1.99E-03 1.93E-03
2.9 1.87E-03 1.81E-03 1.75E-03 1.70E-03 1.64E-03 1.59E-03 1.54E-03 1.49E-03 1.44E-03 1.40E-03
3.0 1.35E-03 1.31E-03 1.26E-03 1.22E-03 1.18E-03 1.14E-03 1.11E-03 1.07E-03 1.04E-03 1.00E-03
3.1 9.68E-04 9.35E-04 9.04E-04 8.74E-04 8.45E-04 8.16E-04 7.89E-04 7.62E-04 7.36E-04 7.11E-04
3.2 6.87E-04 6.64E-04 6.41E-04 6.19E-04 5.98E-04 5.77E-04 5.57E-04 5.38E-04 5.19E-04 5.01E-04
3.3 4.84E-04 4.67E-04 4.50E-04 4.34E-04 4.19E-04 4.04E-04 3.90E-04 3.76E-04 3.63E-04 3.50E-04
3.4 3.37E-04 3.25E-04 3.13E-04 3.02E-04 2.91E-04 2.80E-04 2.70E-04 2.60E-04 2.51E-04 2.42E-04
3.5 2.33E-04 2.24E-04 2.16E-04 2.08E-04 2.00E-04 1.93E-04 1.86E-04 1.79E-04 1.72E-04 1.66E-04
3.6 1.59E-04 1.53E-04 1.47E-04 1.42E-04 1.36E-04 1.31E-04 1.26E-04 1.21E-04 1.17E-04 1.12E-04
3.7 1.08E-04 1.04E-04 9.97E-05 9.59E-05 9.21E-05 8.86E-05 8.51E-05 8.18E-05 7.85E-05 7.55E-05
3.8 7.25E-05 6.96E-05 6.69E-05 6.42E-05 6.17E-05 5.92E-05 5.68E-05 5.46E-05 5.24E-05 5.03E-05
3.9 4.82E-05 4.63E-05 4.44E-05 4.26E-05 4.09E-05 3.92E-05 3.76E-05 3.61E-05 3.46E-05 3.32E-05
4.0 3.18E-05 3.05E-05 2.92E-05 2.80E-05 2.68E-05 2.57E-05 2.47E-05 2.36E-05 2.26E-05 2.17E-05
4.1 2.08E-05 1.99E-05 1.91E-05 1.82E-05 1.75E-05 1.67E-05 1.60E-05 1.53E-05 1.47E-05 1.40E-05
4.2 1.34E-05 1.29E-05 1.23E-05 1.18E-05 1.13E-05 1.08E-05 1.03E-05 9.86E-06 9.43E-06 9.01E-06
4.3 8.62E-06 8.24E-06 7.88E-06 7.53E-06 7.20E-06 6.88E-06 6.57E-06 6.28E-06 6.00E-06 5.73E-06
4.4 5.48E-06 5.23E-06 5.00E-06 4.77E-06 4.56E-06 4.35E-06 4.16E-06 3.97E-06 3.79E-06 3.62E-06
4.5 3.45E-06 3.29E-06 3.14E-06 3.00E-06 2.86E-06 2.73E-06 2.60E-06 2.48E-06 2.37E-06 2.26E-06
4.6 2.15E-06 2.05E-06 1.96E-06 1.87E-06 1.78E-06 1.70E-06 1.62E-06 1.54E-06 1.47E-06 1.40E-06
4.7 1.33E-06 1.27E-06 1.21E-06 1.15E-06 1.10E-06 1.05E-06 9.96E-07 9.48E-07 9.03E-07 8.59E-07
4.8 8.18E-07 7.79E-07 7.41E-07 7.05E-07 6.71E-07 6.39E-07 6.08E-07 5.78E-07 5.50E-07 5.23E-07
4.9 4.98E-07 4.73E-07 4.50E-07 4.28E-07 4.07E-07 3.87E-07 3.68E-07 3.50E-07 3.32E-07 3.16E-07
Z
Issue is clearly defined to the lowest level of cause
and effect. The project should have a response
variable (Y) with specifications and constraints (i.e.
cycle time for returned parts, washer base width). It
should be bound by clearly defined goals. If it looks
big, it is. A poorly defined project will require greater
scoping time and will have a longer completion time
than one that is clearly defined.
SPECIFIC
Financially justifiable - directly impacts a business
metric that returns value: PPM, reliability, yield,
pricing errors, field returns, factory yield, overtime,
transportation, warehousing, availability, SCR,
rework, under billing and scrap.
VALUE-ADDED
The response variable (Y) must have reasonable
historical DATA, or you must have the ability to
capture a reliable data stream.
Having a method for measuring vital Xs is also
essential for in-depth process analysis with data.
Discreet data can be effectively used for problem
investigation, but variable (continuous) data is
better. Projects based on unreliable data have
unreliable results.
MEASURABLE
The selected project should be one which can be
addressed by the accepted local organization
Adequate support is needed to ensure successful
project completion and permanent change to the
process. It is difficult to manage improvements in
Louisville from the field
LOCALLY
ACTIONABLE
CUSTOMER
FOCUSED
The Project Y should be clearly linked to a specific
customer want or need - can result in improved
customer perception or consumer satisfaction
(Customer WOW): on time delivery, billing accuracy,
call answer rate.
3 68
Normal Distribution
Normal Distribution
Z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
5.0 3.00E-07 2.85E-07 2.71E-07 2.58E-07 2.45E-07 2.32E-07 2.21E-07 2.10E-07 1.99E-07 1.89E-07
5.1 1.80E-07 1.71E-07 1.62E-07 1.54E-07 1.46E-07 1.39E-07 1.31E-07 1.25E-07 1.18E-07 1.12E-07
5.2 1.07E-07 1.01E-07 9.59E-08 9.10E-08 8.63E-08 8.18E-08 7.76E-08 7.36E-08 6.98E-08 6.62E-08
5.3 6.27E-08 5.95E-08 5.64E-08 5.34E-08 5.06E-08 4.80E-08 4.55E-08 4.31E-08 4.08E-08 3.87E-08
5.4 3.66E-08 3.47E-08 3.29E-08 3.11E-08 2.95E-08 2.79E-08 2.64E-08 2.50E-08 2.37E-08 2.24E-08
5.5 2.12E-08 2.01E-08 1.90E-08 1.80E-08 1.70E-08 1.61E-08 1.53E-08 1.44E-08 1.37E-08 1.29E-08
5.6 1.22E-08 1.16E-08 1.09E-08 1.03E-08 9.78E-09 9.24E-09 8.74E-09 8.26E-09 7.81E-09 7.39E-09
5.7 6.98E-09 6.60E-09 6.24E-09 5.89E-09 5.57E-09 5.26E-09 4.97E-09 4.70E-09 4.44E-09 4.19E-09
5.8 3.96E-09 3.74E-09 3.53E-09 3.34E-09 3.15E-09 2.97E-09 2.81E-09 2.65E-09 2.50E-09 2.36E-09
5.9 2.23E-09 2.11E-09 1.99E-09 1.88E-09 1.77E-09 1.67E-09 1.58E-09 1.49E-09 1.40E-09 1.32E-09
6.0 1.25E-09 1.18E-09 1.11E-09 1.05E-09 9.88E-10 9.31E-10 8.78E-10 8.28E-10 7.81E-10 7.36E-10
6.1 6.94E-10 6.54E-10 6.17E-10 5.81E-10 5.48E-10 5.16E-10 4.87E-10 4.59E-10 4.32E-10 4.07E-10
6.2 3.84E-10 3.61E-10 3.40E-10 3.21E-10 3.02E-10 2.84E-10 2.68E-10 2.52E-10 2.38E-10 2.24E-10
6.3 2.11E-10 1.98E-10 1.87E-10 1.76E-10 1.66E-10 1.56E-10 1.47E-10 1.38E-10 1.30E-10 1.22E-10
6.4 1.15E-10 1.08E-10 1.02E-10 9.59E-11 9.02E-11 8.49E-11 7.98E-11 7.51E-11 7.06E-11 6.65E-11
6.5 6.25E-11 5.88E-11 5.53E-11 5.20E-11 4.89E-11 4.60E-11 4.32E-11 4.07E-11 3.82E-11 3.59E-11
6.6 3.38E-11 3.18E-11 2.98E-11 2.81E-11 2.64E-11 2.48E-11 2.33E-11 2.19E-11 2.06E-11 1.93E-11
6.7 1.82E-11 1.71E-11 1.60E-11 1.51E-11 1.42E-11 1.33E-11 1.25E-11 1.17E-11 1.10E-11 1.04E-11
6.8 9.72E-12 9.13E-12 8.57E-12 8.05E-12 7.56E-12 7.10E-12 6.66E-12 6.26E-12 5.87E-12 5.52E-12
6.9 5.18E-12 4.86E-12 4.56E-12 4.28E-12 4.02E-12 3.77E-12 3.54E-12 3.32E-12 3.12E-12 2.93E-12
7.0 2.75E-12 2.58E-12 2.42E-12 2.27E-12 2.13E-12 2.00E-12 1.87E-12 1.76E-12 1.65E-12 1.55E-12
7.1 1.45E-12 1.36E-12 1.28E-12 1.20E-12 1.12E-12 1.05E-12 9.88E-13 9.26E-13 8.69E-13 8.15E-13
7.2 7.64E-13 7.16E-13 6.72E-13 6.30E-13 5.90E-13 5.54E-13 5.19E-13 4.86E-13 4.56E-13 4.28E-13
7.3 4.01E-13 3.76E-13 3.52E-13 3.30E-13 3.09E-13 2.90E-13 2.72E-13 2.55E-13 2.39E-13 2.24E-13
7.4 2.10E-13 1.96E-13 1.84E-13 1.72E-13 1.62E-13 1.51E-13 1.42E-13 1.33E-13 1.24E-13 1.17E-13
7.5 1.09E-13 1.02E-13 9.58E-14 8.98E-14 8.41E-14 7.87E-14 7.38E-14 6.91E-14 6.47E-14 6.06E-14
7.6 5.68E-14 5.32E-14 4.98E-14 4.66E-14 4.37E-14 4.09E-14 3.83E-14 3.58E-14 3.36E-14 3.14E-14
7.7 2.94E-14 2.76E-14 2.58E-14 2.42E-14 2.26E-14 2.12E-14 1.98E-14 1.86E-14 1.74E-14 1.63E-14
7.8 1.52E-14 1.42E-14 1.33E-14 1.25E-14 1.17E-14 1.09E-14 1.02E-14 9.58E-15 8.97E-15 8.39E-15
7.9 7.85E-15 7.35E-15 6.88E-15 6.44E-15 6.02E-15 5.64E-15 5.28E-15 4.94E-15 4.62E-15 4.32E-15
8.0 4.05E-15 3.79E-15 3.54E-15 3.31E-15 3.10E-15 2.90E-15 2.72E-15 2.54E-15 2.38E-15 2.22E-15
8.1 2.08E-15 1.95E-15 1.82E-15 1.70E-15 1.59E-15 1.49E-15 1.40E-15 1.31E-15 1.22E-15 1.14E-15
8.2 1.07E-15 9.99E-16 9.35E-16 8.74E-16 8.18E-16 7.65E-16 7.16E-16 6.69E-16 6.26E-16 5.86E-16
8 3 5.48E-16 5.12E-16 4.79E-16 4.48E-16 4.19E-16 3.92E-16 3.67E-16 3.43E-16 3.21E-16 3.00E-16
8.4 2.81E-16 2.62E-16 2.45E-16 2.30E-16 2.15E-16 2.01E-16 1.88E-16 1.76E-16 1.64E-16 1.54E-16
8.5 1.44E-16 1.34E-16 1.26E-16 1.17E-16 1.10E-16 1.03E-16 9.60E-17 8.98E-17 8.40E-17 7.85E-17
8.6 7.34E-17 6.87E-17 6.42E-17 6.00E-17 5.61E-17 5.25E-17 4.91E-17 4.59E-17 4.29E-17 4.01E-17
8.7 3.75E-17 3.51E-17 3.28E-17 3.07E-17 2.87E-17 2.68E-17 2.51E-17 2.35E-17 2.19E-17 2.05E-17
8.8 1.92E-17 1.79E-17 1.68E-17 1.57E-17 1.47E-17 1.37E-17 1.28E-17 1.20E-17 1.12E-17 1.05E-17
8.9 9.79E-18 9.16E-18 8.56E-18 8.00E-18 7.48E-18 7.00E-18 6.54E-18 6.12E-18 5.72E-18 5.35E-18
9.0 5.00E-18 4.68E-18 4.37E-18 4.09E-18 3.82E-18 3.57E-18 3.34E-18 3.13E-18 2.92E-18 2.73E-18
9.1 2.56E-18 2.39E-18 2.23E-18 2.09E-18 1.95E-18 1.83E-18 1.71E-18 1.60E-18 1.49E-18 1.40E-18
9.2 1.31E-18 1.22E-18 1.14E-18 1.07E-18 9.98E-19 9.33E-19 8.73E-19 8.16E-19 7.63E-19 7.14E-19
9.3 6.67E-19 6.24E-19 5.83E-19 5.46E-19 5.10E-19 4.77E-19 4.46E-19 4.17E-19 3.90E-19 3.65E-19
9.4 3.41E-19 3.19E-19 2.98E-19 2.79E-19 2.61E-19 2.44E-19 2.28E-19 2.14E-19 2.00E-19 1.87E-19
9.5 1.75E-19 1.63E-19 1.53E-19 1.43E-19 1.34E-19 1.25E-19 1.17E-19 1.09E-19 1.02E-19 9.56E-20
9.6 8.94E-20 8.37E-20 7.82E-20 7.32E-20 6.85E-20 6.40E-20 5.99E-20 5.60E-20 5.24E-20 4.90E-20
9.7 4.58E-20 4.29E-20 4.01E-20 3.75E-20 3.51E-20 3.28E-20 3.07E-20 2.87E-20 2.69E-20 2.52E-20
9.8 2.35E-20 2.20E-20 2.06E-20 1.93E-20 1.80E-20 1.69E-20 1.58E-20 1.48E-20 1.38E-20 1.29E-20
9.9 1.21E-20 1.13E-20 1.06E-20 9.90E-21 9.26E-21 8.67E-21 8.11E-21 7.59E-21 7.10E-21 6.64E-21
10.0 6.22E-21 5.82E-21 5.44E-21 5.09E-21 4.77E-21 4.46E-21 4.17E-21 3.91E-21 3.66E-21 3.42E-21
Six Sigma Six Sigma
Problem Solving Processes Problem Solving Processes
Step Description Focus Tools Deliverables
Define
A Identify Project CTQs Y VOC; Process Map;
CAP
Project CTQs (1)
B Develop Team Charter Project CAP Approved Charter (2)
C Define Process Map Y=f(x) Process Map High Level Process
Map (3)
Measure
1 Select CTQ
Characteristics
Y VOC; QFD;FMEA Project Y (4)
2 Define Performance
Standards
Y VOC, Blueprints Performance
Standard for Project Y
(5)
3 Measurement System
Analysis
Y & X Continuous Gage
R&R; Test/Retest,
Attribute R&R
Data Collection Plan
& MSA (6), Data for
Project Y (7)
Analyze
4 Establish Process
Capability
Y Capability Indices Process Capability for
Project Y (8)
5 Define Performance
Objectives
Y Team,
Benchmarking
Improvement Goal for
Project Y (9)
6 Identify Variation
Sources
X Process Analysis,
Graphical Analysis,
Hypothesis Tests
Prioritized List of all Xs
(10)
Improve
7 Screen Potential
Causes
X DOE-Screening List of Vital Few Xs
(11)
8 Discover Variable
Relationships
X Factorial Designs Proposed Solution
(13)
9 Establish Operating
Tolerances
X Simulation Piloted Solution (14)
Control
10 Define & Validate
Measurement System
on Xs in Actual
Application
X, Y Continuous Gage
R&R, Test/Retest,
Attribute R&R
MSA
11 Determine New
Process Capability
X, Y Capability Indices Process Capability Y,
X
12 Implement Process
Control
X Control Charts;
Mistake Proofing;
FMEA
Sustained Solution
(15), Project
Documentation (16),
Six Sigma Toolkit - Index Six Sigma Toolkit - Index
t-Distribution
t-Distribution
1-
df .600 .700 .800 .900 .950 .975 .990 .995
1 0.325 0.727 1.376 3.078 6.314 12.706 31.821 63.657
2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032
6 0.265 0.553 0.906 1.440 1.943 2.447 3.143 3.707
7 0.263 0.549 0.896 1.415 1.895 2.365 2.998 3.499
8 0.262 0.546 0.889 1.397 1.860 2.306 2.896 3.355
9 0.261 0.543 0.883 1.383 1.833 2.262 2.821 3.250
10 0.260 0.542 0.879 1.372 1.812 2.228 2.764 3.169
11 0.260 0.540 0.876 1.363 1.796 2.201 2.718 3.106
12 0.259 0.539 0.873 1.356 1.782 2.179 2.681 3.055
13 0.259 0.538 0.870 1.350 1.771 2.160 2.650 3.012
14 0.258 0.537 0.868 1.345 1.761 2.145 2.624 2.977
15 0.258 0.536 0.866 1.341 1.753 2.131 2.602 2.947
16 0.258 0.535 0.865 1.337 1.746 2.120 2.583 2.921
17 0.257 0.534 0.863 1.333 1.740 2.110 2.567 2.898
18 0.257 0.534 0.862 1.330 1.734 2.101 2.552 2.878
19 0.257 0.533 0.861 1.328 1.729 2.093 2.539 2.861
20 0.257 0.533 0.860 1.325 1.725 2.086 2.528 2.845
21 0.257 0.532 0.859 1.323 1.721 2.080 2.518 2.831
22 0.256 0.532 0.858 1.321 1.717 2.074 2.508 2.819
23 0.256 0.532 0.858 1.319 1.714 2.069 2.500 2.807
24 0.256 0.531 0.857 1.318 1.711 2.064 2.492 2.797
25 0.256 0.531 0.856 1.316 1.708 2.060 2.485 2.787
26 0.256 0.531 0.856 1.315 1.706 2.056 2.479 2.779
27 0.256 0.531 0.855 1.314 1.703 2.052 2.473 2.771
28 0.256 0.530 0.855 1.313 1.701 2.048 2.467 2.763
29 0.256 0.530 0.854 1.311 1.699 2.045 2.462 2.756
30 0.256 0.530 0.854 1.310 1.697 2.042 2.457 2.750
40 0.255 0.529 0.851 1.303 1.684 2.021 2.423 2.704
60 0.254 0.527 0.848 1.296 1.671 2.000 2.390 2.660
120 0.254 0.526 0.845 1.289 1.658 1.980 2.358 2.617
0.253 0.524 0.842 1.282 1.645 1.960 2.326 2.576
69
Index
Analysis and Improve Tools Selection Matrix 26
ANOVA
ANOVA / ANOVA One Way
ANOVA Two Way
ANOVA - Balanced
Interpreting the ANOVA Output
41
42
43
44
Calculating Sample Size (Equation for manual Calculation) 28
Characterizing the Process - Rational Subgrouping 16
Control Chart Constants 59
Control Charts 57-58
Data Validity Studies/% Agreement on Binary (Pass / Fail) Data 10
Defining a Six Sigma Project 4
Definition of Z 8
Design for Six Sigma
Loop Diagrams
Tolerancing Analysis
50
51-52
Discrete Data Analysis 35
DOE
Design of Experiments
Factorial Designs
DOE Analysis
53
54
55
DPU / DPO 13
Gage R & R 11 - 12
General Linear Model 45
Hypothesis Statements 30-31
Hypothesis Testing 29
Minitab Graphics
Histogram / Scatter Plot
Descriptive Statistics / Normal Plot
One Variable Regression / Residual Plots
Boxplot / Interval Plot
Time Series Plot / Box-Cox Transformation
Pareto Diagrams / Cause & Effect Diagrams
20
21
22
23
24
25
Normal Approximation
2 Test (Test for Independence)
Poisson Approximation
36
37-38
39
Normality of Data 17
Planning Questions 19
Practical Problem Statement 5
Precontrol 60
Project Closure 61
Regression Analysis
Regression
Stepwise Regression
Regression with Curves (Quadratic) and Interactions
Binary Logistic Regression
46
47
48
49
Response Surface - CCD 56
Rolled Throughput Yield 14
Sample Size Determination 27
Seven Basic Tools 6
Six Sigma Problem Solving Processes 3
Six Sigma Process Report 18
Six Sigma Product Report 15
Stable Ops and 6 Sigma 9
t Test (Testing Means) (1 Sample t; 2 Sample t; Confidence Intervals) 33-34
Tables
Determining Sample Size
F Test
2 Test
Normal Distribution
t Test
62
63-64
65-66
67-68
69
Testing Equality of Variance (F test; Homogeneity of Variance) 32
The Normal Curve 7
The Transfer Function 40
The material in this Toolkit is a combination of material
developed by the GEA Master Black Belts and Dr. Mikel Harry
(The Six Sigma Academy, Inc.). Worksheets, statistical tables
and graphics are outputs of MINITAB for Windows Version
12.2, Copyright 1998, Minitab, Inc. It is intended for use as a
quick reference for trained Black Belts and Green Belts.
More detail information is available from the Quality Coach
Website SSQC.ge.com.
If you need more GEA Six Sigma Information, visit the GE
Appliances Six Sigma Website at
http://genet.appl.ge.com/sixsigma
For information on GE Corporate Certification Testing, go to
the Green Belt Training Site via the GE Appliances Six Sigma
Website.
For information about other GE Appliances Six Sigma
Training, contact a member of the GEA Six Sigma Training
Team
Jeff Keller - Ext 7649
Email: jeff.keller@appl.ge.com
Irene Ligon - Ext 4562
Email:Irene.ligon@appl.ge.com
Broadcast Group eMail:
GEASixSigmaTrainingTeam@Exchange.appl.ge.com
The Toolkit - A Six Sigma Resource The Toolkit - A Six Sigma Resource The Toolkit - A Six Sigma Resource
GLOSSARY OF SIX SIGMA TERMS GLOSSARY OF SIX SIGMA TERMS
1. - alpha risk - Probability of falsely accepting the alternative (H
A
) of
difference
2. ANOVA - Analysis of Variance
3. - Beta risk - Probability of falsely accepting the null hypothesis (H
0
)
of no difference
4.
2
- Tests for independent relationship between two discrete variables
5. - Difference between two means
6. DOE - Design of Experiments
7. DPU - Defects per unit
8. e
-DPU
- Rolled throughput yield
9. F- Test - Used to compare the variances of two distributions
10. g - number of subgroups
11. FIT - The point estimate of the mean response for each level of the
independent variable.
12. H
0
- Null hypothesis
13. H
A
- Alternative hypothesis
14. LSL - Lower spec limit
15. - Population mean
16. - Sample mean
17. n - number of samples in a subgroup
18. N - Number in the total population
19. P Value - If the calculated value of p is lower than the alpha () risk,
then reject the null hypothesis and conclude that there is a difference.
Often referred to as the observed level of significance.
20. Residual - The difference between the observed values and the Fit,
the error in the model
21. - Population standard deviation
22. - Summation
23. - Sample standard deviation
24. Stratify - Divide or arrange data in organized classes or segments,
based on known characteristics or factors.
24. SS - Sum of squares
25. t-Test - Used to compare the means of two distributions
26. Transfer Function - Prediction Equation - Y=f(x)
27. USL - Upper spec limit
28. - mean
29. - mean of the means
30. Z - Transforms a set of data such that =0 and =1
31. Z
LT
- Z Long term
32. Z
ST
- Z short term
33. Z
SHIFT
- Z
ST
- Z
LT
) s (
X
X
Revision 4.5 - September 2001
GE Appliances Copyright 2001
GE GE Appliances Appliances
Six Sigma Toolkit Six Sigma Toolkit
Six Sigma Toolkit
g gg g
g gg g
Rev 4.5 9/2001
GE Appliances Proprietary