Six Sigma

SIX SIGMA
Six Sigma is a business improvement methodology. Its main objective is to implement a

vigorous process to systematically eliminate defects and inefficiency. It was originally developed
by Motorola in the early 1980's and because of its proficiency has become extremely popular in
many corporate and small business environments around the world.
Six Sigma's main purpose or objective is to deliver high performance, value and reliability to the
customer. It is regarded and used around the world as one of the major themes for TQM (Total
Quality Management).
Six Sigma was developed by Bill Smith at Motorola in the early 1980's. It was originally
designed for a way to measure defects and to improve overall quality. A major position of Six
Sigma is that by using the methodology, it can lower defects to a level of 3.4 DPMO (defects per
million opportunities). 3.4 DPMO can also be written as plus or minus six sigma when the
centerline spans 12 sigma positions. (Six Sigma comes from a technical term used in statistics)
While originally developed for quality control, Six Sigma is used in many different ways, such
as improving communications with customers, employees and shareholders and improving the
total process of interaction, communication and product design.
It should be noted that the term "Six Sigma" is a registered trademark, owned by Motorola.
According to Motorola, this methodology has saved the company over 17 billion dollars from its
inception to 2006.
The Six Sigma Methodology
The Six Sigma includes two key methodologies; DMAIC and DMADV. DMAIC is used for an
existing process. DMADV is used when creating a new product or process. Using DMADV for
new projects can usually result in a more predictable process and ultimately higher quality
product.
DMAIC
There are 5 important steps included in DMAIC. They are:
• D - Define goals to improve the overall process between your company strategy and your
customer's demands (can also refer to your group and the groups or individuals that you
support)
• M - Measure your current processes. Collect relevant data on your current processes and
then use this data as a baseline for future comparisons.
• A - Analyze your relationship within the process. It is important to understand the
relationship to determine factors that can ensure you keep your companies strategy in line
with your customers demands.
• I - Improve the process. It is important to constantly improve and optimize the process,
using analysis and other techniques. One technique that is often used is Design of
Experiments. (This is a technique that can help to test a hypothesis, using acceptable
experimental design)
• C - Control. It is important ensure that you can control and correct any variances
avoiding possibly costly defects and loss of quality. Many times pilot runs are set up to
study process capability and production transition. These pilot runs can help fine tune or
add additional control mechanisms.
DMADV
There are 5 important steps included in DMADV. They are:
• D - Define goals that are consistent between your business strategy and customer
demands.
• M - Measure CTOs (critical to qualities) CTOs consist of production process, capabilities
producing a product, the capability of a product and any risk assessments.
• A - Analyze and evaluate many different designs, choosing the best design for its overall
qualities.
• D - Design details. It is important not only to design a product, but optimize the design
features. In order to fully optimize a design feature, you may be required to create
multiple designs or simulations.
• V - Verify the design. Important steps to verifying a design include setting up pilot runs
and running a short production. This step also requires you to handover the design to
process owners.
Statistics
Statistics is at the core of the Six Sigma methodlogy. Six Sigma focuses on using data to problem
solve and create systematic approaches to lowering deficiencies. Because data is at the core of
the Six Sigma methodology, statistical analysis and tools are commonly used. It is important to
note that while the Six Sigma methodology is data driven at its core, rudimentary statistical tools
and analysis are usually proficient.
Implementation of Roles in Six Sigma Methodology

There are many roles that that are used in the Six Sigma Methodology. While most of the roles
below are used in many organizations Six Sigma implementation, it should be noted that they are
not universal. The roles include:
Executive Leadership - Top level executives are responsible for vision and ultimately
implementation of the Six Sigma Methodology. They also empower others to take initiative and
ownership of the Six Sigma principles.
Champions - Champions are usually upper management that is responsible for the
implementation of Six Sigma throughout their organization.
Master Black Belts - are usually hand picked by Champions to coach others within the
organization on the Six Sigma methodologies. They allocate either all or most of their time to the
Six Sigma methodologies. It should also be noted, that they usually have mentoring
responsibilities to coach and train lower roles including Black Belts and Green Belts (see below)
Experts - while this role is not in every organization, it can play a huge role in major
engineering or manufacturing sectors. They improve overall services, products, and processes for
their end customers.
Black Belts - Black Belts focus on six sigma execution. They are usually middle managers.
Green Belts - These roles are usually taken on by employees who help Black belts execute
specific projects, as well as other job responsibilities.
Downsides of the Six Sigma Methodology

For the vast majority of organizations, the Six Sigma methodology has helped them be
competitive and reduce costs; however it should be noted that there are some downsides that do
exist.
In order to implement the Six Sigma methodology in an organization, it is extremely important to

have buy- in from employees on all levels. If associates, middle managers or high level
executives are not enthusiastic about using the Six Sigma Methodology, it can ultimately lead to
failure.
Another downside of using Six Sigma is that in some instances, Six Sigma's effectiveness has
never been measured or is unable to be measured. Due to the inability of measurements, it is
unclear if Six Sigma is actually helpful.
Finally, many organizations use the Six Sigma methodology as a way of protecting themselves
from liability. For instance, if a company produces a product that is low in quality or can harm its
user, the organization can use the defense that quality is at the forefront in order to be viewed
positively. In this respect, it is unclear if an organization has implemented Six Sigma for its
methodology or to cover its liability
Historical overview
Six Sigma originated as a set of practices designed to improve manufacturing processes and
eliminate defects, but its application was subsequently extended to other types of business
processes as well.[4] In Six Sigma, a defect is defined as any process output that does not meet
customer specifications, or that could lead to creating an output that does not meet customer
specifications.[3]
Bill Smith first formulated the particulars of the methodology at Motorola in 1986.[1] Six Sigma
was heavily inspired by six preceding decades of quality improvement methodologies such as
quality control, TQM, and Zero Defects,[5][6] based on the work of pioneers such as Shewhart,
Deming, Juran, Ishikawa, Taguchi and others.
Like its predecessors, Six Sigma doctrine asserts that:
• Continuous efforts to achieve stable and predictable process results (i.e., reduce process
variation) are of vital importance to business success.
• Manufacturing and business processes have characteristics that can be measured,
analyzed, improved and controlled.
• Achieving sustained quality improvement requires commitment from the entire
organization, particularly from top-level management.
Features that set Six Sigma apart from previous quality improvement initiatives include:
• A clear focus on achieving measurable and quantifiable financial returns from any Six
Sigma project.[3]
• An increased emphasis on strong and passionate management leadership and support.[3]
• A special infrastructure of "Champions," "Master Black Belts," "Black Belts," "Green
Belts", etc. to lead and implement the Six Sigma approach.[3]
• A clear commitment to making decisions on the basis of verifiable data, rather than
assumptions and guesswork.[3]
The term "Six Sigma" comes from a field of statistics known as process capability studies.
Originally, it referred to the ability of manufacturing processes to produce a very high proportion
of output within specification. Processes that operate with "six sigma quality" over the short term
are assumed to produce long-term defect levels below 3.4 defects per million opportunities
(DPMO).[7][8] Six Sigma's implicit goal is to improve all processes to that level of quality or
better.
Six Sigma is a registered service mark and trademark of Motorola Inc.[9] As of 2006[update]
Motorola reported over US$17 billion in savings[10] from Six Sigma.
Other early adopters of Six Sigma who achieved well-publicized success include Honeywell
(previously known as AlliedSignal) and General Electric, where Jack Welch introduced the
method.[11] By the late 1990s, about two-thirds of the Fortune 500 organizations had begun Six
Sigma initiatives with the aim of reducing costs and improving quality.[12]
In recent years[update], some practitioners have combined Six Sigma ideas with lean manufacturing
to yield a methodology named Lean Six Sigma
Quality management tools and methods used in Six Sigma
Within the individual phases of a DMAIC or DMADV project, Six Sigma utilizes many
established quality-management tools that are also used outside of Six Sigma. The following
table shows an overview of the main methods used.
• 5 Whys • Histograms
• Analysis of variance • Quality Function Deployment (QFD)
• ANOVA Gauge R&R • Pareto chart
• Axiomatic design • Pick chart
• Business Process Mapping • Process capability
• Cause & effects diagram (also known as • Quantitative marketing research through
fishbone or Ishikawa diagram) use of Enterprise Feedback Management
• Chi-square test of independence and fits (EFM) systems
• Control chart • Regression analysis
• Correlation • Root cause analysis
• Cost-benefit analysis • Run charts
• CTQ tree • SIPOC analysis (Suppliers, Inputs,
Process, Outputs, Customers)
• Design of experiments
• Taguchi methods
• Failure mode and effects analysis
(FMEA) • Taguchi Loss Function
• General linear model • TRIZ
Now let me give an explanation about each one of them :-
*5 WHYS -
The 5 Whys is a questions-asking method used to explore the cause/effect relationships

underlying a particular problem. Ultimately, the goal of applying the 5 Whys method is to
determine a root cause of a defect or problem
Example
The following example demonstrates the basic process:
• My car will not start. (the problem)
1. Why? - The battery is dead. (first why)
2. Why? - The alternator is not functioning. (second why)
3. Why? - The alternator belt has broken. (third why)
4. Why? - The alternator belt was well beyond its useful service life and has never been
replaced. (fourth why)
5. Why? - I have not been maintaining my car according to the recommended service
schedule. (fifth why, a root cause)
6. Why? - Replacement parts are not available because of the extreme age of my vehicle.
(sixth why, optional footnote)
• I will start maintaining my car according to the recommended service schedule. (solution)
The questioning for this example could be taken further to a sixth, seventh, or even greater level.
This would be legitimate, as the "five" in 5 Whys is not gospel; rather, it is postulated that five
iterations of asking why is generally sufficient to get to a root cause. The real key is to encourage
the troubleshooter to avoid assumptions and logic traps and instead to trace the chain of causality
in direct increments from the effect through any layers of abstraction to a root cause that still has
some connection to the original problem. Note that in this example the fifth why suggests a
broken process or an alterable behavior, which is typical of reaching the root-cause level.
It`s interesting to note that the last answer aims to a process. This is actually one of the most
important aspect from the 5 Why approach...the REAL root cause should point toward a process.
You will observe that the process is not working well or that the process is not even existing
History
The technique was originally developed by Sakichi Toyoda and was later used within Toyota
Motor Corporation during the evolution of their manufacturing methodologies. It is a critical
component of problem solving training delivered as part of the induction into the Toyota
Production System. The architect of the Toyota Production System, Taiichi Ohno, described the
5 whys method as "the basis of Toyota's scientific approach . . . by repeating why five times, the
nature of the problem as well as its solution becomes clear."[1] The tool has seen widespread use
beyond Toyota, and is now used within Kaizen, lean manufacturing, and Six Sigma.
Criticism
While the 5 Whys is a powerful tool for engineers or technically savvy individuals to help get to
the true causes of problems, it has been criticized by Teruyuki Minoura, former managing
director of global purchasing for Toyota, as being too basic a tool to analyze root causes to the
depth that is needed to ensure that the causes are fixed[citation needed]. Reasons for this criticism
include:
• Tendency for investigators to stop at symptoms rather than going on to lower level root
causes.
• Inability to go beyond the investigator's current knowledge - can't find causes that they
don't already know
• Lack of support to help the investigator to ask the right "why" questions.
• Results aren't repeatable - different people using 5 Whys come up with different causes
for the same problem.
• The tendency to isolate a single root cause, whereas each question could elicit many
different root causes
These can be significant problems when the method is applied through deduction only. On-the-
spot verification of the answer to the current "why" question, before proceeding to the next, is
recommended as a good practice to avoid these issues.
*Analysis of variance
In statistics, analysis of variance (ANOVA) is a collection of statistical models, and their
associated procedures, in which the observed variance in a particular variable is partitioned into
components attributable to different sources of variation. In its simplest form ANOVA provides
a statistical test of whether or not the means of several groups are all equal, and therefore
generalizes t-test to more than two groups. ANOVAs are helpful because they possess an
advantage over a two-sample t-test. Doing multiple two-sample t-tests would result in an
increased chance of committing a type I error. For this reason, ANOVAs are useful in comparing
three or more means
Models
Fixed-effects models (Model 1)
The fixed-effects model of analysis of variance applies to situations in which the experimenter
applies one or more treatments to the subjects of the experiment to see if the response variable
values change. This allows the experimenter to estimate the ranges of response variable values
that the treatment would generate in the population as a whole.
Random-effects models (Model 2)
Main article: Random effects model
Random effects models are used when the treatments are not fixed. This occurs when the various
factor levels are sampled from a larger population. Because the levels themselves are random
variables, some assumptions and the method of contrasting the treatments differ from ANOVA
model 1.
Most random-effects or mixed-effects models are not concerned with making inferences
concerning the particular sampled factors. For example, consider a large manufacturing plant in
which many machines produce the same product. The statistician studying this plant would have
very little interest in comparing the three particular machines to each other. Rather, inferences
that can be made for all machines are of interest, such as their variability and the mean.
However, if one is interested in the realized value of the random effect best linear unbiased
prediction can be used to obtain a "prediction" for the value.
Assumptions of ANOVA
There are several approaches to the analysis of variance. However, all approaches use a linear
model that relates the response to the treatments and blocks. Even when the statistical model is
nonlinear, it can be approximated by a linear model for which an analysis of variance may be
appropriate.
A model often presented in textbooks
Many textbooks present the analysis of variance in terms of a linear model, which makes the
following assumptions about the probability distribution of the responses:
• Independence of cases – this is an assumption of the model that simplifies the statistical
analysis.
• Normality – the distributions of the residuals are normal.
• Equality (or "homogeneity") of variances, called homoscedasticity — the variance of data
in groups should be the same. Model-based approaches usually assume that the variance
is constant. The constant-variance property also appears in the randomization (design-
based) analysis of randomized experiments, where it is a necessary consequence of the
randomized design and the assumption of unit treatment additivity (Hinkelmann and
Kempthorne): If the responses of a randomized balanced experiment fail to have constant
variance, then the assumption of unit treatment additivity is necessarily violated.
To test the hypothesis that all treatments have exactly the same effect, the F-test's p-values
closely approximate the permutation test's p-values: The approximation is particularly close
when the design is balanced.[1] Such permutation tests characterize tests with maximum power
against all alternative hypotheses, as observed by Rosenbaum.[2] The anova F–test (of the null-
hypothesis that all treatments have exactly the same effect) is recommended as a practical test,
because of its robustness against many alternative distributions.[3][4] The Kruskal–Wallis test is a
nonparametric alternative that does not rely on an assumption of normality. And the Friedman
test is the nonparametric alternative for one-way repeated measures ANOVA.
The separate assumptions of the textbook model imply that the errors are independently,
identically, and normally distributed for fixed effects models, that is, that the errors are
independent and
Randomization-based analysis
In a randomized controlled experiment, the treatments are randomly assigned to experimental

units, following the experimental protocol. This randomization is objective and declared before
the experiment is carried out. The objective random-assignment is used to test the significance of
the null hypothesis, following the ideas of C. S. Peirce and Ronald A. Fisher. This design-based
analysis was discussed and developed by Francis J. Anscombe at Rothamsted Experimental
Station and by Oscar Kempthorne at Iowa State University.] Kempthorne and his students make
an assumption of unit treatment additivity, which is discussed in the books of Kempthorne and
David R. Cox.
Unit-treatment additivity
In its simplest form, the assumption of unit-treatment additivity states that the observed response
yi,j from experimental unit i when receiving treatment j can be written as the sum of the unit's
response yi and the treatment-effect tj, that is
yi,j = yi + tj.[6]
The assumption of unit-treatment addivity implies that, for every treatment j, the jth treatment
have exactly the same effect tj on every experiment unit.
The assumption of unit treatment additivity usually cannot be directly falsified, according to Cox
and Kempthorne. However, many consequences of treatment-unit additivity can be falsified. For
a randomized experiment, the assumption of unit-treatment additivity implies that the variance is
constant for all treatments. Therefore, by contraposition, a necessary condition for unit-treatment
additivity is that the variance is constant.
The property of unit-treatment additivity is not invariant under a "change of scale", so
statisticians often use transformations to achieve unit-treatment additivity. If the response
variable is expected to follow a parametric family of probability distributions, then the
statistician may specify (in the protocol for the experiment or observational study) that the
responses be transformed to stabilize the variance.[7] Also, a statistician may specify that
logarithmic transforms be applied to the responses, which are believed to follow a multiplicative
model.[8]
The assumption of unit-treatment additivity was enunciated in experimental design by
Kempthorne and Cox. Kempthorne's use of unit treatment additivity and randomization is similar
to the design-based inference that is standard in finite-population survey sampling.
Derived linear model
Kempthorne uses the randomization-distribution and the assumption of unit treatment additivity
to produce a derived linear model, very similar to the textbook model discussed previously.
The test statistics of this derived linear model are closely approximated by the test statistics of an
appropriate normal linear model, according to approximation theorems and simulation studies by
Kempthorne and his students (Hinkelmann and Kempthorne). However, there are differences.
For example, the randomization-based analysis results in a small but (strictly) negative
correlation between the observations (Hinkelmann and Kempthorne, volume one, chapter 7;
Bailey chapter 1.14).[9] In the randomization-based analysis, there is no assumption of a normal
distribution and certainly no assumption of independence. On the contrary, the observations are
dependent!
The randomization-based analysis has the disadvantage that its exposition involves tedious
algebra and extensive time. Since the randomization-based analysis is complicated and is closely
approximated by the approach using a normal linear model, most teachers emphasize the normal
linear model approach. Few statisticians object to model-based analysis of balanced randomized
experiments.
Statistical models for observational data
However, when applied to data from non-randomized experiments or observational studies,
model-based analysis lacks the warrant of randomization. For observational data, the derivation
of confidence intervals must use subjective models, as emphasized by Ronald A. Fisher and his
followers. In practice, the estimates of treatment-effects from observational studies generally are
often inconsistent. In practice, "statistical models" and observational data are useful for
suggesting hypotheses that should be treated very cautiously by the public.[10]
Logic of ANOVA
Partitioning of the sum of squares
The fundamental technique is a partitioning of the total sum of squares (abbreviated SS) into
components related to the effects used in the model. For example, we show the model for a
simplified ANOVA with one type of treatment at different levels.
So, the number of degrees of freedom (abbreviated df) can be partitioned in a similar way and
specifies the chi-square distribution which describes the associated sums of squares.
See also Lack-of-fit sum of squares.

The F-test
Main article: F-test
The F-test is used for comparisons of the components of the total deviation. For example, in one-
way or single-factor ANOVA, statistical significance is tested for by comparing the F test
statistic
where
I = number of treatments
and
nT = total number of cases
to the F-distribution with I − 1,nT − I degrees of freedom. Using the F-distribution is a natural
candidate because the test statistic is the quotient of two mean sums of squares which have a chi-
square distribution.
Power analysis
Power analysis is often applied in the context of ANOVA in order to assess the probability of
successfully rejecting the null hypothesis if we assume a certain ANOVA design, effect size in
the population, sample size and alpha level. Power analysis can assist in study design by
determining what sample size would be required in order to have a reasonable chance of
rejecting the null hypothesis when the alternative hypothesis is true.
Effect size
Several standardized measures of effect gauge the strength of the association between a predictor
(or set of predictors) and the dependent variable. Effect-size estimates facilitate the comparison
of findings in studies and across disciplines. Common effect size estimates reported in
univariate-response anova and multivariate-response manova include the following: eta-squared,
partial eta-squared, omega, and intercorrelation.
η2 ( eta-squared ): Eta-squared describes the ratio of variance explained in the dependent
variable by a predictor while controlling for other predictors. Eta-squared is a biased estimator of
the variance explained by the model in the population (it estimates only the effect size in the
sample). On average it overestimates the variance explained in the population. As the sample
size gets larger the amount of bias gets smaller.
Partial η2 (Partial eta-squared): Partial eta-squared describes the "proportion of total variation
attributable to the factor, partialling out (excluding) other factors from the total nonerror
variation" (Pierce, Block & Aguinis, 2004, p. 918). Partial eta squared is often higher than eta
squared.
The popular regression benchmark for effect size comes from (Cohen, 1992; 1988): 0.20 is a
minimal solution (but may be interpreted as practically important as a heuristic); 0.50 is a
medium effect; anything equal to or greater than 0.80 is a large effect size (Cohen, 1992).
Cohen's ƒ This measure of effect size represents the square root of variance explained over
variance not explained.
Follow up tests
A statistically significant effect in ANOVA is often followed up with one or more different
follow-up tests. This can be done in order to assess which groups are different from which other
groups or to test various other focused hypotheses. Follow up tests are often distinguished in
terms of whether they are planned (a priori) or post hoc. Planned tests are determined before
looking at the data and post hoc tests are performed after looking at the data. Post hoc tests such
as Tukey's range test most commonly compare every group mean with every other group mean
and typically incorporate some method of controlling for Type I errors. Comparisons, which are
most commonly planned, can be either simple or compound. Simple comparisons compare one
group mean with one other group mean. Compound comparisons typically compare two sets of
groups means where one set has at two or more groups (e.g., compare average group means of
group A, B and C with group D). Comparisons can also look at tests of trend, such as linear and
quadratic relationships, when the independent variable involves ordered levels.
[edit] Study designs and anovas
There are several types of ANOVA. Many statisticians base anova on the design of the
experiment especially on the protocol that specified the random assignment of treatments to
subjects: This protocol's description of the assignment mechanism should include a specification
of the structure of the treatments and of any blocking. It is also common to use anova on
observational data using an appropriate statistical model.
Some popular designs have the following anovas:
• One-way ANOVA is used to test for differences among two or more independent groups.
Typically, however, the one-way ANOVA is used to test for differences among at least
three groups, since the two-group case can be covered by a t-test (Gosset, 1908). When
there are only two means to compare, the t-test and the ANOVA F-test are equivalent; the
relation between ANOVA and t is given by F = t2.
• Factorial ANOVA is used when the experimenter wants to study the interaction effects
among the treatments.
• Repeated measures ANOVA is used when the same subjects are used for each treatment
(e.g., in a longitudinal study).
• Multivariate analysis of variance (MANOVA) is used when there is more than one
response variable.
History
The analysis of variance was used informally by researchers in the 1800s using least squares.
[citation needed]
In physics and psychology, researchers included a term for the operator-effect, the
influence of a particular person on measurements, according to Stephen Stigler's histories.[citation
needed]
Sir Ronald Fisher proposed a formal analysis of variance in a 1918 article The Correlation
Between Relatives on the Supposition of Mendelian Inheritance.[11] His first application of the
analysis of variance was published in 1921.[12] Analysis of variance became widely known after
being included in Fisher's 1925 book Statistical Methods for Research Workers.
*ANOVA Gauge R&R
ANOVA Gauge R&R (or ANOVA Gauge Repeatability & Reproducibility) is a Measurement
Systems Analysis technique which uses Analysis of Variance (ANOVA) random effects model
to assess a measurement system.
The evaluation of a measurement system is not limited to gauges (or gages) but to all types of
measuring instruments, test methods, and other measurement systems.
Purpose
ANOVA Gauge R&R measures the amount of variability induced in measurements by the
measurement system itself, and compares it to the total variability observed to determine the
viability of the measurement system. There are several factors affecting a measurement system,
including:
• Measuring instruments, the gauge or instrument itself and all mounting blocks,
supports, fixtures, load cells, etc. The machine's ease of use, sloppiness among mating
parts, and, "zero" blocks are examples of sources of variation in the measurement system.
In systems making electrical measurements, sources of variation include electrical noise
and analog-to-digital converter resolution.
• Operators (people), the ability and/or discipline of a person to follow the written or
verbal instructions.
• Test methods, how the devices are set up, the test fixtures, how the data is recorded, etc.
• Specification, the measurement is reported against a specification or a reference value.
The range or the engineering tolerance does not affect the measurement, but is an
important factor in evaluating the viability of the measurement system.
• Parts or specimens (what is being measured), some items are easier to be measured than
others. A measurement system may be good for measuring steel block length but not for
measuring rubber pieces, for example.
There are two important aspects of a Gauge R&R:
• Repeatability: The variation in measurements taken by a single person or instrument on
the same item and under the same conditions.
• Reproducibility: The variability induced by the operators. It is the variation induced
when different operators (or different laboratories) measure the same part.
It is important to understand the difference between accuracy and precision to understand the
purpose of Gauge R&R. Gauge R&R addresses only the precision of a measurement system. It is
common to examine the P/T ratio which is the ratio of the precision of a measurement system to
the (total) tolerance of the manufacturing process of which it is a part. If the P/T ratio is low, the
impact on product quality of variation due to the measurement system is small. If the P/T ratio is
larger, it means the measurement system is "eating up" a large fraction of the tolerance, in that
the parts that do not have sufficient tolerance may be measured as acceptable by the
measurement system. Generally, a P/T ratio less than 0.1 indicates that the measurement system
can reliably determine whether any given part meets the tolerance specification. A P/T ratio
greater than 0.3 suggests that unacceptable parts will be measured as acceptable (or vice-versa)
by the measurement system, making the system inappropriate for the process for which it is
being used.
Anova Gauge R&R is an important tool within the Six Sigma methodology, and it is also a
requirement for a Production Part Approval Process (PPAP) documentation package.
How to perform a Gauge R&R
The Gauge R&R (GRR) is performed by measuring parts using the established measurement
system. The goal is to capture as many sources of measurement variation as possible, so they can
be assessed and understood. Please note that the objective is not for the parts to "pass". A small
variation (a favorable result) might result from a GRR study because an important source of error
was missed in the process.
To capture reproducibility errors, multiple operators are needed. Some (ASTM E691 Standard
Practice for Conducting an Interlaboratory Study to Determine the Precision of a Test Method)
call for at least ten operators (or laboratories) but others use only two or three to measure the
same parts.
To capture repeatability errors, the same part is usually measured several times by each operator.
Each measurement cycle on an individual part must include the full set of operations required if
the operator were testing multiple different parts, including the complete handling, loading, and
unloading of the part from the measurement system.
To capture interactions of operators with parts (e.g. one part may be more difficult to measure
than another), usually between five and ten parts are measured.
There is not a universal criterion of minimum sample requirements for the GRR matrix, it being
a matter for the Quality Engineer to assess risks depending on how critical the measurement is
and how costly they are. The "10x2x2" (ten parts, two operators, two repetitions) is an acceptable
sampling for some studies, although it has very few degrees of freedom for the operator
component. Several methods of determining the sample size and degree of replication are used.
*Axiomatic design
Axiomatic design is a systems design methodology using matrix methods to systematically

analyze the transformation of customer needs into functional requirements, design parameters,
and process variables.[1]
The method gets its name from its use of design principles or design Axioms (i.e., given without
proof) governing the analysis and decision making process in developing high quality product or
system designs. Axiomatic design is considered to be a design method that addresses
fundamental issues in Taguchi methods.
The methodology was developed by Dr. Suh Nam Pyo at MIT, Department of Mechanical
Engineering since the 1990s. A series of academic conferences have been held to present current
developments of the methodology. The most recent International Conference on Axiomatic
Design (ICAD) was held in 2009 in Portugal
*Business process mapping
Business process mapping refers to activities involved in defining exactly what a business
entity does, who is responsible, to what standard a process should be completed and how the
success of a business process can be determined. Once this is done, there can be no uncertainty
as to the requirements of every internal business process. A business process illustration is
produced. The first step in gaining control over an organization is to know and understand the
basic processes (Deming, 1982; Juran, 1988; Taylor, 1911).
ISO 9001 requires a business entity to follow a process approach when managing its business,
and to this end creating business process maps will assist. The entity can then work towards
ensuring its processes are effective (the right process is followed the first time), and efficient
(continually improved to ensure processes use the least amount of resources).
History
[edit] Early history
The first structured method for documenting process flow, the flow process chart, was
introduced by Frank Gilbreth to members of ASME in 1921 as the presentation “Process Charts
—First Steps in Finding the One Best Way”. Gilbreth's tools quickly found their way into
industrial engineering curricula. In the early 1930s, an industrial engineer, Allan H. Mogensen
began training business people in the use of some of the tools of industrial engineering at his
Work Simplification Conferences in Lake Placid, New York. A 1944 graduate of Mogensen's
class, Art Spinanger, took the tools back to Procter and Gamble where he developed their
Deliberate Methods Change Program. Another 1944 graduate, Ben S. Graham, Director of
Formcraft Engineering at Standard Register Corporation, adapted the flow process chart to
information processing with his development of the multi-flow process chart to display multiple
documents and their relationships. In 1947, ASME adopted a symbol set derived from Gilbreth's
original work as the ASME Standard for Process Charts.
[edit] Recent developments
Process mapping has in recent years developed due to software tools that can attach metadata to
activities, drivers and triggers to provide a more complete understanding of processes. For
example, data elements, KPIs, Times, Volumes, documents, files, databases, compliance
applying to an activity can be attached to improve understanding and achieve several business
goals simultaneously. Valuable analysis might include identification of duplicated use of data
elements, proof or lack of proof of compliance.
The developments mean that process mapping is no longer two-dimensional but multi-
dimensional; capable of achieving several important business goals:
• Business process re-engineering
• Regulatory compliance
• Activity analysis
• Service level agreement (SLA) role clarity (RACI)
• Simulation
Making process maps available using a web-browser only means that communication to, and
access by stakeholders, is achievable - thus improving compliance, training and end-to end
process understanding.
Legislation such as "The Sarbanes-Oxley Act" (also known as SOX) has increased the
requirements for improved process understanding and visibility of compliance issues.
Quality improvement practitioners have noted that various graphical descriptions of processes
can be useful. These include: detailed flow-charts, work flow diagrams and value stream maps.
Each map is helpful depending on the process questions and theories being considered. In these
situations process map implies the use of process flow and the current understanding of the
causal structure. The purpose of these process maps is to document and stimulate the
understanding of y=f(x); where the “y” represents the outputs of a process and x represents the
various inputs. These inputs would include sources of noise otherwise described as nuisance
variables.
Six Sigma practictioners use the term Business Process Architecture to describe the mapping of
business processes as series of cross-functional flowcharts. Under this school of thought, each
flowchart is of a certain level (between 0 and 4) based on the amount of detail the flowchart
contains. A level 0 flowchart represents the least amount of detail, and usually contains one or
two steps. A level 4 flowchart represents the most amount of detail, and can include hundreds of
steps. At this level every task, however minor, is represented
*Ishikawa diagram
Ishikawa diagram
One of the Seven Basic Tools of Quality
First described Kaoru Ishikawa

by
Purpose To break down (in successive

layers of detail) root causes that
potentially contribute to a
particular effect
Ishikawa diagrams (also called fishbone diagrams, cause-and-effect diagrams or Fishikawa)
are diagrams that show the causes of a certain event -- created by Kaoru Ishikawa (1990)[1].
Common uses of the Ishikawa diagram are product design and quality defect prevention, to
identify potential factors causing an overall effect. Each cause or reason for imperfection is a
source of variation. Causes are usually grouped into major categories to identify these sources of
variation. The categories typically include:
• People: Anyone involved with the process
• Methods: How the process is performed and the specific requirements for doing it, such
as policies, procedures, rules, regulations and laws
• Machines: Any equipment, computers, tools etc. required to accomplish the job
• Materials: Raw materials, parts, pens, paper, etc. used to produce the final product
• Measurements: Data generated from the process that are used to evaluate its quality
• Environment: The conditions, such as location, time, temperature, and culture in which
the process operates
Ishikawa diagram, in fishbone shape, showing factors of Equipment, Process, People, Materials,
Environment and Management, all affecting the overall problem. Smaller arrows connect the
sub-causes to major causes.
Ishikawa diagrams were proposed by Kaoru Ishikawa[2] in the 1960s, who pioneered quality
management processes in the Kawasaki shipyards, and in the process became one of the
founding fathers of modern management.
It was first used in the 1960s, and is considered one of the seven basic tools of quality control.[3]
It is known as a fishbone diagram because of its shape, similar to the side view of a fish skeleton.
Mazda Motors famously used an Ishikawa diagram in the development of the Miata sports car,
where the required result was "Jinba Ittai" or "Horse and Rider as One". The main causes
included such aspects as "touch" and "braking" with the lesser causes including highly granular
factors such as "50/50 weight distribution" and "able to rest elbow on top of driver's door". Every
factor identified in the diagram was included in the final design.
[edit] Causes
Causes in the diagram are often categorized, such as to the 8 M's, described below. Cause-and-
effect diagrams can reveal key relationships among various variables, and the possible causes
provide additional insight into process behavior.
Causes can be derived from brainstorming sessions. These groups can then be labeled as
categories of the fishbone. They will typically be one of the traditional categories mentioned
above but may be something unique to the application in a specific case. Causes can be traced
back to root causes with the 5 Whys technique.
Typical categories are:
[edit] The 8 Ms (used in manufacturing)
• Machine (technology)
• Method (process)
• Material (Includes Raw Material, Consumables and Information.)
• Man Power (physical work)/Mind Power (brain work): Kaizens, Suggestions
• Measurement (Inspection)
• Milieu/Mother Nature (Environment)
• Management/Money Power
• Maintenance
[edit] The 8 Ps (used in service industry)
• Product=Service
• Price
• Place
• Promotion
• People
• Process
• Physical Evidence
• Productivity & Quality
[edit] The 4 Ss (used in service industry)
• Surroundings
• Suppliers
• Systems
• Skills
[edit] Questions to ask while building a Ishikawa Diagram
• Man
– Was the document properly interpreted? – Was the information properly disseminated? – Did
the recipient understand the information? – Was the proper training to perform the task
administered to the person? – Was too much judgment required to perform the task? – Were
guidelines for judgment available? – Did the environment influence the actions of the individual?
– Are there distractions in the workplace? – Is fatigue a mitigating factor? – How much
experience does the individual have in performing this task?
• Machine
– Was the correct tool used? – Are files saved with the correct extension to the correct location?
– Is the equipment affected by the environment? – Is the equipment being properly maintained
(i.e., daily/weekly/monthly preventative maintenance schedule) – Does the software or hardware
need to be updated? – Does the equipment or software have the features to support our
needs/usage? – Was the machine properly programmed? – Is the tooling/fixturing adequate for
the job? – Does the machine have an adequate guard? – Was the equipment used within its
capabilities and limitations? – Are all controls including emergency stop button clearly labeled
and/or color coded or size differentiated? – Is the equipment the right application for the given
job?
• Measurement
– Does the gage have a valid calibration date? – Was the proper gage used to measure the part,
process, chemical, compound, etc.? – Was a gage capability study ever performed? - Do
measurements vary significantly from operator to operator? - Do operators have a tough time
using the prescribed gage? - Is the gage fixturing adequate? – Does the gage have proper
measurement resolution? – Did the environment influence the measurements taken?
• Material (Includes Raw Material, Consumables and Information )
– Is all needed information available and accurate? – Can information be verified or cross-
checked? – Has any information changed recently / do we have a way of keeping the information
up to date? – What happens if we don't have all of the information we need? – Is a Material
Safety Data Sheet (MSDS) readily available? – Was the material properly tested? – Was the
material substituted? – Is the supplier’s process defined and controlled? – Were quality
requirements adequate for part function? – Was the material contaminated? – Was the material
handled properly (stored, dispensed, used & disposed)?
• Milieu
– Is the process affected by temperature changes over the course of a day? – Is the process
affected by humidity, vibration, noise, lighting, etc.? – Does the process run in a controlled
environment? – Are associates distracted by noise, uncomfortable temperatures, fluorescent
lighting, etc.?
• Method
– Was the canister, barrel, etc. labeled properly? – Were the workers trained properly in the
procedure? – Was the testing performed statistically significant? – Was data tested for true root
cause? – How many “if necessary” and “approximately” phrases are found in this process? –
Was this a process generated by an Integrated Product Development (IPD) Team? – Was the IPD
Team properly represented? – Did the IPD Team employ Design for Environmental (DFE)
principles? – Has a capability study ever been performed for this process? – Is the process under
Statistical Process Control (SPC)? – Are the work instructions clearly written? – Are mistake-
proofing devices/techniques employed? – Are the work instructions complete? – Is the tooling
adequately designed and controlled? – Is handling/packaging adequately specified? – Was the
process changed? – Was the design changed? – Was a process Failure Modes Effects Analysis
(FMEA) ever performed? – Was adequate sampling done? – Are features of the process critical
to safety clearly spelled out to the Operator?
[edit] Criticism of Ishikawa Diagrams
In a discussion of the nature of a cause it customary to distinguish between necessary and
sufficient conditions for the occurrence of an event. A necessary condition for the occurrence of
a specified event is a circumstance in whose absence the event cannot occur. A sufficient
condition for the occurrence of an event is a circumstance in whose presence the event must
occur[4].
Ishikawa diagrams have been criticized for failing to make the distinction between necessary
conditions and sufficient conditions. It seems that Ishikawa was not even aware of this
distinction[5]
*Chi-square test
A chi-square test (also chi squared test or χ2 test) is any statistical hypothesis test in which the
sampling distribution of the test statistic is a chi-square distribution when the null hypothesis is
true, or any in which this is asymptotically true, meaning that the sampling distribution (if the
null hypothesis is true) can be made to approximate a chi-square distribution as closely as
desired by making the sample size large enough.
Some examples of chi-squared tests where the chi-square distribution is only approximately
valid:
• Pearson's chi-square test, also known as the chi-square goodness-of-fit test or chi-
square test for independence. When mentioned without any modifiers or without other
precluding context, this test is usually understood (for an exact test used in place of χ2,
see Fisher's exact test).
• Yates' chi-square test, also known as Yates' correction for continuity
• Mantel–Haenszel chi-square test.
• Linear-by-linear association chi-square test.
• The portmanteau test in time-series analysis, testing for the presence of autocorrelation
• Likelihood-ratio tests in general statistical modelling, for testing whether there is
evidence of the need to move from a simple model to a more complicated one (where the
simple model is nested within the complicated one).
One case where the distribution of the test statistic is an exact chi-square distribution is the test
that the variance of a normally-distributed population has a given value based on a sample
variance. Such a test is uncommon in practice because values of variances to test against are
seldom known exactly
Chi-square test for variance in a normal population

If a sample of size n is taken from a population having a normal distribution, then there is a well-
known result (see distribution of the sample variance) which allows a test to be made of whether
the variance of the population has a pre-determined value. For example, a manufacturing process
might have been in stable condition for a long period, allowing a value for the variance to be
determined essentially without error. Suppose that a variant of the process is being tested, giving
rise to a small sample of product items whose variation is to be tested. The test statistic T in this
instance could be set to be the sum of squares about the sample mean, divided by the nominal
value for the variance (i.e. the value to be tested as holding). Then T has a chi-square distribution
with n–1 degrees of freedom. For example if the sample size is 21, the acceptance region for T
for a significance level of 5% is the interval 9.59 to 34.17
*Control chart
Control chart
First described Walter A. Shewhart

by
Purpose To determine whether a process

should undergo a formal
examination for quality-related
problems
Control charts, also known as Shewhart charts or process-behaviour charts, in statistical

process control are tools used to determine whether or not a manufacturing or business process is
in a state of statistical control.
If analysis of the control chart indicates that the process is currently under control (i.e. is stable,
with variation only coming from sources common to the process) then data from the process can
be used to predict the future performance of the process. If the chart indicates that the process
being monitored is not in control, analysis of the chart can help determine the sources of
variation, which can then be eliminated to bring the process back into control. A control chart is
a specific kind of run chart that allows significant change to be differentiated from the natural
variability of the process.
The control chart can be seen as part of an objective and disciplined approach that enables
correct decisions regarding control of the process, including whether or not to change process
control parameters. Process parameters should never be adjusted for a process that is in control,
as this will result in degraded process performance.[1]
The control chart is one of the seven basic tools of quality control.[2]
[edit] History
The control chart was invented by Walter A. Shewhart while working for Bell Labs in the 1920s.
The company's engineers had been seeking to improve the reliability of their telephony
transmission systems. Because amplifiers and other equipment had to be buried underground,
there was a business need to reduce the frequency of failures and repairs. By 1920 the engineers
had already realized the importance of reducing variation in a manufacturing process. Moreover,
they had realized that continual process-adjustment in reaction to non-conformance actually
increased variation and degraded quality. Shewhart framed the problem in terms of Common-
and special-causes of variation and, on May 16, 1924, wrote an internal memo introducing the
control chart as a tool for distinguishing between the two. Dr. Shewhart's boss, George Edwards,
recalled: "Dr. Shewhart prepared a little memorandum only about a page in length. About a third
of that page was given over to a simple diagram which we would all recognize today as a
schematic control chart. That diagram, and the short text which preceded and followed it, set
forth all of the essential principles and considerations which are involved in what we know today
as process quality control." [3] Shewhart stressed that bringing a production process into a state of
statistical control, where there is only common-cause variation, and keeping it in control, is
necessary to predict future output and to manage a process economically.
Dr. Shewhart created the basis for the control chart and the concept of a state of statistical control
by carefully designed experiments. While Dr. Shewhart drew from pure mathematical statistical
theories, he understood data from physical processes typically produce a "normal distribution
curve" (a Gaussian distribution, also commonly referred to as a "bell curve"). He discovered that
observed variation in manufacturing data did not always behave the same way as data in nature
(Brownian motion of particles). Dr. Shewhart concluded that while every process displays
variation, some processes display controlled variation that is natural to the process, while others
display uncontrolled variation that is not present in the process causal system at all times.[4]
In 1924 or 1925, Shewhart's innovation came to the attention of W. Edwards Deming, then
working at the Hawthorne facility. Deming later worked at the United States Department of
Agriculture and then became the mathematical advisor to the United States Census Bureau. Over
the next half a century, Deming became the foremost champion and proponent of Shewhart's
work. After the defeat of Japan at the close of World War II, Deming served as statistical
consultant to the Supreme Commander of the Allied Powers. His ensuing involvement in
Japanese life, and long career as an industrial consultant there, spread Shewhart's thinking, and
the use of the control chart, widely in Japanese manufacturing industry throughout the 1950s and
1960s.
[edit] Chart details
A control chart consists of:
• Points representing a statistic (e.g., a mean, range, proportion) of measurements of a
quality characteristic in samples taken from the process at different times [the data]
• The mean of this statistic using all the samples is calculated (e.g., the mean of the means,
mean of the ranges, mean of the proportions)
• A center line is drawn at the value of the mean of the statistic
• The standard error (e.g., standard deviation/sqrt(n) for the mean) of the statistic is also
calculated using all the samples
• Upper and lower control limits (sometimes called "natural process limits") that indicate
the threshold at which the process output is considered statistically 'unlikely' are drawn
typically at 3 standard errors from the center line
The chart may have other optional features, including:
• Upper and lower warning limits, drawn as separate lines, typically two standard errors
above and below the center line
• Division into zones, with the addition of rules governing frequencies of observations in
each zone
• Annotation with events of interest, as determined by the Quality Engineer in charge of
the process's quality
[edit] Chart usage

If the process is in control, all points will plot within the control limits. Any observations outside
the limits, or systematic patterns within, suggest the introduction of a new (and likely
unanticipated) source of variation, known as a special-cause variation. Since increased variation
means increased quality costs, a control chart "signaling" the presence of a special-cause requires
immediate investigation.
This makes the control limits very important decision aids. The control limits tell you about
process behavior and have no intrinsic relationship to any specification targets or engineering
tolerance. In practice, the process mean (and hence the center line) may not coincide with the
specified value (or target) of the quality characteristic because the process' design simply can't
deliver the process characteristic at the desired level.
Control charts limit specification limits or targets because of the tendency of those involved with
the process (e.g., machine operators) to focus on performing to specification when in fact the
least-cost course of action is to keep process variation as low as possible. Attempting to make a
process whose natural center is not the same as the target perform to target specification
increases process variability and increases costs significantly and is the cause of much
inefficiency in operations. Process capability studies do examine the relationship between the
natural process limits (the control limits) and specifications, however.
The purpose of control charts is to allow simple detection of events that are indicative of actual
process change. This simple decision can be difficult where the process characteristic is
continuously varying; the control chart provides statistically objective criteria of change. When
change is detected and considered good its cause should be identified and possibly become the
new way of working, where the change is bad then its cause should be identified and eliminated.
The purpose in adding warning limits or subdividing the control chart into zones is to provide
early notification if something is amiss. Instead of immediately launching a process
improvement effort to determine whether special causes are present, the Quality Engineer may
temporarily increase the rate at which samples are taken from the process output until it's clear
that the process is truly in control. Note that with three sigma limits, one expects to be signaled
approximately once out of every 370 points on average, just due to common-causes.
[edit] Choice of limits
Shewhart set 3-sigma (3-standard error) limits on the following basis.
• The coarse result of Chebyshev's inequality that, for any probability distribution, the
probability of an outcome greater than k standard deviations from the mean is at most
1/k2.
• The finer result of the Vysochanskii-Petunin inequality, that for any unimodal probability
distribution, the probability of an outcome greater than k standard deviations from the
mean is at most 4/(9k2).
• The empirical investigation of sundry probability distributions reveals that at least 99% of
observations occurred within three standard deviations of the mean.
Shewhart summarized the conclusions by saying:
... the fact that the criterion which we happen to use has a fine ancestry in highbrow statistical
theorems does not justify its use. Such justification must come from empirical evidence that it
works. As the practical engineer might say, the proof of the pudding is in the eating.
Though he initially experimented with limits based on probability distributions, Shewhart
ultimately wrote:
Some of the earliest attempts to characterize a state of statistical control were inspired by the
belief that there existed a special form of frequency function f and it was early argued that the
normal law characterized such a state. When the normal law was found to be inadequate, then
generalized functional forms were tried. Today, however, all hopes of finding a unique
functional form f are blasted.
The control chart is intended as a heuristic. Deming insisted that it is not a hypothesis test and is
not motivated by the Neyman-Pearson lemma. He contended that the disjoint nature of
population and sampling frame in most industrial situations compromised the use of
conventional statistical techniques. Deming's intention was to seek insights into the cause system
of a process ...under a wide range of unknowable circumstances, future and past .... He claimed
that, under such conditions, 3-sigma limits provided ... a rational and economic guide to
minimum economic loss... from the two errors:
1. Ascribe a variation or a mistake to a special cause when in fact the cause belongs to the
system (common cause). (Also known as a Type I error)
2. Ascribe a variation or a mistake to the system (common causes) when in fact the cause
was special. (Also known as a Type II error)
[edit] Calculation of standard deviation
As for the calculation of control limits, the standard deviation (error) required is that of the
common-cause variation in the process. Hence, the usual estimator, in terms of sample variance,
is not used as this estimates the total squared-error loss from both common- and special-causes
of variation.
An alternative method is to use the relationship between the range of a sample and its standard
deviation derived by Leonard H. C. Tippett, an estimator which tends to be less influenced by the
extreme observations which typify special-causes.
[edit] Rules for detecting signals
The most common sets are:
• The Western Electric rules
• The Wheeler rules (equivalent to the Western Electric zone tests[5])
• The Nelson rules
There has been particular controversy as to how long a run of observations, all on the same side
of the centre line, should count as a signal, with 6, 7, 8 and 9 all being advocated by various
writers.
The most important principle for choosing a set of rules is that the choice be made before the
data is inspected. Choosing rules once the data have been seen tends to increase the Type I error
rate owing to testing effects suggested by the data.
[edit] Alternative bases
In 1935, the British Standards Institution, under the influence of Egon Pearson and against
Shewhart's spirit, adopted control charts, replacing 3-sigma limits with limits based on
percentiles of the normal distribution. This move continues to be represented by John Oakland
and others but has been widely deprecated by writers in the Shewhart-Deming tradition.
[edit] Performance of control charts
When a point falls outside of the limits established for a given control chart, those responsible
for the underlying process are expected to determine whether a special cause has occurred. If one
has, it is appropriate to determine if the results with the special cause are better than or worse
than results from common causes alone. If worse, then that cause should be eliminated if
possible. If better, it may be appropriate to intentionally retain the special cause within the
system producing the results.[citation needed]
It is known that even when a process is in control (that is, no special causes are present in the
system), there is approximately a 0.27% probability of a point exceeding 3-sigma control limits.
Since the control limits are evaluated each time a point is added to the chart, it readily follows
that every control chart will eventually signal the possible presence of a special cause, even
though one may not have actually occurred. For a Shewhart control chart using 3-sigma limits,
this false alarm occurs on average once every 1/0.0027 or 370.4 observations. Therefore, the in-
control average run length (or in-control ARL) of a Shewhart chart is 370.4.[citation needed]
Meanwhile, if a special cause does occur, it may not be of sufficient magnitude for the chart to
produce an immediate alarm condition. If a special cause occurs, one can describe that cause by
measuring the change in the mean and/or variance of the process in question. When those
changes are quantified, it is possible to determine the out-of-control ARL for the chart.[citation needed]
It turns out that Shewhart charts are quite good at detecting large changes in the process mean or
variance, as their out-of-control ARLs are fairly short in these cases. However, for smaller
changes (such as a 1- or 2-sigma change in the mean), the Shewhart chart does not detect these
changes efficiently. Other types of control charts have been developed, such as the EWMA chart
and the CUSUM chart, which detect smaller changes more efficiently by making use of
information from observations collected prior to the most recent data point.[citation needed]
[edit] Criticisms
Several authors have criticised the control chart on the grounds that it violates the likelihood
principle.[citation needed] However, the principle is itself controversial and supporters of control charts
further argue that, in general, it is impossible to specify a likelihood function for a process not in
statistical control, especially where knowledge about the cause system of the process is weak.
[citation needed]
Some authors have criticised the use of average run lengths (ARLs) for comparing control chart
performance, because that average usually follows a geometric distribution, which has high
variability and difficulties.[citation needed]
[edit] Types of charts
Process Process Size of
Chart Process observation observations observations shift to
relationships type detect
Quality characteristic
Large (≥
and R chart measurement within one Independent Variables
1.5σ)
subgroup
Large (≥
and s chart measurement within one Independent Variables
1.5σ)
subgroup
Shewhart
individuals control Large (≥
measurement for one Independent Variables†
chart (ImR chart or 1.5σ)
observation
XmR chart)
Three-way chart Quality characteristic Independent Variables Large (≥

measurement within one
1.5σ)
subgroup
Fraction nonconforming within Large (≥

p-chart Independent Attributes†
one subgroup 1.5σ)
Number nonconforming within Large (≥

np-chart Independent Attributes†
one subgroup 1.5σ)
Number of nonconformances Large (≥

c-chart Independent Attributes†
within one subgroup 1.5σ)
Nonconformances per unit Large (≥

u-chart Independent Attributes†
within one subgroup 1.5σ)
Exponentially weighted moving

average of quality characteristic Attributes or Small (<
EWMA chart Independent
measurement within one variables 1.5σ)
subgroup
Cumulative sum of quality

Attributes or Small (<
CUSUM chart characteristic measurement Independent
variables 1.5σ)
within one subgroup
Attributes or
Time series model measurement within one Autocorrelated N/A
variables
subgroup
Quality characteristic Dependent of

Regression Control Large (≥
measurement within one process control Variables
Chart 1.5σ)
subgroup variables
†
Some practitioners also recommend the use of Individuals charts for attribute data, particularly
when the assumptions of either binomially-distributed data (p- and np-charts) or Poisson-
distributed data (u- and c-charts) are violated.[6] Two primary justifications are given for this
practice. First, normality is not necessary for statistical control, so the Individuals chart may be
used with non-normal data.[7] Second, attribute charts derive the measure of dispersion directly
from the mean proportion (by assuming a probability distribution), while Individuals charts
derive the measure of dispersion from the data, independent of the mean, making Individuals
charts more robust than attributes charts to violations of the assumptions about the distribution of
the underlying population.[8] It is sometimes noted that the substitution of the Individuals chart
works best for large counts, when the binomial and Poisson distributions approximate a normal
distribution. i.e. when the number of trials n > 1000 for p- and np-charts or λ > 500 for u- and c-
charts.
Critics of this approach argue that control charts should not be used then their underlying
assumptions are violated, such as when process data is neither normally distributed nor
binomially (or Poisson) distributed. Such processes are not in control and should be improved
before the application of control charts. Additionally, application of the charts in the presence of
such deviations increases the type I and type II error rates of the control charts, and may make
the chart of little practical use.
*Correlation and dependence
correlation and dependence are any of a broad class of statistical relationships between two or
more random variables or observed data values.
Familiar examples of dependent phenomena include the correlation between the physical
statures of parents and their offspring, and the correlation between the demand for a product and
its price. Correlations are useful because they can indicate a predictive relationship that can be
exploited in practice. For example, an electrical utility may produce less power on a mild day
based on the correlation between electricity demand and weather. Correlations can also suggest
possible causal, or mechanistic relationships; however, statistical dependence is not sufficient to
demonstrate the presence of such a relationship.
Formally, dependence refers to any situation in which random variables do not satisfy a
mathematical condition of probabilistic independence. In general statistical usage, correlation or
co-relation can refer to any departure of two or more random variables from independence, but
most commonly refers to a more specialized type of relationship between mean values. There are
several correlation coefficients, often denoted ρ or r, measuring the degree of correlation. The
most common of these is the Pearson correlation coefficient, which is sensitive only to a linear
relationship between two variables (which may exist even if one is a nonlinear function of the
other). Other correlation coefficients have been developed to be more robust than the Pearson
correlation, or more sensitive to nonlinear relationships
Several sets of (x, y) points, with the Pearson correlation coefficient of x and y for each set. Note
that the correlation reflects the noisiness and direction of a linear relationship (top row), but not
the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom).
N.B.: the figure in the center has a slope of 0 but in that case the correlation coefficient is
undefined because the variance of Y is zero.
Pearson's product-moment coefficient

Main article: Pearson product-moment correlation coefficient
The most familiar measure of dependence between two quantities is the Pearson product-moment
correlation coefficient, or "Pearson's correlation." It is obtained by dividing the covariance of the
two variables by the product of their standard deviations. Karl Pearson developed the coefficient
from a similar but slightly different idea by Francis Galton.[4]
The population correlation coefficient ρX,Y between two random variables X and Y with expected
values μX and μY and standard deviations σX and σY is defined as:
where E is the expected value operator, cov means covariance, and, corr a widely used
alternative notation for Pearson's correlation.
The Pearson correlation is defined only if both of the standard deviations are finite and both of
them are nonzero. It is a corollary of the Cauchy–Schwarz inequality that the correlation cannot
exceed 1 in absolute value. The correlation coefficient is symmetric: corr(X,Y) = corr(Y,X).
The Pearson correlation is +1 in the case of a perfect positive (increasing) linear relationship
(correlation), −1 in the case of a perfect decreasing (negative) linear relationship
(anticorrelation) [5], and some value between −1 and 1 in all other cases, indicating the degree of
linear dependence between the variables. As it approaches zero there is less of a relationship
(closer to uncorrelated). The closer the coefficient is to either −1 or 1, the stronger the correlation
between the variables.
If the variables are independent, Pearson's correlation coefficient is 0, but the converse is not true
because the correlation coefficient detects only linear dependencies between two variables. For
example, suppose the random variable X is symmetrically distributed about zero, and Y = X2.
Then Y is completely determined by X, so that X and Y are perfectly dependent, but their
correlation is zero; they are uncorrelated. However, in the special case when X and Y are jointly
normal, uncorrelatedness is equivalent to independence.
If we have a series of n measurements of X and Y written as xi and yi where i = 1, 2, ..., n, then
the sample correlation coefficient can be used to estimate the population Pearson correlation r
between X and Y. The sample correlation coefficient is written
where x and y are the sample means of X and Y, sx and sy are the sample standard deviations of X
and Y.
This can also be written as:
[edit] Rank correlation coefficients

Main articles: Spearman's rank correlation coefficient and Kendall tau rank correlation
coefficient
Rank correlation coefficients, such as Spearman's rank correlation coefficient and Kendall's rank
correlation coefficient (τ) measure the extent to which, as one variable increases, the other
variable tends to increase, without requiring that increase to be represented by a linear
relationship. If, as the one variable increases, the other decreases, the rank correlation
coefficients will be negative. It is common to regard these rank correlation coefficients as
alternatives to Pearson's coefficient, used either to reduce the amount of calculation or to make
the coefficient less sensitive to non-normality in distributions. However, this view has little
mathematical basis, as rank correlation coefficients measure a different type of relationship than
the Pearson product-moment correlation coefficient, and are best seen as measures of a different
type of association, rather than as alternative measure of the population correlation coefficient.[6]
[7]
To illustrate the nature of rank correlation, and its difference from linear correlation, consider the
following four pairs of numbers (x, y):
(0, 1), (10, 100), (101, 500), (102, 2000).
As we go from each pair to the next pair x increases, and so does y. This relationship is perfect,
in the sense that an increase in x is always accompanied by an increase in y. This means that we
have a perfect rank correlation, and both Spearman's and Kendall's correlation coefficients are 1,
whereas in this example Pearson product-moment correlation coefficient is 0.7544, indicating
that the points are far from lying on a straight line. In the same way if y always decreases when x
increases, the rank correlation coefficients will be −1, while the Pearson product-moment
correlation coefficient may or may not be close to -1, depending on how close the points are to a
straight line. Although in the extreme cases of perfect rank correlation the two coefficients are
both equal (being both +1 or both −1) this is not in general so, and values of the two coefficients
cannot meaningfully be compared.[6] For example, for the three pairs (1, 1) (2, 3) (3, 2)
Spearman's coefficient is 1/2, while Kendall's coefficient is 1/3.
[edit] Other measures of dependence among random variables
The information given by a correlation coefficient is not enough to define the dependence
structure between random variables. The correlation coefficient completely defines the
dependence structure only in very particular cases, for example when the distribution is a
multivariate normal distribution. (See diagram above.) In the case of elliptic distributions it
characterizes the (hyper-)ellipses of equal density, however, it does not completely characterize
the dependence structure (for example, a multivariate t-distribution's degrees of freedom
determine the level of tail dependence).
Distance correlation and Brownian covariance / Brownian correlation [8][9] were introduced to
address the deficiency of Pearson's correlation that it can be zero for dependent random
variables; zero distance correlation and zero Brownian correlation imply independence.
The correlation ratio is able to detect almost any functional dependency, or the entropy-based
mutual information/total correlation which is capable of detecting even more general
dependencies. The latter are sometimes referred to as multi-moment correlation measures, in
comparison to those that consider only 2nd moment (pairwise or quadratic) dependence.
The polychoric correlation is another correlation applied to ordinal data that aims to estimate the
correlation between theorised latent variables.
One way to capture a more complete view of dependence structure is to consider a copula
between them.
[edit] Sensitivity to the data distribution
In abstract algebra, the degree of dependence between variables X and Y does not depend on the
scale on which the variables are expressed. That is, if we are analyzing the relationship between
X and Y, most correlation measures are unaffected by transforming X to a + bX and Y to c + dY,
where a, b, c, and d are constants. This is true of some correlation statistics as well as their
population analogues. Some correlation statistics, such as the rank correlation coefficient, are
also invariant to monotone transformations of the marginal distributions of X and/or Y.
Pearson/Spearman correlation coefficients between X and Y are shown when the two variables'
ranges are unrestricted, and when the range of X is restricted to the interval (0,1).
Most correlation measures are sensitive to the manner in which X and Y are sampled.
Dependencies tend to be stronger if viewed over a wider range of values. Thus, if we consider
the correlation coefficient between the heights of fathers and their sons over all adult males, and
compare it to the same correlation coefficient calculated when the fathers are selected to be
between 165 cm and 170 cm in height, the correlation will be weaker in the latter case.
Various correlation measures in use may be undefined for certain joint distributions of X and Y.
For example, the Pearson correlation coefficient is defined in terms of moments, and hence will
be undefined if the moments are undefined. Measures of dependence based on quantiles are
always defined. Sample-based statistics intended to estimate population measures of dependence
may or may not have desirable statistical properties such as being unbiased, or asymptotically
consistent, based on the spatial structure of the population from which the data were sampled.
The degree of dependence between spatially aggregated variables X and Y depends strongly on
the scale on which the variables are expressed.[10] See Ecological fallacy and Modifiable Areal
Unit Problem for more detail.
[edit] Correlation matrices

The correlation matrix of n random variables X1, ..., Xn is the n × n matrix whose i,j entry is
corr(Xi, Xj). If the measures of correlation used are product-moment coefficients, the correlation
matrix is the same as the covariance matrix of the standardized random variables Xi /σ (Xi) for i =
1, ..., n. This applies to both the matrix of population correlations (in which case "σ " is the
population standard deviation), and to the matrix of sample correlations (in which case "σ "
denotes the sample standard deviation). Consequently, each is necessarily a positive-semidefinite
matrix.
The correlation matrix is symmetric because the correlation between Xi and Xj is the same as the
correlation between Xj and Xi.
[edit] Common misconceptions
[edit] Correlation and causality
Main article: Correlation does not imply causation
The conventional dictum that "correlation does not imply causation" means that correlation
cannot be used to infer a causal relationship between the variables.[11] This dictum should not be
taken to mean that correlations cannot indicate the potential existence of causal relations.
However, the causes underlying the correlation, if any, may be indirect and unknown, and high
correlations also overlap with identity relations, where no causal process exists. Consequently,
establishing a correlation between two variables is not a sufficient condition to establish a causal
relationship (in either direction). For example, one may observe a correlation between an
ordinary alarm clock ringing and daybreak, though there is no causal relationship between these
phenomena.
A correlation between age and height in children is fairly causally transparent, but a correlation
between mood and health in people is less so. Does improved mood lead to improved health; or
does good health lead to good mood; or both? Or does some other factor underlie both? In other
words, a correlation can be taken as evidence for a possible causal relationship, but cannot
indicate what the causal relationship, if any, might be.
[edit] Correlation and linearity
Four sets of data with the same correlation of 0.816
The Pearson correlation coefficient indicates the strength of a linear relationship between two
variables, but its value generally does not completely characterize their relationship. In
particular, if the conditional mean of Y given X, denoted E(Y|X), is not linear in X, the correlation
coefficient will not fully determine the form of E(Y|X).
The image on the right shows scatterplots of Anscombe's quartet, a set of four different pairs of
variables created by Francis Anscombe.[12] The four y variables have the same mean (7.5),
standard deviation (4.12), correlation (0.816) and regression line (y = 3 + 0.5x). However, as can
be seen on the plots, the distribution of the variables is very different. The first one (top left)
seems to be distributed normally, and corresponds to what one would expect when considering
two variables correlated and following the assumption of normality. The second one (top right) is
not distributed normally; while an obvious relationship between the two variables can be
observed, it is not linear. In this case the Pearson correlation coefficient does not indicate that
there is an exact functional relationship: only the extent to which that relationship can be
approximated by a linear relationship. In the third case (bottom left), the linear relationship is
perfect, except for one outlier which exerts enough influence to lower the correlation coefficient
from 1 to 0.816. Finally, the fourth example (bottom right) shows another example when one
outlier is enough to produce a high correlation coefficient, even though the relationship between
the two variables is not linear.
These examples indicate that the correlation coefficient, as a summary statistic, cannot replace
the individual examination of the data. Note that the examples are sometimes said to demonstrate
that the Pearson correlation assumes that the data follow a normal distribution, but this is not
correct.[13]
If a pair (X, Y) of random variables follows a bivariate normal distribution, the conditional mean
E(X|Y) is a linear function of Y, and the conditional mean E(Y|X) is a linear function of X. The
correlation coefficient r between X and Y, along with the marginal means and variances of X and
Y, determines this linear relationship:
where EX and EY are the expected values of X and Y, respectively, and σx and σy are the standard
deviations of X and Y, respectively.
[edit] Partial correlation
Main article: Partial correlation
If a population or data-set is characterized by more than two variables, a partial correlation

coefficient measures the strength of dependence between a pair of variables that is not accounted
for by the way in which they both change in response to variations in a selected subset of the
other variables
*Cost-benefit analysis
Cost-benefit analysis is a term that refers both to:

• helping to appraise, or assess, the case for a project, programme or policy proposal;
• an approach to making economic decisions of any kind.
Under both definitions the process involves, whether explicitly or implicitly, weighing the total
expected costs against the total expected benefits of one or more actions in order to choose the
best or most profitable option. The formal process is often referred to as either CBA (Cost-
Benefit Analysis) or BCA (Benefit-Cost Analysis).
Benefits and costs are often expressed in money terms, and are adjusted for the time value of
money, so that all flows of benefits and flows of project costs over time (which tend to occur at
different points in time) are expressed on a common basis in terms of their “present value.”
Closely related, but slightly different, formal techniques include cost-effectiveness analysis,
economic impact analysis, fiscal impact analysis and Social Return on Investment (SROI)
analysis. The latter builds upon the logic of cost-benefit analysis, but differs in that it is explicitly
designed to inform the practical decision-making of enterprise managers and investors focused
on optimizing their social and environmental impacts
Theory
Cost–benefit analysis is often used by governments to evaluate the desirability of a given
intervention. It is heavily used in today's government. It is an analysis of the cost effectiveness of
different alternatives in order to see whether the benefits outweigh the costs. The aim is to gauge
the efficiency of the intervention relative to the status quo. The costs and benefits of the impacts
of an intervention are evaluated in terms of the public's willingness to pay for them (benefits) or
willingness to pay to avoid them (costs). Inputs are typically measured in terms of opportunity
costs - the value in their best alternative use. The guiding principle is to list all parties affected by
an intervention and place a monetary value of the effect it has on their welfare as it would be
valued by them.
The process involves monetary value of initial and ongoing expenses vs. expected return.
Constructing plausible measures of the costs and benefits of specific actions is often very
difficult. In practice, analysts try to estimate costs and benefits either by using survey methods or
by drawing inferences from market behavior. For example, a product manager may compare
manufacturing and marketing expenses with projected sales for a proposed product and decide to
produce it only if he expects the revenues to eventually recoup the costs. Cost–benefit analysis
attempts to put all relevant costs and benefits on a common temporal footing. A discount rate is
chosen, which is then used to compute all relevant future costs and benefits in present-value
terms. Most commonly, the discount rate used for present-value calculations is an interest rate
taken from financial markets (R.H. Frank 2000). This can be very controversial; for example, a
high discount rate implies a very low value on the welfare of future generations, which may have
a huge impact on the desirability of interventions to help the environment. Empirical studies
suggest that in reality, people's discount rates do decline over time. Because cost–benefit analysis
aims to measure the public's true willingness to pay, this feature is typically built into studies.
During cost–benefit analysis, monetary values may also be assigned to less tangible effects such
as the various risks that could contribute to partial or total project failure, such as loss of
reputation, market penetration, or long-term enterprise strategy alignments. This is especially
true when governments use the technique, for instance to decide whether to introduce business
regulation, build a new road, or offer a new drug through the state healthcare system. In this
case, a value must be put on human life or the environment, often causing great controversy. For
example, the cost–benefit principle says that we should install a guardrail on a dangerous stretch
of mountain road if the dollar cost of doing so is less than the implicit dollar value of the injuries,
deaths, and property damage thus prevented (R.H. Frank 2000).
Cost–benefit calculations typically involve using time value of money formulas. This is usually
done by converting the future expected streams of costs and benefits into a present value amount
Application and history
The practice of cost–benefit analysis differs between countries and between sectors (e.g.,
transport, health) within countries. Some of the main differences include the types of impacts
that are included as costs and benefits within appraisals, the extent to which impacts are
expressed in monetary terms, and differences in the discount rate between countries. Agencies
across the world rely on a basic set of key cost–benefit indicators, including the following:
• NPV (net present value)
• PVB (present value of benefits)
• PVC (present value of costs)
• BCR (benefit cost ratio = PVB / PVC)
• Net benefit (= PVB - PVC)
• NPV/k (where k is the level of funds available)
The concept of CBA dates back to an 1848 article by Dupuit and was formalized in subsequent
works by Alfred Marshall. The practical application of CBA was initiated in the US by the Corps
of Engineers, after the Federal Navigation Act of 1936 effectively required cost–benefit analysis
for proposed federal waterway infrastructure.[1] The Flood Control Act of 1939 was instrumental
in establishing CBA as federal policy. It specified the standard that "the benefits to whomever
they accrue [be] in excess of the estimated costs.[2]
Subsequently, cost–benefit techniques were applied to the development of highway and
motorway investments in the US and UK in the 1950s and 1960s. An early and often-quoted,
more developed application of the technique was made to London Underground's Victoria Line.
Over the last 40 years, cost–benefit techniques have gradually developed to the extent that
substantial guidance now exists on how transport projects should be appraised in many countries
around the world.
In the UK, the New Approach to Appraisal (NATA) was introduced by the then Department for
Transport, Environment and the Regions. This brought together cost–benefit results with those
from detailed environmental impact assessments and presented them in a balanced way. NATA
was first applied to national road schemes in the 1998 Roads Review but subsequently rolled out
to all modes of transport. It is now a cornerstone of transport appraisal in the UK and is
maintained and developed by the Department for Transport.[11]
The EU's 'Developing Harmonised European Approaches for Transport Costing and Project
Assessment' (HEATCO) project, part of its Sixth Framework Programme, has reviewed transport
appraisal guidance across EU member states and found that significant differences exist between
countries. HEATCO's aim is to develop guidelines to harmonise transport appraisal practice
across the EU.[12][13] [3]
Transport Canada has also promoted the use of CBA for major transport investments since the
issuance of its Guidebook in 1994.[4]
More recent guidance has been provided by the United States Department of Transportation and
several state transportation departments, with discussion of available software tools for
application of CBA in transportation, including HERS, BCA.Net, StatBenCost, CalBC, and
TREDIS. Available guides are provided by the Federal Highway Administration[5][6], Federal
Aviation Administration[7], Minnesota Department of Transportation[8], California Department of
Transportation (Caltrans)[9], and the Transportation Research Board Transportation Economics
Committee [10].
In the early 1960s, CBA was also extended to assessment of the relative benefits and costs of
healthcare and education in works by Burton Weisbrod.[11][12] Later, the United States Department
of Health and Human Services issued its CBA Guidebook
Accuracy problems
The accuracy of the outcome of a cost–benefit analysis depends on how accurately costs and
benefits have been estimated.
A peer-reviewed study [14] of the accuracy of cost estimates in transportation infrastructure
planning found that for rail projects actual costs turned out to be on average 44.7 percent higher
than estimated costs, and for roads 20.4 percent higher (Flyvbjerg, Holm, and Buhl, 2002). For
benefits, another peer-reviewed study [15] found that actual rail ridership was on average 51.4
percent lower than estimated ridership; for roads it was found that for half of all projects
estimated traffic was wrong by more than 20 percent (Flyvbjerg, Holm, and Buhl, 2005).
Comparative studies indicate that similar inaccuracies apply to fields other than transportation.
These studies indicate that the outcomes of cost–benefit analyses should be treated with caution
because they may be highly inaccurate. Inaccurate cost–benefit analyses likely to lead to
inefficient decisions, as defined by Pareto and Kaldor-Hicks efficiency ([16] Flyvbjerg,
Bruzelius, and Rothengatter, 2003).These outcomes (almost always tending to underestimation
unless significant new approaches are overlooked) are to be expected because such estimates:
1. Rely heavily on past like projects (often differing markedly in function or size and
certainly in the skill levels of the team members)
2. Rely heavily on the project's members to identify (remember from their collective past
experiences) the significant cost drivers
3. Rely on very crude heuristics to estimate the money cost of the intangible elements
4. Are unable to completely dispel the usually unconscious biases of the team members
(who often have a vested interest in a decision to go ahead) and the natural psychological
tendency to "think positive" (whatever that involves)
Reference class forecasting was developed to increase accuracy in estimates of costs and
benefits.[14]
Another challenge to cost–benefit analysis comes from determining which costs should be
included in an analysis (the significant cost drivers). This is often controversial because
organizations or interest groups may think that some costs should be included or excluded from a
study.
In the case of the Ford Pinto (where, because of design flaws, the Pinto was liable to burst into
flames in a rear-impact collision), the Ford company's decision was not to issue a recall. Ford's
cost–benefit analysis had estimated that based on the number of cars in use and the probable
accident rate, deaths due to the design flaw would run about $49.5 million (the amount Ford
would pay out of court to settle wrongful death lawsuits). This was estimated to be less than the
cost of issuing a recall ($137.5 million) [17]. In the event, Ford overlooked (or considered
insignificant) the costs of the negative publicity so engendered, which turned out to be quite
significant (because it led to the recall anyway and to measurable losses in sales).
In the field of health economics, some analysts think cost–benefit analysis can be an inadequate
measure because willingness-to-pay methods of determining the value of human life can be
subject to bias according to income inequity. They support use of variants such as cost-utility
analysis and quality-adjusted life year to analyze the effects of health policies
Use in regulation
Cost-benefit analysis was widely in the United States under the Bush administration to prevent
regulatory initiatives, and there is some debate about whether it is neutral to regulatory initiatives
or whether it anti-regulatory and undervalues human life, health, and the environment.[15] In the
case of environmental and occupational health regulation, it has been argued that if modern cost-
benefit analyses had been applied prospectively to proposed regulations such as removing lead
from gasoline, not turning the Grand Canyon into a hydroelectric dam, and regulating workers'
exposure to vinyl chloride, these regulations would not have been implemented even though they
are considered to be highly successful in retrospect.[15] The Clean Air Act has been cited in
retrospective studies as a case where benefits exceeded costs, but the knowledge of the benefits
(attributable largely to the benefits of reducing particulate pollution) was not available until
many years later
*CTQ tree
A CTQ tree (Critical to Quality tree) is used to decompose broad customer requirements into
more easily quantified requirements. CTQ Trees are often used in the Six Sigma methodology.
CTQs are derived from customer needs. Customer delight may be an add-on while deriving
Critical To Quality parameters. For cost considerations one may remain focused to customer
needs at the initial stage.
CTQs (Critical to Quality) are the key measurable characteristics of a product or process whose
performance standards or specification limits must be met in order to satisfy the customer. They
align improvement or design efforts with customer requirements.
CTQs represent the product or service characteristics that are defined by the customer (internal
or external). They may include the upper and lower specification limits or any other factors
related to the product or service. A CTQ usually must be interpreted from a qualitative customer
statement to an actionable, quantitative business specification.
To put it in layman's terms, CTQs are what the customer expects of a product... the spoken needs
of the customer. The customer may often express this in plain English, but it is up to the CTQ
expert to convert them to measurable terms using tools such as DFMEA, etc
*Design of experiments
In general usage, design of experiments (DOE) or experimental design is the design of any
information-gathering exercises where variation is present, whether under the full control of the
experimenter or not. However, in statistics, these terms are usually used for controlled
experiments. Other types of study, and their design, are discussed in the articles on opinion polls
and statistical surveys (which are types of observational study), natural experiments and quasi-
experiments (for example, quasi-experimental design). See Experiment for the distinction
between these types of experiments or studies.
In the design of experiments, the experimenter is often interested in the effect of some process or
intervention (the "treatment") on some objects (the "experimental units"), which may be people,
parts of people, groups of people, plants, animals, materials, etc. Design of experiments is thus a
discipline that has very broad application across all the natural and social sciences.
History of development
[edit] Controlled experimentation on scurvy
In 1747, while serving as surgeon on HM Bark Salisbury, James Lind carried out a controlled
experiment to develop a cure for scurvy.[1]
Lind selected 12 men from the ship, all suffering from scurvy, and divided them into six pairs,
giving each group different additions to their basic diet for a period of two weeks. The treatments
were all remedies that had been proposed at one time or another. They were:
• A quart of cider every day
• Twenty five gutts (drops) of elixir vitriol (sulphuric acid) three times a day upon an
empty stomach,
• One half-pint of seawater every day
• A mixture of garlic, mustard, and horseradish in a lump the size of a nutmeg
• Two spoonfuls of vinegar three times a day
• Two oranges and one lemon every day.
The men who had been given citrus fruits recovered dramatically within a week. One of them
returned to duty after 6 days and the other became nurse to the rest. The others experienced some
improvement, but nothing was comparable to the citrus fruits, which were proved to be
substantially superior to the other treatments.
In this study his subjects' cases "were as similar as I could have them", that is he provided strict
entry requirements to reduce extraneous variation. The men were paired, which provided
blocking. From a modern perspective, the main thing that is missing is randomized allocation of
subjects to treatments.
Principles of experimental design, following Ronald A. Fisher
A methodology for designing experiments was proposed by Ronald A. Fisher, in his innovative
book The Design of Experiments (1935). As an example, he described how to test the hypothesis
that a certain lady could distinguish by flavour alone whether the milk or the tea was first placed
in the cup. While this sounds like a frivolous application, it allowed him to illustrate the most
important ideas of experimental design:
Comparison
In many fields of study it is hard to reproduce measured results exactly. Comparisons

between treatments are much more reproducible and are usually preferable. Often one
compares against a standard, scientific control, or traditional treatment that acts as
baseline.
Randomization
There is an extensive body of mathematical theory that explores the consequences of

making the allocation of units to treatments by means of some random mechanism such
as tables of random numbers, or the use of randomization devices such as playing cards
or dice. Provided the sample size is adequate, the risks associated with random allocation
(such as failing to obtain a representative sample in a survey, or having a serious
imbalance in a key characteristic between a treatment group and a control group) are
calculable and hence can be managed down to an acceptable level. Random does not
mean haphazard, and great care must be taken that appropriate random methods are used.
Replication
Measurements are usually subject to variation and uncertainty. Measurements are

repeated and full experiments are replicated to help identify the sources of variation and
to better estimate the true effects of treatments.
Blocking
Blocking is the arrangement of experimental units into groups (blocks) that are similar to
one another. Blocking reduces known but irrelevant sources of variation between units
and thus allows greater precision in the estimation of the source of variation under study.
Orthogonality
Example of orthogonal factorial design
Orthogonality concerns the forms of comparison (contrasts) that can be legitimately and
efficiently carried out. Contrasts can be represented by vectors and sets of orthogonal
contrasts are uncorrelated and independently distributed if the data are normal. Because
of this independence, each orthogonal treatment provides different information to the
others. If there are T treatments and T – 1 orthogonal contrasts, all the information that
can be captured from the experiment is obtainable from the set of contrasts.
Factorial experiments
Use of factorial experiments instead of the one-factor-at-a-time method. These are

efficient at evaluating the effects and possible interactions of several factors (independent
variables).
Analysis of the design of experiments was built on the foundation of the analysis of variance, a
collection of models in which the observed variance is partitioned into components due to
different factors which are estimated and/or tested.
[edit] Example
This example is attributed to Harold Hotelling.[9] It conveys some of the flavor of those aspects
of the subject that involve combinatorial designs.
The weights of eight objects are to be measured using a pan balance and set of standard weights.
Each weighing measures the weight difference between objects placed in the left pan vs. any
objects placed in the right pan by adding calibrated weights to the lighter pan until the balance is
in equilibrium. Each measurement has a random error. The average error is zero; the standard
deviations of the probability distribution of the errors is the same number σ on different
weighings; and errors on different weighings are independent. Denote the true weights by
We consider two different experiments:

1. Weigh each object in one pan, with the other pan empty. Let Xi be the measured weight
of the ith object, for i = 1, ..., 8.
2. Do the eight weighings according to the following schedule and let Yi be the measured
difference for i = 1, ..., 8:
Then the estimated value of the weight θ1 is
Similar estimates can be found for the weights of the other items. For example
The question of design of experiments is: which experiment is better?

The variance of the estimate X1 of θ1 is σ2 if we use the first experiment. But if we use the second
experiment, the variance of the estimate given above is σ2/8. Thus the second experiment gives
us 8 times as much precision for the estimate of a single item, and estimates all items
simultaneously, with the same precision. What is achieved with 8 weighings in the second
experiment would require 64 weighings if items are weighed separately. However, note that the
estimates for the items obtained in the second experiment have errors which are correlated with
each other.
Many problems of the design of experiments involve combinatorial designs, as in this example.
[edit] Statistical control
It is best for a process to be in reasonable statistical control prior to conducting designed
experiments. When this is not possible, proper blocking, replication, and randomization allow for
the careful conduct of designed experiments.[12]
[edit] Experimental designs after Fisher
Some efficient designs for estimating several main effects simultaneously were found by Raj
Chandra Bose and K. Kishen in 1940 at the Indian Statistical Institute, but remained little known
until the Plackett-Burman designs were published in Biometrika in 1946. About the same time,
C. R. Rao introduced the concepts of orthogonal arrays as experimental designs. This was a
concept which played a central role in the development of Taguchi methods by Genichi Taguchi,
which took place during his visit to Indian Statistical Institute in early 1950s. His methods were
successfully applied and adopted by Japanese and Indian industries and subsequently were also
embraced by US industry albeit with some reservations.
In 1950, Gertrude Mary Cox and William Gemmell Cochran published the book Experimental
Designs which became the major reference work on the design of experiments for statisticians
for years afterwards.
Developments of the theory of linear models have encompassed and surpassed the cases that
concerned early writers. Today, the theory rests on advanced topics in linear algebra, algebra and
combinatorics.
As with other branches of statistics, experimental design is pursued using both frequentist and
Bayesian approaches: In evaluating statistical procedures like experimental designs, frequentist
statistics studies the sampling distribution while Bayesian statistics updates a probability
distribution on the parameter space.
Some important contributors to the field of experimental designs are C. S. Peirce, R. A. Fisher,
F. Yates, C. R. Rao, R. C. Bose, J. N. Srivastava, Shrikhande S. S., D. Raghavarao, W. G.
Cochran, O. Kempthorne, W. T. Federer, A. S. Hedayat, J. A. Nelder, R. A. Bailey, J. Kiefer, W.
J. Studden, F. Pukelsheim, D. R. Cox, H. P. Wynn, A. C. Atkinson, G. E. P. Box and G. Taguchi.
The textbooks of D. Montgomery and R. Myers have reached generations of students and
practitioners
*Failure mode and effects analysis
A failure modes and effects analysis (FMEA), is a procedure in product development and
operations management for analysis of potential failure modes within a system for classification
by the severity and likelihood of the failures. A successful FMEA activity helps a team to
identify potential failure modes based on past experience with similar products or processes,
enabling the team to design those failures out of the system with the minimum of effort and
resource expenditure, thereby reducing development time and costs. It is widely used in
manufacturing industries in various phases of the product life cycle and is now increasingly
finding use in the service industry. Failure modes are any errors or defects in a process, design,
or item, especially those that affect the customer, and can be potential or actual. Effects analysis
refers to studying the consequences of those failures
Basic terms
Failure
" The LOSS of an intended function of a device under stated conditions."
Failure mode
"The manner by which a failure is observed; it generally describes the way the failure
occurs."
Failure effect
Immediate consequences of a failure on operation, function or functionality, or status of
some item
Indenture levels
An identifier for item complexity. Complexity increases as levels are closer to one.
Local effect
The Failure effect as it applies to the item under analysis.
Next higher level effect
The Failure effect as it applies at the next higher indenture level.
End effect
The failure effect at the highest indenture level or total system.
Failure cause
Defects in design, process, quality, or part application, which are the underlying cause of
the failure or which initiate a process which leads to failure.
Severity
"The consequences of a failure mode. Severity considers the worst potential consequence
of a failure, determined by the degree of injury, property damage, or system damage that
could ultimately occur."
FMEA cycle
History
Learning from each failure is both costly and time consuming, and FMEA is a more systematic
method of studying failure. As such, it is considered better to first conduct some thought
experiments.
FMEA was formally introduced in the late 1940s for military usage by the US Armed Forces.[2]
Later it was used for aerospace/rocket development to avoid errors in small sample sizes of
costly rocket technology. An example of this is the Apollo Space program. It was also used as
application for HACCP for the Apollo Space Program, and later the food industry in general.[3]
The primary push came during the 1960s, while developing the means to put a man on the moon
and return him safely to earth. In the late 1970s the Ford Motor Company introduced FMEA to
the automotive industry for safety and regulatory consideration after the Pinto affair. They
applied the same approach to processes (PFMEA) to consider potential process induced failures
prior to launching production.
Although initially developed by the military, FMEA methodology is now extensively used in a
variety of industries including semiconductor processing, food service, plastics, software, and
healthcare.[4][5] It is integrated into the Automotive Industry Action Group's (AIAG) Advanced
Product Quality Planning (APQP) process to provide risk mitigation, in both product and process
development phases. Each potential cause must be considered for its effect on the product or
process and, based on the risk, actions are determined and risks revisited after actions are
complete. Toyota has taken this one step further with its Design Review Based on Failure Mode
(DRBFM) approach. The method is now supported by the American Society for Quality which
provides detailed guides on applying the method.[6]
[edit] Implementation
In FMEA, failures are prioritized according to how serious their consequences are, how
frequently they occur and how easily they can be detected. An FMEA also documents current
knowledge and actions about the risks of failures for use in continuous improvement. FMEA is
used during the design stage with an aim to avoid future failures (sometimes called DFMEA in
that case). Later it is used for process control, before and during ongoing operation of the
process. Ideally, FMEA begins during the earliest conceptual stages of design and continues
throughout the life of the product or service.
The outcome of an FMEA development is actions to prevent or reduce the severity or likelihood
of failures, starting with the highest-priority ones. It may be used to evaluate risk management
priorities for mitigating known threat vulnerabilities. FMEA helps select remedial actions that
reduce cumulative impacts of life-cycle consequences (risks) from a systems failure (fault).
It is used in many formal quality systems such as QS-9000 or ISO/TS 16949.
[edit] Using FMEA when designing
FMEA can provide an analytical approach, when dealing with potential failure modes and their
associated causes. When considering possible failures in a design – like safety, cost,
performance, quality and reliability – an engineer can get a lot of information about how to alter
the development/manufacturing process, in order to avoid these failures. FMEA provides an easy
tool to determine which risk has the greatest concern, and therefore an action is needed to
prevent a problem before it arises. The development of these specifications will ensure the
product will meet the defined requirements.
[edit] The pre-work
The process for conducting an FMEA is straightforward. It is developed in three main phases, in
which appropriate actions need to be defined. But before starting with an FMEA, it is important
to complete some pre-work to confirm that robustness and past history are included in the
analysis.
A robustness analysis can be obtained from interface matrices, boundary diagrams, and
parameter diagrams. A lot of failures are due to noise factors and shared interfaces with other
parts and/or systems, because engineers tend to focus on what they control directly.
To start it is necessary to describe the system and its function. A good understanding simplifies
further analysis. This way an engineer can see which uses of the system are desirable and which
are not. It is important to consider both intentional and unintentional uses. Unintentional uses are
a form of hostile environment.
Then, a block diagram of the system needs to be created. This diagram gives an overview of the
major components or process steps and how they are related. These are called logical relations
around which the FMEA can be developed. It is useful to create a coding system to identify the
different system elements. The block diagram should always be included with the FMEA.
Before starting the actual FMEA, a worksheet needs to be created, which contains the important
information about the system, such as the revision date or the names of the components. On this
worksheet all the items or functions of the subject should be listed in a logical manner, based on
the block diagram.
Example FMEA Worksheet
RPN Responsi
S D
Fail O Curr CRIT (risk bility Acti
(seve (detec Recomm
Funct ure Effec Cause(s (occurr ent (critical prior and on
rity tion ended
ion mod ts ) ence contr characte ity target take
ratin rating actions
e rating) ols ristic num completi n
g) )
ber) on date
Fill
Perform
timeo
cost
ut
High Liqui level analysis
based
level d sensor of adding
on
sens spills failed additional Jane Doe
Fill time
or on 8 level 2 5 N 80 sensor 10-Oct-
tub to fill
neve custo sensor halfway 2010
to
r mer disconn between
low
trips floor ected low and
level
high level
senso
sensors
r
[edit] Step 1: Severity

Determine all failure modes based on the functional requirements and their effects. Examples of
failure modes are: Electrical short-circuiting, corrosion or deformation. A failure mode in one
component can lead to a failure mode in another component, therefore each failure mode should
be listed in technical terms and for function. Hereafter the ultimate effect of each failure mode
needs to be considered. A failure effect is defined as the result of a failure mode on the function
of the system as perceived by the user. In this way it is convenient to write these effects down in
terms of what the user might see or experience. Examples of failure effects are: degraded
performance, noise or even injury to a user. Each effect is given a severity number (S) from 1 (no
danger) to 10 (critical). These numbers help an engineer to prioritize the failure modes and their
effects. If the severity of an effect has a number 9 or 10, actions are considered to change the
design by eliminating the failure mode, if possible, or protecting the user from the effect. A
severity rating of 9 or 10 is generally reserved for those effects which would cause injury to a
user or otherwise result in litigation.
[edit] Step 2: Occurrence
In this step it is necessary to look at the cause of a failure mode and how many times it occurs.
This can be done by looking at similar products or processes and the failure modes that have
been documented for them. A failure cause is looked upon as a design weakness. All the
potential causes for a failure mode should be identified and documented. Again this should be in
technical terms. Examples of causes are: erroneous algorithms, excessive voltage or improper
operating conditions. A failure mode is given an occurrence ranking (O), again 1–10. Actions
need to be determined if the occurrence is high (meaning > 4 for non-safety failure modes and
> 1 when the severity-number from step 1 is 9 or 10). This step is called the detailed
development section of the FMEA process. Occurrence also can be defined as %. If a non-safety
issue happened less than 1%, we can give 1 to it. It is based on your product and customer
specification
[edit] Step 3: Detection
When appropriate actions are determined, it is necessary to test their efficiency. In addition,
design verification is needed. The proper inspection methods need to be chosen. First, an
engineer should look at the current controls of the system, that prevent failure modes from
occurring or which detect the failure before it reaches the customer. Hereafter one should
identify testing, analysis, monitoring and other techniques that can be or have been used on
similar systems to detect failures. From these controls an engineer can learn how likely it is for a
failure to be identified or detected. Each combination from the previous 2 steps receives a
detection number (D). This ranks the ability of planned tests and inspections to remove defects or
detect failure modes in time. The assigned detection number measures the risk that the failure
will escape detection. A high detection number indicates that the chances are high that the failure
will escape detection, or in other words, that the chances of detection are low.
After these three basic steps, risk priority numbers (RPN) are calculated
[edit] Risk priority numbers
RPN do not play an important part in the choice of an action against failure modes. They are
more threshold values in the evaluation of these actions.
After ranking the severity, occurrence and detectability the RPN can be easily calculated by
multiplying these three numbers: RPN = S × O × D
This has to be done for the entire process and/or design. Once this is done it is easy to determine
the areas of greatest concern. The failure modes that have the highest RPN should be given the
highest priority for corrective action. This means it is not always the failure modes with the
highest severity numbers that should be treated first. There could be less severe failures, but
which occur more often and are less detectable.
After these values are allocated, recommended actions with targets, responsibility and dates of
implementation are noted. These actions can include specific inspection, testing or quality
procedures, redesign (such as selection of new components), adding more redundancy and
limiting environmental stresses or operating range. Once the actions have been implemented in
the design/process, the new RPN should be checked, to confirm the improvements. These tests
are often put in graphs, for easy visualization. Whenever a design or a process changes, an
FMEA should be updated.
A few logical but important thoughts come in mind:
• Try to eliminate the failure mode (some failures are more preventable than others)
• Minimize the severity of the failure
• Reduce the occurrence of the failure mode
• Improve the detection
[edit] Timing of FMEA
The FMEA should be updated whenever:
• At the beginning of a cycle (new product/process)
• Changes are made to the operating conditions
• A change is made in the design
• New regulations are instituted
• Customer feedback indicates a problem
[edit] Uses of FMEA
• Development of system requirements that minimize the likelihood of failures.
• Development of methods to design and test systems to ensure that the failures have been
eliminated.
• Evaluation of the requirements of the customer to ensure that those do not give rise to
potential failures.
• Identification of certain design characteristics that contribute to failures, and minimize or
eliminate those effects.
• Tracking and managing potential risks in the design. This helps avoid the same failures in
future projects.
• Ensuring that any failure that could occur will not injure the customer or seriously impact
a system.
• To produce world class quality products
[edit] Advantages
• Improve the quality, reliability and safety of a product/process
• Improve company image and competitiveness
• Increase user satisfaction
• Reduce system development timing and cost
• Collect information to reduce future failures, capture engineering knowledge
• Reduce the potential for warranty concerns
• Early identification and elimination of potential failure modes
• Emphasize problem prevention
• Minimize late changes and associated cost
• Catalyst for teamwork and idea exchange between functions
• Reduce the possibility of same kind of failure in future
[edit] Limitations
Since FMEA is effectively dependent on the members of the committee which examines product
failures, it is limited by their experience of previous failures. If a failure mode cannot be
identified, then external help is needed from consultants who are aware of the many different
types of product failure. FMEA is thus part of a larger system of quality control, where
documentation is vital to implementation. General texts and detailed publications are available in
forensic engineering and failure analysis. It is a general requirement of many specific national
and international standards that FMEA is used in evaluating product integrity. If used as a top-
down tool, FMEA may only identify major failure modes in a system. Fault tree analysis (FTA)
is better suited for "top-down" analysis. When used as a "bottom-up" tool FMEA can augment or
complement FTA and identify many more causes and failure modes resulting in top-level
symptoms. It is not able to discover complex failure modes involving multiple failures within a
subsystem, or to report expected failure intervals of particular failure modes up to the upper level
subsystem or system.[citation needed]
Additionally, the multiplication of the severity, occurrence and detection rankings may result in
rank reversals, where a less serious failure mode receives a higher RPN than a more serious
failure mode.[7] The reason for this is that the rankings are ordinal scale numbers, and
multiplication is not defined for ordinal numbers. The ordinal rankings only say that one ranking
is better or worse than another, but not by how much. For instance, a ranking of "2" may not be
twice as bad as a ranking of "1," or an "8" may not be twice as bad as a "4," but multiplication
treats them as though they are. See Level of measurement for further discussion.
[edit] Software
Most FMEAs are created as a spreadsheet. Specialized FMEA software packages exist that offer
some advantages over spreadsheets.
[edit] Types of FMEA
• Process: analysis of manufacturing and assembly processes
• Design: analysis of products prior to production
• Concept: analysis of systems or subsystems in the early design concept stages
• Equipment: analysis of machinery and equipment design before purchase
• Service: analysis of service industry processes before they are released to impact the
customer
• System: analysis of the global system functions
• Software: analysis of the software functions
*General linear model
The general linear model (GLM) is a statistical linear model. It may be written as[1]
where Y is a matrix with series of multivariate measurements, X is a matrix that might be a

design matrix, B is a matrix containing parameters that are usually to be estimated and U is a
matrix containing errors or noise. The errors are usually assumed to follow a multivariate normal
distribution. If the errors do not follow a multivariate normal distribution, generalized linear
models may be used to relax assumptions about Y and U.
The general linear model incorporates a number of different statistical models: ANOVA,
ANCOVA, MANOVA, MANCOVA, ordinary linear regression, t-test and F-test. If there is only
one column in Y (i.e., one dependent variable) then the model can also be referred to as the
multiple regression model (multiple linear regression).
Hypothesis tests with the general linear model can be made in two ways: multivariate or as
several independent univariate tests. In multivariate tests the columns of Y are tested together,
whereas in univariate tests the columns of Y are tested independently, i.e., as multiple univariate
tests with the same design matrix.
Applications
An application of the general linear model appears in the analysis of multiple brain scans in
scientific experiments where Y contains data from brain scanners, X contains experimental
design variables and confounds. It is usually tested in a univariate way (usually referred to a
mass-univariate in this setting) and is often referred to as statistical parametric mapping
*Histogram
In statistics, a histogram is a graphical representation, showing a visual impression of the
distribution of experimental data. It is an estimate of the probability distribution of a continuous
variable and was first introduced by Karl Pearson [1]. A histogram consists of tabular frequencies,
shown as adjacent rectangles, erected over discrete intervals (bins), with an area equal to the
frequency of the observations in the interval. The height of a rectangle is also equal to the
frequency density of the interval, i.e., the frequency divided by the width of the interval. The
total area of the histogram is equal to the number of data. A histogram may also be normalized
displaying relative frequencies. It then shows the proportion of cases that fall into each of several
categories, with the total area equaling 1. The categories are usually specified as consecutive,
non-overlapping intervals of a variable. The categories (intervals) must be adjacent, and often are
chosen to be of the same size.[2]
Histograms are used to plot density of data, and often for density estimation: estimating the
probability density function of the underlying variable. The total area of a histogram used for
probability density is always normalized to 1. If the length of the intervals on the x-axis are all 1,
then a histogram is identical to a relative frequency plot.
An alternative to the histogram is kernel density estimation, which uses a kernel to smooth
samples. This will construct a smooth probability density function, which will in general more
accurately reflect the underlying variable.
The histogram is one of the seven basic tools of quality control
First Karl Pearson

described by
Purpose To roughly assess the

probability distribution of a
given variable by depicting the
frequencies of observations
occurring in certain ranges of
values
Etymology
An example histogram of the heights of 31 Black Cherry trees.
The etymology of the word histogram is uncertain. Sometimes it is said to be derived from the
Greek histos 'anything set upright' (as the masts of a ship, the bar of a loom, or the vertical bars
of a histogram); and gramma 'drawing, record, writing'. It is also said that Karl Pearson, who
introduced the term in 1895, derived the name from "historical diagram".
[4]
[edit] Examples
As an example we consider data collected by the U.S. Census Bureau on time to travel to work
(2000 census, [1], Table 2). The census found that there were 124 million people who work
outside of their homes. An interesting feature of this graph is that the number recorded for "at
least 15 but less than 20 minutes" is higher than for the bands on either side. This is likely to
have arisen from people rounding their reported journey time.[original research?] This rounding is a
common phenomenon when collecting data from people.
Histogram of travel time, US 2000 census. Area under the curve equals the total number of
cases. This diagram uses Q/width from the table.
Data by absolute numbers
Interval Width Quantity Quantity/width
0 5 4180 836
5 5 13687 2737
10 5 18618 3723
15 5 19634 3926
20 5 17981 3596
25 5 7190 1438
30 5 16369 3273
35 5 3212 642
40 5 4122 824
45 15 9200 613
60 30 6461 215
90 60 3435 57
This histogram shows the number of cases per unit interval so that the height of each bar is equal
to the proportion of total people in the survey who fall into that category. The area under the
curve represents the total number of cases (124 million). This type of histogram shows absolute
numbers, with Q in thousands.
Histogram of travel time, US 2000 census. Area under the curve equals 1. This diagram uses
Q/total/width from the table.
Data by proportion
Interval Width Quantity (Q) Q/total/width
0 5 4180 0.0067
5 5 13687 0.0221
10 5 18618 0.0300
15 5 19634 0.0316
20 5 17981 0.0290
25 5 7190 0.0116
30 5 16369 0.0264
35 5 3212 0.0052
40 5 4122 0.0066
45 15 9200 0.0049
60 30 6461 0.0017
90 60 3435 0.0005
This histogram differs from the first only in the vertical scale. The height of each bar is the
decimal percentage of the total that each category represents, and the total area of all the bars is
equal to 1, the decimal equivalent of 100%. The curve displayed is a simple density estimate.
This version shows proportions, and is also known as a unit area histogram.
In other words, a histogram represents a frequency distribution by means of rectangles whose

widths represent class intervals and whose areas are proportional to the corresponding
frequencies. The intervals are placed together in order to show that the data represented by the
histogram, while exclusive, is also continuous. (E.g., in a histogram it is possible to have two
connecting intervals of 10.5-20.5 and 20.5-33.5, but not two connecting intervals of 10.5-20.5
and 22.5-32.5. Empty intervals are represented as empty and not skipped.)[5]
[edit] Activities and demonstrations
The SOCR resource pages contain a number of hands-on interactive activities demonstrating the
concept of a histogram, histogram construction and manipulation using Java applets and charts.
[edit] Mathematical definition
An ordinary and a cumulative histogram of the same data. The data shown is a random sample of
10,000 points from a normal distribution with a mean of 0 and a standard deviation of 1.
In a more general mathematical sense, a histogram is a function mi that counts the number of
observations that fall into each of the disjoint categories (known as bins), whereas the graph of a
histogram is merely one way to represent a histogram. Thus, if we let n be the total number of
observations and k be the total number of bins, the histogram mi meets the following conditions:
[edit] Cumulative histogram
A cumulative histogram is a mapping that counts the cumulative number of observations in all of
the bins up to the specified bin. That is, the cumulative histogram Mi of a histogram mj is defined
as:
[edit] Number of bins and width

There is no "best" number of bins, and different bin sizes can reveal different features of the data.
Some theoreticians have attempted to determine an optimal number of bins, but these methods
generally make strong assumptions about the shape of the distribution. Depending on the actual
data distribution and the goals of the analysis, different bin widths may be appropriate, so
experimentation is usually needed to determine an appropriate width. There are, however,
various useful guidelines and rules of thumb.[6]
The number of bins k can be assigned directly or can be calculated from a suggested bin width h
as:
The braces indicate the ceiling function.

Sturges' formula[7]
which implicitly bases the bin sizes on the range of the data, and can perform poorly if n < 30.
Scott's choice[8]
where σ is the sample standard deviation.

Square-root choice
which takes the square root of the number of data points in the sample (used by Excel histograms
and many others).
Freedman–Diaconis' choice[9]
which is based on the interquartile range.
Choice based on minimization of an estimated L2 risk function[10]
where and are mean and biased variance of a histogram with bin-width ,
and .
*Quality function deployment
Quality function deployment (QFD) is a “method to transform user demands into design
quality, to deploy the functions forming quality, and to deploy methods for achieving the design
quality into subsystems and component parts, and ultimately to specific elements of the
manufacturing process.” [1], as described by Dr. Yoji Akao, who originally developed QFD in
Japan in 1966, when the author combined his work in quality assurance and quality control
points with function deployment used in Value Engineering.
QFD is designed to help planners focus on characteristics of a new or existing product or service
from the viewpoints of market segments, company, or technology-development needs. The
technique yields graphs and matrices.
QFD helps transform customer needs (the voice of the customer [VOC]) into engineering
characteristics (and appropriate test methods) for a product or service, prioritizing each product
or service characteristic while simultaneously setting development targets for product or service
Areas of application
QFD House of Quality for Enterprise Product Development Processes
QFD is applied in a wide variety of services, consumer products, military needs (such as the F-35
Joint Strike Fighter[2]), and emerging technology products. The technique is also used to identify
and document competitive marketing strategies and tactics (see example QFD House of Quality
for Enterprise Product Development, at right). QFD is considered a key practice of Design for
Six Sigma (DFSS - as seen in the referenced roadmap).[3] It is also implicated in the new ISO
9000:2000 standard which focuses on customer satisfaction.
Results of QFD have been applied in Japan and elsewhere into deploying the high-impact
controllable factors in Strategic planning and Strategic management (also known as Hoshin
Kanri, Hoshin Planning,[4] or Policy Deployment).
Acquiring market needs by listening to the Voice of Customer (VOC), sorting the needs, and
numerically prioritizing them (using techniques such as the Analytic Hierarchy Process) are the
early tasks in QFD. Traditionally, going to the Gemba (the "real place" where value is created for
the customer) is where these customer needs are evidenced and compiled.
While many books and articles on "how to do QFD" are available, there is a relative paucity of
example matrices available. QFD matrices become highly proprietary due to the high density of
product or service information found therein
History
While originally developed for manufacturing industries, interest in the use of QFD-based ideas
in software development commenced with work by R. J. Thackeray and G. Van Treeck,[5] for
example in Object-oriented programming[6] and use case driven software development.[7] 111
[edit] Techniques and tools based on QFD
[edit] House of Quality
House of Quality appeared in 1972 in the design of an oil tanker by Mitsubishi Heavy Industries.
[8]
Akao has reiterated numerous times that a House of Quality is not QFD, it is just an example
of one tool.[9]
A Flash tutorial exists showing the build process of the traditional QFD "House of Quality"
(HOQ).[10] (Although this example may violate QFD principles, the basic sequence of HOQ
building are illustrative.) There are also free QFD templates available that walk users through the
process of creating a House of Quality.[11]
Other tools extend the analysis beyond quality to cost, technology, reliability, function, parts,
technology, manufacturing, and service deployments.
In addition, the same technique can extend the method into the constituent product subsystems,
configuration items, assemblies, and parts. From these detail level components, fabrication and
assembly process QFD charts can be developed to support statistical process control techniques.
[edit] Pugh concept selection
Pugh Concept Selection can be used in coordination with QFD to select a promising product or
service configuration from among listed alternatives.
[edit] Modular Function Deployment
Modular Function Deployment uses QFD to establish customer requirements and to identify
important design requirements with a special emphasis on modularity.
[edit] Relationship to other techniques
The QFD-associated "Hoshin Kanri" process somewhat resembles Management by objectives
(MBO), but adds a significant element in the goal setting process, called "catchball". Use of
these Hoshin techniques by U.S. companies such as Hewlett Packard have been successful in
focusing and aligning company resources to follow stated strategic goals throughout an
organizational hierarchy.
Since the early introduction of QFD, the technique has been developed to shorten the time span
and reduce the required group efforts (such as Richard Zultner's Blitz QFD).
*Pareto chart
A Pareto chart, named after Vilfredo Pareto, is a type of chart that contains both bars and a line
graph, where individual values are represented in descending order by bars, and the cumulative
total is represented by the line
Simple example of a Pareto chart using hypothetical data showing the relative frequency of
reasons for arriving late at work
First described Joseph M. Juran

by
Purpose To assess the most frequently-
occurring defects by category†
The left vertical axis is the frequency of occurrence, but it can alternatively represent cost or
another important unit of measure. The right vertical axis is the cumulative percentage of the
total number of occurrences, total cost, or total of the particular unit of measure. Because the
reasons are in decreasing order, the cumulative function is a concave function. To take the
example above, in order to lower the amount of late arriving by 80%, it is sufficient to solve the
first three issues.
The purpose of the Pareto chart is to highlight the most important among a (typically large) set of
factors. In quality control, it often represents the most common sources of defects, the highest
occurring type of defect, or the most frequent reasons for customer complaints, and so on.
These charts can be generated by simple spreadsheet programs, such as OpenOffice.org Calc and
Microsoft Excel and specialized statistical software tools as well as online quality charts
generators.
The Pareto chart is one of the seven basic tools of quality control
*Pick chart
A PICK chart is a Lean Six Sigma tool, developed by Lockheed Martin, for organizing process
improvement ideas and categorizing them during the Identify and Prioritize Opportunities Phase
of a Lean Six Sigma project
Use
When faced with multiple improvement ideas a PICK chart may be used to determine the most
useful. There are four categories on a 2*2 matrix; horizontal is scale of payoff (or benefits),
vertical is ease of implementation. By deciding where an idea falls on the pick chart four
proposed project actions are provided; Possible, Implement, Challenge and Kill (thus the name
PICK).
Low Payoff, easy to do - Possible
High Payoff, easy to do - Implement
High Payoff, hard to do - Challenge
Low Payoff, hard to do - Kill
The vertical axis, representing ease of implementation would typically include some assessment
of cost to implement as well. More expensive actions can be said to be more difficult to
implement.
[edit] Sample PICK chart
Payoff Low Payoff High
easy to do Possible Implement

hard to do Kill Challenge
*Process capability
A process is a unique combination of tools, materials, methods, and people engaged in

producing a measurable output; for example a manufacturing line for machine parts. All
processes have inherent statistical variability which can be evaluated by statistical methods.
The Process Capability is a measurable property of a process to the specification, expressed as a
process capability index (e.g., Cpk or Cpm) or as a process performance index (e.g., Ppk or Ppm).
The output of this measurement is usually illustrated by a histogram and calculations that predict
how many parts will be produced out of specification (OOS).
Process capability is also defined as the capability of a process to meet its purpose as managed
by an organization's management and process definition structures ISO 15504.
Two parts of process capability are: 1) Measure the variability of the output of a process, and 2)
Compare that variability with a proposed specification or product tolerance
Measure the process
The input of a process usually has at least one or more measurable characteristics that are used to
specify outputs. These can be analyzed statistically; where the output data shows a normal
distribution the process can be described by the process mean (average) and the standard
deviation.
A process needs to be established with appropriate process controls in place. A control chart
analysis is used to determine whether the process is "in statistical control". If the process is not in
statistical control then capability has no meaning. Therefore the process capability involves only
common cause variation and not special cause variation.
A batch of data needs to be obtained from the measured output of the process. The more data that
is included the more precise the result, however an estimate can be achieved with as few as 17
data points. This should include the normal variety of production conditions, materials, and
people in the process. With a manufactured product, it is common to include at least three
different production runs, including start-ups.
The process mean (average) and standard deviation are calculated. With a normal distribution,
the "tails" can extend well beyond plus and minus three standard deviations, but this interval
should contain about 99.73% of production output. Therefore for a normal distribution of data
the process capability is often described as the relationship between six standard deviations and
the required specification.
[edit] Capability study
The output of a process is expected to meet customer requirements, specifications, or product
tolerances. Engineering can conduct a process capability study to determine the extent to which
the process can meet these expectations.
The ability of a process to meet specifications can be expressed as a single number using a
process capability index or it can be assessed using control charts. Either case requires running
the process to obtain enough measurable output so that engineering is confident that the process
is stable and so that the process mean and variability can be reliably estimated. Statistical process
control defines techniques to properly differentiate between stable processes, processes that are
drifting (experiencing a long-term change in the mean of the output), and processes that are
growing more variable. Process capability indices are only meaningful for processes that are
stable (in a state of statistical control).
For Information Technology, ISO 15504 specifies a process capability measurement framework
for assessing process capability. This measurement framework consists of 5.5+0.5 levels of
process capability from none (Capability Level 0) to optimizing processes (CL 5). The
measurement framework has been generalized so that it can be applied to non IT processes.
There are currently two process reference models covering software and systems. The Capability
Maturity Model in its latest version (CMMI continuous) also follows this approach
*Quantitative marketing research
Quantitative marketing research is the application of quantitative research techniques to the

field of marketing. It has roots in both the positivist view of the world, and the modern marketing
viewpoint that marketing is an interactive process in which both the buyer and seller reach a
satisfying agreement on the "four Ps" of marketing: Product, Price, Place (location) and
Promotion.
As a social research method, it typically involves the construction of questionnaires and scales.
People who respond (respondents) are asked to complete the survey. Marketers use the
information so obtained to understand the needs of individuals in the marketplace, and to create
strategies and marketing plans.
Typical general procedure
Simply, there are five major and important steps involved in the research process:
1. Defining the Problem.
2. Research Design.
3. Data Collection.
4. Analysis.
5. Report Writing & presentation.
A brief discussion on these steps is:
1. Problem audit and problem definition - What is the problem? What are the various
aspects of the problem? What information is needed?
2. Conceptualization and operationalization - How exactly do we define the concepts
involved? How do we translate these concepts into observable and measurable
behaviours?
3. Hypothesis specification - What claim(s) do we want to test?
4. Research design specification - What type of methodology to use? - examples:
questionnaire, survey
5. Question specification - What questions to ask? In what order?
6. Scale specification - How will preferences be rated?
7. Sampling design specification - What is the total population? What sample size is
necessary for this population? What sampling method to use?- examples: Probability
Sampling:- (cluster sampling, stratified sampling, simple random sampling, multistage
sampling, systematic sampling) & Nonprobability sampling:- (Convenience
Sampling,Judgement Sampling, Purposive Sampling, Quota Sampling, Snowball
Sampling, etc. )
8. Data collection - Use mail, telephone, internet, mall intercepts
9. Codification and re-specification - Make adjustments to the raw data so it is compatible
with statistical techniques and with the objectives of the research - examples: assigning
numbers, consistency checks, substitutions, deletions, weighting, dummy variables, scale
transformations, scale standardization
10. Statistical analysis - Perform various descriptive and inferential techniques (see below)
on the raw data. Make inferences from the sample to the whole population. Test the
results for statistical significance.
11. Interpret and integrate findings - What do the results mean? What conclusions can be
drawn? How do these findings relate to similar research?
12. Write the research report - Report usually has headings such as: 1) executive summary; 2)
objectives; 3) methodology; 4) main findings; 5) detailed charts and diagrams. Present
the report to the client in a 10 minute presentation. Be prepared for questions.
The design step may involve a pilot study to in order to discover any hidden issues. The
codification and analysis steps are typically performed by computer, using software such as DAP
or PSPP. The data collection steps, can in some instances be automated, but often require
significant manpower to undertake. Interpretation is a skill mastered only by experience.
[edit] Statistical analysis
The data acquired for quantitative marketing research can be analysed by almost any of the range
of techniques of statistical analysis, which can be broadly divided into descriptive statistics and
statistical inference. An important set of techniques is that related to statistical surveys. In any
instance, an appropriate type of statistical analysis should take account of the various types of
error that may arise, as outlined below.
[edit] Reliability and validity
Research should be tested for reliability, generalizability, and validity. Generalizability is the
ability to make inferences from a sample to the population.
Reliability is the extent to which a measure will produce consistent results. Test-retest reliability
checks how similar the results are if the research is repeated under similar circumstances.
Stability over repeated measures is assessed with the Pearson coefficient. Alternative forms
reliability checks how similar the results are if the research is repeated using different forms.
Internal consistency reliability checks how well the individual measures included in the research
are converted into a composite measure. Internal consistency may be assessed by correlating
performance on two halves of a test (split-half reliability). The value of the Pearson product-
moment correlation coefficient is adjusted with the Spearman-Brown prediction formula to
correspond to the correlation between two full-length tests. A commonly used measure is
Cronbach's α, which is equivalent to the mean of all possible split-half coefficients. Reliability
may be improved by increasing the sample size.
Validity asks whether the research measured what it intended to. Content validation (also called
face validity) checks how well the content of the research are related to the variables to be
studied. Are the research questions representative of the variables being researched. It is a
demonstration that the items of a test are drawn from the domain being measured. Criterion
validation checks how meaningful the research criteria are relative to other possible criteria.
When the criterion is collected later the goal is to establish predictive validity. Construct
validation checks what underlying construct is being measured. There are three variants of
construct validity. They are convergent validity (how well the research relates to other measures
of the same construct), discriminant validity (how poorly the research relates to measures of
opposing constructs), and nomological validity (how well the research relates to other variables
as required by theory) .
Internal validation, used primarily in experimental research designs, checks the relation between
the dependent and independent variables. Did the experimental manipulation of the independent
variable actually cause the observed results? External validation checks whether the
experimental results can be generalized.
Validity implies reliability : a valid measure must be reliable. But reliability does not necessarily
imply validity :a reliable measure need not be valid.
[edit] Types of errors
Random sampling errors:
• sample too small
• sample not representative
• inappropriate sampling method used
• random errors
Research design errors:
• bias introduced
• measurement error
• data analysis error
• sampling frame error
• population definition error
• scaling error
• question construction error
Interviewer errors:
• recording errors
• cheating errors
• questioning errors
• respondent selection error
Respondent errors:
• non-response error
• inability error
• falsification error
Hypothesis errors:
• type I error (also called alpha error)
○ the study results lead to the rejection of the null hypothesis even though it is
actually true
• type II error (also called beta error)
○ the study results lead to the acceptance (non-rejection) of the null hypothesis even
though it is actually false
*Regression analysis
In statistics, regression analysis includes any techniques for modeling and analyzing several
variables, when the focus is on the relationship between a dependent variable and one or more
independent variables. More specifically, regression analysis helps us understand how the typical
value of the dependent variable changes when any one of the independent variables is varied,
while the other independent variables are held fixed. Most commonly, regression analysis
estimates the conditional expectation of the dependent variable given the independent variables
— that is, the average value of the dependent variable when the independent variables are held
fixed. Less commonly, the focus is on a quantile, or other location parameter of the conditional
distribution of the dependent variable given the independent variables. In all cases, the
estimation target is a function of the independent variables called the regression function. In
regression analysis, it is also of interest to characterize the variation of the dependent variable
around the regression function, which can be described by a probability distribution.
Regression analysis is widely used for prediction and forecasting, where its use has substantial
overlap with the field of machine learning. Regression analysis is also used to understand which
among the independent variables are related to the dependent variable, and to explore the forms
of these relationships. In restricted circumstances, regression analysis can be used to infer causal
relationships between the independent and dependent variables.
A large body of techniques for carrying out regression analysis has been developed. Familiar
methods such as linear regression and ordinary least squares regression are parametric, in that the
regression function is defined in terms of a finite number of unknown parameters that are
estimated from the data. Nonparametric regression refers to techniques that allow the regression
function to lie in a specified set of functions, which may be infinite-dimensional.
The performance of regression analysis methods in practice depends on the form of the data-
generating process, and how it relates to the regression approach being used. Since the true form
of the data-generating process is not known, regression analysis depends to some extent on
making assumptions about this process. These assumptions are sometimes (but not always)
testable if a large amount of data is available. Regression models for prediction are often useful
even when the assumptions are moderately violated, although they may not perform optimally.
However, in many applications, especially with small effects or questions of causality based on
observational data, regression methods give misleading results
History
The earliest form of regression was the method of least squares (French: méthode des moindres
carrés), which was published by Legendre in 1805,[3] and by Gauss in 1809.[4] Legendre and
Gauss both applied the method to the problem of determining, from astronomical observations,
the orbits of bodies about the Sun. Gauss published a further development of the theory of least
squares in 1821,[5] including a version of the Gauss–Markov theorem.
The term "regression" was coined by Francis Galton in the nineteenth century to describe a
biological phenomenon. The phenomenon was that the heights of descendants of tall ancestors
tend to regress down towards a normal average (a phenomenon also known as regression toward
the mean).[6][7] For Galton, regression had only this biological meaning[8][9], but his work was later
extended by Udny Yule and Karl Pearson to a more general statistical context.[10][11]. In the work
of Yule and Pearson, the joint distribution of the response and explanatory variables is assumed
to be Gaussian. This assumption was weakened by R.A. Fisher in his works of 1922 and 1925 [12]
[13][14]
. Fisher assumed that the conditional distribution of the response variable is Gaussian, but
the joint distribution need not be. In this respect, Fisher's assumption is closer to Gauss's
formulation of 1821.
Regression methods continue to be an area of active research. In recent decades, new methods
have been developed for robust regression, regression involving correlated responses such as
time series and growth curves, regression in which the predictor or response variables are curves,
images, graphs, or other complex data objects, regression methods accommodating various types
of missing data, nonparametric regression, Bayesian methods for regression, regression in which
the predictor variables are measured with error, regression with more predictor variables than
observations, and causal inference with regression.
[edit] Regression models
Regression models involve the following variables:
• The unknown parameters denoted as β; this may be a scalar or a vector.
• The independent variables, X.
• The dependent variable, Y.
In various fields of application, different terminologies are used in place of dependent and
independent variables.
A regression model relates Y to a function of X and β.
The approximation is usually formalized as E(Y | X) = f(X, β). To carry out regression analysis,
the form of the function f must be specified. Sometimes the form of this function is based on
knowledge about the relationship between Y and X that does not rely on the data. If no such
knowledge is available, a flexible or convenient form for f is chosen.
Assume now that the vector of unknown parameters β is of length k. In order to perform a
regression analysis the user must provide information about the dependent variable Y:
• If N data points of the form (Y,X) are observed, where N < k, most classical approaches to
regression analysis cannot be performed: since the system of equations defining the
regression model is underdetermined, there is not enough data to recover β.
• If exactly N = k data points are observed, and the function f is linear, the equations
Y = f(X, β) can be solved exactly rather than approximately. This reduces to solving a set
of N equations with N unknowns (the elements of β), which has a unique solution as long
as the X are linearly independent. If f is nonlinear, a solution may not exist, or many
solutions may exist.
• The most common situation is where N > k data points are observed. In this case, there is
enough information in the data to estimate a unique value for β that best fits the data in
some sense, and the regression model when applied to the data can be viewed as an
overdetermined system in β.
In the last case, the regression analysis provides the tools for:
1. Finding a solution for unknown parameters β that will, for example, minimize the
distance between the measured and predicted values of the dependent variable Y (also
known as method of least squares).
2. Under certain statistical assumptions, the regression analysis uses the surplus of
information to provide statistical information about the unknown parameters β and
predicted values of the dependent variable Y.
[edit] Necessary number of independent measurements
Consider a regression model which has three unknown parameters, β0, β1, and β2. Suppose an
experimenter performs 10 measurements all at exactly the same value of independent variable
vector X (which contains the independent variables X1, X2, and X3). In this case, regression
analysis fails to give a unique set of estimated values for the three unknown parameters; the
experimenter did not provide enough information. The best one can do is to estimate the average
value and the standard deviation of the dependent variable Y. Similarly, measuring at two
different values of X would give enough data for a regression with two unknowns, but not for
three or more unknowns.
If the experimenter had performed measurements at three different values of the independent
variable vector X, then regression analysis would provide a unique set of estimates for the three
unknown parameters in β.
In the case of general linear regression, the above statement is equivalent to the requirement that
matrix XTX is invertible.
[edit] Statistical assumptions
When the number of measurements, N, is larger than the number of unknown parameters, k, and
the measurement errors εi are normally distributed then the excess of information contained in (N
- k) measurements is used to make statistical predictions about the unknown parameters. This
excess of information is referred to as the degrees of freedom of the regression.
[edit] Underlying assumptions
Classical assumptions for regression analysis include:
• The sample is representative of the population for the inference prediction.
• The error is a random variable with a mean of zero conditional on the explanatory
variables.
• The independent variables are measured with no error. (Note: If this is not so, modeling
may be done instead using errors-in-variables model techniques).
• The predictors are linearly independent, i.e. it is not possible to express any predictor as a
linear combination of the others. See Multicollinearity.
• The errors are uncorrelated, that is, the variance-covariance matrix of the errors is
diagonal and each non-zero element is the variance of the error.
• The variance of the error is constant across observations (homoscedasticity). (Note: If
not, weighted least squares or other methods might instead be used).
These are sufficient conditions for the least-squares estimator to possess desirable properties, in
particular, these assumptions imply that the parameter estimates will be unbiased, consistent, and
efficient in the class of linear unbiased estimators. It is important to note that actual data rarely
satisfies the assumptions. That is, the method is used even though the assumptions are not true.
Variation from the assumptions can sometimes be used as a measure of how far the model is
from being useful. Many of these assumptions may be relaxed in more advanced treatments.
Reports of statistical analyses usually include analyses of tests on the sample data and
methodology for the fit and usefulness of the model.
Assumptions include the geometrical support of the variables[clarification needed] (Cressie, 1996).
Independent and dependent variables often refer to values measured at point locations. There
may be spatial trends and spatial autocorrelation in the variables that violates statistical
assumptions of regression. Geographic weighted regression is one technique to deal with such
data (Fotheringham et al., 2002). Also, variables may include values aggregated by areas. With
aggregated data the Modifiable Areal Unit Problem can cause extreme variation in regression
parameters (Fotheringham and Wong, 1991). When analyzing data aggregated by political
boundaries, postal codes or census areas results may be very different with a different choice of
units.
[edit] Linear regression
Main article: Linear regression
In linear regression, the model specification is that the dependent variable, yi is a linear
combination of the parameters (but need not be linear in the independent variables). For
example, in simple linear regression for modeling n data points there is one independent
variable: xi, and two parameters, β0 and β1:
straight line:
In multiple linear regression, there are several independent variables or functions of independent
variables. For example, adding a term in xi2 to the preceding regression gives:
parabola:
This is still linear regression; although the expression on the right hand side is quadratic in the
independent variable xi, it is linear in the parameters β0, β1 and β2.
In both cases, is an error term and the subscript i indexes a particular observation. Given a
random sample from the population, we estimate the population parameters and obtain the
sample linear regression model:
The residual, , is the difference between the value of the dependent variable
predicted by the model, and the true value of the dependent variable yi. One method of
estimation is ordinary least squares. This method obtains parameter estimates that minimize the
sum of squared residuals, SSE,[15][16] also sometimes denoted RSS:
Minimization of this function results in a set of normal equations, a set of simultaneous linear
equations in the parameters, which are solved to yield the parameter estimators, .
Illustration of linear regression on a data set.
In the case of simple regression, the formulas for the least squares estimates are
where is the mean (average) of the x values and is the mean of the y values. See simple linear
regression for a derivation of these formulas and a numerical example. Under the assumption
that the population error term has a constant variance, the estimate of that variance is given by:
This is called the mean square error (MSE) of the regression. The standard errors of the
parameter estimates are given by
Under the further assumption that the population error term is normally distributed, the
researcher can use these estimated standard errors to create confidence intervals and conduct
hypothesis tests about the population parameters.
[edit] General linear model
In the more general multiple regression model, there are p independent variables:
The least square parameter estimates are obtained by p normal equations. The residual can be
written as
The normal equations are
Note that for the normal equations depicted above,
That is, there is no β0. Thus in what follows,

In matrix notation, the normal equations for k responses (usually k = 1) are written as
with generalized inverse ( − ) solution, subscripts showing matrix dimensions:
For more detailed derivation, see linear least squares, and for a numerical example, see linear
regression (example).
[edit] Regression diagnostics
Once a regression model has been constructed, it may be important to confirm the goodness of fit
of the model and the statistical significance of the estimated parameters. Commonly used checks
of goodness of fit include the R-squared, analyses of the pattern of residuals and hypothesis
testing. Statistical significance can be checked by an F-test of the overall fit, followed by t-tests
of individual parameters.
Interpretations of these diagnostic tests rest heavily on the model assumptions. Although
examination of the residuals can be used to invalidate a model, the results of a t-test or F-test are
sometimes more difficult to interpret if the model's assumptions are violated. For example, if the
error term does not have a normal distribution, in small samples the estimated parameters will
not follow normal distributions and complicate inference. With relatively large samples,
however, a central limit theorem can be invoked such that hypothesis testing may proceed using
asymptotic approximations.
[edit] Regression with "limited dependent" variables
The phrase "limited dependent" is used in econometric statistics for categorical and constrained
variables.
The response variable may be non-continuous ("limited" to lie on some subset of the real line).
For binary (zero or one) variables, if analysis proceeds with least-squares linear regression, the
model is called the linear probability model. Nonlinear models for binary dependent variables
include the probit and logit model. The multivariate probit model is a standard method of
estimating a joint relationship between several binary dependent variables and some independent
variables. For categorical variables with more than two values there is the multinomial logit. For
ordinal variables with more than two values, there are the ordered logit and ordered probit
models. Censored regression models may be used when the dependent variable is only
sometimes observed, and Heckman correction type models may be used when the sample is not
randomly selected from the population of interest. An alternative to such procedures is linear
regression based on polychoric correlation (or polyserial correlations) between the categorical
variables. Such procedures differ in the assumptions made about the distribution of the variables
in the population. If the variable is positive with low values and represents the repetition of the
occurrence of an event, then count models like the Poisson regression or the negative binomial
model may be used.
[edit] Interpolation and extrapolation
Regression models predict a value of the Y variable given known values of the X variables.
Prediction within the range of values in the dataset used for model-fitting is known informally as
interpolation. Prediction outside this range of the data is known as extrapolation. Performing
extrapolation relies strongly on the regression assumptions. The further the extrapolation goes
outside the data, the more room there is for the model to fail due to differences between the
assumptions and the sample data or the true values. For example, regression model failing the
homoscedasticity assumption will make bigger mistakes the further out the extrapolation is[citation
needed]
.
It is generally advised[citation needed] that when performing extrapolation, one should accompany the
estimated value of the dependent variable with a prediction interval that represents the
uncertainty. Such intervals tend to expand rapidly as the values of the independent variable(s)
moved outside the range covered by the observed data.
For such reasons and others, some tend to say that it might be unwise to undertake extrapolation.
[17]
However, this does not cover the full set of modelling errors that may be being made: in
particular, the assumption of a particular form for the relation between Y and X. A properly
conducted regression analysis will include an assessment of how well the assumed form is
matched by the observed data, but it can only do so within the range of values of the independent
variables actually available. This means that any extrapolation is particularly reliant on the
assumptions being made about the structural form of the regression relationship. Best-practice
advice here is that a linear-in-variables and linear-in-parameters relationship should not be
chosen simply for computational convenience, but that all available knowledge should be
deployed in constructing a regression model. If this knowledge includes the fact that the
dependent variable cannot go outside a certain range of values, this can be made use of in
selecting the model — even if the observed dataset has no values particularly near such bounds.
The implications of this step of choosing an appropriate functional form for the regression can be
great when extrapolation is considered. At a minimum, it can ensure that any extrapolation
arising from a fitted model is "realistic" (or in accord with what is known).
[edit] Nonlinear regression
Main article: Nonlinear regression
When the model function is not linear in the parameters, the sum of squares must be minimized
by an iterative procedure. This introduces many complications which are summarized in
Differences between linear and non-linear least squares
[edit] Power and sample size calculations
There are no generally agreed methods for relating the number of observations versus the
number of independent variables in the model. One rule of thumb suggested by Good and Hardin
is N = mn, where N is the sample size, n is the number of independent variables and m is the
number of observations needed to reach the desired precision if the model had only one
independent variable.[18] For example, a researcher is building a linear regression model using a
dataset that contains 1000 patients (N). If he decides that five observations are needed to
precisely define a straight line (m), then the maximum number of independent variables his
model can support is 4, because .

[edit] Other methods
Although the parameters of a regression model are usually estimated using the method of least
squares, other methods which have been used include:
• Bayesian methods, e.g. Bayesian linear regression
• Least absolute deviations, which is more robust in the presence of outliers, leading to
quantile regression
• Nonparametric regression, requires a large number of observations and is
computationally intensive
• Distance metric learning, which is learned by the search of a meaningful distance metric
in a given input space.[19]
[edit] Software
Main article: List of statistical packages
All major statistical software packages perform least squares regression analysis and inference.
Simple linear regression and multiple regression using least squares can be done in some
spreadsheet applications and on some calculators. While many statistical software packages can
perform various types of nonparametric and robust regression, these methods are less
standardized; different software packages implement different methods, and a method with a
given name may be implemented differently in different packages. Specialized regression
software has been developed for use in fields such as survey analysis and neuroimaging.
*Root cause analysis
Root cause analysis (RCA) is a class of problem solving methods aimed at identifying the root
causes of problems or incidents. The practice of RCA is predicated on the belief that problems
are best solved by attempting to correct or eliminate root causes, as opposed to merely
addressing the immediately obvious symptoms. By directing corrective measures at root causes,
it is hoped that the likelihood of problem recurrence will be minimized. However, it is
recognized that complete prevention of recurrence by a single intervention is not always
possible. Thus, RCA is often considered to be an iterative process, and is frequently viewed as a
tool of continuous improvement.
RCA, initially is a reactive method of problem detection and solving. This means that the
analysis is done after an incident has occurred. By gaining expertise in RCA it becomes a pro-
active method. This means that RCA is able to forecast the possibility of an incident even before
it could occur. While one follows the other, RCA is a completely separate process to Incident
Management.
Root cause analysis is not a single, sharply defined methodology; there are many different tools,
processes, and philosophies of RCA in existence. However, most of these can be classed into
five, very-broadly defined "schools" that are named here by their basic fields of origin: safety-
based, production-based, process-based, failure-based, and systems-based.
• Safety-based RCA descends from the fields of accident analysis and occupational safety
and health.
• Production-based RCA has its origins in the field of quality control for industrial
manufacturing.
• Process-based RCA is basically a follow-on to production-based RCA, but with a scope
that has been expanded to include business processes.
• Failure-based RCA is rooted in the practice of failure analysis as employed in
engineering and maintenance.
• Systems-based RCA has emerged as an amalgamation of the preceding schools, along
with ideas taken from fields such as change management, risk management, and systems
analysis.
Despite the seeming disparity in purpose and definition among the various schools of root cause
analysis, there are some general principles that could be considered as universal. Similarly, it is
possible to define a general process for performing RCA.
General principles of root cause analysis
1. The primary aim of RCA is to identify the root cause of a problem in order to create
effective corrective actions that will prevent that problem from ever re-occurring,
otherwise known as the '100 year fix'.
2. To be effective, RCA must be performed systematically as an investigation, with
conclusions and the root cause backed up by documented evidence.
3. There is always one true root cause for any given problem, the difficult part is having the
stamina to reach it.
4. To be effective the analysis must establish a sequence of events or timeline to understand
the relationships between contributory factors, the root cause and the defined problem.
5. Root cause analysis can help to transform an old culture that reacts to problems into a
new culture that solves problems before they escalate but more importantly; reduces the
instances of problems occurring over time within the environment where the RCA
process is operated.
[edit] General process for performing and documenting an RCA-based Corrective Action
Notice that RCA (in steps 3, 4 and 5) forms the most critical part of successful corrective action,
because it directs the corrective action at the true root cause of the problem. The root cause is
secondary to the goal of prevention, but without knowing the root cause, we cannot determine
what an effective corrective action for the defined problem will be.
1. Define the problem.
2. Gather data/evidence.
3. Ask why and identify the true root cause associated with the defined problem.
4. Identify corrective action(s) that will prevent recurrence of the problem (your 100 year
fix).
5. Identify effective solutions that prevent recurrence, are within your control, meet your
goals and objectives and do not cause other problems.
6. Implement the recommendations.
7. Observe the recommended solutions to ensure effectiveness.
8. Variability Reduction methodology for problem solving and problem avoidance.
[edit] Root cause analysis techniques
• Barrier analysis - a technique often used in particularly in process industries. It is based
on tracing energy flows, with a focus on barriers to those flows, to identify how and why
the barriers did not prevent the energy flows from causing harm.
• Bayesian inference
• Causal factor tree analysis - a technique based on displaying causal factors in a tree-
structure such that cause-effect dependencies are clearly identified.
• Change analysis - an investigation technique often used for problems or accidents. It is
based on comparing a situation that does not exhibit the problem to one that does, in
order to identify the changes or differences that might explain why the problem occurred.
• Current Reality Tree - A method developed by Eliahu M. Goldratt in his theory of
constraints that guides an investigator to identify and relate all root causes using a cause-
effect tree whose elements are bound by rules of logic (Categories of Legitimate
Reservation). The CRT begins with a brief list of the undesirables things we see around
us, and then guides us towards one or more root causes. This method is particularly
powerful when the system is complex, there is no obvious link between the observed
undesirable things, and a deep understanding of the root cause(s) is desired.
• Failure mode and effects analysis
• Fault tree analysis
• 5 Whys
• Ishikawa diagram, also known as the fishbone diagram or cause-and-effect diagram. The
Ishikawa diagram is the preferred method for project managers for conducting RCA,
mainly due to its simplicity, and the complexity of the rest of the methods[1].
• Pareto analysis
• RPR Problem Diagnosis - An ITIL-aligned method for diagnosing IT problems.
Common cause analysis (CCA) common modes analysis (CMA) are evolving engineering
techniques for complex technical systems to determine if common root causes in hardware,
software or highly integrated systems interaction may contribute to human error or improper
operation of a system. Systems are analyzed for root causes and causal factors to determine
probability of failure modes, fault modes, or common mode software faults due to escaped
requirements. Also ensuring complete testing and verification are methods used for ensuring
complex systems are designed with no common causes that cause severe hazards. Common
cause analysis are sometimes required as part of the safety engineering tasks for theme parks,
commercial/military aircraft, spacecraft, complex control systems, large electrical utility grids,
nuclear power plants, automated industrial controls, medical devices or other safety safety-
critical systems with complex functionality.
[edit] Basic elements of root cause
• Materials
○ Defective raw material
○ Wrong type for job
○ Lack of raw material
• Man Power
○ Inadequate capability
○ Lack of Knowledge
○ Lack of skill
○ Stress
○ Improper motivation
• Machine / Equipment
○ Incorrect tool selection
○ Poor maintenance or design
○ Poor equipment or tool placement
○ Defective equipment or tool
• Environment
○ Orderly workplace
○ Job design or layout of work
○ Surfaces poorly maintained
○ Physical demands of the task
○ Forces of nature
• Management
○ No or poor management involvement
○ Inattention to task
○ Task hazards not guarded properly
○ Other (horseplay, inattention....)
○ Stress demands
○ Lack of Process
○ Lack of Communication
• Methods
○ No or poor procedures
○ Practices are not the same as written procedures
○ Poor communication
• Management system
○ Training or education lacking
○ Poor employee involvement
○ Poor recognition of hazard
○ Previously identified hazards were not eliminated
*Run chart
A simple run chart showing data collected over time. The median of the observed data (73) is
also shown on the chart.
A run chart, also known as a run-sequence plot is a graph that displays observed data in a time
sequence. Often, the data displayed represent some aspect of the output or performance of a
manufacturing or other business process
Overview
Run sequence plots[1] are an easy way to graphically summarize an univariate data set. A
common assumption of univariate data sets is that they behave like:[2]
• random drawings;
• from a fixed distribution;
• with a common location; and
• with a common scale.
With run sequence plots, shifts in location and scale are typically quite evident. Also, outliers can
easily be detected.
Run chart of eight random walks in one dimension starting at 0. The plot shows the current
position on the line (vertical axis) versus the time steps (horizontal axis).
Examples could include measurements of the fill level of bottles filled at a bottling plant or the
water temperature of a dishwashing machine each time it is run. Time is generally represented on
the horizontal (x) axis and the property under observation on the vertical (y) axis. Often, some
measure of central tendency (mean or median) of the data is indicated by a horizontal reference
line.
Run charts are analyzed to find anomalies in data that suggest shifts in a process over time or
special factors that may be influencing the variability of a process. Typical factors considered
include unusually long "runs" of data points above or below the average line, the total number of
such runs in the data set, and unusually long series of consecutive increases or decreases.[3]
Run charts are similar in some regards to the control charts used in statistical process control, but
do not show the control limits of the process. They are therefore simpler to produce, but do not
allow for the full range of analytic techniques supported by control charts
*SIPOC
SIPOC is a six sigma tool. The acronym SIPOC stands for Suppliers, inputs, process, outputs,
customers. A SIPOC is completed most easily by starting from the right ("Customers") and
working towards the left.
For example:
1. Suppliers - grocers and vendors
2. Inputs - ingredients for recipes
3. Process - cooking at a restaurant kitchen
4. Outputs - meals served
5. Customers - diners at a restaurant
A number of six sigma SIPOC references are available, including isixsigma.com [1] and
opensourcesixsigma.com
*Taguchi methods
Taguchi methods are statistical methods developed by Genichi Taguchi to improve the quality
of manufactured goods, and more recently also applied to, engineering,[1] biotechnology,[2][3]
marketing and advertising.[4] Professional statisticians have welcomed the goals and
improvements brought about by Taguchi methods, particularly by Taguchi's development of
designs for studying variation, but have criticized the inefficiency of some of Taguchi's
proposals.[5]
Taguchi's work includes three principal contributions to statistics:
• A specific loss function — see Taguchi loss function;
• The philosophy of off-line quality control; and
• Innovations in the design of experiments.
Loss functions
[edit] Loss functions in statistical theory
Traditionally, statistical methods have relied on mean-unbiased estimators of treatment effects:
Under the conditions of the Gauss-Markov theorem, least squares estimators have minimum
variance among all mean-unbiased estimators. The emphasis on comparisons of means also
draws (limiting) comfort from the law of large numbers, according to which the sample means
converge to the true mean. Fisher's textbook on the design of experiments emphasized
comparisons of treatment means.
Gauss proved that the sample-mean minimizes the expected squared-error loss-function (while
Laplace proved that a median-unbiased estimator minimizes the absolute-error loss function). In
statistical theory, the central role of the loss function was renewed by the statistical decision
theory of Abraham Wald.
However, loss functions were avoided by Ronald A. Fisher.[6]
[edit] Taguchi's use of loss functions
Taguchi knew statistical theory mainly from the followers of Ronald A. Fisher, who also avoided
loss functions. Reacting to Fisher's methods in the design of experiments, Taguchi interpreted
Fisher's methods as being adapted for seeking to improve the mean outcome of a process.
Indeed, Fisher's work had been largely motivated by programmes to compare agricultural yields
under different treatments and blocks, and such experiments were done as part of a long-term
programme to improve harvests.
However, Taguchi realised that in much industrial production, there is a need to produce an
outcome on target, for example, to machine a hole to a specified diameter, or to manufacture a
cell to produce a given voltage. He also realised, as had Walter A. Shewhart and others before
him, that excessive variation lay at the root of poor manufactured quality and that reacting to
individual items inside and outside specification was counterproductive.
He therefore argued that quality engineering should start with an understanding of quality costs
in various situations. In much conventional industrial engineering, the quality costs are simply
represented by the number of items outside specification multiplied by the cost of rework or
scrap. However, Taguchi insisted that manufacturers broaden their horizons to consider cost to
society. Though the short-term costs may simply be those of non-conformance, any item
manufactured away from nominal would result in some loss to the customer or the wider
community through early wear-out; difficulties in interfacing with other parts, themselves
probably wide of nominal; or the need to build in safety margins. These losses are externalities
and are usually ignored by manufacturers, which are more interested in their private costs than
social costs. Such externalities prevent markets from operating efficiently, according to analyses
of public economics. Taguchi argued that such losses would inevitably find their way back to the
originating corporation (in an effect similar to the tragedy of the commons), and that by working
to minimise them, manufacturers would enhance brand reputation, win markets and generate
profits.
Such losses are, of course, very small when an item is near to negligible. Donald J. Wheeler
characterised the region within specification limits as where we deny that losses exist. As we
diverge from nominal, losses grow until the point where losses are too great to deny and the
specification limit is drawn. All these losses are, as W. Edwards Deming would describe them,
unknown and unknowable, but Taguchi wanted to find a useful way of representing them
statistically. Taguchi specified three situations:
1. Larger the better (for example, agricultural yield);
2. Smaller the better (for example, carbon dioxide emissions); and
3. On-target, minimum-variation (for example, a mating part in an assembly).
The first two cases are represented by simple monotonic loss functions. In the third case,
Taguchi adopted a squared-error loss function for several reasons:
• It is the first "symmetric" term in the Taylor series expansion of real analytic loss-
functions.
• Total loss is measured by the variance. As variance is additive (for uncorrelated random
variables), the total loss is an additive measurement of cost (for uncorrelated random
variables).
• The squared-error loss function is widely used in statistics, following Gauss's use of the
squared-error loss function in justifying the method of least squares.
[edit] Reception of Taguchi's ideas by statisticians
Though many of Taguchi's concerns and conclusions are welcomed by statisticians and
economists, some ideas have been especially criticized. For example, Taguchi's recommendation
that industrial experiments maximise some signal-to-noise ratio (representing the magnitude of
the mean of a process compared to its variation) has been criticized widely.[citation needed]
[edit] Off-line quality control
[edit] Taguchi's rule for manufacturing
Taguchi realized that the best opportunity to eliminate variation is during the design of a product
and its manufacturing process. Consequently, he developed a strategy for quality engineering
that can be used in both contexts. The process has three stages:
1. System design
2. Parameter design
3. Tolerance design
[edit] System design
This is design at the conceptual level, involving creativity and innovation.
[edit] Parameter design
Once the concept is established, the nominal values of the various dimensions and design
parameters need to be set, the detail design phase of conventional engineering. Taguchi's radical
insight was that the exact choice of values required is under-specified by the performance
requirements of the system. In many circumstances, this allows the parameters to be chosen so as
to minimise the effects on performance arising from variation in manufacture, environment and
cumulative damage. This is sometimes called robustification.
[edit] Tolerance design
With a successfully completed parameter design, and an understanding of the effect that the
various parameters have on performance, resources can be focused on reducing and controlling
variation in the critical few dimensions (see Pareto principle).
[edit] Design of experiments
Taguchi developed his experimental theories independently. Taguchi read works following R. A.
Fisher only in 1954. Taguchi's framework for design of experiments is idiosyncratic and often
flawed, but contains much that is of enormous value. He made a number of innovations.
[edit] Outer arrays
Taguchi's designs aimed to allow greater understanding of variation than did many of the
traditional designs from the analysis of variance (following Fisher). Taguchi contended that
conventional sampling is inadequate here as there is no way of obtaining a random sample of
future conditions.[7] In Fisher's design of experiments and analysis of variance, experiments aim
to reduce the influence of nuisance factors to allow comparisons of the mean treatment-effects.
Variation becomes even more central in Taguchi's thinking.
Taguchi proposed extending each experiment with an "outer array" (possibly an orthogonal
array); the "outer array" should simulate the random environment in which the product would
function. This is an example of judgmental sampling. Many quality specialists have been using
"outer arrays".
Later innovations in outer arrays resulted in "compounded noise." This involves combining a few
noise factors to create two levels in the outer array: First, noise factors that drive output lower,
and second, noise factors that drive output higher. "Compounded noise" simulates the extremes
of noise variation but uses fewer experimental runs than would previous Taguchi designs.
[edit] Management of interactions
[edit] Interactions, as treated by Taguchi

Many of the orthogonal arrays that Taguchi has advocated are saturated arrays, allowing no
scope for estimation of interactions. This is a continuing topic of controversy. However, this is
only true for "control factors" or factors in the "inner array". By combining an inner array of
control factors with an outer array of "noise factors", Taguchi's approach provides "full
information" on control-by-noise interactions, it is claimed. Taguchi argues that such interactions
have the greatest importance in achieving a design that is robust to noise factor variation. The
Taguchi approach provides more complete interaction information than typical fractional
factorial designs, its adherents claim.
• Followers of Taguchi argue that the designs offer rapid results and that interactions can
be eliminated by proper choice of quality characteristics. That notwithstanding, a
"confirmation experiment" offers protection against any residual interactions. If the
quality characteristic represents the energy transformation of the system, then the
"likelihood" of control factor-by-control factor interactions is greatly reduced, since
"energy" is "additive".
[edit] Inefficencies of Taguchi's designs
• Interactions are part of the real world. In Taguchi's arrays, interactions are confounded
and difficult to resolve.
Statisticians in response surface methodology (RSM) advocate the "sequential assembly" of
designs: In the RSM approach, a screening design is followed by a "follow-up design" that
resolves only the confounded interactions that are judged to merit resolution. A second follow-up
design may be added, time and resources allowing, to explore possible high-order univariate
effects of the remaining variables, as high-order univariate effects are less likely in variables
already eliminated for having no linear effect. With the economy of screening designs and the
flexibility of follow-up designs, sequential designs have great statistical efficiency. The
sequential designs of response surface methodology require far fewer experimental runs than
would a sequence of Taguchi's designs.[8]
[edit] Analysis of experiments
Taguchi introduced many methods for analysing experimental results including novel
applications of the analysis of variance and minute analysis.
[edit] Assessment
Genichi Taguchi has made valuable contributions to statistics and engineering. His emphasis on
loss to society, techniques for investigating variation in experiments, and his overall strategy of
system, parameter and tolerance design have been influential in improving manufactured quality
worldwide
*Taguchi loss function
The Taguchi Loss Function is a graphical depiction of loss developed by the Japanese business
statistician Genichi Taguchi to describe a phenomenon affecting the value of products produced
by a company. Praised by Dr. W. Edwards Deming (the business guru of the 1980s American
quality movement),[1] it made clear the concept that quality does not suddenly plummet when, for
instance, a machinist exceeds a rigid blueprint tolerance. Instead "loss" in value progressively
increases as variation increases from the intended condition. This was considered a breakthrough
in describing quality, and helped fuel the continuous improvement movement that since has
become known as lean manufacturing.
[edit] Overview
The Taguchi Loss Function is important for a number of reasons. Primarily, to help engineers
better understand the importance of designing for variation. It was important to the Six Sigma
movement by driving an improved understanding of the importance of Variation Management (a
concept described in the Shingo Prize winning book, Breaking the Cost Barrier[2]). Finally, It
was important to describing the effects of changing variation on a system, which is a central
characteristic of Lean Dynamics, a business management discipline focused on better
understanding the impact of dynamic business conditions (such as sudden changes in demand
seen during the 2008-2009 economic downturn) on loss, and thus on creating value
*TRIZ
TRIZ ((pronounced /ˈtriːz/), Russian: Теория решения изобретательских задач (Teoriya

Resheniya Izobretatelskikh Zadatch)) is "a problem-solving, analysis and forecasting tool
derived from the study of patterns of invention in the global patent literature".[1] It was developed
by Soviet engineer and researcher Genrich Altshuller and his colleagues, beginning in 1946. In
English the name is typically rendered as "the Theory of Inventive Problem Solving)",[2][3] and
occasionally goes by the English acronym TIPS. The approach involves identifying
generalisable problems and borrowing solutions from other fields. TRIZ practitioners aim to
create an algorithmic approach to the invention of new systems, and the refinement of old
systems.
TRIZ is variously described as a methodology, tool set, knowledge base, and model-based
technology for generating new ideas and solutions for problem solving. It is intended for
application in problem formulation, system analysis, failure analysis, and patterns of system
evolution. Splits have occurred within TRIZ advocacy, and interpretation of its findings and
applications are disputed.
History
The development of TRIZ began in 1946 with the mechanical engineer Genrich Altshuller
studying patents on behalf of the Russian navy.[4] Altshuller's job was to inspect invention
proposals, help document them, and help others to invent. By 1969 he had reviewed about
40,000 patent abstracts in order to find out in what way the innovation had taken place. By
examining a large database of inventions, Altshuller concluded that only one per cent of
inventions were genuinely original; the rest represented the novel application of previous ideas.[5]
Altshuller argued that "An invention is the removal of technical contradictions"[6]. Along these
lines, he said that to develop a method for inventing, one must scan a large number of inventions,
identify the contradictions underlying them, and formulate the principle used by the inventor for
their removal. Over the next years, he developed 40 Principles of Invention, several Laws of
Technical Systems Evolution, the concepts of technical and physical contradictions that creative
inventions resolve, the concept of Ideality of a system and numerous other theoretical and
practical approaches.
Concerned at the state of the Soviet Union following World War II, in 1949 Altshuller and his
colleague Rafael Shapiro wrote to Josef Stalin, encouraging the use of TRIZ to rebuild the
economy. However, the letter also contained sharp criticism of the situation for innovation in the
USSR at the time.[7] Altshuller was arrested and sentenced to 25 years hard labour, although he
was freed following Stalin's death in 1953.[6] Altshuller and Shapiro published their first paper in
1956 in Voprosy Psikohologii (Questions of Psychology) entitled "On the psychology of
inventive creation".[8]
In 1971, after returning to Baku some years earlier, Altshuller founded the Azerbaijan Public
Institute for Inventive Creation. This institute incubated the TRIZ movement, producing a
generation of Altshuller proteges. Altshuller himself gave workshops across the Soviet Union. In
1989 the TRIZ Association was formed, with Altshuller chosen as President.
Following the end of the cold war, the waves of emigrants from the former Soviet Union brought
TRIZ to other countries and drew attention to it overseas.[9] In 1995 the Altshuller Institute for
TRIZ Studies was established in Boston, USA.
[edit] Basic principles of TRIZ
TRIZ process for creative problem solving
The TRIZ process presents an algorithm for the analysis of problems in a technological system.
The fundamental view is that almost all "inventions" are reiterations of previous discoveries
already made in the same or other fields, and that problems can be reduced to contradictions
between two elements. The goal of TRIZ analysis is to achieve a better solution than a mere
trade-off between the two elements, and the belief is that the solution almost certainly already
exists somewhere in the patent literature.
A problem is first defined in terms of the ideal solution. The problem is analysed into its basic,
abstract constituents according to a list of 39 items (for example, the weight of a stationery
object, the use of energy by a moving object, the ease of repair etc.), and reframed as a
contradiction between two of these constituents. Using a contradiction matrix based upon large-
scale analysis of patents, a series of suggested abstract solutions (for example "move from
straight lines to curved", or "make the object porous") is offered, helping the analyst find creative
practical solutions. Altshuler however abandoned this method of defining and solving "technical"
contradictions in the mid 1980's and instead used Su-field modeling and the 76 inventive
standards and a number of other tools included in the algorithm for solving inventive problems,
ARIZ.
ARIZ consists of a program (sequence of actions) for the exposure and solution of
contradictions, i.е. the solution of problems. ARIZ includes: the program itself, information
safeguards supplied by the knowledge base (shown by an arrow in fig.1.1), and methods for the
control of psychological factors, which are a component part of the methods for developing a
creative imagination. Furthermore, sections of ARIZ are predetermined for the selection of
problems and the evaluation of the received solution.
Classification of a system of standard solutions for inventive problems, as well as the standards
themselves, is built on the basis of Su-Field Analysis of technological systems. Su-Field Analysis
is also a component part of the program ARIZ.
A system of laws for the development of technology, a system of standards for the solution of
inventive problems, and Su-Field Analysis are used to forecast the development of technology,
to search for and select problems, and to evaluate the received solution. For the development of a
creative imagination, all elements of TRIZ can be used, although particular stress is given to
methods for developing a creative imagination. The solution of inventive problems is realized
with the help of laws for the development of technological systems, the knowledge base, Su-
Field Analysis, ARIZ, and, in part, with the help of methods for the development of a creative
imagination.
By means of TRIZ, both known and unknown types of problems could be solved. Known
(standard) types of inventive problems are solved with the use of the knowledge base, and
unknown (nonstandard) – with the use of АRIZ. As experience grows, solutions for a class of
known types of problems increase and exhibit a structure.
At the present time, computer programs have been developed on the basis of TRIZ that try to
provide intellectual assistance to engineers and inventors during the solution of technological
problems. These programs also try to reveal and forecast emergency situations and undesirable
occurrences.
[edit] Essentials
[edit] Basic terms
• Ideal Final Result (IFR) - the ultimate idealistic solution of a problem when the desired
result is achieved by itself;
• Administrative Contradiction - contradiction between the needs and abilities;
• Technical Contradiction - an inverse dependence between parameters/characteristics of a
machine or technology;
• Physical Contradiction - opposite/contradictory physical requirements to an object;
• Separation principle - a method of resolving physical contradictions by separating
contradictory requirements;
• VePol or SuField - a minimal technical system consisting of two material objects
(substances) and a "field". "Field" is the source of energy whereas one of the substances
is "transmission" and the other one is the "tool";
• FePol - a sort of VePol where "substances" are ferromagnetic objects;
• Level of Invention;
• Standard - a standard inventive solution of a higher level;
• Law of Technical Systems Evolution;
• ARIZ - Algorithm of Inventive Problems Solving, which combines various specialized
methods of TRIZ into one universal tool;
[edit] Identifying a problem: contradictions
Altshuller believed that inventive problems stem from contradictions (one of the basic TRIZ
concepts) between two or more elements, such as, "If we want more acceleration, we need a
larger engine; but that will increase the cost of the car," that is, more of something desirable also
brings more of something less desirable, or less of something else also desirable.
These are called Technical Contradictions by Altshuller. He also defined so-called physical or
inherent contradictions: More of one thing and less of the same thing may both be desired in the
same system. For instance, a higher temperature may be needed to melt a compound more
rapidly, but a lower temperature may be needed to achieve a homogeneous mixture.
An "inventive situation" which challenges us to be inventive, might involve several such
contradictions. Conventional solutions typically "trade" one contradictory parameter for another;
no special inventiveness is needed for that. Rather, the inventor would develop a creative
approach for resolving the contradiction, such as inventing an engine that produces more
acceleration without increasing the cost of the engine.
[edit] Inventive principles and the matrix of contradictions
Altshuller screened patents in order to find out what kind of contradictions were resolved or
dissolved by the invention and the way this had been achieved. From this he developed a set of
40 inventive principles and later a Matrix of Contradictions. Rows of the matrix indicate the 39
system features that one typically wants to improve, such as speed, weight, accuracy of
measurement and so on. Columns refer to typical undesired results. Each matrix cell points to
principles that have been most frequently used in patents in order to resolve the contradiction.
For instance, Dolgashev mentions the following contradiction: Increasing accuracy of
measurement of machined balls while avoiding the use of expensive microscopes and elaborate
control equipment. The matrix cell in row "accuracy of measurement" and column "complexity
of control" points to several principles, among them the Copying Principle, which states, "Use a
simple and inexpensive optical copy with a suitable scale instead of an object that is complex,
expensive, fragile or inconvenient to operate." From this general invention principle, the
following idea might solve the problem: Taking a high-resolution image of the machined ball. A
screen with a grid might provide the required measurement. As mentioned above, Altshuler
abandoned this method of defining and solving "technical" contradictions in the mid 1980's and
instead used Su-field modeling and the 76 inventive standards and a number of other tools
included in the algorithm for solving inventive problems, ARIZ
[edit] Laws of technical system evolution
Main article: Laws of Technical Systems Evolution
Altshuller also studied the way technical systems have been developed and improved over time.
From this, he discovered several trends (so called Laws of Technical Systems Evolution) that
help engineers predict what the most likely improvements that can be made to a given product
are. The most important of these laws involves the ideality of a system.
[edit] Substance-field analysis
One more technique that is frequently used by inventors involves the analysis of substances,
fields and other resources that are currently not being used and that can be found within the
system or nearby. TRIZ uses non-standard definitions for substances and fields. Altshuller
developed methods to analyze resources; several of his invention principles involve the use of
different substances and fields that help resolve contradictions and increase ideality of a
technical system. For instance, videotext systems used television signals to transfer data, by
taking advantage of the small time segments between TV frames in the signals.
Su-Field Analysis (structural substance-field analysis) produces a structural model of the initial
technological system, exposes its characteristics, and with the help of special laws, transforms
the model of the problem. Through this transformation the structure of the solution that
eliminates the shortcomings of the initial problem is revealed. Su-Field Analysis is a special
language of formulas with which it is possible to easily describe any technological system in
terms of a specific (structural) model. A model produced in this manner is transformed according
to special laws and regularities, thereby revealing the structural solution of the problem.
[edit] ARIZ - algorithm of inventive problems solving
ARIZ (Russian acronym of Алгоритм решения изобретательских задач - АРИЗ) - Algorithm
of Inventive Problems Solving - is a list of about 85 step-by-step procedures to solve
complicated invention problems, where other tools of TRIZ alone (Su-field analysis, 40
inventive principles, etc.) are not sufficient.
Various TRIZ software (see Invention Machine, Ideation International...) is based on this
algorithm (or an improved one).
Starting with an updated matrix of contradictions, semantic analysis, subcategories of inventive
principles and lists of scientific effects, some new interactive applications are other attempts to
simplify the problem formulation phase and the transition from a generic problem to a whole set
of specific solutions.
(see External links for details)
[edit] Use of TRIZ methods in industry
It has been reported that car companies Ford and Daimler-Chrysler, Johnson & Johnson,
aeronautics companies Boeing, NASA, technology companies Hewlett Packard, Motorola,
General Electric, Xerox, IBM, LG and Samsung, and Procter and Gamble and Kodak have used
TRIZ methods in some projects.[6][10][11][12]
[edit] Approaches which are modifications/derivatives of TRIZ
1. SIT (Systematic Inventive Thinking)
2. ASIT (Advanced Systematic Inventive Thinking)
3. USIT (Unified Systematic Inventive Thinking)
4. JUSIT (Japanese version of Unified Systematic Inventive Thinking)
5. Southbeach notation
After this explanation of each and every quality management tools that are used in SIX SIGMA I
would like to present some more things about SIX SIGMA :-
Implementation roles
One key innovation of Six Sigma involves the "professionalizing" of quality management
functions. Prior to Six Sigma, quality management in practice was largely relegated to the
production floor and to statisticians in a separate quality department. Formal Six Sigma programs
borrow martial arts ranking terminology to define a hierarchy (and career path) that cuts across
all business functions.
Six Sigma identifies several key roles for its successful implementation.[13]
• Executive Leadership includes the CEO and other members of top management. They are
responsible for setting up a vision for Six Sigma implementation. They also empower the
other role holders with the freedom and resources to explore new ideas for breakthrough
improvements.
• Champions take responsibility for Six Sigma implementation across the organization in
an integrated manner. The Executive Leadership draws them from upper management.
Champions also act as mentors to Black Belts.
• Master Black Belts, identified by champions, act as in-house coaches on Six Sigma. They
devote 100% of their time to Six Sigma. They assist champions and guide Black Belts
and Green Belts. Apart from statistical tasks, they spend their time on ensuring consistent
application of Six Sigma across various functions and departments.
• Black Belts operate under Master Black Belts to apply Six Sigma methodology to specific
projects. They devote 100% of their time to Six Sigma. They primarily focus on Six
Sigma project execution, whereas Champions and Master Black Belts focus on
identifying projects/functions for Six Sigma.
• Green Belts are the employees who take up Six Sigma implementation along with their
other job responsibilities, operating under the guidance of Black Belts.
Some organizations use additional belt colours, such as Yellow Belts, for employees that have
basic training in Six Sigma tools.
[edit] Certification
In the United States, Six Sigma certification for both Green and Black Belts is offered by the
Institute of Industrial Engineers[14] and by the American Society for Quality.[15]
In addition to these examples, there are many other organizations and companies that offer
certification. There currently is no central certification body, either in the United States or
anywhere else in the world
Origin and meaning of the term "six sigma process"
Graph of the normal distribution, which underlies the statistical assumptions of the Six Sigma
model. The Greek letter σ (sigma) marks the distance on the horizontal axis between the mean,
µ, and the curve's inflection point. The greater this distance, the greater is the spread of values
encountered. For the curve shown above, µ = 0 and σ = 1. The upper and lower specification
limits (USL, LSL) are at a distance of 6σ from the mean. Because of the properties of the normal
distribution, values lying that far away from the mean are extremely unlikely. Even if the mean
were to move right or left by 1.5σ at some point in the future (1.5 sigma shift), there is still a
good safety cushion. This is why Six Sigma aims to have processes where the mean is at least 6σ
away from the nearest specification limit.
The term "six sigma process" comes from the notion that if one has six standard deviations
between the process mean and the nearest specification limit, as shown in the graph, practically
no items will fail to meet specifications.[8] This is based on the calculation method employed in
process capability studies.
Capability studies measure the number of standard deviations between the process mean and the
nearest specification limit in sigma units. As process standard deviation goes up, or the mean of
the process moves away from the center of the tolerance, fewer standard deviations will fit
between the mean and the nearest specification limit, decreasing the sigma number and
increasing the likelihood of items outside specification
Role of the 1.5 sigma shift

Experience has shown that processes usually do not perform as well in the long term as they do
in the short term.[8] As a result, the number of sigmas that will fit between the process mean and
the nearest specification limit may well drop over time, compared to an initial short-term study.[8]
To account for this real-life increase in process variation over time, an empirically-based 1.5
sigma shift is introduced into the calculation.[8][16] According to this idea, a process that fits six
sigmas between the process mean and the nearest specification limit in a short-term study will in
the long term only fit 4.5 sigmas – either because the process mean will move over time, or
because the long-term standard deviation of the process will be greater than that observed in the
short term, or both.[8]
Hence the widely accepted definition of a six sigma process as one that produces 3.4 defective
parts per million opportunities (DPMO). This is based on the fact that a process that is normally
distributed will have 3.4 parts per million beyond a point that is 4.5 standard deviations above or
below the mean (one-sided capability study).[8] So the 3.4 DPMO of a "Six Sigma" process in
fact corresponds to 4.5 sigmas, namely 6 sigmas minus the 1.5 sigma shift introduced to account
for long-term variation.[8] This takes account of special causes that may cause a deterioration in
process performance over time and is designed to prevent underestimation of the defect levels
likely to be encountered in real-life operation.[8]
[edit] Sigma levels
A control chart depicting a process that experienced a 1.5 sigma drift in the process mean toward
the upper specification limit starting at midnight. Control charts are used to maintain 6 sigma
quality by signaling when quality professionals should investigate a process to find and eliminate
special-cause variation.
See also: Three sigma rule
The table[17][18] below gives long-term DPMO values corresponding to various short-term sigma
levels.
Note that these figures assume that the process mean will shift by 1.5 sigma toward the side with
the critical specification limit. In other words, they assume that after the initial study determining
the short-term sigma level, the long-term Cpk value will turn out to be 0.5 less than the short-term
Cpk value. So, for example, the DPMO figure given for 1 sigma assumes that the long-term
process mean will be 0.5 sigma beyond the specification limit (Cpk = –0.17), rather than 1 sigma
within it, as it was in the short-term study (Cpk = 0.33). Note that the defect percentages only
indicate defects exceeding the specification limit to which the process mean is nearest. Defects
beyond the far specification limit are not included in the percentages.
Sigma level DPMO Percent defective Percentage yield Short-term Cpk Long-term Cpk
1 691,462 69% 31% 0.33 –0.17
2 308,538 31% 69% 0.67 0.17

3 66,807 6.7% 93.3% 1.00 0.5
4 6,210 0.62% 99.38% 1.33 0.83
5 233 0.023% 99.977% 1.67 1.17
6 3.4 0.00034% 99.99966% 2.00 1.5
7 0.019 0.0000019% 99.9999981% 2.33 1.83
Six Sigma mostly finds application in large organizations.[19] An important factor in the spread of
Six Sigma was GE's 1998 announcement of $350 million in savings thanks to Six Sigma, a
figure that later grew to more than $1 billion.[19] According to industry consultants like Thomas
Pyzdek and John Kullmann, companies with less than 500 employees are less suited to Six
Sigma implementation, or need to adapt the standard approach to make it work for them.[19] This
is due both to the infrastructure of Black Belts that Six Sigma requires, and to the fact that large
organizations present more opportunities for the kinds of improvements Six Sigma is suited to
bringing about.[19]
[edit] Criticism
[edit] Lack of originality
Noted quality expert Joseph M. Juran has described Six Sigma as "a basic version of quality
improvement", stating that "there is nothing new there. It includes what we used to call
facilitators. They've adopted more flamboyant terms, like belts with different colors. I think that
concept has merit to set apart, to create specialists who can be very helpful. Again, that's not a
new idea. The American Society for Quality long ago established certificates, such as for
reliability engineers."[20]
[edit] Role of consultants
The use of "Black Belts" as itinerant change agents has (controversially) fostered an industry of
training and certification. Critics argue there is overselling of Six Sigma by too great a number of
consulting firms, many of which claim expertise in Six Sigma when they only have a
rudimentary understanding of the tools and techniques involved.[3]
[edit] Potential negative effects
A Fortune article stated that "of 58 large companies that have announced Six Sigma programs,
91 percent have trailed the S&P 500 since". The statement is attributed to "an analysis by Charles
Holland of consulting firm Qualpro (which espouses a competing quality-improvement
process)."[21] The summary of the article is that Six Sigma is effective at what it is intended to do,
but that it is "narrowly designed to fix an existing process" and does not help in "coming up with
new products or disruptive technologies." Advocates of Six Sigma have argued that many of
these claims are in error or ill-informed.[22][23]
A BusinessWeek article says that James McNerney's introduction of Six Sigma at 3M may have
had the effect of stifling creativity. It cites two Wharton School professors who say that Six
Sigma leads to incremental innovation at the expense of blue-sky work.[24] This phenomenon is
further explored in the book, Going Lean, which describes a related approach known as lean
dynamics and provides data to show that Ford's "6 Sigma" program did little to change its
fortunes.[25]
[edit] Based on arbitrary standards
While 3.4 defects per million opportunities might work well for certain products/processes, it
might not operate optimally or cost effectively for others. A pacemaker process might need
higher standards, for example, whereas a direct mail advertising campaign might need lower
standards. The basis and justification for choosing 6 (as opposed to 5 or 7, for example) as the
number of standard deviations is not clearly explained. In addition, the Six Sigma model assumes
that the process data always conform to the normal distribution. The calculation of defect rates
for situations where the normal distribution model does not apply is not properly addressed in the
current Six Sigma literature.[3]
[edit] Criticism of the 1.5 sigma shift
The statistician Donald J. Wheeler has dismissed the 1.5 sigma shift as "goofy" because of its
arbitrary nature.[26] Its universal applicability is seen as doubtful.[3]
The 1.5 sigma shift has also become contentious because it results in stated "sigma levels" that
reflect short-term rather than long-term performance: a process that has long-term defect levels
corresponding to 4.5 sigma performance is, by Six Sigma convention, described as a "6 sigma
process."[8][27] The accepted Six Sigma scoring system thus cannot be equated to actual normal
distribution probabilities for the stated number of standard deviations, and this has been a key
bone of contention about how Six Sigma measures are defined.[27] The fact that it is rarely
explained that a "6 sigma" process will have long-term defect rates corresponding to 4.5 sigma
performance rather than actual 6 sigma performance has led several commentators to express the
opinion that Six Sigma is a confidence trick
This is all about six sigma and its applications hope u all will like it!!!!!!!

Six Sigma

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Six Sigma

Uploaded by

Copyright:

Available Formats

SIX SIGMA

Six Sigma is a business improvement methodology. Its main objective is to implement a

Implementation of Roles in Six Sigma Methodology

Downsides of the Six Sigma Methodology

In order to implement the Six Sigma methodology in an organization, it is extremely important to

The 5 Whys is a questions-asking method used to explore the cause/effect relationships

In a randomized controlled experiment, the treatments are randomly assigned to experimental

See also Lack-of-fit sum of squares.

nT = total number of cases

*ANOVA Gauge R&R

Axiomatic design is a systems design methodology using matrix methods to systematically

*Business process mapping

One of the Seven Basic Tools of Quality

First described Kaoru Ishikawa

Purpose To break down (in successive

Chi-square test for variance in a normal population

One of the Seven Basic Tools of Quality

First described Walter A. Shewhart

Purpose To determine whether a process

Control charts, also known as Shewhart charts or process-behaviour charts, in statistical

[edit] Chart usage

Three-way chart Quality characteristic Independent Variables Large (≥

Fraction nonconforming within Large (≥

Number nonconforming within Large (≥

Number of nonconformances Large (≥

Nonconformances per unit Large (≥

Exponentially weighted moving

Cumulative sum of quality

Quality characteristic Dependent of

*Correlation and dependence

Pearson's product-moment coefficient

[edit] Rank correlation coefficients

[edit] Correlation matrices

If a population or data-set is characterized by more than two variables, a partial correlation

Cost-benefit analysis is a term that refers both to:

In many fields of study it is hard to reproduce measured results exactly. Comparisons

There is an extensive body of mathematical theory that explores the consequences of

Measurements are usually subject to variation and uncertainty. Measurements are

Example of orthogonal factorial design

Use of factorial experiments instead of the one-factor-at-a-time method. These are

We consider two different experiments:

The question of design of experiments is: which experiment is better?

[edit] Step 1: Severity

where Y is a matrix with series of multivariate measurements, X is a matrix that might be a

One of the Seven Basic Tools of Quality

First Karl Pearson

Purpose To roughly assess the

An example histogram of the heights of 31 Black Cherry trees.

Data by absolute numbers

Interval Width Quantity Quantity/width

Interval Width Quantity (Q) Q/total/width

In other words, a histogram represents a frequency distribution by means of rectangles whose

[edit] Number of bins and width

The braces indicate the ceiling function.

where σ is the sample standard deviation.

*Quality function deployment

One of the Seven Basic Tools of Quality

First described Joseph M. Juran

easy to do Possible Implement

A process is a unique combination of tools, materials, methods, and people engaged in

*Quantitative marketing research

Quantitative marketing research is the application of quantitative research techniques to the

Illustration of linear regression on a data set.