Professional Documents
Culture Documents
❖ Y = f(X),
❖ where Y is Dependent variable or the result
(output)
❖ X is Independent variable, input or the
controllable variable
Column 1 Column 2
Column 1 1
Column 2 0.879350768 1
Correlation Coefficient
❖ Correlation
❖ Measures the strength of linear
relationship between Y and X
❖ Pearson Correlation Coefficient, r (r
varies between -1 and +1)
❖ Perfect positive relationship: r = 1
❖ No relationship: r = 0
❖ Perfect negative relationship: r = -1
Correlation Coefficient
Correlation vs Causation
❖ Correlation does not imply causation
❖ a correlation between two variables does
not imply that one causes the other
Correlation – Confidence Interval
❖ Population correlation (ρ) – usually
unknown
❖ Sample correlation (r)
Correlation – Confidence Interval
❖ Since r is not normally distributed, there
are three steps to find out confidence
interval
❖ Convert r to z’ (Fisher’s Transformation)
❖ Calculate confidence interval in terms of z’
❖ Convert confidence interval back to r
❖ z’ = .5[ln(1+r) – ln(1-r)]
❖ Variance = 1/N-3
Correlation – Confidence Interval
❖ N=10, r=0.88 find confidence interval
❖ Step 1.
❖ Convert r to z’
❖ z’ = .5[ln(1+r) – ln(1-r)]
❖ z’ = .5[ln(1+0.88) – ln(1-0.88)]
❖ z’= . 5[0.63 – (-2.12)] = 1.375
Correlation – Confidence Interval
❖ N=10, r=0.88 find confidence interval
❖ Step 2. Confidence interval for z’
❖ Variance = 1/N-3 = 1/7 = 0.1428
❖ Standard error = Sqrt (0.1428) = 0.378
❖ 95% confidence Z = 1.96
❖ CI = 1.375 +/- (1.96)(0.378)
❖ Lower Limit = 0.635
❖ Upper Limit = 2.11
Correlation – Confidence Interval
❖ N=10, r=0.88 find confidence interval
❖ Step 3. Convert back to r
❖ z’ Lower Limit = 0.635
❖ z’ Upper Limit = 2.11
z’ = .5[ln(1+r) – ln(1-r)]
❖ r = 0.88, r2 = 0.77
Regression Analysis
❖ Quantifies the relationship between Y
and X (Y = a + bX)
Regression Analysis
❖ Quantifies the relationship between Y
and X (Y = a + bX)
Hours Studied (X) Test Score % (Y) XY X2 Y2
20 40 800 400 1600
24 55 1320 576 3025
46 69 3174 2116 4761
62 83 5146 3844 6889
22 27 594 484 729
37 44 1628 1369 1936
45 61 2745 2025 3721
27 33 891 729 1089
65 71 4615 4225 5041
23 37 851 529 1369
SUM 371 520 21764 16297 30160
Regression Analysis
❖ Quantifies the relationship between Y
and X (Y = 15.79 + 0.97.X)
Hours
Test Score
Studied XY X2 Y2
% (Y)
(X)
20 40 800 400 1600
24 55 1320 576 3025
46 69 3174 2116 4761
62 83 5146 3844 6889
22 27 594 484 729
37 44 1628 1369 1936
45 61 2745 2025 3721
27 33 891 729 1089
65 71 4615 4225 5041
23 37 851 529 1369
SUM 371 520 21764 16297 30160
Regression Analysis
❖ For a student studying 50 hrs what is the
expected test score %?
Residual Analysis
❖ Y = 15.79 + 0.97.X
Residual Analysis – No pattern
Residual
20
15
10
0
0 10 20 30 40 50 60 70
-5
-10
-15
Residual
Confidence Interval - Slope
❖ Confidence interval
❖ 95% confidence interval, representing a
range of likely values for the mean
response.
❖ Prediction interval
❖ 95% prediction interval, represents a range
of likely values for a single new
observation.
Multivariate Tools
❖ Simple Linear Relation
❖ Y = a + bX
❖ Multiple Linear Regression
❖ Y = a + b1X1 + b2X2+ ……..+ bnXn
❖ Multicollinearity
❖ When two input variables (predictor
variables - Xs) are correlated.
❖ Multivariate
❖ Two or more dependent variables (Ys)
Multivariate Tools
❖ Factor analysis / Principal Component
Analysis
❖ Discriminant analysis
❖ Multiple analysis of variance (MANOVA)
Multivariate
❖ Application
❖ Climate: Min temp, max temp, humidity,
precipitation – for a day
❖ Medical – Systolic BP, Diastolic BP, Pulse
rate, Age – of a patient
Multivariate
❖ Application
❖ Classification of individuals – easy when
there are limited number of variables.
❖ Cause-effect relationship
Multivariate
❖ Tools
❖ Classification of individuals
❖ Discriminant Analysis
❖ Dimension reduction
❖ Principal Component Analysis/ Factor
Analysis
❖ Cause-effect relationship
❖ MANOVA
Discriminant Analysis
❖ Explains how clusters are different
PCA / Factor Analysis
❖ Principal Component Analysis/ Factor
Analysis
❖ To reduce number of variables
❖ By grouping highly correlated variables
together
MANOVA
❖ The MANOVA (multivariate analysis of
variance)
❖ To analyze data that involves more than
one dependent variable at a time.
❖ Tests the effect of one or more
independent variables on two or more
dependent variables.
❖ MANOVA is simply an ANOVA with
several dependent variables.
Errors of Statistical Tests
True State of Nature
H0 Ha
Is true Is true
Support H0 /
Reject Ha Correct Type II
Conclusion Error
Conclusion Support Ha / Correct
Reject H0 Type I Error Conclusion
(Power)
Errors of Statistical Tests
Type I error (alpha) Type II error (beta)
Name Producer’s risk/ Consumer’s risk
Significance level
1 minus error is Confidence level Power of the test
called
Example of Fire False fire alarm leading Missed fire leading to
Alarm to inconvenience disaster
Effects on Unnecessary cost Defects may be produced
process increase due to frequent
changes
Control method Usually fixed at a pre- Usually controlled to < 10%
determined level, 1%, by appropriate sample size
5% or 10%
Simple definition Innocent declared as Guilty declared as innocent
guilty
Significance Level
Level of Confidence / Confidence Interval:
C = 0.90, 0.95, 0.99 (90%, 95%, 99%)
Level of Significance:
α = 1 – C (0.10, 0.05, 0.01)
Power
❖ Power = 1 – β (or 1 - type II error)
❖ Type II Error: Failing to reject null
hypothesis when null hypothesis is false.
❖ Power: Likelihood of rejecting null
hypothesis when null hypothesis is false.
❖ Interval estimate:
❖ A range of values within which, we believe,
the true parameter lies with high
probability.
Point Estimates
❖ Point estimate:
❖ Summarize the sample by a single number
that is an estimate of the population
parameter.
❖ The sample mean x̄ is a point estimate of
the population mean μ. The sample
proportion p is a point estimate of the
population proportion P.
Point vs Interval Estimates
❖ Interval estimate:
❖ A range of values within which, we believe,
the true parameter lies with high
probability.
❖ For example, a < x̄ < b is an interval
estimate of the population mean μ. It
indicates that the population mean is
greater than a but less than b.
Confidence Interval
❖ Factors affecting the width of
confidence interval
❖ sample size
❖ standard deviation
❖ confidence level
Confidence Interval
❖ When population standard deviation is
known/ Sample size is >=30
❖ H 0: μ 2 – μ 1 <= 0.3
❖ H a: μ 2 – μ 1 > 0.3
Two Sample z Test
❖ Example: From two machines 100
samples each were drawn.
❖ Machine 1: Mean = 151.2 / sd = 2.1
❖ Machine 2: Mean = 151.9 / sd = 2.2
❖ Is there difference of more than 0.3 cc in
these two machines. Check at 95%
confidence level.
❖ Zcal = (151.2 – 151.9) – (-0.3)/0.304
❖ = -0.4 / 0.304 = -1.316
❖ Zcritical = 1.64
❖ Fail to reject Null Hypothesis.
Two Sample t Test
❖ If two set of data are independent or
dependent.
❖ If the values in one sample reveal no
information about those of the other
sample, then the samples are independent.
❖ Example: Blood pressure of male/female
Test Information
H0: Mean Difference = 0
Ha: Mean Difference Not Equal To 0
Assume Unequal Variance
Results: A C
Count 5 5
Mean 151.80 154.60
Standard Deviation 1.483 15.027
Results: A C
Count 5 5
Mean 151.80 154.60
Standard Deviation 1.483 15.027
Two Sample t Test
tcritical = 2.776
Two Sample t Test
❖ Minitab 17 output:
Two-sample T for A vs C
F critical = 4.1203
Two Sample Variance – F Test
❖ Example: We took 8 samples from
machine A and the standard deviation
was 1.1. For machine B we took 5
samples and the variance was 11. Is
there a difference in variance at 90%
F critical = 4.1203
confidence level?
❖ n1 = 8, s1 = 1.1, s21 = 1.21, df = 7 (denominator)
❖ n2 = 5, s22 = 11, df = 4 (numerator)
❖ F calculated = 11/1.21 = 9.09 (higher value at top)
❖ Reject H0
Two Sample Variance – F Test
❖ Right tail F critical = 4.1203
❖ Left tail F critical =?
❖ Reverse degrees of freedom and then
take inverse.
F critical = 4.1203
❖ F (4,7) = 4.1203
❖ F (7,4) = 6.0942
❖ Inverse of this is 1/6.0942 is F= 0.164
Tests for Variance
❖ F-test
❖ for testing equality of two variances from
different populations
❖ for testing equality of several means with
technique of ANOVA.
❖ Chi-square test
❖ For testing the population variance against
a specified value
❖ testing goodness of fit of some probability
distribution
❖ X2 = 24x5 / 4 = 30
❖ X2 = 24x5 / 4 = 30
❖ Fail to reject H0
One Sample Chi Square
❖ SigmaXL Output
ANOVA
❖ F-test
❖ for testing equality of two variances from
different populations
❖ for testing equality of several means with
technique of ANOVA.
❖ Chi-square test
❖ For testing the population variance against
a specified value
❖ testing goodness of fit of some probability
distribution
4 x
3 x 3 vs 4
2 x 2 vs 3 2 vs 4
1 x 1 vs 2 1 vs 3 1 vs 4
1 2 3 4
ANOVA
❖ Why ANOVA?
❖ How many t Test we need to conduct if
have to compare 4 samples? … 6
❖ Each test is done with alpha = 0.05 or 95%
confidence.
❖ 6 tests will result in confidence level of
0.95x0.95x0.95x0.95x0.95x0.95 = 0.735
ANOVA
❖ Comparing three machines:
Machine 1 Machine 2 Machine 3
150 153 156
151 152 154
152 148 155
152 151 156
151 149 157
150 152 155
x̄1 = 151 x̄2 = 150.83 x̄3 = 155.50
ANOVA
Machine 1 Machine 2 Machine 3
❖ Comparing three machines: 150 153 156
151 152 154
152 148 155
152 151 156
158
151 149 157
156 150 152 155
x̄1 = 151.00 x̄2 = 150.83 x̄3 = 155.50
154
Median
25th
152
75th
Mean
150
148
Machine 1 Machine 2 Machine 3
146
ANOVA
❖ Comparing three machines:
❖ Ratio:
SS between(or treatment) / SS within(or error)
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
Machine 1 x1 - x̄1 Sqr(x1 - x̄1) Machine 2 x2 - x̄2 Sqr(x2 - x̄2) Machine 3 x3 - x̄3 Sqr(x3 - x̄3)
150.00 -1.00 1.00 153.00 2.17 4.69 156.00 0.50 0.25
151.00 0.00 0.00 152.00 1.17 1.36 154.00 -1.50 2.25
152.00 1.00 1.00 148.00 -2.83 8.03 155.00 -0.50 0.25
152.00 1.00 1.00 151.00 0.17 0.03 156.00 0.50 0.25
151.00 0.00 0.00 149.00 -1.83 3.36 157.00 1.50 2.25
150.00 -1.00 1.00 152.00 1.17 1.36 155.00 -0.50 0.25
151.00 150.83 155.50 152.44
4.00 18.83 5.50
❖ SS within = 4.00+18.83+5.50 = 28.33
Machine 1 Machine 2 Machine 3
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
Machine 1 x1 - x̄1 Sqr(x1 - x̄1) Machine 2 x2 - x̄2 Sqr(x2 - x̄2) Machine 3 x3 - x̄3 Sqr(x3 - x̄3)
150.00 -1.00 1.00 153.00 2.17 4.69 156.00 0.50 0.25
151.00 0.00 0.00 152.00 1.17 1.36 154.00 -1.50 2.25
152.00 1.00 1.00 148.00 -2.83 8.03 155.00 -0.50 0.25
152.00 1.00 1.00 151.00 0.17 0.03 156.00 0.50 0.25
151.00 0.00 0.00 149.00 -1.83 3.36 157.00 1.50 2.25
150.00 -1.00 1.00 152.00 1.17 1.36 155.00 -0.50 0.25
151.00 150.83 155.50 152.44
4.00 18.83 5.50
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
❖ Degrees of freedom
❖ Total df = df treatment + df error
❖ (N-1) = (C-1) + (N-C)
❖ df treatment = 3-1=2, df error = 18-3=15
❖ df total = 17
Machine 1 Machine 2 Machine 3
ANOVA
150 153 156
151 152 154
152 148 155
152 151 156
❖ MSbetween = SS between(or treatment) /df treatment x̄1 = 151.00 x̄2 = 150.83 x̄3 = 155.50
❖ DEMONSTRATE MS Excel
Machine 1 Machine 2 Machine 3
ANOVA
150 153 156
151 152 154
152 148 155
ANOVA Table
Source SS DF MS F P-Value
Between 84.111 2 42.056 22.265 0.0000
Within 28.333 15 1.889
Total 112.44 17
ANOVA
❖ Practice Exercise: Fill in the values for ?A
to ?E in this ANOVA Table:
ANOVA Table
Source SS DF MS F
Between 84.111 ?C ?D ?E
Within ?A 15 1.889
Total ?B 17
X2 = 25.8
Goodness of Fit Test (Chi Square)
❖ A coin is flipped 100 times. Number of
heads are noted. Is this coin biased?
X2cal= 25.8
X2(4,0.95)= 9.49
Goodness of Fit Test (Chi Square)
❖ A coin is flipped 100 times. Number of
heads are noted. Is this coin biased?
❖ X2cal= 25.8
❖ X2(4,0.95)= 9.49
EXPECTED
Operator 1 Operator 2 Operator 3
Shift 1 122x71/347 112x71/347 115x71/347 71
Shift 2 122x116/347 112x116/347 115x116/347 116
Shift 3 122x160/347 112x160/347 115x160/347 160
122 112 115 347
Contingency Tables
❖ Calculate Chi square statistic.
EXPECTED
Operator 1 Operator 2 Operator 3
Shift 1 122x71/347 112x71/347 115x71/347 71
Shift 2 122x116/347 112x116/347 115x116/347 116
Shift 3 122x160/347 112x160/347 115x160/347 160
122 112 115 347
EXPECTED
Operator 1 Operator 2 Operator 3
Shift 1 24.96 22.91 23.53 71
Shift 2 40.78 37.44 38.44 116
Shift 3 56.25 51.64 53.02 160
122 112 115 347
Contingency Tables
❖ Calculate Chi square statistic.
OBSERVED EXPECTED
Operator 1 Operator 2 Operator 3 Operator 1 Operator 2 Operator 3
Shift 1 22 26 23 71 Shift 1 24.96 22.91 23.53 71
Shift 2 28 62 26 116 Shift 2 40.78 37.44 38.44 116
Shift 3 72 22 66 160 Shift 3 56.25 51.64 53.02 160
122 112 115 347 122 112 115 347
Operat Operat
2
(O-E) /E Operator 1
or 2 or 3
Shift 1 (22-24.96)2/24.96 = 0.35 0.42 0.01 71
Shift 2 (28-40.78)2/40.78 = 4.00 16.11 4.03 116
Shift 3 (72-56.25)2/56.25 = 4.41 17.01 3.18 160
122 112 115 347 X2 = 49.52
Contingency Tables
❖ Calculate Chi square statistic = 49.52
❖ Degrees of freedom = (r-1)(c-1) = 4
❖ Chi square critical = 9.49
❖ Reject null hypothesis
❖ There is a relationship between the shift
and the operator.
Contingency Tables
❖ Practice Exercise:
❖ Calculate the Expected value for Non
Smoker Male?
❖ What will be the degrees of freedom in
this example?
Smoker Non
Smoker
Male 60 40 100
Female 35 40 75
95 80 175
Contingency Tables
❖ Practice Exercise:
❖ Calculate the Expected value for Non
Smoker Male? = 80x100/175 = 45.71
❖ What will be the degrees of freedom in
this example? (2-1)(2-1)=1
Smoker Non
Smoker
Male 60 40 100
Female 35 40 75
95 80 175
Parametric vs Non Parametric
❖ Parametric
❖ Assumes about the population from which the
sample has been drawn (e.g. Normally
distributed)
❖ Data is ratio or interval level
❖ Non Parametric
❖ Makes no assumption about the population
from which the sample has been drawn
❖ Normally or small size data. No minimum
sample size.
❖ Data is ratio, interval, nominal or ordinal level
❖ Less power (More likely to make Type II error)
Parametric vs Non Parametric
Design Process
FMEA FMEA
- System
Production Assembly
- Subsystem
FMEA FMEA
- Component
FMEA
- System - System
- Subsystem - Subsystem
- Component - Component
FMEA FMEA
FMEA
❖ Failure Mode and Effect Analysis:
❖ It is proactive tool (Before the problem
happens / not the after effect analysis)
❖ It is a living document
FMEA
Process / Failure Mode Failure Severity Cause(s) of Occurrence Current Detection R Recommende
Requirement Effect (1-10) failure mode (1-10) Controls (1-10) P d actions
(KPIVs) N
Perfume (1-10) • Unclear (1-10) • Review and 4 96
Making • Inconsistent specificatio 3 approve
quality 8 n specification
• Receiving • Wrong by design
ingredients
• Substandard 6 • Third party 4 192
material certification
supplied by • In house test
supplier lab
• Mixing
FMEA
❖ Risk Priority Number (RPN)
❖ Severity (1-10) x Occurrence (1-10) x
Detection (1-10)
❖ Severity
❖ Severity 1 – No effect/ client might not
even notice it
❖ Severity 10 – Serious safety hazard
without warning
FMEA
❖ Occurrence
❖ Occurrence 1 – Rare event, no data of such
type of failure in past
❖ Occurrence 10 – Failure almost inevitable
❖ Detection
❖ Detection 1 – Current system almost
certainly detect the problem (automation)
❖ Detection 10 – Current system can not
detect the problem
FMEA
❖ Identify key process steps
❖ Identify failure mode
❖ Identify failure effects/severity
❖ Identify causes/occurrence
❖ Identify controls /detection
❖ Calculate Risk Priority Number (RPN)
❖ Prioritize by RPN – Higher RPN first
❖ Determine action plan
❖ Recalculate RPN
FMEA
❖ Update FMEA when there is plan to
change / actual change of :
❖ Design
❖ Application
❖ Material
❖ Process
Desired
Gap State
Current
State
Gap Analysis
❖ Defining Current State
❖ Internal Measurements Strength Weakness
Political Social
Government Culural aspects
Intervention
Economic Technological
How business Automation
operates and innovation
Gap Analysis
❖ Defining Future State
❖ Benchmarking
Gap Analysis
❖ Bridging the gap
❖ Prioritization
❖ Hoshin Kanri (X Matrix) for Strategy
deployment
Commonly Used Gap Analysis
❖ Implementing ISO 9001 or other
Management Systems
❖ MBNQA, EFQM Excellence Model,
Deming Prize
Root Cause Analysis (RCA)
❖ RCA is a structured process to identify
root causes of an event that resulted in
an undesired outcome and develop
corrective actions.
Root Cause Analysis (RCA)
❖ 1. Identify the event to be investigated
and gather preliminary information (D)
❖ 2. Charter and select the team (D)
❖ 3. Describe what happened (M)
❖ 4. Identify the contributing factors (A)
❖ 5. Identify the root causes (A)
❖ 6. Identify and implement changes to
eliminate the root causes (I)
❖ 7. Measure and monitor the success (C)
Five Whys
Oil spill on Leakage from
floor pump
Policy of
ordering to
lowest bidder
Five Whys
Policy of
Oil spill on Leakage from Gasket Sub standard
ordering to
floor pump damaged gasket
lowest bidder
Poor
Pump too old
Housekeeping
Frequency
10
0
1
2
3
4
5
6
8
9
7
Application opportunity
Concepts clear
Engaging instructor
Pareto Chart
Complaint
Meeting expectatations
Instructor knoledgeable
Learning Valuable
0%
10%
20%
30%
40%
50%
60%
80%
90%
70%
100%
Fault Tree Analysis
Image from
Wikipedia
Cause and Effect Diagrams
Types of Wastes
Philosophy
• Waste exist in all processes at all levels in
the organization.
• Waste elimination is the key to successful
implementation of lean.
• Waste reduction is an effective way to
increase profitability.
Muda, Mura, Muri
Muda
an activity that is wasteful and
doesn't add value or is unproductive
Mura
Any variation leading to
unbalanced situations.
Muri
Any activity asking unreasonable
stress or effort from personnel, Wastes
material or equipment.
Muda
• Muda is a traditional Japanese term for
an activity that is wasteful and doesn't
add value or is unproductive
• Type I Muda: (Incidental Work)
• Non-value-added tasks which seam to be
essential. Business conditions need to be
changed to eliminate this type of waste.
Wastes
• Type II Muda: (Non-Value-Added Work)
• Non-value-added tasks which can be
eliminated immediately.
Mura
• MURA: Any variation leading to
unbalanced situations.
• Mura exists when
• workflow is out of balance
• workload is inconsistent
• not in compliance with the standard.
Wastes
Muri
• MURI: Any activity asking unreasonable
stress or effort from personnel, material
or equipment.
• For people, Muri means too heavy a mental
or physical burden.
• For machinery Muri means expecting a
machine to do more than it is capable of or
Wastes
has been designed to do.
Eight Types of Wastes
Transportation Over Processing
Unnecessary movement of people Processing beyond the demand
or parts between processes. from the customers.
8 Types
of Wastes Defects
Motion
Unnecessary movement of people Sorting, repetition or making
or parts within a process. scrap
8. Under-utilized staff
1. Transportation
• Unnecessary movement of people or
parts between processes.
8 Types
of Wastes
2. Inventory
• Materials parked and not having value added
to them.
8 Types
of Wastes
3. Motion
• Unnecessary movement of people or parts
within a process.
8 Types
of Wastes
4. Waiting time
• People or parts waiting for a work cycle to
finish.
8 Types
of Wastes
5. Over processing
• Processing beyond the demand from the
customers.
8 Types
of Wastes
6. Overproduction
• Producing too much, too early and/or too
fast.
8 Types
of Wastes
7. Defects
• Sorting, repetition or making scrap
8 Types
of Wastes
8. Unexploited knowledge
• Failure when it comes to exploiting the
knowledge and talent of the employees.
8 Types
of Wastes