Professional Documents
Culture Documents
p
Let F() be distribution function of the quality characteristic X. Dene a class of intervals
I as
I =
_
(x, y) : F(y) F(x) = 0.9973
_
(2)
Nonparametric Capability Indices 33
and corresponding distances by
D =
_
d
xy
: d
xy
= y x, (x, y) I
_
(3)
Let d
p
=
U L
d
(4)
where U and L are as specied earlier. We note that, the denition of C
p
does not assume
any particular shape for the density f() of X. Also, when F() is normal, d
is 6; hence
C
p
agrees with the usual denition of C
p
. We use 0.9973 as an aid for illustration. It
may be replaced by any other fraction in (0, 1).
2.2 Capability Index C
pk
Let us assume that f() is unimodal with mode m. Then, x < m < y for any (x, y) I.
Let (x
, y
= y
, where d
p
. We dene
C
pk
= min
_
U m
y
m
,
mL
mx
_
(5)
It should be noted that C
pk
= C
pk
, when F() is the distribution function of a normal
distribution.
In the capability index C
p
, the expression 6 is used as a measure of process spread while
d
= y
is used in C
p
. The index C
pk
involves a measure of location, the process
mean and the measure of spread, 6. When the process is skewed, the mode is a better
measure of location than the mean and hence C
pk
uses the mode as a measure of location
and a function based on y
and x
as a measure of spread.
When the distribution is positively or negatively skewed, 6 does not support 99.73%
of the area under the density function unlike for the normal distribution. Same is the case
with even some of the symmetric distributions such as Students-t or logistic. However, y
and x
, y
)
Shape V =
6
(y
)
0.2 1.57 0.05 0.68
0.6 0.82 0.20 0.82
0.8 0.93 0.70 0.98
1.0 1.01 1.00 1.01
1.7 1.13 2.00 1.04
2.5 1.13 3.00 1.04
3.5 1.06
4.0 1.01
4.5 0.96
If
V =
6
y
=
C
p
C
p
> 1
then 6 supports more than 99.73% of the area and may indicate that the process is
not capable when it really is.
On the other hand if V < 1, the index C
p
may declare a process to be capable when it
is not. Finally when V 1, both indices C
p
and C
p
suggest the process to be capable;
it should be noted, however, that the process mode may not be on target. Hence C
pk
is recommended as a measure of process capability. It should also be noted that when
the process is normal and the mean is on the target, C
p
can be used to determine the
number of defective parts per million (DPPM), using the probability of nonconformance
2(3C
p
), (C
p
= 1.0 implies 2700 DPPM, C
p
= 1.33 indicates 63 DPPM and C
p
= 2.0
implies 0.002 DPPM). For non-normal F(), the probability of nonconformance is given
by
1 F( +d
p
) +F( (1 )d
p
) (6)
where 0 < < 1 and = L + (1 )U. Further, this can be used to make statements
about DPPM.
3 Estimation of C
p
and C
pk
Here we propose nonparametric estimators
C
p
and
C
pk
of C
p
and C
pk
respectively, based
on an independent sample (X
1
, . . . , X
n
) for X.
Initially we estimate the unknown density f by
f using a kernel estimator of the form:
Nonparametric Capability Indices 35
f(x) =
1
nh
n
n
i=1
k
_
x X
i
h
n
_
(7)
where h
n
is the band-width and k an appropriate kernel. The details regarding the
optimum choice of h
n
, choice of the kernel k and the asymptotic properties of the estimator
f
n
(x). That is, m
n
is an estimator of the mode m,
based on a given sample. Note that
f(m
n
) > 0 and for c
_
0,
f
n
(m
n
)
_
, there exists at
least two points x
n
, y
n
with x
n
< y
n
, such that
f(x
n
) =
f(y
n
) = c. If there are more than
two points, which solve the equation
f() = c, then let x
n
be the smallest and y
n
be the
largest.
Now
F
n
(x) =
x
_
f
n
(z)dz (8)
For a given p, (0 < p < 1), let x
n
and y
n
be such that
f
n
(x
n
) =
f
n
(y
n
) and
F(y
n
)
F
n
(x
n
) = p. The pair (x
, y
n
= y
n
x
n
,
C
p
=
U L
d
n
(9)
and
pk
= min
_
m
n
L
m
n
x
n
,
U m
n
y
n
m
n
_
(10)
as the estimators of d
, C
p
and C
pk
, respectively. The asymptotic properties of the
estimators are given in the Appendix.
The computations of
C
p
and
C
pk
are not as easy as those of
C
p
and
C
pk
. However, these
can be programmed relatively easily.
4 Examples
We consider two examples to illustrate the importance of the modied capability indices.
4.1 Example 1: Finish bolt hole boring in a XL0-2 spindle machine
A connecting rod (connecting the position and the crank shaft) is a vital part in an
automobile engine. A defective rod can reduce the power of an engine and the failure
36 T. V. Ramanathan, A.D. Dharmadhikari and Bovas Abraham
of the rod leads to failure of the engine. This study dealt with the capability of the
manufacturing of such rods and in particular with the nish bolt hole boring operation.
The objective was to reduce the variation in the diameter of the bolt hole. An air gauge
was used to record the deviations from the targeted diameter (measured in thousands of
an inch). However, the data are coded to protect the condentiality. The specications
are L = 0, U = 3 (coded units).
A sample of size n = 95 was selected from a days production to evaluate the capability
of the process. Normality tests and Normal probability plot indicated that the data were
non-normal. The following statistics were calculated for this data.
The sample mean x = 1.038, sample mode m
n
= 0.587, and the sample standard deviation
s = 0.579.
These lead to
C
p
= 0.86 and
C
pk
= 0.60 which seems to indicate that the process is
not capable. However, we also calculated x
= 0.011 and y
1
5
, where s is
the sample standard deviation and n is the sample size. See Silverman (1986) for more
details on such a choice.) Hence we obtain
C
p
= 1.03 and
C
pk
= 1.02. We note that
C
p
is less than one and it is less than
C
p
. The dierence between
C
pk
and
C
pk
is even
much larger. This is because the mode is a better measure of location than the mean
in this context. In situations like this,
C
p
and
C
pk
are better indicators of the process
performance than the traditional ones.
4.2 Example 2: Pathalo Green
Pathalo green is a ne powder used as a coloring substance in paints. The manufacturing
process for the pigments involves two kettles, which are heated by oil. The material inside
the kettle is stirred at a predetermined velocity. The strength of the pigment is the quality
characteristic of interest and it is expected to be the same for xed temperatures, velocity
and cycle time. However, there was considerable variation over 100 batches and there was
interest in reducing this variation.
Initially, the capability of the process was to be measured. One sample was taken from
each of the 100 batches and the strength was measured. Like in the previous example,
normality tests and normal probability plots of the coded measurements revealed that
the assumption of normality is questionable. Therefore, the capability indices based on
normality assumption may be misleading. The specied tolerances in coded units are
L = 0 and U = 4.
For this data x = 0.917, m
n
= 0.410, and s = 0.663. Thus,
C
p
= 1.01 and
C
pk
= 0.46.
In this case x
= 0.00 and y
p
= 1.57 and
C
pk
= 1.00. (The same
procedures explained in Example 1 are adopted.) The new capability measures give the
same indication namely that the process is capable; the usual indices C
p
and C
pk
give
mixed signals. As mentioned earlier, the mode is a better measure of location than the
mean in the present context.
Nonparametric Capability Indices 37
These examples indicate that, when the data are not normal, new measures such as the
one introduced here may be more appropriate than the traditional ones.
5 Concluding Remarks
In this paper, we introduce capability indices, which do not depend on any distributional
assumptions. It is shown that, the usual C
p
and C
pk
can be misleading when the process
distribution is non-normal. Examples illustrate that the new indices are better indicators
of the actual process capability.
We acknowledge that the new indices are harder to compute. However, in the current
computing environment, these can be easily programmed and made available to a user. In
fact, the authors have written a program to compute these indices given the observations
on quality characteristic.
When the sample size is adequate to test the goodness-of-t of a distribution from a
specied parametric family, that exercise is recommended and the estimated percentiles
may be plugged in to evaluate
C
p
and
C
pk
. When no standard distribution is found
suitable for the quality characteristic, the method proposed in this paper may be employed
to evaluate
C
p
and
C
pk
.
The distributional properties of
C
p
and
C
pk
for nite sample sizes can be studied using
bootstrap procedures as in Franklin and Wasserman (1991, 1992).
It is possible to dene further indices C
pm
and C
pmk
along the same line of C
pm
and C
pmk
.
This may be done when (y
e
iuy
k(y)dy
is absolutely integrable over IR.
A4 The sequence
_
h
n
_
is such that h
n
0 and nh
n
as n .
We note that under the assumptions A1 to A4, it can be proved that
sup
x
f
n
(x) f(x)
0 in probability (11)
m
n
m in probability (12)
where m
n
and m are the sample and population modes, respectively (see Dharmadhikari
and Joag Dev (1988, P.198)).
The following lemma and theorem are useful to establish the consistency of the estimators
p
and
C
pk
.
Lemma:
Let f() be a unimodal density satisfying A1, p be a number with 0 < p < 1, k and h
n
be
the kernel and bandwidth, respectively, satisfying A2 - A4 and
f
n
be the estimator of f
dened as (7). Then
f
n
(x
n
) f(x
) and
f
n
(y
n
) f(y
) in probability (13)
where
_
(x
n
, y
n
), n 1
_
are as dened in Section 3.
Proof:
We know that F(y
) F(x
= sup
_
z : (z, w) B(p, c)
_
and y
= inf
_
w : (z, w) B(p, c)
_
(15)
Let D
n
(c), B
n
(p, c) respectively be the sample versions of D(c), B(p, c) based on a sample
of size n, i.e.,
D
n
(c) =
_
(z, w) :
f
n
(z) =
f
n
(w) = c, z w
_
(16)
B
n
(p, c) =
_
(z, w) : (z, w) D
n
(c),
F
n
(w)
F
n
(z) p
_
(17)
Dene
x
n
= sup
_
z : (z, w) B
n
(p, c)
_
and y
n
= inf
_
w : (z, w) B
n
(p, c)
_
(18)
Now
f
n
(x
n
) f(x
sup
zB
n
(p,c)
f
n
(z) sup
zB(p,c)
f(z)
max
_
sup
zA
f
n
(z), sup
zA
f
n
(z)
_
min
_
sup
zA
f(z), sup
zA
f(z)
_
(19)
where A = B
n
(p, c) B(p, c), A
= B
n
(p, c) B
(p, c), A
= B(p, c) B
n
(p, c) and B
and
B
n
are the complements of B and B
n
, respectively.
As a consequence of A1, we obtain
B(p, c) B
n
(p, c) and B
(p, c) B
n
(p, c) as n (20)
Hence
limsup
n
f
n
(x
n
) f(x
lim
n
limsup
zA
f
n
(z) f(z)
lim
n
sup
zIR
f
n
(z) f(z)
n
) f(y
) is analogous.
Theorem:
Under the assumptions of the above lemma
x
n
x
and y
n
y
f(x
n
) f(x
f
n
(x
n
) f(x
n
)
f
n
(x
n
) f(x
(23)
The right hand side converges in probability to 0 because of (11) and the above lemma.
Since f() is uniformly continuous, the result follows.
As a consequence of the above theorem and the denitions of
C
p
and
C
pk
,
p
C
p
and
C
pk
C
pk
in probability (24)
However, since
C
p
and
C
pk
are based on kernel estimators of the probability density
function, the convergence could be slow (Silverman 1986, Section 3.7.2).
Note: The asymptotic distribution of the estimators of suggested capability indices may
be possibly investigated by writing them as functions of quantile processes and then
establishing the weak convergence of such processes.
T.V. Ramanathan A.D. Dharmadhikari
Department of Statistics Department of Statistics
University of Botswana University of Pune
P. B. No. 0022, Gaborone Ganeshkhind, Pune
Botswana India, 411 007
Ramanathan@mopipi.ub.bw adhar@stats.unipune.ernet.in
Bovas Abraham
Department of Statistics and Actuarial Science
University of Waterloo
Waterloo, Ontario
Canada, N2L 3G1
babraham@math.uwaterloo.ca