Nonparametric Capability Indices - Ramanathan

c Heldermann Verlag Economic Quality Control
ISSN 0940-5151 Vol 18 (2003), No. 1, 31 41

Nonparametric Capability Indices
T. V. Ramanathan, A.D. Dharmadhikari and Bovas Abraham
Abstract: Process capability indices C
p
and C
pk
are widely used in statistical quality control
to assess the capability of a process. These indices are dened, based on the assumption that
the quality characteristic follows a normal distribution. In this paper two new capability indices
are considered, which do not depend on any distributional assumptions. The estimation and
asymptotic properties of the estimators of these indices are investigated. Two examples are
given to illustrate the importance of the given indices and their computational procedures.
Key Words: Capability, Estimation, Non-normal
1 Introduction
Many organizations, as part of their quality improvement programs, utilize, Capability
Indices as a measure of their process performance. These are typically unit less quantities
comparing the allowable process spread with the actual process spread. Several Indices are
available in the literature (see Kotz and Johnson (1993) and Kotz and Lovelace (1998)).
Kotz and Johnson (2002) provide a compact survey with interpretations and comments
on some 170 publications on process capability indices, which appeared during 1992-2000.
Two well-known indices are
(i) C
p
=
U L
6
and (ii) C
pk
= min
_
U
3
,
L
3
_
(1)
where U = upper specication limit, L = lower specication limit, = process standard
deviation, and = process mean.
These denitions are motivated by the fact that for a normally distributed process, 6 is
the actual process spread covering 99.73% of the parts; if the specied process tolerance
U L = 6, then C
p
= 1 and the process is said to be just capable. Hence eorts are
usually made to keep C
p
higher than one. However it can be misleading if the process
is not on target even when it is normally distributed. In such cases, the index C
pk
is
recommended.
In practice, the true values of C
p
and C
pk
are unknown and one uses their estimates based
on the estimates of and from a sample of size n. This estimation requires appropriate
sampling plans (the number of observations, the sources of variation included, etc.) so
that the variability of the resulting estimates is small.
32 T. V. Ramanathan, A.D. Dharmadhikari and Bovas Abraham
As indicated above, C
p
and C
pk
are motivated by the normal distribution. However,
with the increased complexity of shape of the parts and introduction of computerized
manufacturing and measurement systems, there are situations where this assumption
may not be appropriate. For instance measurements such as out of round, surface nish
(roughness), positional deviations (co-axiality and symmetry), and directional deviations
may not be normal (see for example Munechika (1986)). Thus, there is a need to look
into the aspects of capability indices when the quality characteristic is non-normal.
This is not the rst time that the problem of non-normality relating to capability in-
dices is being looked into. Both the books, viz., Kotz and Johnson (1993) and Kotz and
Lovelace (1998) have devoted one of their chapters for reviewing such problems. The
recent review paper of Kotz and Johnson (2002) also discusses the problems and conse-
quences of non-normality. The most commonly used techniques to handle non-normal
data are transformation and quantile estimation. Many practitioners are not comfortable
with transformed data and may have diculty in translating the results back to the orig-
inal scale. Many a time, it will be also dicult to identify the correct transformation.
Clements (1989) suggested a procedure to modify the capability indices C
p
and C
pk
using
the quantile estimates. This method consists of replacing 6 with the dierence of two
quantiles viz., X
0.99865
and X
0.00135
, where X
0.99865
is such that P(X X
0.99865
) = 0.99865
and P(X X
0.00135
) = 0.00135. If the distributional form of the quality characteristic
is not not known, it is often approximated by a member in the Pearson family. Subse-
quently, the approximations for X
0.99865
and X
0.00135
are determined. Several authors such
as Gilchrist (1993), Chang and Lu (1994), Pearn and Kotz (1994) and Sundaraiyer (1996),
have extended Clements method to incorporate various related situations. In most of the
earlier work, the authors have tried to t an appropriate probability distribution of the
process from available data and dene indices based on the estimated distribution. Such
an approach would require large amounts of data to have a clear understanding of the
shape of the distribution and the analysis can also be very sensitive to departures from
that distribution.
An alternative consists of proposing modied capability indices, which can be applied
without reference to a particular distribution. In this paper, we introduce such indices.
The indices are dened in Section 2 and an estimating procedure is discussed in Section
3. Two examples are considered in Section 4 and some concluding remarks are given in
Section 5. Asymptotic properties of the estimators are established in the Appendix.
2 Modied Capability Indices
2.1 Capability Index C
p
Let F() be distribution function of the quality characteristic X. Dene a class of intervals
I as
I =
_
(x, y) : F(y) F(x) = 0.9973
_
(2)
Nonparametric Capability Indices 33
and corresponding distances by
D =
_
d
xy
: d
xy
= y x, (x, y) I
_
(3)
Let d
= inf D. Then, we dene a new capability index

C
p
=
U L
d
(4)
where U and L are as specied earlier. We note that, the denition of C
p
does not assume
any particular shape for the density f() of X. Also, when F() is normal, d
is 6; hence
C
p
agrees with the usual denition of C
p
. We use 0.9973 as an aid for illustration. It
may be replaced by any other fraction in (0, 1).
2.2 Capability Index C
pk
Let us assume that f() is unimodal with mode m. Then, x < m < y for any (x, y) I.
Let (x
, y
) be the interval in I such that d
= y
, where d
is as given in the denition

of C
p
. We dene
C
pk
= min
_
U m
y
m
,
mL
mx
_
(5)
It should be noted that C
pk
= C
pk
, when F() is the distribution function of a normal
distribution.
In the capability index C
p
, the expression 6 is used as a measure of process spread while
d
= y
is used in C
p
. The index C
pk
involves a measure of location, the process
mean and the measure of spread, 6. When the process is skewed, the mode is a better
measure of location than the mean and hence C
pk
uses the mode as a measure of location
and a function based on y
and x
as a measure of spread.
When the distribution is positively or negatively skewed, 6 does not support 99.73%
of the area under the density function unlike for the normal distribution. Same is the case
with even some of the symmetric distributions such as Students-t or logistic. However, y
and x
are dened such that (x
, y
) is the shortest interval supporting the area of 99.73%.

We demonstrate this by computing V =
6
y
for Weibull and Gamma distributions

with varying shape parameters. In Table 2.1 some numerical results are presented for
illustration.
It can be seen that when the shape parameter of the Weibull is in the range (0.8, 4.5),
the ratio V ranges between 0.93 and 1.13. In other cases, the ratio is farther from 1. In
the case of Gamma distribution, when the shape parameter is in the range(0.05, 1.0), the
ratio V ranges from 0.68 to 1.01.
Table 2.1: Values of V =
6
y
Weibull with Scale Parameter 1 Gamma with Scale Parameter 1

Shape V =
6
(y
)
Shape V =
6
(y
)
0.2 1.57 0.05 0.68
0.6 0.82 0.20 0.82
0.8 0.93 0.70 0.98
1.0 1.01 1.00 1.01
1.7 1.13 2.00 1.04
2.5 1.13 3.00 1.04
3.5 1.06
4.0 1.01
4.5 0.96
If
V =
6
y
=
C
p
C
p
> 1
then 6 supports more than 99.73% of the area and may indicate that the process is
not capable when it really is.
On the other hand if V < 1, the index C
p
may declare a process to be capable when it
is not. Finally when V 1, both indices C
p
and C
p
suggest the process to be capable;
it should be noted, however, that the process mode may not be on target. Hence C
pk
is recommended as a measure of process capability. It should also be noted that when
the process is normal and the mean is on the target, C
p
can be used to determine the
number of defective parts per million (DPPM), using the probability of nonconformance
2(3C
p
), (C
p
= 1.0 implies 2700 DPPM, C
p
= 1.33 indicates 63 DPPM and C
p
= 2.0
implies 0.002 DPPM). For non-normal F(), the probability of nonconformance is given
by
1 F( +d
p
) +F( (1 )d
p
) (6)
where 0 < < 1 and = L + (1 )U. Further, this can be used to make statements
about DPPM.
3 Estimation of C
p
and C
pk
Here we propose nonparametric estimators

C
p
and

C
pk
of C
p
and C
pk
respectively, based
on an independent sample (X
1
, . . . , X
n
) for X.
Initially we estimate the unknown density f by

f using a kernel estimator of the form:
f(x) =
1
nh
n
n
i=1
k
_
x X
i
h
n
_
(7)
where h
n
is the band-width and k an appropriate kernel. The details regarding the
optimum choice of h
n
, choice of the kernel k and the asymptotic properties of the estimator
f can be seen in Rao (1983) or Silverman (1986).

Let m
n
be such that

f(m
n
) = sup
x
f
n
(x). That is, m
n
is an estimator of the mode m,
based on a given sample. Note that

f(m
n
) > 0 and for c
_
0,

f
n
(m
n
)
_
, there exists at
least two points x
n
, y
n
with x
n
< y
n
, such that

f(x
n
) =

f(y
n
) = c. If there are more than
two points, which solve the equation

f() = c, then let x
n
be the smallest and y
n
be the
largest.
Now
F
n
(x) =
x
_
f
n
(z)dz (8)
For a given p, (0 < p < 1), let x
n
and y
n
be such that

f
n
(x
n
) =

f
n
(y
n
) and
F(y
n
)

F
n
(x
n
) = p. The pair (x
, y
) can be obtained using the well-known bisection

method.
We dene
d
n
= y
n
x
n
,

C
p
=
U L
d
n
(9)
and
pk
= min
_
m
n
L
m
n
x
n
,
U m
n
y
n
m
n
_
(10)
as the estimators of d
, C
p
and C
pk
, respectively. The asymptotic properties of the
estimators are given in the Appendix.
The computations of

C
p
and

C
pk
are not as easy as those of

C
p
and

C
pk
. However, these
can be programmed relatively easily.
4 Examples
We consider two examples to illustrate the importance of the modied capability indices.
4.1 Example 1: Finish bolt hole boring in a XL0-2 spindle machine
A connecting rod (connecting the position and the crank shaft) is a vital part in an
automobile engine. A defective rod can reduce the power of an engine and the failure
of the rod leads to failure of the engine. This study dealt with the capability of the
manufacturing of such rods and in particular with the nish bolt hole boring operation.
The objective was to reduce the variation in the diameter of the bolt hole. An air gauge
was used to record the deviations from the targeted diameter (measured in thousands of
an inch). However, the data are coded to protect the condentiality. The specications
are L = 0, U = 3 (coded units).
A sample of size n = 95 was selected from a days production to evaluate the capability
of the process. Normality tests and Normal probability plot indicated that the data were
non-normal. The following statistics were calculated for this data.
The sample mean x = 1.038, sample mode m
n
= 0.587, and the sample standard deviation
s = 0.579.
These lead to

C
p
= 0.86 and

C
pk
= 0.60 which seems to indicate that the process is
not capable. However, we also calculated x
= 0.011 and y
= 2.890. (To estimate the

density, we have used Epanechnikov kernel and the band-width as 1.06sn
1
5
, where s is
the sample standard deviation and n is the sample size. See Silverman (1986) for more
details on such a choice.) Hence we obtain

C
p
= 1.03 and

C
pk
= 1.02. We note that
C
p
is less than one and it is less than

C
p
. The dierence between

C
pk
and

C
pk
is even
much larger. This is because the mode is a better measure of location than the mean
in this context. In situations like this,

C
p
and

C
pk
are better indicators of the process
performance than the traditional ones.
4.2 Example 2: Pathalo Green
Pathalo green is a ne powder used as a coloring substance in paints. The manufacturing
process for the pigments involves two kettles, which are heated by oil. The material inside
the kettle is stirred at a predetermined velocity. The strength of the pigment is the quality
characteristic of interest and it is expected to be the same for xed temperatures, velocity
and cycle time. However, there was considerable variation over 100 batches and there was
interest in reducing this variation.
Initially, the capability of the process was to be measured. One sample was taken from
each of the 100 batches and the strength was measured. Like in the previous example,
normality tests and normal probability plots of the coded measurements revealed that
the assumption of normality is questionable. Therefore, the capability indices based on
normality assumption may be misleading. The specied tolerances in coded units are
L = 0 and U = 4.
For this data x = 0.917, m
n
= 0.410, and s = 0.663. Thus,

C
p
= 1.01 and

C
pk
= 0.46.
In this case x
= 0.00 and y
= 2.54, and hence

C
p
= 1.57 and

C
pk
= 1.00. (The same
procedures explained in Example 1 are adopted.) The new capability measures give the
same indication namely that the process is capable; the usual indices C
p
and C
pk
give
mixed signals. As mentioned earlier, the mode is a better measure of location than the
mean in the present context.
These examples indicate that, when the data are not normal, new measures such as the
one introduced here may be more appropriate than the traditional ones.
5 Concluding Remarks
In this paper, we introduce capability indices, which do not depend on any distributional
assumptions. It is shown that, the usual C
p
and C
pk
can be misleading when the process
distribution is non-normal. Examples illustrate that the new indices are better indicators
of the actual process capability.
We acknowledge that the new indices are harder to compute. However, in the current
computing environment, these can be easily programmed and made available to a user. In
fact, the authors have written a program to compute these indices given the observations
on quality characteristic.
When the sample size is adequate to test the goodness-of-t of a distribution from a
specied parametric family, that exercise is recommended and the estimated percentiles
may be plugged in to evaluate

C
p
and

C
pk
. When no standard distribution is found
suitable for the quality characteristic, the method proposed in this paper may be employed
to evaluate

C
p
and

C
pk
.
The distributional properties of

C
p
and

C
pk
for nite sample sizes can be studied using
bootstrap procedures as in Franklin and Wasserman (1991, 1992).
It is possible to dene further indices C
pm
and C
pmk
along the same line of C
pm
and C
pmk
.
This may be done when (y
) is much smaller than (U L) and mode is not on target.

Acknowledgements
Both Dharmadhikari and Abraham would like to acknowledge the research support from
Natural Sciences and Engineering Research Council of Canada (NSREC) and Manufac-
turing Research Corporation of Ontario (MRCO).
References
[1] Chang, P. and Lu, K. (1994). PCI calculations with any shape distribution with
percentile. QWTS, 110-114.
[2] Clements, J. A. (1989). Process capability calculations for non-normal distributions.
Quality Progress, September, 95-100.
[3] Dharmadhikari, S. W. and Kumar Joag Dev, (1988). Unimodality and Convexicity.
Wiley, New York.
[4] Franklin, L. A. and Wasserman, G. (1991). Bootstrap condence interval estimates
of C
pk
: An introduction. Communications in Statistics: Simulations and Computa-
tions 20(1), 231-242.
[5] Franklin, L. A. and Wasserman, G. (1992). Bootstrap lower condence limits esti-
mates for capability indices. Journal of Quality Technology 24(4), 158-172.
[6] Gilchrist, W. G. (1993). Capability of the customer-supplier chain. First Newcastle
Conference on Quality and its Applications. Penshaw Press, Newcastle-upon-Tyne,
United Kingdom, 587-591.
[7] Kotz, S. and Johnson, N. L. (1993). Process Capability Indices. Chapman & Hall,
New York.
[8] Kotz, S. and Johnson, N. L. (2002). Process Capability Indices - A Review, 1992-
2000. Journal of Quality Technology 34, No.1, 2-53.
[9] Kotz, S. and Lovelace, C. R. (1998). Process Capability indices in Theory and Prac-
tice. Arnold, London.
[10] Munechika, M. (1986). Evaluation of process capability for skew distributions. Pro-
ceedings of 30th EOQC Conference, Stockholm, 383-390.
[11] Pearn, W. L. and Kotz, S. (1994). Applications of Clements method for calculating
second and third generation process capability indices for non-normal Pearsonian
populations. Quality Engineering 7(1), 139-145.
[12] Rao, B. L. S. P. (1983). Non Parametric Functional Estimation. Academic Press,
Orlando.
[13] Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chap-
man and Hall, London.
[14] Sundaraiyer, V. H. (1996). Estimation of a process capability index for inverse
Gaussian distributions. Communications in Statistics: Theory and Methods 26(10),
2381-2396.
Appendix
Asymptotic Properties
Throughout the following discussion we assume that
A1 The density function f() is uniformly continuous on IR.
A2 The kernel k is such that
_
IR
k(y)dy = 1
A3 The Fourier transform
(u) =
e
iuy
k(y)dy
is absolutely integrable over IR.
A4 The sequence
_
h
n
_
is such that h
n
0 and nh
n
as n .
We note that under the assumptions A1 to A4, it can be proved that
sup
x
f
n
(x) f(x)
0 in probability (11)
m
n
m in probability (12)
where m
n
and m are the sample and population modes, respectively (see Dharmadhikari
and Joag Dev (1988, P.198)).
The following lemma and theorem are useful to establish the consistency of the estimators
p
and

C
pk
.
Lemma:
Let f() be a unimodal density satisfying A1, p be a number with 0 < p < 1, k and h
n
be
the kernel and bandwidth, respectively, satisfying A2 - A4 and

f
n
be the estimator of f
dened as (7). Then
f
n
(x
n
) f(x
) and

f
n
(y
n
) f(y
) in probability (13)
where
_
(x
n
, y
n
), n 1
_
are as dened in Section 3.
Proof:
We know that F(y
) F(x
) = p. Now corresponding to each c > 0, we dene the set

D(c) =
_
(z, w) : f(w) = f(z) = c, z w
_
Note that, D(c) is empty, for c > m, and D(c) =
_
(m, m)
_
for c = m.
Let
B(p, c) =
_
(z, w) : (z, w) D(c), F(w) F(z) p
_
(14)
Then, for xed p
x
= sup
_
z : (z, w) B(p, c)
_
and y
= inf
_
w : (z, w) B(p, c)
_
(15)
Let D
n
(c), B
n
(p, c) respectively be the sample versions of D(c), B(p, c) based on a sample
of size n, i.e.,
D
n
(c) =
_
(z, w) :

f
n
(z) =

f
n
(w) = c, z w
_
(16)
B
n
(p, c) =
_
(z, w) : (z, w) D
n
(c),

F
n
(w)

F
n
(z) p
_
(17)
Dene
x
n
= sup
_
z : (z, w) B
n
(p, c)
_
and y
n
= inf
_
w : (z, w) B
n
(p, c)
_
(18)
Now
f
n
(x
n
) f(x
sup
zB
n
(p,c)
f
n
(z) sup
zB(p,c)
f(z)
max
_
sup
zA
f
n
(z), sup
zA
f
n
(z)
_
min
_
sup
zA
f(z), sup
zA
f(z)
_
(19)
where A = B
n
(p, c) B(p, c), A
= B
n
(p, c) B
(p, c), A
= B(p, c) B
n
(p, c) and B
and
B
n
are the complements of B and B
n
, respectively.
As a consequence of A1, we obtain
B(p, c) B
n
(p, c) and B
(p, c) B
n
(p, c) as n (20)
Hence
limsup
n
f
n
(x
n
) f(x
lim
n
limsup
zA
f
n
(z) f(z)
lim
n
sup
zIR
f
n
(z) f(z)
= 0 in probability (see (11)) (21)

The proof of

f
n
(y
n
) f(y
) is analogous.
Theorem:
Under the assumptions of the above lemma
x
n
x
and y
n
y
hold in probability. (22)

Proof:
Note that
f(x
n
) f(x
f
n
(x
n
) f(x
n
)
f
n
(x
n
) f(x
(23)
The right hand side converges in probability to 0 because of (11) and the above lemma.
Since f() is uniformly continuous, the result follows.
As a consequence of the above theorem and the denitions of

C
p
and

C
pk
,
p
C
p
and

C
pk
C
pk
in probability (24)
However, since

C
p
and

C
pk
are based on kernel estimators of the probability density
function, the convergence could be slow (Silverman 1986, Section 3.7.2).
Note: The asymptotic distribution of the estimators of suggested capability indices may
be possibly investigated by writing them as functions of quantile processes and then
establishing the weak convergence of such processes.
T.V. Ramanathan A.D. Dharmadhikari
Department of Statistics Department of Statistics
University of Botswana University of Pune
P. B. No. 0022, Gaborone Ganeshkhind, Pune
Botswana India, 411 007
Ramanathan@mopipi.ub.bw adhar@stats.unipune.ernet.in
Bovas Abraham
Department of Statistics and Actuarial Science
University of Waterloo
Waterloo, Ontario
Canada, N2L 3G1
babraham@math.uwaterloo.ca

Nonparametric Capability Indices - Ramanathan

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nonparametric Capability Indices - Ramanathan

Uploaded by

Copyright:

Available Formats

c Heldermann Verlag Economic Quality Control

ISSN 0940-5151 Vol 18 (2003), No. 1, 31 41

= inf D. Then, we dene a new capability index

) be the interval in I such that d

is as given in the denition

are dened such that (x

) is the shortest interval supporting the area of 99.73%.

for Weibull and Gamma distributions

Weibull with Scale Parameter 1 Gamma with Scale Parameter 1

f can be seen in Rao (1983) or Silverman (1986).

) can be obtained using the well-known bisection

= 2.890. (To estimate the

= 2.54, and hence

) is much smaller than (U L) and mode is not on target.

) = p. Now corresponding to each c > 0, we dene the set

= 0 in probability (see (11)) (21)

hold in probability. (22)

Nonparametric Capability Indices 41

You might also like