You are on page 1of 14

Expert Systems with Applications 36 (2009) 48624875

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

A novel technique for selecting mother wavelet function using an intelligent


fault diagnosis system
J. Raee a,*, P.W. Tse b, A. Hari c, M.H. Sadeghi d
a

Department of Mechanical, Aerospace and Nuclear Engineering, Jonsson Engineering Center, Rm. 2049, 110 8th Street, Rensselaer Polytechnic Institute, Troy, NY 12180-3590, USA
Smart Engineering Asset Management Laboratory, City University of Hong Kong, Hong Kong
c
Faculty of Electrical and Computer Engineering, University of Tabriz, Iran
d
Faculty of Mechanical Engineering, University of Tabriz, Iran
b

a r t i c l e

i n f o

Keywords:
Condition monitoring
Signal processing
Vibration signal
Fault diagnosis
Neural network
Genetic algorithm
Mother wavelet
Gear
Piecewise cubic hermite interpolation
Daubechies

a b s t r a c t
This paper presents an optimized gear fault identication system using genetic algorithm (GA) to investigate the type of gear failures of a complex gearbox system using articial neural networks (ANNs) with
a well-designed structure suited for practical implementations due to its short training duration and high
accuracy. For this purpose, slight-worn, medium-worn, and broken-tooth of a spur gear of the gearbox
system were selected as the faults. In fault simulating, two very similar models of worn gear have been
considered with partial difference for evaluating the preciseness of the proposed algorithm. Moreover,
the processing of vibration signals has become much more difcult because a full-of-oil complex gearbox
system has been considered to record raw vibration signals. Raw vibration signals were segmented into
the signals recorded during one complete revolution of the input shaft using tachometer information and
then synchronized using piecewise cubic hermite interpolation to construct the sample signals with the
same length. Next, standard deviation of wavelet packet coefcients of the vibration signals considered as
the feature vector for training purposes of the ANN. To ameliorate the algorithm, GA was exploited to
optimize the algorithm so as to determine the best values for mother wavelet function, decomposition
level of the signals by means of wavelet analysis, and number of neurons in hidden layer resulted in a
high-speed, meticulous two-layer ANN with a small-sized structure. This technique has been eliminated
the drawbacks of the type of mother function for fault classication purpose not only in machine condition monitoring, but also in other related areas. The small-sized proposed network has improved the stability and reliability of the system for practical purposes.
2008 Published by Elsevier Ltd.

1. Introduction
The application of a condition-based maintenance strategy used
for detection and diagnosis of incipient faults can help us to minimize avoidable costs and impediments caused by the need to perform ad hoc repairs. Gearboxes consisting of three major
components (gears, bearings, and shafts) are considered to be
one of the essential systems on account of its inuential applications in a wide domain of industries (e.g. machine tools and vehicles). As more than half of the failures in gearboxes are due to gear
defects (Yesilyurt, 2004), the gear faults have been considered as a
case study for the proposed technique.
In general, acoustic emission (Wu, Chiang, Chang, & Shiao,
2008) and vibration analysis (Tse, Gontarz, & Wang, 2007) are
two main centers of fault detection and diagnosis systems in
machine condition monitoring. In this paper, vibration signals
* Corresponding author. Tel.: +1 (518) 276 6351; fax: +1 (518) 276 6025.
E-mail address: kraee81@gmail.com (J. Raee).
0957-4174/$ - see front matter 2008 Published by Elsevier Ltd.
doi:10.1016/j.eswa.2008.05.052

established as the way of recording required dataset because of


the ease of measurement and the rich contents of information.
To analyze raw vibration signals, a simultaneous timefrequency
analysis method which maps one-dimensional signals into a twodimensional space of time and frequency was applied because of
non-stationarity of vibration signals. Wavelet analysis, the most
popular one for analyzing the non-stationary signals (Peng et al.,
2005), overcomes the drawbacks of other techniques by means of
analytical functions that are local in both time and frequency (Tse,
Yang, & Tam, 2004). Therefore, in pre-processing and feature extraction phase of the research, wavelet transform (WT) (Tse, Peng, &
Yam, 2001) has been used to take out proper features. To bolster this
recommendation, in early 1990, Leducq applied wavelet transform
to analyze the hydraulic noise of the centrifugal pump which is
probably the rst paper in application of wavelet transform in
machine diagnostics (Peng & Chu, 2004). Momoh and Dias (Momoh
& Dias, 1996), moreover, used both fast Fourier transform (FFT)
and wavelet transform (WT) to obtain processed vibration
signals to identify and pinpoint the location of the faults in power

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

DB 4

DB 5

DB 6

DB 7

DB 8

DB 9

Fig. 1. Low-order Daubechies.

distribution systems. The prior research demonstrates the wavelet


transform as the most reliable technique for gear condition monitoring and a capable technique to recognize the incipient failures
and to identify different types of faults simultaneously.

4863

The most indispensable challenge in wavelet analysis is the


selection of the mother wavelet function as well as the decomposition level of signal. As a result of the use of dyadic discrete wavelet transform (DWT), orthogonal wavelets have been applied in this
research; among them Daubechies (DB) wavelets have been widely
implemented as it matches the transient components in vibration
signals. Another main issue in wavelet analysis is the order of the
mother wavelet function, which was previously determined by
trial-and-error methods based on intrinsic characteristics of the
data in several papers (Wang & McFadden, 1995; Samanta & AlBalushi, 2003; Tse et al., 2004; Liu, 2005; Kar & Mohanty, 2006;
Saravanan et al., 2007; Raee, Arvani, Hari, & Sadeghi, 2007). To
more explanations, the range of DB2 and Db20 has been extensively used in machine condition monitoring. Wang and McFadden
(1995) used DB4 orthogonal wavelets to disclose abnormal transients generated by early gear damage from gearbox vibration signal. Sung et al. and also Gaborson used DB20 and DB4, as the
mother wavelet, in their works, respectively (Kar & Mohanty,
2006). Samanta and Al-Balushi (2003) and Raee et al. (2007) processed vibration signals through discrete wavelet transform (DWT)

Fig. 2. (A) Experimental set-up, (B) slight-worn gear, (C) broken-tooth gear, (D) accelerometer location, (E) slight-worn model, (F) schematic of the gearbox and (G) load
mechanism.

4864

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Faultless
100
0
-100

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.03

0.035

0.04

0.045

0.03

0.035

0.04

0.045

0.03

0.035

0.04

0.045

Broken-teeth
400
200
0
-200
-400

0.005

0.01

0.015

0.02

0.025

Medium-worn
100
0
-100

0.005

0.01

0.015

0.02

0.025

Slight-worn
100
0
-100

0.005

0.01

0.015

0.02
0.025
Time (Sec)

Fig. 3. Raw vibration signals of four gearbox conditions recorded during one revolution of input shaft.

Faultless
100
0
-100

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.03

0.035

0.04

0.045

0.03

0.035

0.04

0.045

0.03

0.035

0.04

0.045

Broken-teeth
400
200
0
-200
-400

0.005

0.01

0.015

0.02

0.025

Medium-worn
100
0
-100

0.005

0.01

0.015

0.02

0.025

Slight-worn
100
0
-100

0.005

0.01

0.015

0.02
0.025
Time (Sec)

Fig. 4. Synchronized vibration signals of four gearbox conditions recorded during one revolution of input shaft.

4865

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Signal

A1

D1

Level 1

AA2

AD2

DD2

DA2

Level 2

Level n
Fig. 5. Approximations and details in wavelet packet analysis.

using DB4 to obtain the wavelet coefcients. The most important


issue as it has been shown in Fig. 1, is that it can not be functional
if the adjacent wavelet functions (e.g. DB6 and DB7) are compared
as they do not differ from each other tremendously.
In general, the current fault detection and diagnosis techniques
are predominantly based on intelligent systems, modeling using
classical techniques in time or frequency domain, and statistical
analysis. Fuzzy-based (Wang & Hu, 2006), ANN-based (Wuxing,
Tse, Guicai, & Tielin, 2004), GA-based (He, Guo, & Chu, 2001) methods as well as intelligent hybrid systems such as fuzzy-GA-based
(Lei et al., MSSP 2007), (Lo, Chan, Wong, Rad, & Cheung, 2007),
Nero-fuzzy-based (Lei et al., ESWA 2007), Nero-GA-based (Samanta, 2004) systems can be categorized as the intelligent fault diagnosis systems. In this category, ANN-based systems consisting of
signal preprocessing, feature extraction and recognition process
have been introduced as one of the most noteworthy methods in
several papers. To name a few, Li, Chow, Tipsuwan, and Hung
(2000) presented an approach for motor rolling bearing fault diagnosis using both simulation and experiment to acquire vibration
signals which was used to obtain a feature vector to adjust ANN
weights. Samanta and Al-Balushi (2003) introduced a new technique for fault detection of rolling bearing elements through ANNs
in which feature vector was obtained from statistical properties of
time-domain vibration signals of the rotating machinery with normal and defective bearings. In 2003, Samanta, Al-Balushi, and AlAraimi (2003) also enhanced the proposed procedureusing GA to
meticulously optimize the mentioned feature vector for gear fault
detection using the experimental vibration data of a gearbox system. Inspired by Jack and Nandi (2002) innovations, the paper presents an optimized ANN-based system for identifying the gear
faults of a gearbox system. It also presents a novel solution to nd
the best mother wavelet function for fault classication purpose as
well as the best level of decomposing the vibration signals by
wavelet analysis in machine condition monitoring and related
areas. In addition, the small structure optimized network has improved the stability and reliability of the system in practical
implementations.

raw vibration signals which usually dont have the same length
in a complete revolution of the shaft. To overtake the impediment,
piecewise cubic hermite interpolation (Raee et al., 2007) was suggested to synchronize the signals. Afterwards, wavelet packet analysis (Liu, 2005) has been considered to preprocess the
synchronized signals for presenting a distinguished feature vector.
Then, selecting of the mother wavelet function and decomposition
level has been reformed by means of genetic algorithm as an obstacle in this regard. To present a more accurate fault classication
system, number of neurons in hidden layer has been also chosen
to be optimized by GA.
2.1. Experimental set-up
A proper data acquisition process can extensively improve the
sensibility for the detection of failures. Therefore, selecting an
appropriate method is important for acquiring vibration signal
data. The experimental set-up (Raee et al., 2007) to collect dataset
in this research consists of a four-speed motorcycle gearbox containing the oil during acquiring dataset, an electrical motor with
a constant nominal rotation speed of 1420 RPM, a load mechanism,
multi-channel pulse analyzer system, a triaxial accelerometer,
tachometer and four shock absorbers under the bases of test-bed.
The vibration signals were recorded by mounting the accelerometer on the outer surface of the gearboxs case near input shaft of the
gearbox. Ten different congurations were synthesized and tested.
For each conguration, three different fault conditions were tested
that were slight-worn, medium-worn, broken-tooth of gear. For
evaluating the meticulousness of the technique, two very similar
models of worn gear have been taken into account with partial difference. The rotational speed of the system was measured by
tachometer used as a measure to compensate the uctuations owing to uncertainties of the load mechanism which was a constant
static load. Moreover, the signals were also sampled at 16384 Hz
lasting eight seconds. Fig. 2 clearly shows all parts of fault simulator and related components in detail.
2.2. Preprocessing of raw vibration signals

2. Summary of the developed procedure


Since gears are the most challenging components in condition
monitoring, a complex gearbox system has been considered as
the case study. Of paramount importance is to synchronize the

It is important to pre-process any raw data before use since it


contains redundant information. The pre-processing of vibration
signals was involved in synchronization of the signals and computation of standard deviation of wavelet packet coefcients per-

4866

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

400 (0,0)

200

200

200 (1,0)

(1,1)

-200

-200

-200
200

400

600

100
400

50

200

100

300

-50

300

(2,2)

(2,1)

50

200
0

200

-200

(2,0)

-50

-400
50

100

150

50

100

50

150
100

40 (3,0)

400 (2,3)

100

150

(3,1)

20

200

-20
-100
50

100

20

150

(3,2)

200

40

60

80

100

20

40

60

80

100

20

(3,3)

200
0

-200

-20

0
-200
20

40

60

80

100

20

40

60

80

100

200

50

(3,5)

20

40

60

80

20

0
-100
20

40

60

80

100

60

20
150

40

10

20

-10

-20 (4,0)

100

40

60

80

100

40

60

80

100

(4,2)

50
0

(4,1)

-20
20

60

(3,7)

100

-100
100

40

200

(3,6)

100

-50

(3,4)

20

-50
40

60

20

40

60

Fig. 6a. Wavelet packet coefcients of a sample synchronized broken-tooth signal in four levels.

formed in the following two steps to extract the feature vector.


Accurate pre-processing of the data to feed neural network can
make ANN training more efcient because of a considerable reduction of the dimensionality of the input data and therefore improving the network performance.
2.2.1. Synchronization using piecewise cubic hermite interpolation
In the recorded signals, as a consequence of non-synchronous
attribute of vibration signals, the number of data-points per each
revolution may not be exactly equal from one revolution to another one, due to shaft speed uctuations which can inuence
the time-domain average adversely. To overcome the drawback,
piecewise cubic hermite interpolation (P.C.H.I.) (Fritsch & Carlson,
1980) was exploited to resample the obtained data in each revolution as it outperforms the other methods (Raee et al., 2007). A
menace of interpolation is that the tting function may tend to

demonstrate unsolicited uctuations, essentially following random


errors in the data and oscillations of the data points. This problem
is particularly persisting if the tting function is a polynomial dened over the entire region of interest, and becomes more critical
as the number of data points increases. Consequently, P.C.H.I. was
implemented as the interpolating scheme. In this method, the
same sample spans in each revolution could be guaranteed. In this
paper, the average length of the 150 segmented sample signals was
applied to synchronize the whole signal set. Given that hk is the
length of the kth subinterval,

hk xk1  xk

Then the rst difference, dk is calculated by the following equation,

dk yk1  yk =hk
Assume that dk designate the slope of the interpolant at xk

4867

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

50

(4,3)

100

-50

-100
20

40

(4,4)

200
0
-200

60

20

50

40

60
40

200

0
-50

20

-200

(4,7)
-400

(4,6)

-100

20

40

60

40

100

(4,10)

(4,9)

-50

20

40

60

20

40

60

20

40

60

20

40

60

(4,11)

-20

40

(4,8)

60

20

20

-20
20

50

(4,5)

-100

60

20

40

60

200

200

(4,12)

(4,13)

100

100

(4,14)

0
0

0
-100

- 200
20

40

-100

60
200

20

40

60

20

40

60

(4,15)

100
0
-100

Fig. 6b. Wavelet packet coefcients of a sample synchronized broken-tooth signal in four levels.

DB
order

Slight-worn

Standard
Deviation
of Wavelet
Packet
Coefficients

Medium-worn

0 1 1 0 1 0 1 1 10 11 0 1
Broken-tooth

Fig. 8. Proposed chromosome for GA.

Normal Gearbox
Output layer

Input layer
Hidden layer

Fig. 7. Structure of the MLP network for fault classication.

dk P0 xk

for the piecewise linear interpolant, dk = dk1 or dk


Suppose the following function on the interval xk 6 x 6 xk+1,
expressed in term of local variables s = x- xk and h = hk,
2

Px

3hs  2s3
3

h
ss  h2
h

Number of Decomposition
Neurons
Level

yk1
dk

h  3hs 2s3
h

yk

s2 s  h
2

dk1
4

which is a cubic polynomial in s, and, hence in x, that fullled four


interpolation conditions, two of those on function values and the
rest two on the possibly unknown derivative values. The raw and
synchronized signals have been depicted in Figs. 3 and 4, respectively. As shown, the signal lengths which are not equal in raw ones
have been precisely synchronized after implementing of P.C.H.I.
2.2.2. Wavelet packet analysis
Feature extraction, aimed to take out certain characteristics
from original measured signals, plays the vital role in neural networks. In simple words, badly implemented feature extraction or
improper features will be led to poor classication results even
using the potential classiers. Accordingly, special attention has
been paid to the selection of the feature vector or input layer in
ANNs.
Wavelet analysis, which is mainly categorized into continuous
and discrete types, includes transforming a signal from the time

4868

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875


Table 1
GA parameters

Gearbox
Accelerometer

Tachometer

Raw vibration signals

Number of generation

200

Population
Chromosome length
Selection operator
Fitness normalization
Elitism
Crossover
Mutation

30
12
Roulette
Rank
1
Pc = 0.8, two-point, uniform
Pm = 0.1, Gaussian, mean = 0.0, std = 1.0

Synchronized Signals by P.C.H.I


Table 2
GA variables

Decomposition
Level

Yes
Stopping

Number of
Neurons

Criterion

Mother
Wavelet

Variable name

Range

Optimized value

Daubechies order
Decomposition level
Number of the hidden-layer neurons

DB2DB20
36
1035

DB11
4
14

No
components because it is able to simultaneously impart time and
frequency structures.
Continuous wavelet transform (CWT), a function of two parameters entitled translation and scale, bears a remarkable data
redundancy for the signal or function to be analyzed. By skipping
several steps rather than continuously adjusting two parameters,
in other words, by reducing the volume of calculations, discrete
wavelet transform (DWT) has been dened to analyze the signals
with a smaller set of scales and specic number of translations at
each scale (Soman & Ramachandran, 2004), as efcient as CWT
with identical accuracy. In DWT, the signals are decomposed into
a hierarchical structure of detail and approximations at limited levels as follows:

Selection of parameters by GA

StD of WPC
Training ANN

Testing ANN

Training Time

ANN Performance

Fitness
Function

Intelligent Fault Identification


System with Optimized Parameters

f t

ij
X

Di t Aj t

i1

where Di(t) denotes the wavelet detail and Aj(t) stands for the wavelet approximation at the jth level.
Wavelet packet analysis (Avci, Turkoglu, & Poyraz, 2005), a
generalization of wavelet decomposition offering a richer range
of possibilities for signal analysis, is based upon DWT. Every wavelet detail component is further decomposed to get their own

Fig. 9. Intelligent fault identication system with optimized parameters.

domain into timefrequency domain. Wavelet transform (WT) offers a satisfactory time and poor frequency resolution at high frequencies and a satisfactory frequency and poor time resolution at
low frequencies (Peng et al., 2005). It is used to detect transient

0,0
Synchronized Signal

Level 1

1,0

Level 2

2,0

Level 3

Level 4

1,1

2,1

3,0

4,0

4,1

3,1

4,2

4,3

3,2

4,4

2,2

3,3

4,5

4,6

4,7

3,4

4,8

2,3

3,5

4,9

3,6

3,7

4,10 4,11 4,12 4,13 4,14 4,15

Fig. 10. Decomposition tree of synchronized vibration signals.

4869

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

(4,0), StD=14.58

(4,1), StD=9.99

20

20

-20

-20

(4,4), StD=50.04

(4,2), StD=10.72

20

20

-20

-20

(4,5), StD=73.18

100

200

-100

- 200

(4,6), StD=13.98

(4,7), StD=58.92

50
100
0

- 100
-50

(4, 10), StD=12.47

(4,9), StD=4.36

(4,8), StD=3.22

(4,3), StD=14.26

(4, 11), StD=7.23

50
10

10

-10

-10

20
0

0
-20

-50

(4, 12), StD=54.42

(4, 15), StD=29.57

(4, 14), StD=15.04

(4, 13), StD=63.47


50

100

100

-100

50
0

100

-50
-50

Fig. 11. Wavelet packet coefcients in one sample broken-tooth signal in level 4 (decomposition tree and standard deviation of the coefcients in title).

approximation and detail components as shown in Fig. 5. Wavelet


packet includes a set of linearly combined usual wavelet functions.
Therefore, wavelet packet inherits the attributes of their corresponding wavelet functions such as orthonormality and timefrequency localization. In this research, after synchronizing the
vibration signals, wavelet packet has been used to pre-process
the signals. A wavelet packet is a function with three indices of
integers i, j and k which are the modulation, scale and translation
parameters, respectively,

wij;k t 2j=2 wj 2j t  k; i 1; 2; 3; . . .

The wavelet functions wj are determined from the following recursive equations:

w2j t

1
p X
2
hkwi 2t  k

w2j1 t

1
1
p X

gkwi 2t  k

7
8

1

The original signal f(t) after j level of decomposition are dened as


follows:

f t

2j
X

fji t

i1

while the wavelet packet component signal fji t are stated by a linear combination of wavelet packet functions wij;k t as follows:

1
X

fji t

cij;k twij;k t

10

k1

where the wavelet packet coefcients cij;k t are calculated by:

cij;k t

1

f twij;k tdt

providing that
orthogonality:

the

n
wm
j;k twj;k t 0 if mn

11
wavelet

packet

functions

satisfy

the

12

In this paper, standard deviation of wavelet packet coefcients


(Raee et al., 2007) has been proposed to identify the faults as the
feature vector for ANN training.
The wavelet-based techniques are absolutely dependent upon
the mother wavelet function. Ingrid Daubechies invented what is
called compactly supported orthonormal wavelets thus making
discrete wavelet analysis practical. In machine condition monitoring, Daubechies family functions are often selected for signal analysis and synthesis arbitrarily by trial and error (Wang and
McFadden, 1995; Samanta and Al-Balushi, 2003; Tse et al., 2004;
Liu, 2005; Peng et al., 2005; Kar and Mohanty, 2006; Saravanan
et al., 2007; Raee et al., 2007. In other words, there is no computational logic behind the selection of Daubechies order which is
therefore considered to be optimized for fault classication in this
research. It is of paramount signicance to note that calculating of

4870

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

(4,0), StD=19.94

(4,1), StD=15.49

(4,2), SD=11.51t

100

(4,3), StD=23.52
100

50

20

50
0

-20

-50

-100

-50

(4,4), StD=67.50

(4,5), StD=157.57

(4,6), StD=41.60

200

500

200

-200

-500

500

200

(4,9), StD=8.54

(4,8), StD=5.58

-500

(4, 10), StD=44.57


200

20

20

(4,7), StD=116.60

(4, 11), StD=30.14


100

100
0
0

-20

-20

-100
(4, 12), StD=54.96

(4, 13), StD=67.87

0
-200

(4, 15), StD=57.06

(4, 14), StD=67.45

200

200

-100

200

100

-200

-100

200

Fig. 12. Wavelet packet coefcients in one sample broken-tooth signal in level 4 (decomposition tree and standard deviation of the coefcients in title).

the wavelet coefcients is directly dependent on the shape of the


mother wavelet such that the correlation between mother wavelet
and signal is calculated as wavelet coefcients. Consequently, the
standard deviation of wavelet coefcients (Raee et al., 2007) is
among ideally suited features for fault detection and classication.
As demonstrated in Fig. 6, the time and frequency contents of the
signals have not been lost with down-sampling of the lengthy signals into 16 sub-signals.
In Daubechies family functions, the difference between two
wavelet functions with adjacent order (e.g. DB5 and DB6) is slight
and they are similar to each other as depicted in Fig. 1. Therefore,
the research has been focused more on a novel technique not only
to eliminate the mentioned shortcoming but also to acquire the
more efcient ANN-based system by proposing an accurate feature
vector.
2.3. Articial neural networks
Articial neural networks (Haykin, 1998) or parallel-distributed-processing systems are made of simple processing units,
called neurons which are a rough imitation of architecture of biological nervous system. ANNs have been shown to be suited to
model highly complex and nonlinear phenomena. ANNs are also
useful in machine condition monitoring since they can learn the
systems normal operating conditions and determine if incoming
signals are signicantly different as it is handy when there is lots

of training data available and can manage the imprecision of condition monitoring process thanks to adaptability dynamism. Unlike
the classic digital-processing techniques, ANNs possess the ability
to perform parallel processing of data, cope with noisy data and
adapt to different circumstances.
The most popular neural network is the multi-layer perceptron,
which is a feed-forward network and frequently exploited in fault
detection and diagnosis systems, which has found an immense
popularity in machine condition monitoring applications as it constitutes more than 90% (Bartelmus, Zimroz, & Batra, 2003) of the
current ANNs in the eld, and therefore, in this study, has been
used as the base of failure classier system. The feed-forward network rst learns through already existing data by iteratively
changing its weights using the back propagation of error algorithm which is performed by iteratively seeking to minimize the
root-mean-square-error (RMSE) using the gradient search technique (Kwak & Ha, 2004).
An important network design issue is the selection and implementation of the network conguration. In the proposed system,
the number of nodes in hidden layer, usually determined by
trial-and-error methods in several papers, has been optimized
using GA to provide the maximum throughput. Hyperbolic tangent
sigmoid transfer function was applied to all net layers with Resilient Backpropagation training algorithm (Riedmiller & Braun,
1993). A matrix of a data set with 800 processed elements, comprising 5  4  40 sampled data for every four gear conditions

4871

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

(4,0)
40
30

(4,2)

(4,1)
25

40

20

30

MW

45

NG

MW

NG
15

20

10

10

SW
NG

MW

SW
SW

10

SW

BT

(4,4)

NG

MW

100

BT

NG
SW

40

BT
SW MW
NG

50

20

BT

BT
(4,10)

(4,11)

20
60

SW

NG
10

45

SW

MW
40

NG

SW

30

NG
MW

20
5

MW

(4,7)

NG

MW

50

15

10

BT

100

(4,9)

SW MW

20

SW

60

(4,8)

BT

15

(4,6)

150

SW

25

MW BT

BT

(4,5)

50

15

30

NG

20

75

(4,3)

MW

15

BT

BT

BT
(4,13)

(4,12)

NG

(4,14)

(4,15)

75

SW

90

60

SW

50

MW

60

20

BT

30

75

SW

40

NG
25

NG

NG

MW BT

NG
SW
MW

50

MW
BT

25

BT

Fig. 13. Standard deviation of wavelet packet coefcients for 90 segmented sample signals in each condition (decomposition level: 4, gearbox conditions: normal gearbox,
slight-worn, medium-worn, broken-tooth).

and ve gearbox congurations, was applied to the network as the


feature vector (Raee et al., 2007).
The initial weights were, in addition, obtained randomly in a
range of (10, 10). Error function was chosen to be least mean
square. In many applications, training the network to perform fault
detection system is performed off-line (Chow & Sharpe, 1993). The
obtained signals for 10 congurations were divided into two
groups; each used for training and testing of the proposed ANN.
Fig. 7 depicts schematically the utilized ANN structure.
2.4. Genetic algorithm
Genetic algorithms have become popular following Hollands
work (Holland, 1975). GAs consisting of continuous and binary
forms are designed to efciently search huge, non-linear, discrete
and poorly understood search spaces, where expert knowledge is
scarce or difcult to model and traditional optimization techniques has failed (He et al., 2001). Typically, a simple GA consists
of three operations: (1) parent selection, (2) crossover, and (3)
mutation.

In this research, Roulette wheel selection scheme has been


applied among the selection operators (Wong & Nandi, 2004).
Two-point crossover was used for each chromosome of the
chromosome-pair having a 50% chance of selection, the two
parents selected for crossover exchange information lying between two randomly generated points within the binary
string.
In addition that most of the successful applications of GAs
(Tang, Man, Kwong, & He, 1996) is greatly dependent on nding
a suitable method for encoding the chromosome, the creation of
a tness function to rank the performance of a specic chromosome is also of paramount importance for the success of the training process. The genetic algorithm rates its own performance
around that of the tness function; consequently, if the tness
function does not adequately take account of the desired performance features, the genetic algorithm is unable to meet the deserved requirements of the user.
The proposed chromosome includes ve genes for Daubechies
mother wavelet function, two genes for decomposition level of signals, and ve genes pertain to the number of neurons of hidden

4872

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Mother wavelet = DB11

Mother wavelet = DB2

120

BT

100
80

MW

60

SW

40
20

30 smples for each condition

30 smples for each condition

120

BT

100
80

MW

60

SW

40
20

NG

NG
2

10

12

14

16

Standard Deviation of Wavelet Packet Coefficients

10

12

14

16

Mother wavelet = DB15

Mother wavelet = DB20


120

BT

100

80

MW

60

SW

40

20

NG
2

10

12

14

16

30 smples for each condition

120

30 smples for each condition

Standard Deviation of Wavelet Packet Coefficients

BT

100
80

MW

60

SW

40
20

NG
2

Standard Deviation of Wavelet Packet Coefficients

10

12

14

16

Standard Deviation of Wavelet Packet Coefficients

Fig. 14. Standard deviation of WPC of 30 sample signals in each gearbox condition with different Daubechies order (gearbox conditions: normal gearbox, slight-worn,
medium-worn, broken-tooth).

layer which is depicted in Fig. 8. Furthermore, the following linear


equation was used as the tness function F.

F apbt

13

where a and b are the weights, p and t denote network performance and training time, respectively. In general, the network
performance which is directly related to the accuracy of the network outweighs the training time constraint due to the signicance of the system accuracy and off-line attribute of the
network in the fault detection and diagnosis systems. Therefore,
to optimize the system, network performance has more impact
on tness function than training time; the weights were applied
where a was assumed to be 0.7 vs. 0.3 which was assigned to b.
As the more training time has the more negative impact on the
whole fault identication system negative sign has been considered for b in tness function. In simple words, the network performance and training time have inverse proportion on the tness
function. Depending on different case studies, these values for
constant coefcients a and b may deliberately be selected. For
example, perhaps the training time may not be very important
for a fault identication system having ofine training process.
Accordingly, the weight of training time can verge to be zero in
that circumstance.
To sum up, GA was used to optimally search Daubechies order,
decomposition level of the signals, and the number of neurons in

hidden layer. Fig. 9 depicts the whole scheme of the fault identication system. Tables 1 and 2 explore, respectively, the parameters
and variables of the GA exploited in the research.
3. Results and discussion
In this paper, by running GA, DB11, level 4 and 14 neurons
have been selected as the best values for Daubechies order,
decomposition level, and the number of nodes in hidden layer,
respectively. Decomposition tree of wavelet packet has been depicted in Fig. 10 and related wavelet coefcients of two sample
signals for normal gearbox and broken-tooth gear have been
illustrated in Figs. 11 and 12. With close scrutiny to the amplitude of the wavelet packet coefcients, it has been shown that
there is a considerable correlation between DB11 and the failure
condition (broken-tooth gear). The values of standard deviation
of wavelet packet coefcients in 16 sub-signals are mostly higher than those of the normal one. On the other hand, as shown in
Fig. 13, the uctuations of the standard deviations in different
sample segmented signals are considerable and this is the most
appropriate factor for training the neural networks. Stability and
convergence of the network are directly dependent on the distinguished feature vector with enough samples in ample ranges. As
a matter of fact, the range of the samples in feature vector is
more important than the number of samples in training that

4873

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875


Table 3
Various combinations of the three optimized parameters and related results

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Wavelet function
order

Decomposition
level

No. of hidden-layer
neurons

Samples for
testing

Slightworn

Mediumworn

Brokentooth

Faultless

Best net
structure

DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB

3
3
3
3
3
3
4
4
4
4
4
4
5
5
5
5
5
5

13
15
13
13
15
15
17
19
17
14
15
14
24
25
25
23
23
23

5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75

94
93.33
95.66
98.53
97.33
93.33
94.33
92
100
100
100
97.86
97.33
90.66
96
100
100
90.66

90
92.66
93.33
97.66
96.06
95.33
100
98.46
98.66
100
100
100
96.86
92
100
100
96
100

98.06
97.13
97.33
99.33
98.2
98.06
99.66
99.13
100
100
100
99
100
99.4
100
100
100
100

96.66
92.33
95.66
97.66
96.73
99
96
94.66
93.33
100
100
100
98.66
97.53
98.66
100
100
94.66

8:10:4
8:12:4
8:10:4
8:10:4
8:12:4
8:12:4
16:14:4
16:16:4
16:14:4
16:11:4
16:13:4
16:14:4
32:21:4
32:22:4
32:22:4
32:21:4
32:23:4
32:23:4

8
9
10
11
12
13
8
9
10
11
12
14
8
9
10
11
12
13

Table 4
Various arrangements of the three optimized parameters with a constant number of neurons and the related results

1
2
3
4
5
6
7
8
9
10
11
12

Wavelet function
order

Decomposition
level

No. of hidden-layer
neurons

Samples for
testing

Slightworn

Mediumworn

Brokentooth

Faultless

Net
structure

DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB

4
4
4
4
4
4
5
5
5
5
5
5

14
14
14
14
14
14
14
14
14
14
14
14

5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75
5  4  75

97
96.33
98.66
100
100
96
89.66
94.33
95.06
95
95.66
93.66

93.33
97.33
100
100
100
100
91
89.66
97
98.66
94
91.66

100
100
100
100
100
100
96.66
95.33
98.66
96.66
96.33
96.66

98.66
100
100
100
100
100
95
96.66
100
98
96.66
94

16:11:4
16:11:4
16:11:4
16:11:4
16:11:4
16:11:4
32:11:4
32:11:4
32:11:4
32:11:4
32:11:4
32:11:4

8
9
10
11
12
13
8
9
10
11
12
13

the standard deviation of wavelet coefcients has fullled the


need. Fig. 14 clearly shows the difference between different orders of Daubechies to discriminate the failures. As it can be seen,
the failures and normal condition can be clearly discriminate in
all of them, and it is related to the appropriate feature vector.
Nonetheless, the broken-tooth failure is not as clear as the others in DB2 even though there is enough severity in the fault. Besides, the uniformity among the sample signals in each condition
is not as satisfactory as DB11. In DB15, the drawbacks are the
same as DB2 but more appropriate results. DB20 is not a worthy
case for neural network because in many level there is no differences among the failures and normal condition. Most signicantly, the preciseness of discriminating the faults can not be
compared with DB11. In DB20, the slight- and medium-worn
gear, which is more important for us than the broken-tooth
one are not distinguished as easily as DB11 especially in incipient stages of the fault. Consequently, Daubechies wavelet function with the order of 11 was selected as the most proper
function by GA for fault classication. It means that for different
wavelet-based methods, the selection of mother wavelet function is mandatory to increase the performance of the system
although as mentioned in this research it may be possible to
have enough system performance with a specic mother wavelet
function.
To validate the estimated values, the performance of the neural network with three above-mentioned parameters has been

also calculated manually with several trials and errors. Related


results which are matched with the estimation of GA have been
shown in Table 3. Although, the network structure is compact in
decomposition level of 3 (due to 8-element network input), obviously no satisfactory results were obtained. In spite of the higher
performance and better results, the decomposition level of 5
leads to a larger number of neurons of the hidden layer (32-element network input) which leads to the complexity of the system. Therefore, fourth decomposition level was determined to
be the most appropriate level for this case study because of
the better outcomes than those of the others and the introduction of a well-formed network structure. The data in Table 3
has been calculated for each case without genetic algorithm. In
addition, the optimal number of the neurons of the hidden layer
was determined using several trials and errors runs for each case
in Table 3.
Hence, the end results obtained using genetic algorithm not
only does the high capabilities of this algorithm prove in order
to nd the optimal solution but also veries the soundness of
the parameters for the adjustment of the algorithm presented
in Table 1.
Table 4 puts forward the results of a diagnostics system without
using GA and manually with 14 neurons in hidden layer with different function orders and decomposition levels to better clarify
the aptness of the results of the previous case (DB11 with a decomposition level of 4).

4874

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Segmentation

Raw signal

Synchronization by P.C.H.I.

Selection

Crossover

Generation

Mutation

0
1
1
0
1
0
1
1
0
1
0
1

0
1
1
0
1
0
1
1
0
1
0
1

DB order

No. of
Neurons

F =. p+ . t

Decomposition
Level

Genetic Algorithm

Fitness Function
No

Stopping
Criteria Fulfilled

Yes
END
Fig. 15. Schematic of the entire proposed algorithm.

4. Conclusion
An intelligent gear fault identication system was developed
and implemented to pinpoint the gearbox faults; Three parameters
were recognized to be of paramount signicance to be optimized
using GA i.e. Daubechies wavelet function order, decomposition level, and number of neurons in hidden layer of ANN, which play a
key role in size and performance of the network and the soundness
of the feature vector. The Feature vector was also obtained from
optimized standard deviation of wavelet packet coefcients acquired from DB11 at fourth level of decomposition following a
piecewise cubic hermite interpolation for synchronizing. Eventually, a MLP network with well-formed and optimized structure
(16:14:4) and remarkable accuracy was presented providing the
capability to identify slight-worn, medium-worn and broken-tooth
of gears faults perfectly in 100% of the ve tested congurations of
gearbox. By the proposed system which has been elaborately depicted in Fig. 15, the drawback of implementing mother wavelet
function through trial-and-error based methods has been also improved for fault classication purposes.
Acknowledgements
This work described in this paper was partially supported by
Vibration and Modal Analysis Lab at University of Tabriz and partially supported by a grant from the Research Grants Council of
Hong Kong Special Administrative Region, China (Project no. CityU
120506). The authors would like to write in memoriam of two dedicated mentors, Professor Vahhab Pirouzpanah and Professor Samad Nami Notash who devoted more than four decades to
enlighten a myriad of students. They would also like to thank the

anonymous reviewers for spending their valuable time to review


the current research.
References
Avci, E., Turkoglu, I., & Poyraz, M. (2005). Intelligent target recognition based on
wavelet packet neural network. Expert Systems with Applications, 29, 175182. S.
Bartelmus, W., Zimroz, R., & Batra, H. (2003). Gearbox vibration signal preprocessing and input values choice for neural network training. Articial
Intelligence Methods, Gliwice, Poland, November 57.
Chow, M. Y., & Sharpe, R. N. (1993). On the application and design of articial neural
networks for motor fault detection, Part 2. IEEE Transactions on Industrial
Electronics, 40(2).
Fritsch, F. N., & Carlson, R. E. (1980). Monotone piecewise cubic interpolation. SIAM
Journal of Numerical Analysis, 17, 238246.
Haykin, S. (1998). Neural networks: A comprehensive foundation (2nd ed.). Pearson
Education.
He, Y., Guo, D., & Chu, F. (2001). Using genetic algorithms and nite element
methods to detect shaft crack for rotor-bearing system. Mathematics and
Computers in Simulation, 57, 95108.
Holland, J. H. (1975). Adaption in natural and articial systems. Ann Arbor: University
of Michigan Press.
Jack, L. B., & Nandi, A. K. (2002). Fault detection using support vector machines and
articial neural networks augmented by genetic algorithms. Mechanical System
and Signal Processing, 16(23), 373390.
Kar, C., & Mohanty, A. R. (2006). Monitoring gear vibrations through motor current
signature analysis and wavelet transform. Mechanical Systems and Signal
Processing, 20(1), 158187.
Kwak, J. S., & Ha, M. K. (2004). Neural network approach for diagnosis of grinding
operation by acoustic emission and power signals. Journal of Materials Processing
Technology, 147, 6571.
Lei, Y., He, Z., & Zi, Y. (2007). A new approach to intelligent fault diagnosis of
rotating machinery. Expert Systems with Applications. doi:10.1016/j.eswa.2007.
08. 072.
Lei, Y., He, Z., Zi, Y., & Hu, Q. (2007). Fault diagnosis of rotating machinery based on
multiple ANFIS combination with GAs. Mechanical Systems and Signal Processing,
21, 22802294.
Li, B., Chow, M. Y., Tipsuwan, Y., & Hung, J. C. (2000). Neural-network-based motor
rolling bearing fault diagnosis. IEEE Transactions on Industrial Electronics, 47(5).

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875


Liu, B. (2005). Selection of wavelet packet basis for rotating machinery fault
diagnosis. Journal of Sound and Vibration, 284, 567582.
Lo, C. H., Chan, P. T., Wong, Y. K., Rad, A. B., & Cheung, K. L. (2007). Fuzzy-genetic
algorithm for automatic fault detection in HVAC systems. Applied Soft
Computing, 7, 554560.
Momoh, J. A., & Dias, L. G. (1996). Solar dynamic power system fault diagnostics. In
NASA conference Publication 10189.
Peng, Z. K., & Chu, F. L. (2004). Application of the wavelet transform in machine
condition monitoring and fault diagnostics. Mechanical Systems and Signal
Processing, 18, 199221.
Peng, Z. K., Tse, P. W., & Chu, F. L. (2005). An improved HilbertHuang transform and
its application in vibration signal analysis. Journal of Sound and Vibration, 286,
187205.
Peng, Z. K., Tse, P. W., & Chu, F. L. (2005). A comparison study of improved
HilbertHuang transform and wavelet transform: Application to fault
diagnosis for rolling bearing. Mechanical Systems and Signal Processing, 19,
974988.
Raee, J., Arvani, F., Hari, A., & Sadeghi, M. H. (2007). Intelligent condition
monitoring of a gearbox using articial neural network. Mechanical Systems and
Signal Processing, 21, 17461754.
Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster
backpropagation learning: The RPROP algorithm. In Proceedings of the IEEE
international conference on neural networks, San Francisco.
Samanta, B. (2004). Gear fault detection using articial neural networks and
support vector machines with genetic algorithms. Mechanical Systems and Signal
Processing, 18, 625644.
Samanta, B., & Al-Balushi, K. R. (2003). Articial neural network based fault
diagnostics of rolling element bearings using time-domain features. Mechanical
Systems and Signal Processing, 17(2), 317328.
Samanta, B., Al-Balushi, K. R., & Al-Araimi, S. A. (2003). Articial neural networks
and support vector machines with genetic algorithm for bearing fault detection.
Engineering Applications of Articial Intelligence, 16, 657665.

4875

Saravanan, N. et al. (2007). A comparative study on classication of features by SVM


and PSVM extracted using Morlet wavelet for fault diagnosis of spur bevel gear
box. Expert Systems with Applications. doi:10.1016/j.eswa.2007.08.026.
Soman, K. I., & Ramachandran (2004). Insight into wavelets from theory to practice.
Prentice-Hall of India.
Tang, K. S., Man, K. F., Kwong, S., & He, O. (1996). Genetic algorithms and their
applications. IEEE Signal Processing Magazine, 2237.
Tse, P. W., Gontarz, S., & Wang, X. J. (2007). Enhanced eigenvector algorithm for
recovering multiple sources of vibration signals in machine fault diagnosis.
Mechanical Systems and Signal Processing, 21, 27942813.
Tse, P. W., Peng, Y. H., & Yam, R. (2001). Wavelet analysis and envelope detection for
rolling element bearing fault diagnosis Their effectiveness and exibilities.
Journal of Vibration and Acoustics, 123, 303310.
Tse, P. W., Yang, W. X., & Tam, H. Y. (2004). Machine fault diagnosis through an
effective exact wavelet analysis. Journal of Sound and Vibration, 277, 1005
1024.
Wang, J., & Hu, H. (2006). Vibration-based fault diagnosis of pump using fuzzy
technique. Measurement, 39, 176185.
Wang, W. J., & McFadden, P. D. (1995). Application of orthogonal wavelets to early
gear damage detection. Mechanical System and Signal Processing, 9, 497507.
Wong, M. L. D., & Nandi, A. K. (2004). Automatic digital modulation recognition
using articial neural network and genetic algorithm. Signal Processing, 84,
351365.
Wu, J. D., Chiang, P. H., Chang, Y. W., & Shiao, Y. J. (2008). An expert system for fault
diagnosis in internal combustion engines using probability neural network.
Expert Systems with Applications, 34, 27042713.
Wuxing, L., Tse, P. W., Guicai, Z., & Tielin, S. (2004). Classication of gear faults using
cumulants and the radial basis function network. Mechanical Systems and Signal
Processing, 18, 381389.
Yesilyurt, I. (2004). The application of the conditional moments analysis to gearbox
fault detection-a comparative study using the spectrogram and scalogram.
NDT& E International, 37, 309320.

You might also like