A Novel Technique For Selecting Mother Wavelet Function Using An Intelligent Fault Diagnosis System

Expert Systems with Applications 36 (2009) 48624875
Contents lists available at ScienceDirect
Expert Systems with Applications

journal homepage: www.elsevier.com/locate/eswa
A novel technique for selecting mother wavelet function using an intelligent

fault diagnosis system
J. Raee a,*, P.W. Tse b, A. Hari c, M.H. Sadeghi d
a
Department of Mechanical, Aerospace and Nuclear Engineering, Jonsson Engineering Center, Rm. 2049, 110 8th Street, Rensselaer Polytechnic Institute, Troy, NY 12180-3590, USA
Smart Engineering Asset Management Laboratory, City University of Hong Kong, Hong Kong
c
Faculty of Electrical and Computer Engineering, University of Tabriz, Iran
d
Faculty of Mechanical Engineering, University of Tabriz, Iran
b
a r t i c l e
i n f o
Keywords:
Condition monitoring
Signal processing
Vibration signal
Fault diagnosis
Neural network
Genetic algorithm
Mother wavelet
Gear
Piecewise cubic hermite interpolation
Daubechies
a b s t r a c t
This paper presents an optimized gear fault identication system using genetic algorithm (GA) to investigate the type of gear failures of a complex gearbox system using articial neural networks (ANNs) with
a well-designed structure suited for practical implementations due to its short training duration and high
accuracy. For this purpose, slight-worn, medium-worn, and broken-tooth of a spur gear of the gearbox
system were selected as the faults. In fault simulating, two very similar models of worn gear have been
considered with partial difference for evaluating the preciseness of the proposed algorithm. Moreover,
the processing of vibration signals has become much more difcult because a full-of-oil complex gearbox
system has been considered to record raw vibration signals. Raw vibration signals were segmented into
the signals recorded during one complete revolution of the input shaft using tachometer information and
then synchronized using piecewise cubic hermite interpolation to construct the sample signals with the
same length. Next, standard deviation of wavelet packet coefcients of the vibration signals considered as
the feature vector for training purposes of the ANN. To ameliorate the algorithm, GA was exploited to
optimize the algorithm so as to determine the best values for mother wavelet function, decomposition
level of the signals by means of wavelet analysis, and number of neurons in hidden layer resulted in a
high-speed, meticulous two-layer ANN with a small-sized structure. This technique has been eliminated
the drawbacks of the type of mother function for fault classication purpose not only in machine condition monitoring, but also in other related areas. The small-sized proposed network has improved the stability and reliability of the system for practical purposes.
2008 Published by Elsevier Ltd.
1. Introduction
The application of a condition-based maintenance strategy used
for detection and diagnosis of incipient faults can help us to minimize avoidable costs and impediments caused by the need to perform ad hoc repairs. Gearboxes consisting of three major
components (gears, bearings, and shafts) are considered to be
one of the essential systems on account of its inuential applications in a wide domain of industries (e.g. machine tools and vehicles). As more than half of the failures in gearboxes are due to gear
defects (Yesilyurt, 2004), the gear faults have been considered as a
case study for the proposed technique.
In general, acoustic emission (Wu, Chiang, Chang, & Shiao,
2008) and vibration analysis (Tse, Gontarz, & Wang, 2007) are
two main centers of fault detection and diagnosis systems in
machine condition monitoring. In this paper, vibration signals
* Corresponding author. Tel.: +1 (518) 276 6351; fax: +1 (518) 276 6025.
E-mail address: kraee81@gmail.com (J. Raee).
0957-4174/$ - see front matter 2008 Published by Elsevier Ltd.
doi:10.1016/j.eswa.2008.05.052
established as the way of recording required dataset because of

the ease of measurement and the rich contents of information.
To analyze raw vibration signals, a simultaneous timefrequency
analysis method which maps one-dimensional signals into a twodimensional space of time and frequency was applied because of
non-stationarity of vibration signals. Wavelet analysis, the most
popular one for analyzing the non-stationary signals (Peng et al.,
2005), overcomes the drawbacks of other techniques by means of
analytical functions that are local in both time and frequency (Tse,
Yang, & Tam, 2004). Therefore, in pre-processing and feature extraction phase of the research, wavelet transform (WT) (Tse, Peng, &
Yam, 2001) has been used to take out proper features. To bolster this
recommendation, in early 1990, Leducq applied wavelet transform
to analyze the hydraulic noise of the centrifugal pump which is
probably the rst paper in application of wavelet transform in
machine diagnostics (Peng & Chu, 2004). Momoh and Dias (Momoh
& Dias, 1996), moreover, used both fast Fourier transform (FFT)
and wavelet transform (WT) to obtain processed vibration
signals to identify and pinpoint the location of the faults in power
J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875
DB 4
DB 5
DB 6
DB 7
DB 8
DB 9
Fig. 1. Low-order Daubechies.
distribution systems. The prior research demonstrates the wavelet

transform as the most reliable technique for gear condition monitoring and a capable technique to recognize the incipient failures
and to identify different types of faults simultaneously.
4863
The most indispensable challenge in wavelet analysis is the

selection of the mother wavelet function as well as the decomposition level of signal. As a result of the use of dyadic discrete wavelet transform (DWT), orthogonal wavelets have been applied in this
research; among them Daubechies (DB) wavelets have been widely
implemented as it matches the transient components in vibration
signals. Another main issue in wavelet analysis is the order of the
mother wavelet function, which was previously determined by
trial-and-error methods based on intrinsic characteristics of the
data in several papers (Wang & McFadden, 1995; Samanta & AlBalushi, 2003; Tse et al., 2004; Liu, 2005; Kar & Mohanty, 2006;
Saravanan et al., 2007; Raee, Arvani, Hari, & Sadeghi, 2007). To
more explanations, the range of DB2 and Db20 has been extensively used in machine condition monitoring. Wang and McFadden
(1995) used DB4 orthogonal wavelets to disclose abnormal transients generated by early gear damage from gearbox vibration signal. Sung et al. and also Gaborson used DB20 and DB4, as the
mother wavelet, in their works, respectively (Kar & Mohanty,
2006). Samanta and Al-Balushi (2003) and Raee et al. (2007) processed vibration signals through discrete wavelet transform (DWT)
Fig. 2. (A) Experimental set-up, (B) slight-worn gear, (C) broken-tooth gear, (D) accelerometer location, (E) slight-worn model, (F) schematic of the gearbox and (G) load
mechanism.
4864
Faultless
100
0
-100
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.03
0.035
0.04
0.045
0.03
0.035
0.04
0.045
0.03
0.035
0.04
0.045
Broken-teeth
400
200
0
-200
-400
0.005
0.01
0.015
0.02
0.025
Medium-worn
100
0
-100
0.005
0.01
0.015
0.02
0.025
Slight-worn
100
0
-100
0.005
0.01
0.015
0.02
0.025
Time (Sec)
Fig. 3. Raw vibration signals of four gearbox conditions recorded during one revolution of input shaft.
Faultless
100
0
-100
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.03
0.035
0.04
0.045
0.03
0.035
0.04
0.045
0.03
0.035
0.04
0.045
Broken-teeth
400
200
0
-200
-400
0.005
0.01
0.015
0.02
0.025
Medium-worn
100
0
-100
0.005
0.01
0.015
0.02
0.025
Slight-worn
100
0
-100
0.005
0.01
0.015
0.02
0.025
Time (Sec)
Fig. 4. Synchronized vibration signals of four gearbox conditions recorded during one revolution of input shaft.
4865
Signal
A1
D1
Level 1
AA2
AD2
DD2
DA2
Level 2
Level n
Fig. 5. Approximations and details in wavelet packet analysis.
using DB4 to obtain the wavelet coefcients. The most important

issue as it has been shown in Fig. 1, is that it can not be functional
if the adjacent wavelet functions (e.g. DB6 and DB7) are compared
as they do not differ from each other tremendously.
In general, the current fault detection and diagnosis techniques
are predominantly based on intelligent systems, modeling using
classical techniques in time or frequency domain, and statistical
analysis. Fuzzy-based (Wang & Hu, 2006), ANN-based (Wuxing,
Tse, Guicai, & Tielin, 2004), GA-based (He, Guo, & Chu, 2001) methods as well as intelligent hybrid systems such as fuzzy-GA-based
(Lei et al., MSSP 2007), (Lo, Chan, Wong, Rad, & Cheung, 2007),
Nero-fuzzy-based (Lei et al., ESWA 2007), Nero-GA-based (Samanta, 2004) systems can be categorized as the intelligent fault diagnosis systems. In this category, ANN-based systems consisting of
signal preprocessing, feature extraction and recognition process
have been introduced as one of the most noteworthy methods in
several papers. To name a few, Li, Chow, Tipsuwan, and Hung
(2000) presented an approach for motor rolling bearing fault diagnosis using both simulation and experiment to acquire vibration
signals which was used to obtain a feature vector to adjust ANN
weights. Samanta and Al-Balushi (2003) introduced a new technique for fault detection of rolling bearing elements through ANNs
in which feature vector was obtained from statistical properties of
time-domain vibration signals of the rotating machinery with normal and defective bearings. In 2003, Samanta, Al-Balushi, and AlAraimi (2003) also enhanced the proposed procedureusing GA to
meticulously optimize the mentioned feature vector for gear fault
detection using the experimental vibration data of a gearbox system. Inspired by Jack and Nandi (2002) innovations, the paper presents an optimized ANN-based system for identifying the gear
faults of a gearbox system. It also presents a novel solution to nd
the best mother wavelet function for fault classication purpose as
well as the best level of decomposing the vibration signals by
wavelet analysis in machine condition monitoring and related
areas. In addition, the small structure optimized network has improved the stability and reliability of the system in practical
implementations.
raw vibration signals which usually dont have the same length
in a complete revolution of the shaft. To overtake the impediment,
piecewise cubic hermite interpolation (Raee et al., 2007) was suggested to synchronize the signals. Afterwards, wavelet packet analysis (Liu, 2005) has been considered to preprocess the
synchronized signals for presenting a distinguished feature vector.
Then, selecting of the mother wavelet function and decomposition
level has been reformed by means of genetic algorithm as an obstacle in this regard. To present a more accurate fault classication
system, number of neurons in hidden layer has been also chosen
to be optimized by GA.
2.1. Experimental set-up
A proper data acquisition process can extensively improve the
sensibility for the detection of failures. Therefore, selecting an
appropriate method is important for acquiring vibration signal
data. The experimental set-up (Raee et al., 2007) to collect dataset
in this research consists of a four-speed motorcycle gearbox containing the oil during acquiring dataset, an electrical motor with
a constant nominal rotation speed of 1420 RPM, a load mechanism,
multi-channel pulse analyzer system, a triaxial accelerometer,
tachometer and four shock absorbers under the bases of test-bed.
The vibration signals were recorded by mounting the accelerometer on the outer surface of the gearboxs case near input shaft of the
gearbox. Ten different congurations were synthesized and tested.
For each conguration, three different fault conditions were tested
that were slight-worn, medium-worn, broken-tooth of gear. For
evaluating the meticulousness of the technique, two very similar
models of worn gear have been taken into account with partial difference. The rotational speed of the system was measured by
tachometer used as a measure to compensate the uctuations owing to uncertainties of the load mechanism which was a constant
static load. Moreover, the signals were also sampled at 16384 Hz
lasting eight seconds. Fig. 2 clearly shows all parts of fault simulator and related components in detail.
2.2. Preprocessing of raw vibration signals
2. Summary of the developed procedure

Since gears are the most challenging components in condition
monitoring, a complex gearbox system has been considered as
the case study. Of paramount importance is to synchronize the
It is important to pre-process any raw data before use since it

contains redundant information. The pre-processing of vibration
signals was involved in synchronization of the signals and computation of standard deviation of wavelet packet coefcients per-
4866
400 (0,0)
200
200
200 (1,0)
(1,1)
-200
-200
-200
200
400
600
100
400
50
200
100
300
-50
300
(2,2)
(2,1)
50
200
0
200
-200
(2,0)
-50
-400
50
100
150
50
100
50
150
100
40 (3,0)
400 (2,3)
100
150
(3,1)
20
200
-20
-100
50
100
20
150
(3,2)
200
40
60
80
100
20
40
60
80
100
20
(3,3)
200
0
-200
-20
0
-200
20
40
60
80
100
20
40
60
80
100
200
50
(3,5)
20
40
60
80
20
0
-100
20
40
60
80
100
60
20
150
40
10
20
-10
-20 (4,0)
100
40
60
80
100
40
60
80
100
(4,2)
50
0
(4,1)
-20
20
60
(3,7)
100
-100
100
40
200
(3,6)
100
-50
(3,4)
20
-50
40
60
20
40
60
Fig. 6a. Wavelet packet coefcients of a sample synchronized broken-tooth signal in four levels.
formed in the following two steps to extract the feature vector.

Accurate pre-processing of the data to feed neural network can
make ANN training more efcient because of a considerable reduction of the dimensionality of the input data and therefore improving the network performance.
2.2.1. Synchronization using piecewise cubic hermite interpolation
In the recorded signals, as a consequence of non-synchronous
attribute of vibration signals, the number of data-points per each
revolution may not be exactly equal from one revolution to another one, due to shaft speed uctuations which can inuence
the time-domain average adversely. To overcome the drawback,
piecewise cubic hermite interpolation (P.C.H.I.) (Fritsch & Carlson,
1980) was exploited to resample the obtained data in each revolution as it outperforms the other methods (Raee et al., 2007). A
menace of interpolation is that the tting function may tend to
demonstrate unsolicited uctuations, essentially following random

errors in the data and oscillations of the data points. This problem
is particularly persisting if the tting function is a polynomial dened over the entire region of interest, and becomes more critical
as the number of data points increases. Consequently, P.C.H.I. was
implemented as the interpolating scheme. In this method, the
same sample spans in each revolution could be guaranteed. In this
paper, the average length of the 150 segmented sample signals was
applied to synchronize the whole signal set. Given that hk is the
length of the kth subinterval,
hk xk1 xk
Then the rst difference, dk is calculated by the following equation,
dk yk1 yk =hk
Assume that dk designate the slope of the interpolant at xk
4867
50
(4,3)
100
-50
-100
20
40
(4,4)
200
0
-200
60
20
50
40
60
40
200
0
-50
20
-200
(4,7)
-400
(4,6)
-100
20
40
60
40
100
(4,10)
(4,9)
-50
20
40
60
20
40
60
20
40
60
20
40
60
(4,11)
-20
40
(4,8)
60
20
20
-20
20
50
(4,5)
-100
60
20
40
60
200
200
(4,12)
(4,13)
100
100
(4,14)
0
0
0
-100
- 200
20
40
-100
60
200
20
40
60
20
40
60
(4,15)
100
0
-100
Fig. 6b. Wavelet packet coefcients of a sample synchronized broken-tooth signal in four levels.
DB
order
Slight-worn
Standard
Deviation
of Wavelet
Packet
Coefficients
Medium-worn
0 1 1 0 1 0 1 1 10 11 0 1
Broken-tooth
Fig. 8. Proposed chromosome for GA.
Normal Gearbox
Output layer
Input layer
Hidden layer
Fig. 7. Structure of the MLP network for fault classication.
dk P0 xk
for the piecewise linear interpolant, dk = dk1 or dk

Suppose the following function on the interval xk 6 x 6 xk+1,
expressed in term of local variables s = x- xk and h = hk,
2
Px
3hs 2s3
3
h
ss h2
h
Number of Decomposition
Neurons
Level
yk1
dk
h 3hs 2s3
h
yk
s2 s h
2
dk1
4
which is a cubic polynomial in s, and, hence in x, that fullled four

interpolation conditions, two of those on function values and the
rest two on the possibly unknown derivative values. The raw and
synchronized signals have been depicted in Figs. 3 and 4, respectively. As shown, the signal lengths which are not equal in raw ones
have been precisely synchronized after implementing of P.C.H.I.
2.2.2. Wavelet packet analysis
Feature extraction, aimed to take out certain characteristics
from original measured signals, plays the vital role in neural networks. In simple words, badly implemented feature extraction or
improper features will be led to poor classication results even
using the potential classiers. Accordingly, special attention has
been paid to the selection of the feature vector or input layer in
ANNs.
Wavelet analysis, which is mainly categorized into continuous
and discrete types, includes transforming a signal from the time
4868

Table 1
GA parameters
Gearbox
Accelerometer
Tachometer
Raw vibration signals
Number of generation
200
Population
Chromosome length
Selection operator
Fitness normalization
Elitism
Crossover
Mutation
30
12
Roulette
Rank
1
Pc = 0.8, two-point, uniform
Pm = 0.1, Gaussian, mean = 0.0, std = 1.0
Synchronized Signals by P.C.H.I

Table 2
GA variables
Decomposition
Level
Yes
Stopping
Number of
Neurons
Criterion
Mother
Wavelet
Variable name
Range
Optimized value
Daubechies order
Decomposition level
Number of the hidden-layer neurons
DB2DB20
36
1035
DB11
4
14
No
components because it is able to simultaneously impart time and
frequency structures.
Continuous wavelet transform (CWT), a function of two parameters entitled translation and scale, bears a remarkable data
redundancy for the signal or function to be analyzed. By skipping
several steps rather than continuously adjusting two parameters,
in other words, by reducing the volume of calculations, discrete
wavelet transform (DWT) has been dened to analyze the signals
with a smaller set of scales and specic number of translations at
each scale (Soman & Ramachandran, 2004), as efcient as CWT
with identical accuracy. In DWT, the signals are decomposed into
a hierarchical structure of detail and approximations at limited levels as follows:
Selection of parameters by GA
StD of WPC
Training ANN
Testing ANN
Training Time
ANN Performance
Fitness
Function
Intelligent Fault Identification

System with Optimized Parameters
f t
ij
X
Di t Aj t
i1
where Di(t) denotes the wavelet detail and Aj(t) stands for the wavelet approximation at the jth level.
Wavelet packet analysis (Avci, Turkoglu, & Poyraz, 2005), a
generalization of wavelet decomposition offering a richer range
of possibilities for signal analysis, is based upon DWT. Every wavelet detail component is further decomposed to get their own
Fig. 9. Intelligent fault identication system with optimized parameters.
domain into timefrequency domain. Wavelet transform (WT) offers a satisfactory time and poor frequency resolution at high frequencies and a satisfactory frequency and poor time resolution at
low frequencies (Peng et al., 2005). It is used to detect transient
0,0
Synchronized Signal
Level 1
1,0
Level 2
2,0
Level 3
Level 4
1,1
2,1
3,0
4,0
4,1
3,1
4,2
4,3
3,2
4,4
2,2
3,3
4,5
4,6
4,7
3,4
4,8
2,3
3,5
4,9
3,6
3,7
4,10 4,11 4,12 4,13 4,14 4,15
Fig. 10. Decomposition tree of synchronized vibration signals.
4869
(4,0), StD=14.58
(4,1), StD=9.99
20
20
-20
-20
(4,4), StD=50.04
(4,2), StD=10.72
20
20
-20
-20
(4,5), StD=73.18
100
200
-100
- 200
(4,6), StD=13.98
(4,7), StD=58.92
50
100
0
- 100
-50
(4, 10), StD=12.47
(4,9), StD=4.36
(4,8), StD=3.22
(4,3), StD=14.26
(4, 11), StD=7.23
50
10
10
-10
-10
20
0
0
-20
-50
(4, 12), StD=54.42
(4, 15), StD=29.57
(4, 14), StD=15.04
(4, 13), StD=63.47

50
100
100
-100
50
0
100
-50
-50
Fig. 11. Wavelet packet coefcients in one sample broken-tooth signal in level 4 (decomposition tree and standard deviation of the coefcients in title).
approximation and detail components as shown in Fig. 5. Wavelet

packet includes a set of linearly combined usual wavelet functions.
Therefore, wavelet packet inherits the attributes of their corresponding wavelet functions such as orthonormality and timefrequency localization. In this research, after synchronizing the
vibration signals, wavelet packet has been used to pre-process
the signals. A wavelet packet is a function with three indices of
integers i, j and k which are the modulation, scale and translation
parameters, respectively,
wij;k t 2j=2 wj 2j t k; i 1; 2; 3; . . .
The wavelet functions wj are determined from the following recursive equations:
w2j t
1
p X
2
hkwi 2t k
w2j1 t
1
1
p X
gkwi 2t k
7
8
1
The original signal f(t) after j level of decomposition are dened as

follows:
f t
2j
X
fji t
i1
while the wavelet packet component signal fji t are stated by a linear combination of wavelet packet functions wij;k t as follows:
1
X
fji t
cij;k twij;k t
10
k1
where the wavelet packet coefcients cij;k t are calculated by:
cij;k t
1
f twij;k tdt
providing that
orthogonality:
the
n
wm
j;k twj;k t 0 if mn
11
wavelet
packet
functions
satisfy
the
12
In this paper, standard deviation of wavelet packet coefcients

(Raee et al., 2007) has been proposed to identify the faults as the
feature vector for ANN training.
The wavelet-based techniques are absolutely dependent upon
the mother wavelet function. Ingrid Daubechies invented what is
called compactly supported orthonormal wavelets thus making
discrete wavelet analysis practical. In machine condition monitoring, Daubechies family functions are often selected for signal analysis and synthesis arbitrarily by trial and error (Wang and
McFadden, 1995; Samanta and Al-Balushi, 2003; Tse et al., 2004;
Liu, 2005; Peng et al., 2005; Kar and Mohanty, 2006; Saravanan
et al., 2007; Raee et al., 2007. In other words, there is no computational logic behind the selection of Daubechies order which is
therefore considered to be optimized for fault classication in this
research. It is of paramount signicance to note that calculating of
4870
(4,0), StD=19.94
(4,1), StD=15.49
(4,2), SD=11.51t
100
(4,3), StD=23.52
100
50
20
50
0
-20
-50
-100
-50
(4,4), StD=67.50
(4,5), StD=157.57
(4,6), StD=41.60
200
500
200
-200
-500
500
200
(4,9), StD=8.54
(4,8), StD=5.58
-500
(4, 10), StD=44.57

200
20
20
(4,7), StD=116.60
(4, 11), StD=30.14

100
100
0
0
-20
-20
-100
(4, 12), StD=54.96
(4, 13), StD=67.87
0
-200
(4, 15), StD=57.06
(4, 14), StD=67.45
200
200
-100
200
100
-200
-100
200
Fig. 12. Wavelet packet coefcients in one sample broken-tooth signal in level 4 (decomposition tree and standard deviation of the coefcients in title).
the wavelet coefcients is directly dependent on the shape of the

mother wavelet such that the correlation between mother wavelet
and signal is calculated as wavelet coefcients. Consequently, the
standard deviation of wavelet coefcients (Raee et al., 2007) is
among ideally suited features for fault detection and classication.
As demonstrated in Fig. 6, the time and frequency contents of the
signals have not been lost with down-sampling of the lengthy signals into 16 sub-signals.
In Daubechies family functions, the difference between two
wavelet functions with adjacent order (e.g. DB5 and DB6) is slight
and they are similar to each other as depicted in Fig. 1. Therefore,
the research has been focused more on a novel technique not only
to eliminate the mentioned shortcoming but also to acquire the
more efcient ANN-based system by proposing an accurate feature
vector.
2.3. Articial neural networks
Articial neural networks (Haykin, 1998) or parallel-distributed-processing systems are made of simple processing units,
called neurons which are a rough imitation of architecture of biological nervous system. ANNs have been shown to be suited to
model highly complex and nonlinear phenomena. ANNs are also
useful in machine condition monitoring since they can learn the
systems normal operating conditions and determine if incoming
signals are signicantly different as it is handy when there is lots
of training data available and can manage the imprecision of condition monitoring process thanks to adaptability dynamism. Unlike
the classic digital-processing techniques, ANNs possess the ability
to perform parallel processing of data, cope with noisy data and
adapt to different circumstances.
The most popular neural network is the multi-layer perceptron,
which is a feed-forward network and frequently exploited in fault
detection and diagnosis systems, which has found an immense
popularity in machine condition monitoring applications as it constitutes more than 90% (Bartelmus, Zimroz, & Batra, 2003) of the
current ANNs in the eld, and therefore, in this study, has been
used as the base of failure classier system. The feed-forward network rst learns through already existing data by iteratively
changing its weights using the back propagation of error algorithm which is performed by iteratively seeking to minimize the
root-mean-square-error (RMSE) using the gradient search technique (Kwak & Ha, 2004).
An important network design issue is the selection and implementation of the network conguration. In the proposed system,
the number of nodes in hidden layer, usually determined by
trial-and-error methods in several papers, has been optimized
using GA to provide the maximum throughput. Hyperbolic tangent
sigmoid transfer function was applied to all net layers with Resilient Backpropagation training algorithm (Riedmiller & Braun,
1993). A matrix of a data set with 800 processed elements, comprising 5 4 40 sampled data for every four gear conditions
4871
(4,0)
40
30
(4,2)
(4,1)
25
40
20
30
MW
45
NG
MW
NG
15
20
10
10
SW
NG
MW
SW
SW
10
SW
BT
(4,4)
NG
MW
100
BT
NG
SW
40
BT
SW MW
NG
50
20
BT
BT
(4,10)
(4,11)
20
60
SW
NG
10
45
SW
MW
40
NG
SW
30
NG
MW
20
5
MW
(4,7)
NG
MW
50
15
10
BT
100
(4,9)
SW MW
20
SW
60
(4,8)
BT
15
(4,6)
150
SW
25
MW BT
BT
(4,5)
50
15
30
NG
20
75
(4,3)
MW
15
BT
BT
BT
(4,13)
(4,12)
NG
(4,14)
(4,15)
75
SW
90
60
SW
50
MW
60
20
BT
30
75
SW
40
NG
25
NG
NG
MW BT
NG
SW
MW
50
MW
BT
25
BT
Fig. 13. Standard deviation of wavelet packet coefcients for 90 segmented sample signals in each condition (decomposition level: 4, gearbox conditions: normal gearbox,
slight-worn, medium-worn, broken-tooth).
and ve gearbox congurations, was applied to the network as the

feature vector (Raee et al., 2007).
The initial weights were, in addition, obtained randomly in a
range of (10, 10). Error function was chosen to be least mean
square. In many applications, training the network to perform fault
detection system is performed off-line (Chow & Sharpe, 1993). The
obtained signals for 10 congurations were divided into two
groups; each used for training and testing of the proposed ANN.
Fig. 7 depicts schematically the utilized ANN structure.
2.4. Genetic algorithm
Genetic algorithms have become popular following Hollands
work (Holland, 1975). GAs consisting of continuous and binary
forms are designed to efciently search huge, non-linear, discrete
and poorly understood search spaces, where expert knowledge is
scarce or difcult to model and traditional optimization techniques has failed (He et al., 2001). Typically, a simple GA consists
of three operations: (1) parent selection, (2) crossover, and (3)
mutation.
In this research, Roulette wheel selection scheme has been

applied among the selection operators (Wong & Nandi, 2004).
Two-point crossover was used for each chromosome of the
chromosome-pair having a 50% chance of selection, the two
parents selected for crossover exchange information lying between two randomly generated points within the binary
string.
In addition that most of the successful applications of GAs
(Tang, Man, Kwong, & He, 1996) is greatly dependent on nding
a suitable method for encoding the chromosome, the creation of
a tness function to rank the performance of a specic chromosome is also of paramount importance for the success of the training process. The genetic algorithm rates its own performance
around that of the tness function; consequently, if the tness
function does not adequately take account of the desired performance features, the genetic algorithm is unable to meet the deserved requirements of the user.
The proposed chromosome includes ve genes for Daubechies
mother wavelet function, two genes for decomposition level of signals, and ve genes pertain to the number of neurons of hidden
4872
Mother wavelet = DB11
120
BT
100
80
MW
60
SW
40
20
30 smples for each condition
120
BT
100
80
MW
60
SW
40
20
NG
NG
2
10
12
14
16
Standard Deviation of Wavelet Packet Coefficients
10
12
14
16

120
BT
100
80
MW
60
SW
40
20
NG
2
10
12
14
16
120
BT
100
80
MW
60
SW
40
20
NG
2
10
12
14
16
Fig. 14. Standard deviation of WPC of 30 sample signals in each gearbox condition with different Daubechies order (gearbox conditions: normal gearbox, slight-worn,
medium-worn, broken-tooth).
layer which is depicted in Fig. 8. Furthermore, the following linear

equation was used as the tness function F.
F apbt
13
where a and b are the weights, p and t denote network performance and training time, respectively. In general, the network
performance which is directly related to the accuracy of the network outweighs the training time constraint due to the signicance of the system accuracy and off-line attribute of the
network in the fault detection and diagnosis systems. Therefore,
to optimize the system, network performance has more impact
on tness function than training time; the weights were applied
where a was assumed to be 0.7 vs. 0.3 which was assigned to b.
As the more training time has the more negative impact on the
whole fault identication system negative sign has been considered for b in tness function. In simple words, the network performance and training time have inverse proportion on the tness
function. Depending on different case studies, these values for
constant coefcients a and b may deliberately be selected. For
example, perhaps the training time may not be very important
for a fault identication system having ofine training process.
Accordingly, the weight of training time can verge to be zero in
that circumstance.
To sum up, GA was used to optimally search Daubechies order,
decomposition level of the signals, and the number of neurons in
hidden layer. Fig. 9 depicts the whole scheme of the fault identication system. Tables 1 and 2 explore, respectively, the parameters
and variables of the GA exploited in the research.
3. Results and discussion
In this paper, by running GA, DB11, level 4 and 14 neurons
have been selected as the best values for Daubechies order,
decomposition level, and the number of nodes in hidden layer,
respectively. Decomposition tree of wavelet packet has been depicted in Fig. 10 and related wavelet coefcients of two sample
signals for normal gearbox and broken-tooth gear have been
illustrated in Figs. 11 and 12. With close scrutiny to the amplitude of the wavelet packet coefcients, it has been shown that
there is a considerable correlation between DB11 and the failure
condition (broken-tooth gear). The values of standard deviation
of wavelet packet coefcients in 16 sub-signals are mostly higher than those of the normal one. On the other hand, as shown in
Fig. 13, the uctuations of the standard deviations in different
sample segmented signals are considerable and this is the most
appropriate factor for training the neural networks. Stability and
convergence of the network are directly dependent on the distinguished feature vector with enough samples in ample ranges. As
a matter of fact, the range of the samples in feature vector is
more important than the number of samples in training that
4873

Table 3
Various combinations of the three optimized parameters and related results
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Wavelet function
order
Decomposition
level
No. of hidden-layer
neurons
Samples for
testing
Slightworn
Mediumworn
Brokentooth
Faultless
Best net
structure
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
3
3
3
3
3
3
4
4
4
4
4
4
5
5
5
5
5
5
13
15
13
13
15
15
17
19
17
14
15
14
24
25
25
23
23
23
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
94
93.33
95.66
98.53
97.33
93.33
94.33
92
100
100
100
97.86
97.33
90.66
96
100
100
90.66
90
92.66
93.33
97.66
96.06
95.33
100
98.46
98.66
100
100
100
96.86
92
100
100
96
100
98.06
97.13
97.33
99.33
98.2
98.06
99.66
99.13
100
100
100
99
100
99.4
100
100
100
100
96.66
92.33
95.66
97.66
96.73
99
96
94.66
93.33
100
100
100
98.66
97.53
98.66
100
100
94.66
8:10:4
8:12:4
8:10:4
8:10:4
8:12:4
8:12:4
16:14:4
16:16:4
16:14:4
16:11:4
16:13:4
16:14:4
32:21:4
32:22:4
32:22:4
32:21:4
32:23:4
32:23:4
8
9
10
11
12
13
8
9
10
11
12
14
8
9
10
11
12
13
Table 4
Various arrangements of the three optimized parameters with a constant number of neurons and the related results
1
2
3
4
5
6
7
8
9
10
11
12
Wavelet function
order
Decomposition
level
No. of hidden-layer
neurons
Samples for
testing
Slightworn
Mediumworn
Brokentooth
Faultless
Net
structure
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
DB
4
4
4
4
4
4
5
5
5
5
5
5
14
14
14
14
14
14
14
14
14
14
14
14
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
5 4 75
97
96.33
98.66
100
100
96
89.66
94.33
95.06
95
95.66
93.66
93.33
97.33
100
100
100
100
91
89.66
97
98.66
94
91.66
100
100
100
100
100
100
96.66
95.33
98.66
96.66
96.33
96.66
98.66
100
100
100
100
100
95
96.66
100
98
96.66
94
16:11:4
16:11:4
16:11:4
16:11:4
16:11:4
16:11:4
32:11:4
32:11:4
32:11:4
32:11:4
32:11:4
32:11:4
8
9
10
11
12
13
8
9
10
11
12
13
the standard deviation of wavelet coefcients has fullled the

need. Fig. 14 clearly shows the difference between different orders of Daubechies to discriminate the failures. As it can be seen,
the failures and normal condition can be clearly discriminate in
all of them, and it is related to the appropriate feature vector.
Nonetheless, the broken-tooth failure is not as clear as the others in DB2 even though there is enough severity in the fault. Besides, the uniformity among the sample signals in each condition
is not as satisfactory as DB11. In DB15, the drawbacks are the
same as DB2 but more appropriate results. DB20 is not a worthy
case for neural network because in many level there is no differences among the failures and normal condition. Most signicantly, the preciseness of discriminating the faults can not be
compared with DB11. In DB20, the slight- and medium-worn
gear, which is more important for us than the broken-tooth
one are not distinguished as easily as DB11 especially in incipient stages of the fault. Consequently, Daubechies wavelet function with the order of 11 was selected as the most proper
function by GA for fault classication. It means that for different
wavelet-based methods, the selection of mother wavelet function is mandatory to increase the performance of the system
although as mentioned in this research it may be possible to
have enough system performance with a specic mother wavelet
function.
To validate the estimated values, the performance of the neural network with three above-mentioned parameters has been
also calculated manually with several trials and errors. Related

results which are matched with the estimation of GA have been
shown in Table 3. Although, the network structure is compact in
decomposition level of 3 (due to 8-element network input), obviously no satisfactory results were obtained. In spite of the higher
performance and better results, the decomposition level of 5
leads to a larger number of neurons of the hidden layer (32-element network input) which leads to the complexity of the system. Therefore, fourth decomposition level was determined to
be the most appropriate level for this case study because of
the better outcomes than those of the others and the introduction of a well-formed network structure. The data in Table 3
has been calculated for each case without genetic algorithm. In
addition, the optimal number of the neurons of the hidden layer
was determined using several trials and errors runs for each case
in Table 3.
Hence, the end results obtained using genetic algorithm not
only does the high capabilities of this algorithm prove in order
to nd the optimal solution but also veries the soundness of
the parameters for the adjustment of the algorithm presented
in Table 1.
Table 4 puts forward the results of a diagnostics system without
using GA and manually with 14 neurons in hidden layer with different function orders and decomposition levels to better clarify
the aptness of the results of the previous case (DB11 with a decomposition level of 4).
4874
Segmentation
Raw signal
Synchronization by P.C.H.I.
Selection
Crossover
Generation
Mutation
0
1
1
0
1
0
1
1
0
1
0
1
0
1
1
0
1
0
1
1
0
1
0
1
DB order
No. of
Neurons
F =. p+ . t
Decomposition
Level
Genetic Algorithm
Fitness Function
No
Stopping
Criteria Fulfilled
Yes
END
Fig. 15. Schematic of the entire proposed algorithm.
4. Conclusion
An intelligent gear fault identication system was developed
and implemented to pinpoint the gearbox faults; Three parameters
were recognized to be of paramount signicance to be optimized
using GA i.e. Daubechies wavelet function order, decomposition level, and number of neurons in hidden layer of ANN, which play a
key role in size and performance of the network and the soundness
of the feature vector. The Feature vector was also obtained from
optimized standard deviation of wavelet packet coefcients acquired from DB11 at fourth level of decomposition following a
piecewise cubic hermite interpolation for synchronizing. Eventually, a MLP network with well-formed and optimized structure
(16:14:4) and remarkable accuracy was presented providing the
capability to identify slight-worn, medium-worn and broken-tooth
of gears faults perfectly in 100% of the ve tested congurations of
gearbox. By the proposed system which has been elaborately depicted in Fig. 15, the drawback of implementing mother wavelet
function through trial-and-error based methods has been also improved for fault classication purposes.
Acknowledgements
This work described in this paper was partially supported by
Vibration and Modal Analysis Lab at University of Tabriz and partially supported by a grant from the Research Grants Council of
Hong Kong Special Administrative Region, China (Project no. CityU
120506). The authors would like to write in memoriam of two dedicated mentors, Professor Vahhab Pirouzpanah and Professor Samad Nami Notash who devoted more than four decades to
enlighten a myriad of students. They would also like to thank the
anonymous reviewers for spending their valuable time to review

the current research.
References
Avci, E., Turkoglu, I., & Poyraz, M. (2005). Intelligent target recognition based on
wavelet packet neural network. Expert Systems with Applications, 29, 175182. S.
Bartelmus, W., Zimroz, R., & Batra, H. (2003). Gearbox vibration signal preprocessing and input values choice for neural network training. Articial
Intelligence Methods, Gliwice, Poland, November 57.
Chow, M. Y., & Sharpe, R. N. (1993). On the application and design of articial neural
networks for motor fault detection, Part 2. IEEE Transactions on Industrial
Electronics, 40(2).
Fritsch, F. N., & Carlson, R. E. (1980). Monotone piecewise cubic interpolation. SIAM
Journal of Numerical Analysis, 17, 238246.
Haykin, S. (1998). Neural networks: A comprehensive foundation (2nd ed.). Pearson
Education.
He, Y., Guo, D., & Chu, F. (2001). Using genetic algorithms and nite element
methods to detect shaft crack for rotor-bearing system. Mathematics and
Computers in Simulation, 57, 95108.
Holland, J. H. (1975). Adaption in natural and articial systems. Ann Arbor: University
of Michigan Press.
Jack, L. B., & Nandi, A. K. (2002). Fault detection using support vector machines and
articial neural networks augmented by genetic algorithms. Mechanical System
and Signal Processing, 16(23), 373390.
Kar, C., & Mohanty, A. R. (2006). Monitoring gear vibrations through motor current
signature analysis and wavelet transform. Mechanical Systems and Signal
Processing, 20(1), 158187.
Kwak, J. S., & Ha, M. K. (2004). Neural network approach for diagnosis of grinding
operation by acoustic emission and power signals. Journal of Materials Processing
Technology, 147, 6571.
Lei, Y., He, Z., & Zi, Y. (2007). A new approach to intelligent fault diagnosis of
rotating machinery. Expert Systems with Applications. doi:10.1016/j.eswa.2007.
08. 072.
Lei, Y., He, Z., Zi, Y., & Hu, Q. (2007). Fault diagnosis of rotating machinery based on
multiple ANFIS combination with GAs. Mechanical Systems and Signal Processing,
21, 22802294.
Li, B., Chow, M. Y., Tipsuwan, Y., & Hung, J. C. (2000). Neural-network-based motor
rolling bearing fault diagnosis. IEEE Transactions on Industrial Electronics, 47(5).

Liu, B. (2005). Selection of wavelet packet basis for rotating machinery fault
diagnosis. Journal of Sound and Vibration, 284, 567582.
Lo, C. H., Chan, P. T., Wong, Y. K., Rad, A. B., & Cheung, K. L. (2007). Fuzzy-genetic
algorithm for automatic fault detection in HVAC systems. Applied Soft
Computing, 7, 554560.
Momoh, J. A., & Dias, L. G. (1996). Solar dynamic power system fault diagnostics. In
NASA conference Publication 10189.
Peng, Z. K., & Chu, F. L. (2004). Application of the wavelet transform in machine
condition monitoring and fault diagnostics. Mechanical Systems and Signal
Processing, 18, 199221.
Peng, Z. K., Tse, P. W., & Chu, F. L. (2005). An improved HilbertHuang transform and
its application in vibration signal analysis. Journal of Sound and Vibration, 286,
187205.
Peng, Z. K., Tse, P. W., & Chu, F. L. (2005). A comparison study of improved
HilbertHuang transform and wavelet transform: Application to fault
diagnosis for rolling bearing. Mechanical Systems and Signal Processing, 19,
974988.
Raee, J., Arvani, F., Hari, A., & Sadeghi, M. H. (2007). Intelligent condition
monitoring of a gearbox using articial neural network. Mechanical Systems and
Signal Processing, 21, 17461754.
Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster
backpropagation learning: The RPROP algorithm. In Proceedings of the IEEE
international conference on neural networks, San Francisco.
Samanta, B. (2004). Gear fault detection using articial neural networks and
support vector machines with genetic algorithms. Mechanical Systems and Signal
Samanta, B., & Al-Balushi, K. R. (2003). Articial neural network based fault
diagnostics of rolling element bearings using time-domain features. Mechanical
Systems and Signal Processing, 17(2), 317328.
Samanta, B., Al-Balushi, K. R., & Al-Araimi, S. A. (2003). Articial neural networks
and support vector machines with genetic algorithm for bearing fault detection.
Engineering Applications of Articial Intelligence, 16, 657665.
4875
Saravanan, N. et al. (2007). A comparative study on classication of features by SVM

and PSVM extracted using Morlet wavelet for fault diagnosis of spur bevel gear
box. Expert Systems with Applications. doi:10.1016/j.eswa.2007.08.026.
Soman, K. I., & Ramachandran (2004). Insight into wavelets from theory to practice.
Prentice-Hall of India.
Tang, K. S., Man, K. F., Kwong, S., & He, O. (1996). Genetic algorithms and their
applications. IEEE Signal Processing Magazine, 2237.
Tse, P. W., Gontarz, S., & Wang, X. J. (2007). Enhanced eigenvector algorithm for
recovering multiple sources of vibration signals in machine fault diagnosis.
Mechanical Systems and Signal Processing, 21, 27942813.
Tse, P. W., Peng, Y. H., & Yam, R. (2001). Wavelet analysis and envelope detection for
rolling element bearing fault diagnosis Their effectiveness and exibilities.
Journal of Vibration and Acoustics, 123, 303310.
Tse, P. W., Yang, W. X., & Tam, H. Y. (2004). Machine fault diagnosis through an
effective exact wavelet analysis. Journal of Sound and Vibration, 277, 1005
1024.
Wang, J., & Hu, H. (2006). Vibration-based fault diagnosis of pump using fuzzy
technique. Measurement, 39, 176185.
Wang, W. J., & McFadden, P. D. (1995). Application of orthogonal wavelets to early
gear damage detection. Mechanical System and Signal Processing, 9, 497507.
Wong, M. L. D., & Nandi, A. K. (2004). Automatic digital modulation recognition
using articial neural network and genetic algorithm. Signal Processing, 84,
351365.
Wu, J. D., Chiang, P. H., Chang, Y. W., & Shiao, Y. J. (2008). An expert system for fault
diagnosis in internal combustion engines using probability neural network.
Expert Systems with Applications, 34, 27042713.
Wuxing, L., Tse, P. W., Guicai, Z., & Tielin, S. (2004). Classication of gear faults using
cumulants and the radial basis function network. Mechanical Systems and Signal
Yesilyurt, I. (2004). The application of the conditional moments analysis to gearbox
fault detection-a comparative study using the spectrogram and scalogram.
NDT& E International, 37, 309320.

A Novel Technique For Selecting Mother Wavelet Function Using An Intelligent Fault Diagnosis System

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Novel Technique For Selecting Mother Wavelet Function Using An Intelligent Fault Diagnosis System

Uploaded by

Copyright:

Available Formats

Expert Systems with Applications 36 (2009) 48624875

Contents lists available at ScienceDirect

Expert Systems with Applications

A novel technique for selecting mother wavelet function using an intelligent

established as the way of recording required dataset because of

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Fig. 1. Low-order Daubechies.

distribution systems. The prior research demonstrates the wavelet

The most indispensable challenge in wavelet analysis is the

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

using DB4 to obtain the wavelet coefcients. The most important

2. Summary of the developed procedure

It is important to pre-process any raw data before use since it

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

formed in the following two steps to extract the feature vector.

demonstrate unsolicited uctuations, essentially following random

Then the rst difference, dk is calculated by the following equation,

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Fig. 8. Proposed chromosome for GA.

Fig. 7. Structure of the MLP network for fault classication.

for the piecewise linear interpolant, dk = dk1 or dk

which is a cubic polynomial in s, and, hence in x, that fullled four

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Raw vibration signals

Synchronized Signals by P.C.H.I

Intelligent Fault Identification

Fig. 9. Intelligent fault identication system with optimized parameters.

4,10 4,11 4,12 4,13 4,14 4,15

Fig. 10. Decomposition tree of synchronized vibration signals.

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

(4, 10), StD=12.47

(4, 11), StD=7.23

(4, 12), StD=54.42

(4, 15), StD=29.57

(4, 14), StD=15.04

(4, 13), StD=63.47

approximation and detail components as shown in Fig. 5. Wavelet

The original signal f(t) after j level of decomposition are dened as

where the wavelet packet coefcients cij;k t are calculated by:

In this paper, standard deviation of wavelet packet coefcients

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

(4, 10), StD=44.57

(4, 11), StD=30.14

(4, 13), StD=67.87

(4, 15), StD=57.06

(4, 14), StD=67.45

the wavelet coefcients is directly dependent on the shape of the

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

and ve gearbox congurations, was applied to the network as the

In this research, Roulette wheel selection scheme has been

J. Raee et al. / Expert Systems with Applications 36 (2009) 48624875

Mother wavelet = DB11

Mother wavelet = DB2

30 smples for each condition

30 smples for each condition

Standard Deviation of Wavelet Packet Coefficients

Mother wavelet = DB15

Mother wavelet = DB20

30 smples for each condition

30 smples for each condition

Standard Deviation of Wavelet Packet Coefficients

Standard Deviation of Wavelet Packet Coefficients