Professional Documents
Culture Documents
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
art ic l e i nf o a b s t r a c t
Article history: This note proposes a modied ELM algorithm named P-ELM subject to how to get rid of the
Received 17 September 2014 multicollinear problem in calculation based on PCA technique. By reducing the dimension of hidden
Received in revised form layer output matrix (H) without loss major information through PCA theory, the proposed P-ELM
13 April 2015
algorithm can not only ensure the full column rank of newly generated hidden layer output matrix (H 0 ),
Accepted 14 April 2015
Available online 19 August 2015
but also improve the training speed. In order to verify the effectiveness of P-ELM algorithm, this paper
establishes a soft measurement model for hot metal temperature in the blast furnace (BF). Some
Keywords: comparative simulation results with other famous feedforward neural network and the ordinary ELM
Extreme learning machine algorithm with its variants illustrate the better generalization performance and stability of the proposed
Principle component analysis
P-ELM algorithm.
Blast furnace
& 2015 Elsevier B.V. All rights reserved.
Hot metal temperature
Soft measurement model
http://dx.doi.org/10.1016/j.neucom.2015.04.106
0925-2312/& 2015 Elsevier B.V. All rights reserved.
H. Zhang et al. / Neurocomputing 174 (2016) 232237 233
indexes and the proposed P-ELM algorithm is employed to train is widely used, where H H T H 1 H T . So
the sampling data and build a reliable feedforward neural
network model. ^ H T H 1 HT T 4
In this paper, a modied ELM algorithm named P-ELM is It is apparent from (4) that one can not obtain correct and
proposed based on PCA technique. Compared with the previous satisfactory result if the matrix HTH is singular despite that some
methods, the proposed P-ELM algorithm can deal with the multi- mathematical softwares like Matlab have the corresponding
collinear problem under the condition that the estimation of methods to deal with the inverse of singular matrix. Next some
output weights is unbiased. The improved P-ELM algorithm shows properties of the solution of ELM algorithm will be discussed. In a
a better performance in dealing with interfered data obtained complex industrial production environment, the data will often be
from complex industrial process. In order to verify the effective- interfered by external noise. Then the model (2) should be
ness of the proposed P-ELM algorithm in real industrial applica- modied as
tion, a soft measurement model subject to hot metal temperature
H T 5
in BF is presented. Finally, simulation results using the real
industrial data show that P-ELM algorithm has better stability where white noise is considered here and A N 0; . 2
and generalization performance compared with other famous After adding the interference of external noise, the solution of
feedforward neural network. It is worth mentioning that [13] also ELM will also be modied as
employed PCA theory to the ordinary ELM algorithm. However, in PN~ PN~
1
1 H i H i i i i 1 H i i
T T
this paper, PCA technique is used to deal with the hidden layer ^ HT H H T T i P
N~ T
PN~ T
6
output matrix rather than the original sampling data in [13]. i 1 Hi Hi i 1 Hi Hi
The following sections are organized as follows: Section 2 is the
Next we analyze the result from three aspects: expectation (E),
main part to describe the improved P-ELM algorithm. Some basic
variance (V) and mean square error (MSE). Then one can get
theory about BF will be presented in Section 3 and simulation
results are given in Section 4. Section 5 summarizes the conclusion E ^
of this paper.
2 XN~
2 2 1
V ^ E ^ E ^ P ~ 2
N T
i 1 Hi Hi i1 i
2. The improved extreme learning machine algorithm 1 T 1 2 2 X N~
1
MSE ^ E ^ ^
PN~
In the last few years, ELM algorithm has received very wide N~ N~ H T
H N~
i 1
i
i1 i i
range of applications and development because of its fast train 7
speed and good generalization performance. In this section, a
review of the ordinary ELM algorithm is presented and some where i is the ith eigenvalue of H H [8].
T
properties of the solution will be discussed. And then the improved As we all know, the learning parameters of hidden nodes in the
P-ELM algorithm is proposed to overcome the restrictions. model of ELM algorithm are generated randomly without human
tuning. In most case, the number of hidden nodes is far less than
~
that of the samples NN. So H is usually not a square matrix and
2.1. The theory of ordinary ELM algorithm H T H may not always be nonsingular. That is to say when HT H is
multicollinear,
some eigenvalues will tend to zero, while V ^ and
Suppose there are N arbitrary samples xi ; t i , where xi ^
MSE will become larger. And the solution is not convincing.
xi1 ; xi2 ; ; xin T A Rn denotes an n-dimensional feature of the ith
sample and t i t i1 ; t i2 ; ; t im A Rm denotes the target vector. The
2.2. The introduction for improved P-ELM algorithm
mathematical model of SLFNs with N~ hidden nodes is as follows:
N~
X N~
X This subsection is the main part of the note. Here the improved
i g i xk i gwi ; bi ; xk t k ; k 1; 2; ; N 1 P-ELM algorithm is presented to overcome the multicollinear
i1 i1 problem using principle component analysis (PCA) technique.
where wi and bi are the learning parameters which will be PCA is a useful statistical technique that has found application in
determined randomly. i is the output weight matrix connecting elds such as face recognition and image compression, and is a
the ith hidden and the output nodes. gwi ; bi ; xk is a nonlinear common technique for nding patterns in high dimension data
piecewise continuous function which satises ELM universal [14,16]. For a series of data with high dimension, PCA is a powerful
approximation capability theorems. tool for identifying patterns and expressing the data in such a way
The above N~ equations can be written in matrix form as as to highlight their similarities and differences by reducing the
number of dimensions [15].
H T 2 Fig. 1 presents the distribution of sample data with two
where dimensions where we can see that most of the data distribute
0 1 along the direction of w1 . Here w1 is called the rst principle
gw1 x1 b1 gwN~ x1 bN~ component direction which can character the main information of
B C
HW; B; X @ A data distribution. In addition, w2 stands for another direction of
gw1 xN b1 gwN~ xN bN~ the second principle component which means less important
NN~
information like external disturbance. The main purpose of the
called the hidden layer output matrix. According to the theory of theory of PCA is to represent the sample data using the main
Least Square, the output weight can be estimated as principle components. In the theory of mathematical statistics, the
rst principle component direction is the one with maximum
^ H T 3
variance of the data distribution [16]. So the sort of principle
where H is the MoorePenrose generalized inverse of H [19]. components is based on the size of variances of sample data in
There are several methods to calculate the MoorePenrose gen- different directions. Next a brief theoretical derivation of PCA is
eralized inverse. Here singular value decomposition (SVD) method presented [17].
234 H. Zhang et al. / Neurocomputing 174 (2016) 232237
x2
w2
w1
x1
Fig. 1. The distribution of sample data.
355 12004250
76 190
350
74
1195
185 345
72
70 340 11904200
180
68 335
1185
66 175
330
64
325 11804150
170
62
320
60
165 1175
315
58
56 160 31011704100
20 40 60 80 100 120 140 160 180 200
time
Fig. 4. The distribution of sampling data. (From top to bottom: (a) Furnace top pressure. (b) Blast pressure. (c) Blast volume. (d) Blast temperature. (e) Oxygen enrichment.)
1# iron notch
gradually rise until about 1500 1C. One can see that the proposed 2# iron notch 3# iron notch
1200
P-ELM algorithm can well track the variation tendency of hot The second temperature
1100
metal temperature. In addition, the performance of the proposed measurement
P-ELM algorithm with sigmoid function is more stable than with 1000
RBF
RBF hidden nodes. 900
Sigmoid
For the continuous production process in BF, it is very essential 800
The first temperature
to realize online condition monitoring, which can provide the real- 700 measurement
time operating condition information to the operators. The pro- 2 4 6 8 10 12 14 16
posed P-ELM algorithm is applied in continuous measurement for
Time
the hot metal temperature through batch learning. We employ P-
ELM algorithm to verify 10 batches of data. Then Fig. 6 shows the Fig. 5. The regression results for hot metal temperature.
simulation results in the aspect of batch error using the corre-
sponding ELM algorithm with sigmoidal function, where one can
60
see that P-ELM algorithm has less error compared with other ELM ELM
algorithm like OP-ELM and R-ELM. P-ELM algorithm can restrain OPELM
the multicollinear problem well and present more accurate pre- 50 RELM
dictions for hot metal temperature. PELM
Temperature error
5. Conclusions full rank of the columns. This approach can not only solve the
multicollinear problem but also improve training speed compared
In practical applications, ELM algorithm encounters an important with other ELM algorithm. At last, we establish the soft measurement
constrain for the multicollinear problem in calculation of output model for the temperature of hot metal in iron-making process of the
weight matrix. In this paper, an improved ELM algorithm named P- BF in order to verify the effectiveness of P-ELM algorithm. The
ELM is proposed based on PCA theory. Through reducing the dimen- experimental results show that P-ELM algorithm has more stability
sion of hidden layer output matrix (H) without loss of any statistic in handling the contaminated industrial data and learning speed can
information, the newly generated hidden layer output matrix (H 0 ) has meet the on-line implementation requirements in the industrial eld.
H. Zhang et al. / Neurocomputing 174 (2016) 232237 237
Table 2 [3] C. Pan, D.S. Park, Y. Yang, H.M. Yoo, Leukocyte image segmentation by visual
Simulation results. attention and extreme learning machine, Neural Comput. Appl. 21 (6) (2012)
12171227.
Algorithms Nodes MSE Training Testing SD CV [4] P.K. Wang, Z.X. Yang, C.M. Vong, J.H. Zhong, Real-time fault diagnosis for gas
time time turbine generator systems using extreme learning machine, Neurocomputing
Training Testing 128 (2014) 249257.
[5] X.F. Hu, Z. Zhao, S. Wang, F.L. Wang, D.K. He, S.K. Wu, Multi-stage extreme
learning machine for fault diagnosis on hydraulic tube tester, Neural Comput.
BP 80 0.1851 0.1239 5.1821 0.0310 0.0354 0.0501
Appl. 17 (4) (2008) 399403.
SVM - 0.0998 0.0979 0.3940 0.0198 0.0305 0.0436
[6] H.G. Zhang, S. Zhang, Y.X. Yin, An improved ELM algorithm based on EM-ELM
ELM 80 0.1015 0.0998 0.0055 0.0011 0.0344 0.0491 and ridge regression, Lecture Notes in Computer Science 8261 (2013) 756763.
OP-ELM 65 0.0995 0.1003 0.0804 0.0030 0.0213 0.0304 [7] G.B. Huang, H. Zhou, X. Ding, R. Zhang, Extreme learning machine for
P-ELM 80-58 0.0990 0.0967 0.0122 0.0011 0.0201 0.0287 regression and multiclass classication, IEEE Trans. Syst. Man Cybern.-Part B
Cybern. 42 (2) (2012) 513529.
[8] H.G. Zhang, S. Zhang, Y.X. Yin, A novel improved ELM algorithm for a real
industrial application, Math. Probl. Eng. (2014) http://dx.doi.org/10.1155/2014/
Table 3 824765.
The distribution of simulation data. [9] V.R. Radhakrishnan, K.M. Ram, Mathematical model for predictive control of
the bell-less top charging system of a blast furnace, J. Process Control 11 (5)
Dataset #Attributes #Classes #Training data #Testing data (2001) 565586.
[10] V.R. Radhakrishnan, A.R. Mohamed, Neural networks for the identication and
control of blast furnace hot metal quality, J. Process Control 10 (6) (2000)
Abalone 8 3000 1177
509524.
Wine 13 3 120 58
[11] G.M. Cui, J. Li, Y. Zhang, Z.D. Li, X. Ma, Prediction Modeling Study for Blast
Furnace Hot Metal Temperature Based on T-S Fuzzy Neural Network Model,
Iron Steel 48 (1) (2013) 1115.
Table 4 [12] P.D. Burk, J.M. Burgess, A coupled gas and solid ow heat transfer & chemical
Simulation results (sigmoid hidden nodes) reaction model, in: Iron Making Conference, Chicago, 2006, pp. 773781.
[13] A. Castao, F. Fernandez-Navarro, C. Hervas-Martnez, PCA-ELM: a robust and
pruned extreme learning machine approach based on principal component
Dataset Algorithm #Nodes Training time (s) RMSE or accuracy
analysis, Neural Process. Lett. 37 (3) (2013) 377392.
[14] M. Kirby, L. Sirovich, application of the Karhunen-Loeve procedure for the
Training Testing characterization of human faces, IEEE Trans. Pattern Anal. Mach. Intell. 12 (1)
(1990) 103108.
Abalone ELM 50 0.0978 0.0767 0.0723 [15] R.P. Good, D. Kost, G.A. Cherry, Introducing a unied PCA algorithm for model
OP-ELM 50 0.1532 0.0699 0.0711 size reduction, IEEE Trans. Semicond. Manuf. 23 (2) (2010) 201209.
R-ELM 50 0.1329 0.0721 0.0719 [16] L. Yan., A PCA-based PCM data analyzing method for diagnosing process
P-ELM 50 0.1321 0.0701 0.0700 failures, IEEE Trans. Semicond. Manuf. 19 (4) (2006) 404410.
Wine ELM 30 0.0022 99.98% 95.70% [17] I.T. Jolliffe, Principal Component Analysis, Second Edition, Springer-Verlag,
OP-ELM 30 0.0030 99.95% 96.01% New York, 2002, ISBN: 978-0-387-22440-4.
R-ELM 30 0.0031 99.94% 95.29% [18] X.L. Tang, L. Zhang, X.D. Hu., The support vector regression based on the chaos
P-ELM 30 0.0025 99.96% 97.02% particle swarm optimization algorithm for the prediction of silicon content in
hot metal, Control Theory Appl. 26 (8) (2009) 838842.
[19] D. Serre, Matrices: Theory and Applications, Springer, New York, 2002.
Acknowledgments
Haigang Zhang received the B.S. and M.S. degree in
Electrical Engineering from University of Science and
This work has been supported by the National Natural Science Technology Beijing in 2009 and 2011 respectively. Now
Foundation of China (NSFC Grant no. 61333002) and Beijing he is a Ph.D. candidate of control systems. His research
interests include neural networks and its application to
Natural Science Foundation (Grant no. 4132065). control of the world.
Appendix