You are on page 1of 12

ARTIFICIAL NEURAL NETWORK MODELS IN RAINFALL-RUNOFF MODELLING OF TURKISH RIVERS

H.Kerem CIIZOLU1* Pnar AKIN1 Ahmet ZTRK2 Atilla GRBZ3 mer AYHAN3 Mehmet YILDIZ3 smail UAR2
stanbul Technical University, Civil Engineering Faculty, Division of Hydraulics, Maslak, 34469,Istanbul-Turkey 2 stanbul Technical University, Electrical and Electronic Faculty, Electrical Engineering Department,34469,Istanbul-Turkey 3 General Directorate of Electrical Power Resources Survey and Development Administration, Eskiehir yolu 7. km No:166, 06520 ankaya,Ankara-Turkey * Corresponding author:Associate Prof., E-Mail: cigiz@itu.edu.tr
1

ABSTRACT

Three neural network methods, feed forward back propagation (FFBP), radial basis function (RBF) and generalized regression neural network (GRNN) were employed for rainfall-runoff modelling of Turkish hydrometeorologic data. The daily rainfall and daily mean flow data are coupled to form the basis of rainfall-flow modelling using different ANN configurations. It was seen that all three different ANN algorithms compared well with conventional multi linear regression (MLR) technique. The peak flows of the observed hydrographs were closer approximated by FFBP and RBF algorithms. It was seen that only GRNN technique did not provide negative flow estimations for some observations. The rainfall-flow correlogram was successfully used in order to determine the input layer structure of the ANN configurations. Keywords: neural network, rainfall-flow modelling,

RIVER BASIN FLOOD MANAGEMENT

561
1. INTRODUCTION

The rainfall-runoff relationship is one of the most complex hydrologic phenomena to comprehend due to the tremendous spatial and tremendous spatial and temporal variability of watershed characteristics and precipitation patterns, and the number of variables involved in the modelling of the physical processes. Numerous rainfall-runoff (R-R) models have been developed to forecast streamflow within the last 70 years. Conceptual models provide daily, monthly, or seasonal estimates of streamflow for long-term forecasting on a continuous basis. The entire physical process in the hydrologic cycle is mathematically formulated in conceptual models that are composed of a large number of parameters. Because of the several model parameters and the interaction of these parameters is highly complicated, the optimization of model parameters is usually accomplished by a trial-error procedure. The accuracy of model predictions is very subjective and highly dependent on the users ability, knowledge, and understanding of the model and of the watershed characteristics. The artificial neural network (ANN) approach, which is a non-linear black box model, is extensively used in the water resources literature. The major ANN application domains can be summarized as river flow forecasting (Cigizoglu, 2003a; Cigizoglu, 2003b; Cigizoglu and Kisi, 2005), various groundwater problems (Ranjithan et al., 1993), estimation of suspended sediment (Cigizoglu, 2004; Cigizoglu and Kisi, 2006). There are ANN studies also for modelling rainfall-runoff relationship (Minns and Hall, 1996; Tokar and Johnson, 1999). In these studies either the conventional feed forward error back propagation method (FFBP) or the radial basis function was employed to train the neural networks. As well known the FFBP algorithm has some drawbacks. It is very sensitive to the selected initial weight values and may provide performances differing from each other significantly. Another problem faced during the application of FFBP is the local minima issue. Levenberg-Marquardt algorithm was employed for FFBP applications in the presented study to solve this problem. The performance of another ANN algorithm, Generalized Regression Neural Network (GRNN), in water resources problems is not extensively anlyzed. (Cigizoglu, 2005a; Cigizoglu, 2005b; Cigizoglu and Alp, 2006) found GRNN quite successfull in hydrological forecasting. The presented study covers the employment of three ANN methods, i.e. FFBP, RBF and GRNN, in rainfall-flow modelling of Turkish hydrological basins. The performance of GRNN in flow hydrograph estimation is widely investigated in the whole study. Another concern of the study was the initial analysis of different statistics in forming the ANN input layer structure.

562

INTERNATIONAL CONGRESS ON RIVER BASIN MANAGEMENT

2. NEURAL NETWORK METHODS IN THE STUDY

2.1.The Feed Forward Back Propagation (FFBP) Method A FFBP distinguishes itself by the presence of one or more hidden layers, whose computation nodes are correspondingly called hidden neurons of hidden units. The function of hidden neurons is to intervene between the external input and the network output in some useful manner. By adding one or more hidden layers, the network is able to extract higher order statistics. In a rather loose sense, the network acquires a global perspective despite its local connectivity due to the extra set of synaptic connections and the extra dimension of the network interconnections (Haykin, 1994). The ability of hidden neurons to extract higher order statistics is particularly valuable when the size of the input layer is large. The source nodes in the input layer of the network supply respective elements of the activation pattern (input vector), which constitute the input signals applied to the neurons (computation nodes) in the second layer (i.e., the first hidden layer). The output signals of the second layer are used as inputs to the third layer, and so on for the rest of the network. Typically, the neurons in each layer of the network have as their inputs the output signals of the preceding layer only. The set of the output signals of the neurons in the output layer of the network constitutes the overall response of the network to the activation patterns applied by the source nodes in the input (first) layer. The FFBP are trained using the LevenbergMarquardt optimization technique.. Throughout all FFBP simulations, the learning rate and the momentum rate parameters were taken adaptively. 2.2.The Radial Basis Function-Based Neural Networks (RBF) RBF were introduced into the neural network literature by (Broomhead and Lowe, 1988). The RBF consists of two layers whose output nodes form a linear combination of the basis functions. The basis functions in the hidden layer produce a significant non-zero response to input stimulus only when the input falls within a small localized region of the input space. Hence, this paradigm is also known as a localized receptive field network. The relation between inputs and outputs is illustrated in Fig. 1. Transformation of the inputs is essential for fighting the curse of dimensionality in empirical modeling. The type of input transformation of the RBF is the local nonlinear projection using a radial fixed shape basis function. After nonlinearly squashing the multi-dimensional inputs without considering the output space, the radial basis functions play a role as regressors. Since the output layer implements a linear regressor the only adjustable parameters are the weights of this regressor.

RIVER BASIN FLOOD MANAGEMENT

563

These parameters can therefore be determined using the linear least square method, which gives an important advantage for convergence. 2.3.The Generalized Regression Neural Networks (GRNN) The basics of the GRNN can be obtained in the literature (Specht, 1991). The GRNN consists of four layers: input layer, pattern layer, summation layer, and output layer. The number of input units in the first layer is equal to the total number of parameters. The first layer is fully connected to the second, pattern layer, where each unit represents a training pattern and its output is a measure of the distance of the input from the stored patterns. Each pattern layer unit is connected to the two neurons in the summation layer: S-summation neuron and D-summation neuron. The Ssummation neuron computes the sum of the weighted outputs of the pattern layer while the D-summation neuron calculates the unweighted outputs of the pattern neurons. The connection weight between the ith neuron in the pattern layer and the Ssummation neuron is yi; the target output value corresponding to the ith input pattern. For D-summation neuron, the connection weight is unity. The output layer merely divides the output of each S-summation neuron by that of each D-summation neuron, yielding the predicted value to an unknown input vector.
3. DATA ANALYSIS

The rainfall and river flow data belongs to 26 hydrological regions of Turkey. The daily mean flow data and the rainfall data are provided by Turkish General Directorate of Electrical Power Resources Survey and Development Administration (EE) and Turkish State Meteorological Service (DM), respectively. In the presented study only the rainfall and river flow measurements for hydrologic basin No.1 have been employed as representative example for the whole study due to the limited paper space. 3.1.The rainfall data Turkish State Meteorological Service (DM) has provided 3 daily total rainfall measurements for each day of each of the 261 meteorological stations in whole Turkey. The unit of the data is mm. It was seen that 228 stations had data longer than 10 years. The rainfall data for these stations covers the time period between 1973 and 2002. The areal distribution of these stations in the whole country are shown in Fig.1. The data consists of the values measured in wet days. For the remaining dry days zero values are attributed. A continuous rainfall time series of 30 years (1973-2002) is obtained for each considered meteorological station.

564
BLACK SEA

INTERNATIONAL CONGRESS ON RIVER BASIN MANAGEMENT

MEDITERRANIAN SEA

Fig. 1: The areal distribution of the meteorological stations considered within the study.

3.2.The flow data In the study daily mean flow data belonging to 116 stations have been employed. ANN forecasting studies have been accomplished using daily mean flow series. The flow station series having missing flows for more than two days have not been considered. Statistics for daily mean flow series are presented in Table 1 for some of the flow stations. The table covers statistics for annual maximum flows and daily mean flows for all the flow stations in hydrologic basin No.1. The statistics are
_

the mean ( x ), standard deviation (sx), skewness coefficient (csx), maximum (xmax) and minimum (xmin).

Table 1a. The statistics for the annual maximum flows in hydrological basin No.1.

Station No.

Catchment area (km2) 10194,8 1381,2

105 106

Standard deviation Mean (of the (of the annual annual maximum maximum flows) flows) (m3/s) (m3/s) 594,15 628,13 159,64 163,77

Maximum (of the annual maximum flows) (m3/s) 2190 655

Table 1b. The statistics for the daily mean flows in hydrological basin No.1.

Station No.

Mean (m3/s) 21,20 4,06

Standard deviation (m3/s) 62,30 14,09

Skewness coefficient

Maximum and minimum (m3/s) 1690-0 614-0,01

105 106

13,19 16,92

RIVER BASIN FLOOD MANAGEMENT

565

3.3.Coupling the flow and rainfall data The rainfall measurements of DMI and river flow measurements of EE are collected independent from each other. Therefore a code is written in MATLAB language to couple the rainfall and flow stations having coordinates close to each other. For each flow station several rainfall stations having close coordinates have been determined initially. Further elimination took place considering the results of the correlation analysis. Two rainfall stations having highest cross correlation coefficients have been selected for each flow station. Finally the maps showing the catchment area boundaries have been analyzed carefully. The rainfall stations which are not situated in the same basin with the considered flow station have been eliminated. The coupling study results for hydrologic basin No.1 is presented in Table 2. On each line of this table all of the rainfall stations having close coordinates with the corresponding flow station are presented. Two rainfall stations found after correlation analysis are shown in dark font (Table 2).
Table 2. Coupling the rainfall and the flow stations in hydrologic basin No.1.

FLOW STATIONS

RAINFALL STATIONS

104

17052

17600

17056

17634

17050

17608

17632

105

17052

17600

17056

17634

17050

17608

17632

4. ANN ESTIMATION RESULTS FOR HYDROLOGICAL BASIN NO.1

Three different MATLAB codes are written for the three ANN methods employed in the study. The data is divided into two groups as training and testing. The same data is then also used for multiple linear regression (MLR) method for the purpose of comparison. The comparison criteria for the testing period were the testing time period hydrographs, the mean square error (MSE) and the coefficient of determination (R2) for the testing time period. The ANN and MLR methods were applied to the daily mean flow data for all of the 98 flow stations in the entire country. The input data for each flow station included the past and present daily total rainfall values of two rainfall stations together with the past daily mean flow measurements. The number of the daily rainfall values for each rainfall station was determined using

566

INTERNATIONAL CONGRESS ON RIVER BASIN MANAGEMENT

the correlation analysis as explained in the previous section. Therefore the analysis of the cross correlations between flow and rainfall data and the auto correlations within the flow data helped significantly in establishing the ANN input layer structure. The calibration period for MLR metod was the same as the training method of ANNs and similarly the same testing period was employed. The independent variables for the MLR method were the input layer variables of ANN configuration. The parameters for three ANN methods were found separately for each flow station after trying various values for each parameter. These parameters are the number of hidden layers and the number of hidden layer nodes for FFBP and the spread parameter value for GRNN and RBF. In general one hidden layer was found adequate for FFBP simulations. In the presented paper the ANN results only for two hydrologic basins were presented due to the paper page limitation. The ANN configurations together with the model parameters and the testing stage performance criteria for the four flow stations in hydrological basin No.1 are presented in Table 3. In this table the FFBP configuration FFBP(18,6,1) represents a structure with 18 input nodes, 6 hidden layer nodes and a unique output node. On the other hand a GRNN structure GRNN (18, 0.10,1) corresponds to 18 input nodes, a spread value equal to 0.10 and a single output node. The hydrograph and the scatter plot for the testing period are plotted in Figs.2-3. It can be seen that the three ANN method estimated hydrographs capture the general behaviour of the observed ones better than the MLR curves.The local maximum flows are closer estimated by FFBP and RBF methods. The GRNN performance criteria were quite similar with the results of the other two ANN methods. Though the GRNN method did not provide negative estimations for all of the low flows it provided some overestimations for some of the low flow parts of the observed hydrographs. The overall performance of three ANN and MLR methods throughout the country are are presented in Table 4. In all of the 98 stations ANN methods provided the best performance criteria. For only 2 stations MLR method had close results to the best ANN method. FFBP, GRNN and RBF had best estimation performances for 65, 33 and 31 stations, respectively. It was seen that FFBP, RBF and MLR methods provided estimations having negative values for some of the low flows. For the GRNN method, on the other hand, this problem is not seen.

Table 3. Flow estimation results and general information related to the ANN and MLR models in hydrological basin No.1 (performance criteria are obtained for the testing period).
Inputs Data period MLR Rainfall 1 Sta:17632 Calibration: 1500 days testing: 1351 days MSE=249.59m6/s2 MSE=88.10m6/s2 R2=0.29 R2=0.70 MSE =61 m6/s2 MSE = 107.55 m6/s2 R(t), R(t-1), R(t-2) training:1500 days training: 1500 days testing: 1351 gn testing: 1351days testing: 1351days R(t), R(t-1), R(t2), R(t-3), R(t-4) training:1500 days Rainfall 2 Sta:17608 FFBP FFBP(18,6,1) RBF RBF(18,.0.85,1) ANN Architecture and performance criteria for testing period

Station No.

105

Flow Q(t-1), Q(t2), Q(t-3), Q(t-4), Q(t-5), Q(t6), Q(t-7), Q(t-8),

GRNN GRNN (18, 0.10,1)

07.09.1979 07.07.1994

106

Sta:17634 Sta:17632

R2=0.74 FFBP(15, 6, 1)

R2=0.61 GRNN(15, 0.10 ,1)

RBF(15 , 0.85 ,1) training:1500 days testing: 1796 days

Calibration: 1500 gn

R(t), R(t-1), R(t-2) R(t), R(t-1), R(t-2), R(t-3) training:1500 days testing: 1796 days testing: 1796gn MSE =6.69m /s MSE = 2.19m6/s2 R2=0. 54 R2=0.26
6 2

26.09.1974 testing: 1796 gn 11.06.1994

training: 1500 days

Q(t-1), Q(t2), Q(t-3), Q(t-4), Q(t-5), Q(t6), Q(t-7), Q(t-8),

MSE =5.26 m6/s2 MSE =18.43m /s R2=0.2294 R2=0.10


6 2

568

INTERNATIONAL CONGRESS ON RIVER BASIN MANAGEMENT

250 Daily mean flow (m 3/s) 200 150 100 50 0 -50 -100 0

- - - - -FFBP _____ Observed


y = 0.9783x + 0.7434 R2 = 0.743

250 Estimated flow (m 3/s) 200 150 100 50 0 -50 -100 0 50

500

1000

1500

100

150

200

250

Time (day)

Observed flow (m 3/s)

(a)
180 Estimated flow (m 3/s)
250 Daily mean flow (m 3/s) 200 150 100 50 0 0 500 Tim e (day) 1000 1500 - - - - - GRNN _____ Observed

y = 0.547x + 9.9253 R2 = 0.6051

160 140 120 100 80 60 40 20 0 0 50 100 150 200 250 Observed flow (m 3/s)

(b)
300 Estimated flow (m 3/s)
300 Daily mean flow (m 3/s) 250 200 150 100 50 0 -50 0 -100 -150 Tim e (day) 500 1000 1500 - - - - RBF _____ Observed

y = 1.0654x - 0.5891 R2 = 0.7044

250 200 150 100 50 0 -50 0 -100 -150 Observed flow (m 3/s) 50 100 150 200 250

(c)
300 Daily mean flow (m3/s) 200 100 0 -100 0 -200 -300 Time (day) 500 1000 1500
- - - - MLR _____ Observed

300 Estimated flow (m 3/s) 200 100 0 -100 -200 -300 0 50

y = 0.7105x - 0.5495 R 2 = 0.2919

100

150

200

250

Observed flow (m 3/s)

(d)
Fig. 2. ANN and MLR flow estimation results for station 105 (for the testing period).

RIVER BASIN FLOOD MANAGEMENT

569

40 Estimated flow (m 3/s)


40 Daily mean flow (m 3/s) 30 20 10 0 -10 -20 -30 Tim e (day) 0 500 1000 1500 2000 - - - - - FFBP _____ Observed

y = 1.0422x + 0.1172 R2 = 0.5359

30 20 10 0 -10 0 -20 -30 Observed flow (m 3/s ) 10 20 30 40

(a)
40 Daily mean flow (m 3/s) 35 30 25 20 15 10 5 0 0 500 1000 Tim e (day) 1500 2000 - - - - -GRNN _____ Observed

40 35 Estimated flow (m 3/s) 30 25 20 15 10 5 0 0

y = 0.6128x + 2.2794 R2 = 0.2635

10

20

30

40

Observed flow (m 3/s)

(b)
35 Daily mean flow (m 3/s) 30
Estimated flow (m 3/s)

- - - - - RBF _____ Observed


35 30 25 20 15 10 5 0 -5 0 -10 10 20 30 40 y = 0.79x + 0.743 R2 = 0.2294

25 20 15 10 5 0 -5 0 -10 Tim e (day) 500 1000 1500 2000

Observe d flow (m 3/s)

(c)
70 Daily mean flow (m 3/s) 60 - - - - MLR _____ Observed

Estimated flow (m 3/s)

50 40 30 20 10 0 -10 0 500 1000 Tim e (day) 1500 2000

70 60 50 40 30 20 10 0 -10 0

y = 0.9654x + 1.3854 R2 = 0.1078

10

20 Observed flow (m 3/s)

30

40

(d)
Fig. 3. ANN and MLR flow estimation results for station 106 (for the testing period).

570

INTERNATIONAL CONGRESS ON RIVER BASIN MANAGEMENT

Table 4.General evaluation of the model estimation results for all of the Turkish flow stations considered in the study.

FFBP 65

GRNN 33

RBF 31

MLR 2

Number of flow stations 98

5.CONCLUSIONS

The results of this study can be summarized as follows: 1-In this study the daily rainfall-runoff data sets are prepared for all of the 26 hydrologic basins of Turkey using the rainfall and river flow data sets which are measured independent from each other. 2-The employment of three different ANN methods in the rainfall- river flow modelling of 98 flow stations in the 26 hydrologic basins of whole Turkey has been accomplished successfully. It is seen that in nearly all of the stations the ANN methods performed better than the conventional multilinear regression method. 3-The applicability of GRNN model is tested for rainfall-runoff modelling for a wide range of data sets. It is seen that GRNN model does not provide negative values for low flows differing from RBF and FFBP methods. For a significant number of flow stations GRNN provided best performance criteria.. 4-For the high flows, the FFBP and RBF methods provided closer estimations for the observed measurements compared with GRNN method. This can be explained with the extrapolation ability of both methods. 5-The training time of a single FFBP application is relatively shorter than GRNN. This handicap can be easily overcome with GRNN algorithms by including clustering, which was not employed in this study (Parzen 1962, Miller et al. 1990). Besides, it was seen that multiple FFBP simulations were required until obtaining a satisfactory performance criteria. This total duration for FFBP simulations was longer than the unique GRNN application. 6- Another important result of the study is the positive coontribution of the utilization of initial statistical analysis results in determining the ANN input layer node number. The rainfall-river flow correlograms and flow auto correlograms point to the number of rainfall and past river flow values significant in flow estimations. This can be a time saving feature since the input layer nodes are found by trial and error in general.

RIVER BASIN FLOOD MANAGEMENT

571
ACKNOWLEDGEMENT

This work has been supported by the Scientific and Technical Research Council of Turkey (TBTAK) under grant TAG I841. The data is provided by General Directorate of Electrical Power Resources Survey and Development Administration of Turkey (EE) and Turkish State Meteorological Service (DM).

REFERENCES

Broomhead D, Lowe D. Multivariable functional interpolation and adaptive Networks. 1988. Complex Syst. 2, 321355. Cigizoglu HK. Incorporation of ARMA models into flow forecasting by artificial neural networks. 2003a. Environmetrics. 14(4), 417-427 . Cigizoglu HK. Estimation, forecasting and extrapolation of flow data by artificial neural networks. 2003b. Hydrological Sciences Journal. 48(3), 349-361. Cigizoglu HK. Estimation and forecasting of daily suspended sediment data by multi layer perceptrons. 2004. Advances in Water Resources. 27, 185-195. Cigizoglu HK. Application of the Generalized Regression Neural Networks to Intermittent Flow Forecasting and Estimation. 2005a. Journal of Hydrologic Engineering, ASCE. 10(4), 336-341. Cigizoglu HK. Generalized regression neural networks in monthly flow forecasting. 2005b. Civil Engineering and Environmental Systems. 22 (2), 71-84. Cigizoglu HK, Alp M. Generalized Regression Neural Network in modelling river sediment yield. 2006. Advances in Engineering Software. 37(2), 63-68. Cigizoglu HK, Kisi O. Flow Prediction by three Back Propagation Techniques Using kfold Partitioning of Neural Network Training Data. 2005. Nordic Hydrology. 36(1), 1-16. Cigizoglu HK, Kisi O. Methods to improve the neural network performance in suspended sediment estimation. 2006. Journal of Hydrology. 317 (3-4), 221-238. Haykin S. Neural networks: a comprehensive foundation. 1994. IEEE press. Minns AW, Hall MJ. Artificial neural networks as rainfall runoff models. 1996. Hydrological Sciences Journal. 41 (3), 399-417. Ranjithan S, Eheart JW, Garrett JH. Neural network-based screening for groundwater reclamation under uncertainity. 1993. Water Resources Research. 29 (3), 563-574. Specht DF. A general regression neural network.. 1991. IEEE Transactions on Neural Networks. 2(6), 568-576. Tokar AS, Johnson PA. Rainfall-runoff modelling using artificial neural networks. 1999. Journal of Hydrologic Engineering. 4 (3), 232-239.

You might also like