You are on page 1of 8

Hydrol. Earth Syst. Sci.

, 20, 1405–1412, 2016


www.hydrol-earth-syst-sci.net/20/1405/2016/
doi:10.5194/hess-20-1405-2016
© Author(s) 2016. CC Attribution 3.0 License.

Technical note: Application of artificial neural networks


in groundwater table forecasting – a case study in a
Singapore swamp forest
Yabin Sun1 , Dadiyorto Wendi1 , Dong Eon Kim1 , and Shie-Yui Liong1,2,3
1 Tropical Marine Science Institute, National University of Singapore, Singapore
2 WillisResearch Network, Willis Re Inc., London, UK
3 Center for Environmental Modeling and Sensing, SMART, Singapore

Correspondence to: Yabin Sun (tmssy@nus.edu.sg)

Received: 9 July 2015 – Published in Hydrol. Earth Syst. Sci. Discuss.: 10 September 2015
Revised: 19 March 2016 – Accepted: 23 March 2016 – Published: 14 April 2016

Abstract. Accurate prediction of groundwater table is im- teristics, initial conditions, system forcings, etc. To develop
portant for the efficient management of groundwater re- a groundwater numerical model, essential data include: to-
sources. Despite being the most widely used tools for de- pography, geological coverage, soil properties, land use map,
picting the hydrological regime, numerical models suffer vegetation distribution, evapotranspiration information, hy-
from formidable constraints, such as extensive data demand- drologic and climatic data, etc. Extensive data demanding
ing, high computational cost, and inevitable parameter un- makes numerical models highly data dependent and data sen-
certainty. Artificial neural networks (ANNs), in contrast, can sitive. Fitting a physical model is not possible when data are
make predictions on the basis of more easily accessible vari- not sufficient; the accuracy of the numerical model to a great
ables, rather than requiring explicit characterization of the extent depends on how accurate the model inputs are. Nu-
physical systems and prior knowledge of the physical param- merical models are also less competent in forecasting as most
eters. This study applies ANN to predict the groundwater ta- of the system forcings (e.g. evapotranspiration, rainfall) are
ble in a freshwater swamp forest of Singapore. The inputs to less predictable. As a result of aforementioned constraints,
the network are solely the surrounding reservoir levels and numerical models tend to produce imperfect results in spite
rainfall. The results reveal that ANN is able to produce an of the perfect knowledge of the governing laws (Sun et al.,
accurate forecast with a leading time of 1 day, whereas the 2010).
performance decreases when leading time increases to 3 and To combat the deficiencies of the numerical models, ar-
7 days. tificial neural networks (ANNs) have emerged as an alter-
native modelling and forecasting approach with a variety of
applications in hydrology research (e.g. French et al., 1992;
Maier and Dandy, 2000). Unlike the traditional physical-
1 Introduction based models, the ANN-based approach does not require ex-
plicit characterization of the physical properties, or accurate
Physical-based numerical models are widely used in ground- representation of the physical parameters, but rather simply
water table simulation. Different numerical models have determines the system patterns based on the relationships be-
been developed for different regions with different objec- tween inputs and outputs mapped in the training process.
tives, such as to describe regional groundwater flow patterns, ANNs typically use input variables that are more accessi-
and to understand local hydrological processes. (e.g. Matej et ble to make predictions, and therefore circumvent the data
al., 2007; Pool et al., 2011; Yao et al., 2015). Numerical mod- reliance inherent to the numerical models. As compared to
els solve the deterministic equations to simulate the ground- classical regression techniques, e.g. linear regression mod-
water systems based on the knowledge of the system charac-

Published by Copernicus Publications on behalf of the European Geosciences Union.


1406 Y. Sun et al.: Technical note: Application of artificial neural networks

elling, ANNs are capable of simulating the nonlinear dynam- works (FNNs) and recurrent neural networks (RNNs; Graves
ics of the hydrological processes and hence result in superior et al., 2009). In FNNs information flows from inputs to out-
modelling and forecasting performance. puts in only one direction, whereas in RNNs some of the
ANNs in recent years have also been successfully applied information can flow not only in one direction from inputs to
in groundwater table modelling. Yang et al. (1997) utilized outputs but also in the opposite direction.
ANN to predict groundwater table variations in subsurface- There are many algorithms for training neural network
drained farmland. Coulibaly et al. (2001) calibrated three models, most of which employ some form of gradient de-
different ANN models using groundwater recordings and scent using backpropagation to compute the actual gradients
other hydrometeorological data to simulate groundwater ta- (Werbos, 1974). The backpropagation algorithm is imple-
ble fluctuations. Lallahem et al. (2005) showed the feasibility mented by taking the derivatives of the cost function with re-
of using ANN to estimate groundwater level in an uncon- spect to the synaptic weights and then changing the weights
fined chalky aquifer. Daliakopoulous et al. (2005) examined in a gradient-related direction (Sexton and Dorsey, 2000;
the performance of different ANN architectures and train- Mandischer, 2002).
ing algorithms in groundwater table forecasting. Taormina This study opts for a standard FNN and a quasi-Newton
et al. (2012) developed a two-step ANN model to simulate training algorithm, more specifically a multilayer perceptron
the groundwater fluctuations in a coastal aquifer using past (MLP) trained with the Levenberg–Marquardt (LM) algo-
observed groundwater levels and external inputs, i.e. evap- rithm, attributing to its superior accuracy in groundwater ta-
otranspiration and rainfall. Most of above studies, however, ble forecasting (Daliakopoulous et al., 2005).
focus on applying ANN in large-scale semiarid or arid wa-
tersheds, where groundwater table is less variable and long- 2.2 Multilayer perceptron
term groundwater table variation (e.g. monthly, annually) is
of more concern. In addition, these studies use historical Multilayer perceptron (MLP) was developed for pattern clas-
groundwater tables as inputs to the network, requiring con- sification by Rosenblatt (1958). The architecture of a typical
tinuously long groundwater table recordings which can be a MLP consists of an input layer, one hidden layer and an out-
luxury for many regions. put layer. In mathematical terms, a computational neuron in
This study, for the first time, applies ANN to forecast the hidden or output layers can be described by following
the groundwater table in a tropical wetland – the Nee Soon pair of equations:
Swamp Forest (NSSF) in Singapore. Being nourished with
n
water supply from reservoirs and precipitation, the ground- X
u= wi x i (1)
water table in the NSSF is close to the ground level and ex- i=1
tremely sensitive to the changes in hydrometeorological con-
ditions. This study selects surrounding reservoir levels and and
rainfall as inputs to the network, avoiding the requirement on
continuously long groundwater table recordings. The fore- y = ϕ(u + b), (2)
cast is made with 3 leading times, i.e. 1 day, 3 days, and 7
days, which provide sufficient reaction time for human inter- where x1 , x2 , . . . , xn are the input signals to the neuron, w1 ,
vention to maintain favourable hydrological conditions for w2 , . . . , wn are the synaptic weights, u is the linear combiner
conserving local ecosystems. The methodology, application, of the input signals, b is the bias, and y is the output signal
results, and conclusions are elaborated in the following sec- of the neuron, whereas φ(·) is the activation function to limit
tions. the amplitude of the output signal and to create a mapping
between the input and output signals.
The universal approximation theorem states that every
2 Methodology continuous function defined on a closed and bounded set can
be approximated arbitrarily closely by an MLP provided that
2.1 Overview the number of neurons in the hidden layers is sufficiently
high and that their activation functions belong to a restricted
As defined by Haykin (1999), artificial neural networks class of functions with particular properties (Hornik et al.,
(ANNs) are massively parallel distributed processors made 1989).
up of simple processing units, known as neurons, which have
a natural propensity for storing experiential knowledge and 2.3 Levenberg–Marquardt algorithm
making it available for use. ANNs are inspired by biological
neural networks to emulate the way in which human brains The Levenberg–Marquardt (LM) algorithm, independently
function. The fact that neurons can be interconnected in nu- developed by Levenberg (1944) and Marquardt (1963), pro-
merous ways results in numerous possible topologies that can vides a numerical solution to the problem of minimizing a
be divided into two basic classes, i.e. feedforward neural net- nonlinear function. The update rule of the LM algorithm can

Hydrol. Earth Syst. Sci., 20, 1405–1412, 2016 www.hydrol-earth-syst-sci.net/20/1405/2016/


Y. Sun et al.: Technical note: Application of artificial neural networks 1407

Figure 1. Geographical location of the Nee Soon Swamp Forest in Singapore.

be presented as follows: NSSF is located in the northern part of the Singapore central
 −1 catchment nature reserve bounded by the Upper Seletar, Up-
wk = wk − JTk Jk + µk I Jk ek , (3) per Peirce, and Lower Peirce reservoirs. As the only substan-
tial freshwater swamp forest remaining on the main island of
where k is the iteration index, J is the Jacobian matrix, µ is Singapore, the NSSF houses a diversity of flora and fauna,
the combination coefficient, I is the identity matrix, and e is some of which are found nowhere else in Singapore or the
the error vector. world (Karunasingha et al., 2013).
The LM algorithm essentially blends the steepest descent With an estimated area of about 750 ha, the NSSF covers
method and the Gauss–Newton algorithm. The optimization the lower area of shallow valleys with slow-flowing streams
process is guided by the combination coefficient µ. Around and a few higher grounds with dryland forests. The eleva-
the error surface with complex curvature, the LM algorithm tion of NSSF ranges between 1 and 80 m above mean sea
switches to the steepest descent algorithm with a bigger level (a.m.s.l.). The aquifer depth in the NSSF is 20–40 m,
µ, whereas if the local curvature is appropriate to make a and the major soil type features silty sand with a hydraulic
quadratic approximation, µ can be decreased, giving the LM conductivity of 4.05 × 10−5 m s−1 . Figure 1 also depicts the
algorithm a step closer to the Gauss–Newton algorithm. The locations of the 4 piezometers installed for groundwater table
LM algorithm is faster, more stable, and less easily trapped monitoring. The piezometers are deployed near the streams,
in local minima than other algorithms (Toth et al., 2000). where the observed groundwater tables vary 0–1 m below the
ground level.

3 Application
3.2 ANN setup
3.1 Study case
The surrounding reservoirs serve as important freshwater
Figure 1 shows the geographical location of the study area storage for Singapore, with reservoir levels being kept at rel-
– the Nee Soon Swamp Forecast (NSSF) in Singapore. The atively high levels ranging from 10 to 40 m a.m.s.l. Singapore

www.hydrol-earth-syst-sci.net/20/1405/2016/ Hydrol. Earth Syst. Sci., 20, 1405–1412, 2016


1408 Y. Sun et al.: Technical note: Application of artificial neural networks

Figure 2. Observed vs. MLR- and ANN-forecasted groundwater tables (P1).

has a typical tropical rainforest climate with abundant rain- trial and error), and an output layer with 4 output neurons
fall; the annual rainfall at the NSSF region can be as high as (future observed groundwater tables at the 4 piezometers).
3000 mm. Despite being another important influential factor The logistic function and threshold function are respectively
for the groundwater, observed evapotranspiration is not avail- adopted as the activation functions for the hidden layer and
able due to the constraints imposed from setting up monitor- the output layer.
ing stations in the protected forest, and hence it is excluded Daily observed data, i.e. reservoir levels, rainfall and
in the ANN setup. Reservoir levels and rainfall, as the ma- groundwater tables, are available in 2012 and 2013. The data
jor water source and driving force, are fed to the networks as set is divided into three subsets as follows:
inputs, while the output is the observed groundwater tables
with a leading time of 1, 3, and 7 days (i.e. future observed – Training data (January 2012–December 2012)
groundwater tables after 1, 3, and 7 days). Training data are used for adjusting the synaptic weights
A multiple-input multiple-output (MIMO) network is se- in the network. An entire year’s data are selected as the
lected over 4 multiple-input single-output (MISO) ANNs for training data, so as to expose the network to a complete
two reasons: (1) it is easier to implement; and (2) cross cor- annual cycle for a robust training.
relation exists in the observed groundwater tables, e.g. the
synchronous response to dry and wet conditions; targeting – Cross-validation data (January 2013–June 2013)
the groundwater table measurements at 4 locations simulta- Cross-validation data are used for avoiding overfitting.
neously, the cross-correlation impact can be captured in the When the errors between the predicted values and de-
synaptic weights of the trained ANN and hence a better per- sired values in the cross-validation data begin to in-
formance is expected. The MIMO network is composed of an crease, the training stops and this is considered to be the
input layer with 4 input neurons (including 3 reservoir levels point of best generalization. One half of a year’s data
and one rainfall), a hidden layer with 10 neurons (inspired are selected as the cross-validation data.
by the universal approximation theorem and determined by

Hydrol. Earth Syst. Sci., 20, 1405–1412, 2016 www.hydrol-earth-syst-sci.net/20/1405/2016/


Y. Sun et al.: Technical note: Application of artificial neural networks 1409

Figure 3. Scatter plots of observed and ANN-forecasted groundwater tables (P1).

– Testing data (July 2013–December 2013) ical processes, the relationship between the input (reservoir
level, rainfall) and the output (groundwater table) is highly
Testing data are used for evaluating the performance of
nonlinear. Therefore, the MLR model is not suitable to serve
the network. Once the network is trained, the weights
our study purpose and produces inferior forecasting results,
are frozen; the testing set is fed into the network and
especially at the extreme values. In contrast, the ANN fore-
the network output is then compared with the desired
cast successfully resolves the rising and falling tendencies
output. The remaining half of a year’s data are selected
of the groundwater tables, resulting in a rather reasonable
as the testing data.
groundwater table forecast. The scatter plots of the observed
groundwater tables and the ANN forecast are presented in
Fig. 3. The response of the groundwater tables to the sys-
4 Results and discussion tem forcings, for such a confined and wet catchment, is rapid
and sensitive. The correlation fades out between the inputs
Figure 2 illustrates examples at P1 of the observed ground- and outputs when the leading time progresses; this leads to
water tables, the forecasted groundwater tables from a multi- the model performance deterioration at 3 and 7 days. The
ple linear regression (MLR) model and the ANN model. Due groundwater tables experience a drastic drop in July and Au-
to the complicated geological characteristics and hydrolog-

www.hydrol-earth-syst-sci.net/20/1405/2016/ Hydrol. Earth Syst. Sci., 20, 1405–1412, 2016


1410 Y. Sun et al.: Technical note: Application of artificial neural networks

Figure 4. Observed vs. ANN-forecasted groundwater tables (P4).

Table 1. Evaluation statistics of the ANN forecast.

P1 P2 P3 P4
RMSE r RMSE r RMSE r RMSE r
(cm) (cm) (cm) (cm)
1 day 5.4 0.88 6.4 0.78 5.2 0.77 12.2 0.69
3 day 8.2 0.76 7.1 0.76 6.6 0.71 13.3 0.68
7 day 9.9 0.64 9.2 0.72 8.6 0.67 15.8 0.65
Average 7.8 0.76 7.6 0.75 6.8 0.72 13.8 0.67

gust 2013, caused by a continuous 2-month drought. As such Table 1 summarizes the ANN forecast efficiency through
an extreme drought condition does not exist in the training evaluating the root mean square error (RMSE) and the corre-
data, the ANN tends to overpredict the groundwater tables lation coefficient (r) based on the testing data. The forecast
for that period. accuracy decreases slightly when the leading time increases
Figures 4 and 5, respectively, present the groundwater ta- due to the rapid and sensitive response of the groundwater
ble time series and scatter plots at P4. P4 is located near tables to the system forcings. The RMSE is generally within
the Upper Seletar Reservoir, and the groundwater table is af- 10 cm with the exception at P4 caused by the absence of the
fected by the spillway discharge released from the reservoir. spillway information. Averaged over the 3 leading times, at
Failing to include the spillway information makes the ANN P1–P3 the RMSE is less than 8.0 cm with correlation coef-
less competent in capturing the groundwater table extreme ficient r higher than 0.7, whereas at P4 the averaged RMSE
values caused by the spillway discharge, and hence results in and correlation coefficient r are, respectively, 13.8 cm and
the lower forecast accuracy at P4. 0.67.

Hydrol. Earth Syst. Sci., 20, 1405–1412, 2016 www.hydrol-earth-syst-sci.net/20/1405/2016/


Y. Sun et al.: Technical note: Application of artificial neural networks 1411

Figure 5. Scatter plots of observed and ANN-forecasted groundwater tables (P4).

5 Conclusions ical models. However, improvements are expected if more


variables can be involved in the training, cross-validation,
This study, for the first time, applies artificial neural networks and testing process; such variables, for example, are spill-
(ANNs) to predict the groundwater table variations in a trop- way discharge, evapotranspiration, soil properties, and water
ical wetland – the Nee Soon Swamp Forest (NSSF) in Sin- level measurements. Less data demanding, lower computa-
gapore. The ANN model solely utilizes the easily accessi- tional cost and higher site-specific forecast accuracy are the
ble surrounding reservoir levels and rainfall as inputs to fore- advantages of the ANN-based approach over the physical-
cast the groundwater tables, without requiring any other prior based numerical models. Numerical models, however, can be
knowledge of the system’s physical properties. The ANN applied to describe the spatiotemporal variations of the sys-
forecast shows generally promising accuracy, while its per- tem process over the entire model domain provided with suf-
formance decreases when the leading time progresses due to ficient information of the model inputs. Therefore, the ANN
the fading correlation between the network inputs and out- and numerical model can act as natural complements in such
puts. a way that ANN is more suitable for site-specific forecast
In this study, surrounding reservoir levels and rainfall are while the numerical model provides a better spatial cover-
selected as ANN inputs. The limited number of inputs elimi- age.
nates the data-demanding restrictions inherent in the numer-

www.hydrol-earth-syst-sci.net/20/1405/2016/ Hydrol. Earth Syst. Sci., 20, 1405–1412, 2016


1412 Y. Sun et al.: Technical note: Application of artificial neural networks

Acknowledgements. This study forms part of the research project Maier, H. R. and Dandy, G. C.: Neural networks for the prediction
“Nee Soon Swamp Forest Biodiversity and Hydrology Baseline and forecasting of water resources variables: a review of mod-
Studies – Phase 2” funded by National Parks Board (NParks), eling issues and applications, Environ. Modell. Softw., 15, 101–
Singapore. The authors are also grateful for the data support from 124, 2000.
Public Utilities Board (PUB), Singapore for making this study Mandischer, M.: A comparison of evolution strategies and back-
possible. propagation for neural network training, Neurocomputing, 42,
87–117, 2002.
Edited by: D. Solomatine Marquardt, D.: An algorithm for least-squares estimation of nonlin-
ear parameters, SIAM J. Appl. Math., 11, 431–441, 1963.
Matej, G., Isabelle, W., and Jan, M.: Regional groundwater model
of north-east Belgium, J. Hydrol., 335, 133–139, 2007.
References Pool, D. R., Blasch, K. W., Callegary, J. B., Leake, S. A.,
and Graser, L. F.: Regional Groundwater-Flow Model of the
Coulibaly, P., Anctil, F., Aravena, R., and Bobee, B.: Artificial neu- Redwall-Muav, Coconino, and Alluvial Basin Aquifer Systems
ral network modeling of water table depth fluctuations, Water of Northern and Central Arizona: USGS Scientific Investigation
Resour. Res., 37, 885–896, 2001. Report 2010-5180, v. 1.1, 101, 2011.
Daliakopoulosa, I. N., Coulibaly, P., and Tsanis, I. K.: Groundwater Rosenblatt, F.: The perceptron: a probabilistic model for informa-
level forecasting using artificial neural networks, J. Hydrol., 309, tion storage and organization in the brain, Psychol. Rev., 65, 386–
229–240, 2005. 408, 1958.
French, M. N., Krajewski, W. F., and Cuykendall, R. R.: Rainfall Sexton, R. S. and Dorsey, E. D.: Reliable classification using neural
forecasting in space and time using a neural network, J. Hydrol., networks: a Genetic Algorithm and backpropagation compari-
137, 1–31, 1992. son, Decis. Support Syst., 30, 11–22, 2000.
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Sun, Y., Babovic, V., and Chan, E. S.: Multi-step-ahead model er-
and Schmidhuber, J.: A novel connectionist system for uncon- ror prediction using time-delay neural networks combined with
strained handwriting recognition, IEEE T. Pattern Anal., 31, chaos theory, J. Hydrol., 395, 109–116, 2010.
855–868, 2009. Taormina, R., Chau, K.-W., and Sethi, R.: Artificial neural network
Haykin, S.: Neural Networks: A Comprehensive Foundation, Pren- simulation of hourly groundwater levels in a coastal aquifer sys-
tice Hall, New Jersey, 1999. tem of the Venice lagoon, Eng. Appl. Artif. Intel., 25, 1670–
Hornik, K., Stinchcombe, M., and White, M.: Multilayer feedfor- 1676, 2012.
ward networks are universal approximators, Neural Networks 2, Toth, E., Brath, A., and Montanari, A.: Comparison of short-term
359–366, 1989. rainfall prediction models for real-time flood forecasting, J. Hy-
Karunasingha, D. S. K., Chui, T. F. M., and Liong, S. Y.: An ap- drol., 239, 132–147, 2000.
proach for modelling the effects of changes in hydrological envi- Werbos, P. J.: Beyond Regression: New Tools for Prediction and
ronmental variables on tropical primary forest vegetation, J. Hy- Analysis in the Behavioral Sciences, PhD Thesis, Harvard Uni-
drol., 505, 102–112, 2013. versity, Cambridge, MA, 1974.
Lallahem, S., Mania, J., Hani, A., and Najjar, Y.: On the use of neu- Yang, C. C., Prasher, S. O., Lacroix, R., Sreekanth, S., Patni, N. K.,
ral networks to evaluate groundwater levels in fractured media, and Masse, L.: Artificial neural network model for subsurface-
J. Hydrol., 307, 92–111, 2005. drained farmlands, J. Irrig. Drain. E.-ASCE, 123, 285–292, 1997.
Levenberg, K.: A method for the solution of certain problems in Yao, Y., Zheng, C., Liu, J., Cao, G., Xiao, H., Li, H., and Li, W.:
least squares, Q. Appl. Math., 5, 164–168, 1944. Conceptual and numerical models for groundwater flow in an
arid inland river basin, Hydrol. Proc., 29, 1480–1492, 2015.

Hydrol. Earth Syst. Sci., 20, 1405–1412, 2016 www.hydrol-earth-syst-sci.net/20/1405/2016/

You might also like