Professional Documents
Culture Documents
- Peshawar]
On: 20 June 2014, At: 00:35
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954
Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,
UK
Taylor & Francis makes every effort to ensure the accuracy of all the
information (the “Content”) contained in the publications on our platform.
However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or
suitability for any purpose of the Content. Any opinions and views expressed
in this publication are the opinions and views of the authors, and are not the
views of or endorsed by Taylor & Francis. The accuracy of the Content should
not be relied upon and should be independently verified with primary sources
of information. Taylor and Francis shall not be liable for any losses, actions,
claims, proceedings, demands, costs, expenses, damages, and other liabilities
whatsoever or howsoever caused arising directly or indirectly in connection
with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes.
Any substantial or systematic reproduction, redistribution, reselling, loan, sub-
licensing, systematic supply, or distribution in any form to anyone is expressly
forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
Downloaded by [NWFP University of Engineering & Technology - Peshawar] at 00:35 20 June 2014
int. j. remote sensing, 1997 , vol. 18 , no. 4 , 699 ± 709
Introduction
Abstract. Over the past decade there have been considerable increases in both
the quantity of remotely sensed data available and the use of neural networks.
These increases have largely taken place in parallel, and it is only recently that
several researchers have begun to apply neural networks to remotely sensed data.
This paper introduces this special issue which is concerned speci® cally with the
use of neural networks in remote sensing. The feed-forward back-propagation
multi-layer perceptron (MLP) is the type of neural network most commonly
encountered in remote sensing and is used in many of the papers in this special
issue. The basic structure of the MLP algorithm is described in some detail while
some other types of neural network are mentioned. The most common applica-
tions of neural networks in remote sensing are considered, particularly those
concerned with the classi® cation of land and clouds, and recent developments
in these areas are described. Finally, the application of neural networks to
multi-source data and fuzzy classi® cation are considered.
1. Backgroun d
The current generation of Earth observation sensors are producing data with
great potential for use in scienti® c and technological investigations in very large and
ever increasing quantities. Whilst such data provide a considerable resource with
which to address many fundamental environmental issues, they also present new
challenges of data processing and data interpretation. These challenges must be
tackled if the full potential of the data are to be realised. Not only is this necessary
for e cient use of the present data, it also provides an important constraint on the
need for, and an in¯ uence on the design of, instruments proposed for future sensor
platforms. It is in the context of these requirements that arti® cial neural networks
(ANNs) are currently being applied in a wide variety of remote sensing applications.
Good introductions to neural networks are provided in texts such as Kohonen
( 1988 ), Beale and Jackson ( 1990), Simpson ( 1990 ), Bishop ( 1995) and Aleksander
and Morton ( 1991 ).
The use of arti® cial neural networks for remote sensing data interpretation has
been motivated by the realisation that the human brain is very e cient at processing
vast quantities of data from a variety of di erent sources. Neurons in the human
brain receive inputs from other neurons and produce an output (if the sum of the
inputs is above a certain threshold ) which is then passed to other neurons. For some
time it has been recognized that a mathematical approach based on the actions of
0143 ± 1161/97 $12.0 0 Ñ 1997 Taylo r & Francis Ltd
700 P. M. Atkinson and A. R. L . Tatnall
biological neurons may be implemented to process and interpret many di erent
types of digital data. While it is not possible or desirable to reproduce the complexity
of the human brain on a computer, arti® cial neural networks that are based on an
architecture of simple processing elements like neurons are proving successful for a
wide range of applications, including processing and interpreting remotely sensed
data.
In the above sense, neural networks are an arti® cial intelligence (AI) technique
and, therefore, come from the same family as expert systems and knowledge-based
Downloaded by [NWFP University of Engineering & Technology - Peshawar] at 00:35 20 June 2014
approaches to learning ( Key et al . 1989 ). However, whereas expert systems are based
on symbolic representation and, therefore, incorporate qualitative data into the
estimation through prior programming of the learning algorithm, neural networks
employ a connectionist approach in which computer code is required only to run
the network ( Hepner et al . 1990).
Neural networks, in the simplest sense, may be seen as data transformers ( Pao
1989 ), where the objective is to associate the elements in one set of data with the
elements in a second set. When applied to classi® cation, for example, they are
concerned with the transformation of data from feature space to class space. Neural
networks, therefore, belong to the same class of techniques as automated pattern
recognition ( Ritter et al . 1988), regression, and spectral (and textural ) classi® cation.
Given the importance of these techniques, it is not surprising that neural networks
are ® nding increasing application in remote sensing.
The rapid uptake of neural approaches in remote sensing is due mainly to their
widely demonstrated ability to:
(i ) perform more accurately than other techniques such as statistical classi® ers,
particularly when the feature space is complex and the source data has
di erent statistical distributions ( Benediktsson et al . 1990, 1993, Schalko
1992 );
(ii ) perform more rapidly than other techniques such as statistical classi® ers
( Bankert 1994, Coà te and Tatnall 1995);
(iii ) incorporate a priori knowledge and realistic physical constraints into the
analysis ( Brown and Harris 1994, Foody 1995a, b),
(iv) incorporate di erent types of data (including those from di erent sensors)
into the analysis, thus facilitating synergistic studies ( Benediktsson et al .
1993, Benediktsson and Sveinsson 1997)
Given this list of bene® ts, it is clear that one of the main opportunities o ered
by neural networks is to allow the e cient handling of the large quantities of
remotely sensed data which are currently being produced.
In the remainder of this paper, one of the main di erences between statistical
and neural approaches is discussed, a common type of neural network employed in
remote sensing is introduced, and several applications of neural networks in remote
sensing are considered. The objective is not to review neural networks in remote
sensing, but rather to provide a context for the papers in this special issue. A useful
review may be found in Paola and Schowengerdt ( 1995), while some general guidance
is provided in Kanellopoulos and Wilkinson ( 1997).
classi® cation system (for example, into forest). Consequently, statistical approaches
may be seen as restrictive because of the underlying assumptions of the model. A
further problem with statistical approaches is that they require non-singular
(invertible) class-speci® c covariance matrices ( Benediktsson et al . 1993 ).
Neural networks applied for supervised classi® cation are similar to the K -nearest
neighbour algorithms, although neural networks are more e cient, and require less
data for training ( Lee et al . 1990 ). One of the main advantages of neural networks
for classi® cation is that they are distribution-free, that is, no underlying model is
assumed for the multivariate distribution of the class-speci® c data in feature space.
It is, therefore, possible for a single class to be represented in feature space as a
series of clusters (rather than a single cluster). A fundamental di erence between
statistical and neural approaches to classi® cation is, therefore, that statistical
approaches depend on an assumed model , whereas neural approaches depend on
data . It is for this reason that neural networks are suitable for integrating data from
di erent sources. More recent work has concentrated on the incorporation of
additional knowledge into the neural network ( for example, Foody 1995b).
associated with the connection. The receiving node sums the weighted signals from
all nodes to which it is connected in the preceding layer. Formally, the input that a
single node receives is weighted according to:
netj = v ji o i ( 1)
where v ji represents the weights between node i and node j , and o i is the output
from node i . The output from a given node j is then computed from:
Downloaded by [NWFP University of Engineering & Technology - Peshawar] at 00:35 20 June 2014
o j = f (net j ) ( 2)
The function f is usually a non-linear sigmoid function that is applied to the weighted
sum of inputs before the signal passes to the next layer. When the signal reaches the
output layer it forms the network output. In traditional hard classi® cation (where
pixels are assigned to a single class only), the output of one node (that of the chosen
class) is set to one, while all other nodes in the output layer are equal to zero.
3.4. T raining
The aim of network training is to build a model of the data generating process
so that the network can generalize and predict outputs from inputs that it has not
seen before. For the MLP a training pattern is presented to the network and the
signals are fed-forwards as described above. Then, the network output is compared
with the desired output (a set of training data, for example, of known classes) and
the error computed. This error is then back-propagated through the network and,
generally, the weights of the connections (which are usually set randomly at the
start) are altered according to what is known as the generalized delta rule ( Rumelhart
et al . 1986):
Dv ji ( n +1) = g ( d j o i ) + aDv ji ( n ) ( 3)
where g is the learning rate parameter, d j is an index of the rate of change of the
error, and a is the momentum parameter. This process of feeding forward signals
and back-propagating the error is repeated iteratively until the error of the network
as a whole is minimized or reaches an acceptable magnitude. It is through the
successive modi® cation of the (adaptive) weights that the neural network is able
to learn.
3.5. Generalizing
Several factors a ect the capabilities of the neural network to generalize, that is,
the ability of the neural network to interpolate and extrapolate to data that it has
not seen before. These include:
(i ) Number of nodes and architecture
If a large number of simple processing elements are used the mathematical
structure can be made very ¯ exible and the neural network can be used for a wide
range of applications. This may not be necessary for all applications. For example,
very simple topologies using a small number of data points have been investigated
( Yahn and Simpson 1995 ). In general terms, the larger the number of nodes in the
hidden layer(s), the better the neural network is able to represent the training data,
but at the expense of the ability to generalize.
(ii ) Size of training set
The data set used must be representative of the entire distribution of values likely
to be associated with a particular class. If the extent of the distribution of the data
704 P. M. Atkinson and A. R. L . Tatnall
in feature space is not covered adequately the network may fail to classify new data
accurately. A consequence of this for the MLP algorithm is that large quantities of
data are often required for training, and researchers are often concerned with ® nding
the minimum size of data set necessary (for example, Hepner et al . 1990).
The requirement for large training data sets also means that training times may
be long. To speed up the training process, several modi® cations to the MLP algorithm
have been introduced including the momentum term (see above), the delta-bar-delta
rule, and optimization procedures ( Benediktsson et al . 1993 ). Paola and
Downloaded by [NWFP University of Engineering & Technology - Peshawar] at 00:35 20 June 2014
4.1. L and
Downloaded by [NWFP University of Engineering & Technology - Peshawar] at 00:35 20 June 2014
4.2. Clouds
A similar approach has been taken to evaluate neural networks for identifying
and classifying clouds. For example, Lee et al . ( 1990 ) used a feed-forward back-
propagation neural network to classify di erent categories of clouds including cirrus,
stratocumulus, and cumulus from Landsat MSS data. The results showed a signi® cant
improvement over classical methods and gave an overall accuracy of 93 per cent.
Several di erent types of neural network were compared by Welch et al . ( 1992 ) for
classifying cloud data. They found the feed-forward back-propagation neural network
to be accurate, but slow to train compared to other types of network, particularly
the probabilistic neural network. This network makes use of Bayesian classi® cation
to assign classes to the largest value of the posterior class probability density
functions. The probablistic type of network was successfully used by Bankert ( 1994)
to classify clouds in maritime regions. In these studies, one goal is to automate the
handling of satellite sensor imagery and there is growing recognition that use must
be made of cloud shape, size, texture, and context ( Pankiewicz 1995, Lewis et al .
1997 ). Another goal is that of cloud screening to assist other work ( Yahn and
Simpson 1995, Logar et al . 1997 ).
distinct (or hard). In neurofuzzy techniques ( Brown and Harris 1994 ), the power of
neural networks are combined with fuzzy logic to enable fuzzy rules to be incorpor-
ated into the classi® cation and enable the intrinsic uncertainty in classi® cation to be
represented and minimized. A common problem with classi® cation in remote sensing
is that many observed pixels represent a mixture of classes. Methods of fuzzy
classi® cation for dealing with sub-pixel mixing, including the use of neural networks,
are described and compared in Atkinson et al . ( 1997 ) and Foody et al . ( 1997).
the incorporation of a priori knowledge and data from di erent sources into the
estimation. Neural networks are ® nding use in a wide range of applications in remote
sensing, and new applications are being proposed frequently. In this special issue,
some of the applications identi® ed above are investigated further, and several new
applications are reported.
References
Downloaded by [NWFP University of Engineering & Technology - Peshawar] at 00:35 20 June 2014
F iset, R . and C avayas, F . , 1997, Automatic comparison of a topographic map with remotely
sensed images in a map updating perspective: the road network case. International
Journal of Remote Sensing , 18, 991± 1006 (this issue).
F isher, P . F . and P athirana, S ., 1990, The evaluation of fuzzy membership of land cover
classes in the suburban zone. Remote Sensing of Environment , 34, 121± 132.
F oody, G . M . , 1995a, Using prior knowledge in arti® cial neural network classi® cation with
a minimal training set. International Journal of Remote Sensing , 16, 301± 312.
F oody, G . M . , 1995b, Land cover classi® cation using an arti® cial neural network with ancillary
information. International Journal of Geographical Information Systems, 9, 527± 542.
Downloaded by [NWFP University of Engineering & Technology - Peshawar] at 00:35 20 June 2014
F oody, G . M . and A rora, M . K ., 1997, An evaluation of some factors a ecting the accuracy
of classi® cation by an arti® cial neural network. International Journal of Remote Sensing ,
18, 799± 810 (this issue).
F oody, G . M ., L ucas, R . M ., C urran, P . J . and H onzak, M . , 1997, Non-linear mixture
modelling without end-members using an arti® cial neural network. International
Journal of Remote Sensing , 18, 937± 953 (this issue).
G opal, S . and W oodcock, C . , 1994, Theory and methods for accuracy assessment of thematic
maps using fuzzy sets. Photogrammetric Engineering and Remote Sensing , 60, 181± 188.
H eerman, P . D . and K hazenie, N ., 1992, Classi® cation of multi-spectral remote sensing data
using a back-propagation neural network. I.E.E.E. T ransactions on Geoscience and
Remote Sensing , 30, 81± 88.
H epner, G . F ., L ogan, T ., R itter, N . and B ryant, N ., 1990, Arti® cial neural network
classi® cation using a minimal training set: comparison to conventional supervised
classi® cation. Photogrammetric Engineering and Remote Sensing , 56, 469± 473.
H opfield, J . J . and T ank, D . W . , 1985, Neural computation of decisions in optimization
problems. Biological Cybernetics, 52, 141± 152.
H owald, K . J ., 1989, Neural network image classi® cation. Proceedings of the ASPRS-ACS M
Fall Convention ( Falls Church, VA: American Society for Photogrammetry and Remote
Sensing), pp. 207± 215.
I to, Y . and O matu, S ., 1997, Categeory classi® cation using a self-organizing neural network.
International Journal of Remote Sensing , 18, 829± 845 (this issue).
J in, Y .-Q . and L iu, C . , 1997, Biomass retrieval from high-dimensional active/passive remote
sensing data by using an arti® cial neural network. International Journal of Remote
Sensing, 18, 971± 979 (this issue).
K aminsky, E . J ., B arad, H . and B rown, W . , 1997, Textural neural network and version space
classi® ers for remote sensing. International Journal of Remote Sensing , 18, 741± 762
(this issue).
K anellopoulos, I . and W ilkinson, G . G ., 1997, Strategies and best practice for neural
network image classi® cation. International Journal of Remote Sensing , 18, 711± 725
(this issue).
K anellopoulos, I ., V arfis, A ., W ilkinson, G . G . and M e’gier, J ., 1992, Land-cover discrim-
ination in SPOT HRV imagery using an arti® cial neural network: a 20-class experiment.
International Journal of Remote Sensing , 13, 917± 924.
K ey, J ., M aslanik, J . A . and S chweiger, A . J ., 1989, Classi® cation of merged AVHRR and
SMMR Arctic data with neural networks. Photogrammetric Engineering and Remote
Sensing, 55, 1331± 1338.
K ohonen, T . , 1984, Self Organization and Associative Memory (Berlin: Springer-Verlag).
K ohonen, T . , 1988, An introduction to neural computing. Neural Networks, 1, 3± 16.
L ee, J ., W eger, R . C ., S engupta, S . K . and W elch, R . M . , 1990, A neural network approach
to cloud classi® cation. I.E.E.E. T ransactions on Geoscience and Remote Sensing , 28,
846± 855.
L ee, J . J ., S him, J . C . and H a, Y . H ., 1994, Stereo correspondence using the Hop® eld neural
network of a new energy function. Pattern Recognition , 27, 1513± 1522.
L ewis, H . G ., C o^ te’ , S . and T atnall, A . R . L . , 1997, Determination of spatial and temporal
characteristics as an aid to neural network cloud classi® cation. International Journal
of Remote Sensing , 18, 899± 915 (this issue).
L ippmann, R . P ., 1987, An introduction to computing with neural nets. I.E.E.E. ASSP
Magazine , 2, 4± 22.
L ogar, A ., C orwin, E ., A lexander, J ., L loyd, D ., B erendes, T . and W elch, R ., 1997, A
Neural networks in remote sensing 709
N asrabadi, N . H . and C hoo, C . Y ., 1992, Hop® eld network for stereo vision correspondence.
I.E.E.E. T ransactions on Neural Networks, 3, 5± 13.
P ankiewicz, G ., 1995, Pattern recognition techniques for the identi® cation of cloud and cloud
systems. Journal of Applied Meteorology, 2, 257± 271.
P ankiewicz, G ., 1997, Neural network classi® cation of convective air masses for a ¯ ood
forecasting system. International Journal of Remote Sensing , 18, 887± 898 (this issue).
P ao, Y .-H . , 1989, Adaptive Pattern Recognition and Neural Networks ( Reading, MA:
Addison-Wesley).
P aola, J . D . and S chowengerdt, R . A ., 1995, A detailed comparison of back-propagation
neural network and maximum-likelihoo d classi® ers for urban and land use classi® ca-
tion. IEEE T ransactions on Geoscience and Remote Sensing , 33, 981± 996.
P eddle, D . R ., F oody, G . M ., Z hang, A ., F ranklin, S . E . and L edrew, E . F . , 1994,
Multisource image classi® cation II: an empirical comparison of evidential reasoning,
linear discriminant analysis, and maximum likelihood algorithms for alpine land cover
classi® cation. Canadian Journal of Remote Sensing , 20, 397± 408.
R itter, N . D ., L ogan, T . L . and B ryant, N . A ., 1988, Integration of neural network
technologies with geographic information systems. Proceedings of the GIS Symposium:
Integrating T echnology and Geoscience Applications, Denver, Colorado , pp. 102± 103.
R osenblatt, F . , 1958, The perceptron: a probabilistic model for information storage and
organization in the brain. Psychological Review, 65, 386± 408.
R umelhart, D . E ., H inton, G . E . and W illiams, R . J ., 1986, Learning internal representations
by error propagation. In Parallel Distributed Processing: Explorations in the
Microstructures of Cognition , vol. 1 edited by D. E. Rumelhart and J. L. McClelland
(Cambridge, MA: MIT Press), pp. 318± 362.
S chalkoff, R ., 1992, Pattern Recognition: Statistical, Structural, and Neural Approaches ( New
York: Wiley).
S chweiger, A . and K ey, J ., 1997, Estimating surface radiation ¯ uxes in the Arctic from TOVS
HIRS and MSU brightness temperatures. International Journal of Remote Sensing , 18,
955± 970 (this issue).
S impson, P . K ., 1990, Arti® cial Neural Systems (Oxford: Pergamon Press).
T homas, I . L ., B enning, V . M . and C hing, N . P ., 1987, Classi® cation of Remotely Sensed
Images ( Bristol: Adam Hilger).
W ang, Y . and D ong, D ., 1997, Retrieving forest stand parameters from SAR backscatter
data using a neural network trained by a canopy backscatter model. International
Journal of Remote Sensing , 18, 981± 989 (this issue).
W elch, R . M ., S engupta, S . K ., G oroch, A . K ., R abindra, P ., R angaraj, N . and N avar,
M . S ., 1992, Polar cloud and surface classi® cation using AVHRR imagery: an inter-
comparison of methods. Journal of Applied Meteorology, 31, 405± 420.
W idrow, B . and H off, M . E . , 1960, Adaptive switching circuits. IRE WESCON Convention
Record, 4, 96± 104.
W ilkinson, G . G ., 1993, The generalization of satellite-derived thematic maps for GIS input.
Geo-Informations-Systeme , 6, 24± 29.
W ilkinson, G . G ., F ierens, F . and K anellopoulos, I ., 1995, Integration of neural and
statistical approaches in spatial data classi® cation. Geographical Systems, 2, 1± 20.
Y ahn, R . S . and S impson, J . J ., 1995, Applications of neural networks to cloud segmentation.
I.E.E.E. T ransactions on Geoscience and Remote Sensing , 33, 590± 603.