You are on page 1of 11

Elliott Wave Theory and neuro-fuzzy systems, in stock market prediction:

The WASP system


George S. Atsalakis

, Emmanouil M. Dimitrakakis, Constantinos D. Zopounidis


Department of Production Engineering and Management, Technical University of Crete, Chania 73100, Greece
a r t i c l e i n f o
Keywords:
Neurofuzzy forecasting
Stock market forecasting
Price trend forecasting
Elliott waves
Technical analysis
ANFIS
a b s t r a c t
This paper presents the WASP (Wave Analysis Stock Prediction) system, a system based on the neuro-
fuzzy architecture, which utilizes aspects from the Elliott Wave Theory, presented by Ralph Nelson Elliott.
This theory has been found to be extremely useful and accurate, particularly in problems of forecasting. A
neuro-fuzzy logic technique has been used to forecast the trend of the stock prices and the results derived
are very encouraging.
2011 Elsevier Ltd. All rights reserved.
1. Introduction
Finding a reliable forecasting method has always been the goal
of many investors. The possibility of easy prots by forecasting the
market has been the underlying force that motivates many
researchers to invent new methodologies and models. Forecasts
have always been made, with the help of statistical models, techni-
cal analysis, econometric methods, and others. Recently, articial
intelligence has been found to provide signicant results for such
problems.
2. Neurofuzzy systems and stock forecasting
Interesting efforts have been made in the way of neuro fuzzy
systems in stock forecasting. Abraham (2004) presented various
neurofuzzy modeling techniques. Wong, Wang, Goh, and Quek
(1992) proposed a neuro fuzzy system to forecast annual returns.
The inputs he used were the beta coefcient, a moving average
of three years, Tobins q, and others. The results were satisfactory.
Nishina and Hagiwara (1997) proposed a model that aimed to fore-
cast the exact return of a stock. His model achieved better results
than a neural network. Rast (1999) compared a neurofuzzy system
with a neural network, for the years 1987 and 1988. His system
was trained with the DAX index. He found that the neuro fuzzy
system outperformed the neural network. Siekmann, Kruse,
Gebhardt, Van Overbeek, and Cooke (2001) proposed a neuro-fuzzy
model to forecast the DAX index. His model outperformed a linear
model, using the hit-rate as a measure of success. Abraham, Nath,
and Mahanti (2001) combined a neural network with a neuro-
fuzzy system to forecast the next days return of the NASDAQ
index. He used the neuro-fuzzy system to evaluate the neural net-
works result. His overall results were satisfactory. Wu, Fung, and
Flitman (2001) proposed a Feed Forward Neuro Fuzzy (FFNF) sys-
tem to forecast the monthly trend on the S&P500 index. As inputs
he used previous returns, as well as various economic indices like
the unemployment rate, lending rates and others. Atsalakis and
Valavanis (2009) proposed a system based on an inverted neuro
fuzzy controller. As inputs he used returns. The model achieved
one of the highest percentages of correct forecasts, over various
periods for various stocks. Ghandar, Schmidt, To, and Zurbruegg
(2007), used a neuro-fuzzy technique to select stocks, based on
buy and sell signals of various technical analysis indexes, like the
moving average, the double moving average and the On Balance
Volume index. The model assigns every stock with a score. Accord-
ing to this score, the system chooses the stocks that form the port-
folio. The results were very interesting. Bekiros (2007) presents the
results of an ANFIS system, which forecasts the rate of change for
the next day, of the NIKKEI index. As inputs, both returns and their
lags were used. The system was compared with an ARMA model
and a neural network. The neuro-fuzzy system outperformed on
bear markets and on bull markets. Pokropinska and Scherer
(2008) proposed a neuro-fuzzy system based on the Mamdani
architecture to forecast buy and sell signals for the stock market
of Warsaw. The inputs used were closing prices, opening prices,
maximum and minimum, the difference between maximum and
minimum, volume, as well as a combination of moving averages
and other technical analysis indexes and oscillators. The system
gave correct signals for the testing period. More stock market fore-
casting techniques have been presented by Atsalakis et al. (2001),
Atsalakis and Ucenic (2005), Atsalakis, Skiadas, and Braimis (2007),
Atsalakis and Nezis (2008), Atsalakis and Valavanis (2008, 2010a,
2010b), and Atsalakis and Zopounidis (2009).
0957-4174/$ - see front matter 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2011.01.068

Corresponding author.
E-mail addresses: atsalak@otenet.gr (G.S. Atsalakis), mdimitrakakis@gmail.com
(E.M. Dimitrakakis), kostas@dpem.tuc.gr (C.D. Zopounidis).
Expert Systems with Applications 38 (2011) 91969206
Contents lists available at ScienceDirect
Expert Systems with Applications
j our nal homepage: www. el sevi er . com/ l ocat e/ eswa
2.1. Elliott Wave Theory
The Elliott Wave Theory was introduced by Ralph Nelson Elliott,
during the 1930s. Elliott believed that stock trends follow a
repeating pattern, which can be forecasted both in the long and
short-term. He published his ideas in his book The Elliott Wave
Principle in 1938. Using data from stocks he concluded, that what
seems to be a chaotic movement, actually outlines a harmony
found in nature. Elliotts discovery was completely based on obser-
vation, but he tried to explain his ndings using psychological rea-
sons. The main principle of his theory was that a pattern consists of
eight waves as can be seen in Diagram 1.
It is clearly indicated that waves 1, 3, 5 follow the overall trend,
while waves 2 and 4 correct the underlying trend waves a, b, c cor-
rect the overall trend, while waves a and c follow the correction
and wave b resists it. Elliott observed that each wave consists of
smaller waves, which follow the exact same pattern as is shown
in Diagram 2, thereby forming a super-cycle.
The numbers in the diagram represent the number of waves
when counted in a different scope. For example, the whole diagram
represents two big waves, the impulse and the correction. The im-
pulse consists of 5 waves, and the correction of 3. The 5 waves of
the impulse, consist of 21 sub-waves which, in turn, consist of 89
smaller waves, while the Correction wave consists of 13 sub-waves
which, in turn, consist of 55 even smaller waves. As can be
observed all the above numbers are part of the Fibonacci series, a
series which can be found in many different areas in nature.
According to the Elliott Wave Theory (Prechter & Frost, 1998),
when Elliott rst expressed his theory he was not aware of the
Fibonacci series. What is even more remarkable is the fact that
some ratios which are related to the Fibonacci series can be ob-
served in many stock movements, and charts, as will be presented
later.
Elliott believed that there are nine cycles, of different durations,
the bigger of which, is formed by the smaller ones. From largest to
the smallest cycles, there are: (1) Grand Super-Cycle, (2) Super-
Cycle, (3) Cycle, (4) Primary, (5) Intermediate, (6) Minor, (7)
Minute, (8) Minuette and (9) Subminuette.
The duration of the cycles vary from minutes to decades. Each
pattern (cycle) is outlined by the following rules:
(1) The second wave cannot be longer than the rst wave, and
cannot return to a lower price than that set at the beginning
of the rst wave.
(2) The third wave is never the smallest wave compared to the
rst and the fth.
(3) The fourth wave does not return to a lower price than the
price found at end of the rst wave. The same applies for
wave A.
(4) Usually, the third wave shows a greater dynamic, except in
some cases where the fthwave is extended (the case when
the fth wave is made up of ve smaller waves).
(5) The fth wave usually leads to a higher point than the fth.
2.1.1. Explaining the wave behavior
The rst wave is the new beginning of an impulse. It is dif-
cult to differentiate it from a correction of a previous downtrend,
and therefore it is not a powerful wave. Most investors prefer to
wait for better timing. The force behind the wave pattern is the
number of investors that decide to enter and exit the market at a
given time. After some initial winnings, investors decide to exit
the market as the price becomes higher, and the stock becomes
overpriced for these few investors. This behavior explains the sec-
ond wave. As the price begins falling, the stock becomes more
attractive for a greater number of investors that regretted not hav-
ing entered the market during the rst wave that they missed.
Demand begins rising, which pushes the price up. More and more
investors are determined to enter the market, creating a powerful,
fast paced wave, which in turn attracts even more investors to en-
ter the market at a higher price. Those who entered in the begin-
ning of the wave, are satised with their winnings, and have
most likely exited the market. Investors realize that the price has
reached a level making it difcult to attract any further investors.
Demand begins falling, which leads to the fourth wave. Major
investors are out of the market, waiting for the end of the fourth
wave, to enter again and reap in the prots of the fth wave. It is
important to note that the fourth and fth waves are the easiest
ones to follow, as they come after the third wave which is the eas-
iest to spot, due to its length, power, and speed. Major investors
have bought stocks on lower prices, from investors that had bought
them during the end of the third wave, who feared the price might
go lower. However as the major investors enter the market again,
they create a small hype, the fth wave, smaller than the third
wave, which usually reaches the peak of the third wave, and some-
times even higher. Investors, who know the market, know that the
price is extremely overrated and therefore have exited the market.
Wave A, is a corrective wave, which is often mistaken for a second
wave. This explains wave B. Smaller investors think that wave A
corrected the price enough, so that it can lead to an upward trend.
Unfortunately, this is the wave where most smaller, and occasional
investors lose huge amounts of money, as wave C starts, pushing
the price lower until the price gets underrated again, for a new pat-
tern to start.
The above explanation is by no means a statistical explanation
of the wave behavior, but explains the difference between major
and occasional investors, and their knowledge of the market. It is
very important to know the exact wave patterns, otherwise it is
very easy to misinterpret signs. It is important to note that the fol-
lowing explanation regards an overall impulse trend. The opposite
would happen in the case of an overall correction.
2.1.2. The Fibonacci series
As previously mentioned, the Elliott wave principle is acciden-
tally (According to Elliott) connected with the Fibonacci sequence.
The Fibonacci sequence is a sequence of numbers derived from the
addition of the previous two numbers (1, 1, 2, 3, 5, 8, 13, 21, 34, 55,
89, 144, . . .). One can observe that all these numbers are also the
numbers of waves, depending of the size of the Elliott wave pat-
terns. What is remarkable for this sequence is that the division of
a number of the sequence with its previous numbers, with the
exception of the rst ones, gives the same result, the number
1.618, which is called Phi (the Greek letter /). This number is also Diagram 1. Basic Elliott wave pattern. Source: Prechter and Frost (1998).
G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206 9197
called the golden ratio, or the golden number. By dividing any
number of the sequence with its following, the result is the number
0.618, another important number as will be seen later. There are
historical evidences that this number was known to ancient civili-
zations, as such ratios are found in the Egyptian Pyramids, Ancient
Greek architecture (The Parthenon). This number is also found in
nature, microcosmic and macrocosmic. It can be found in Human
DNA architecture, is spiral galaxies, and in planet movements. It
is not surprising that such ratios are also found in Stock market
movements. In a later book of Elliott, Natures Law, he states that
All human activities have three distinctive features, pattern, time and
ratio, all of which observe the Fibonacci summation series.
Elliott observed that many ratios in his patterns are derived
from the Fibonacci sequence. Wave 2, usually corrects up to 50%
or 62% of wave 1. Wave 3 is usually 1.62, 2.62 or 4.25 times the
length of wave 1. Wave 4 corrects up to 24% or 28% of wave 3.
Wave 5 is somewhat more complicated, and depends either on
wave 1 (1.62 or 2.62 times wave one), or on the length from the
beginning of wave 1 to the end of wave 3 (being 0.62, 1, or 1.62
times this length). It is important to note that these ratios are
not to be used for forecasting the market, but rather explaining
the market, and spotting the waves. Elliotts wave theory cannot
constantly explain the market perfectly, but can give fuzzy esti-
mates of the market behavior. Of course, other factors affect the
market, but as the results of the suggested system show, the theory
is capable of improved stock market forecasting.
2.2. Fuzzy logic
Fuzzy logic was proposed by Zadeh (1965), as an alternative
way to express data. One of the most typical examples to under-
stand fuzzy logic is the expression of age. Classical reasoning
measures age in years. Fuzzy logic can be used to express age
in three categories: young, middle-aged, old. This can be used
with the help of a membership function which converts age in
years into an appropriate category. A membership function takes
on values in the space [0, 1], according to how much the data
belong to a corresponding category. This is the main difference
between fuzzy logic and classical reasoning. In Fuzzy logic, data
can belong in more than one category, while in classical
reasoning, something is either true or false. The above reasoning
is depicted in Diagram 3.
According to the diagram, the age of 30 can be categorized both
as young and middle aged, with a membership grade of 0.5 while,
the age of 80 belongs with a grade of 1 in the fuzzy set old. Clas-
sical reasoning would use a threshold stating, for example, that
people older than 30 are middle aged, which would mean that a
person of 29 years of age would be young. One of the biggest
advantages of fuzzy logic is this use of verbal variables, which
are easily understood by humans. It is important to note that clas-
sical reasoning can be seen as a subset of fuzzy logic. In the above
example, one could use one fuzzy set for each year, or even a fuzzy
set for every 6 months.
Fuzzy logic can be used in many problems, where information
does not have to be precise. This is the main reason why it was
chosen to be used in stock market forecast, a problem where infor-
mation is not noise free, as there are different factors affecting the
output.
There are various membership functions, which work better for
different problems, the most common of which are the triangular,
trapezoidal, Gaussian, and the generalized bell membership func-
tion shown in Diagram 4.
The triangular membership function requires three parameters
(a, b, c) according to the following relation, proposed by Jang and
Sun (1997)
trianglex; a; b; c max min
x a
b a
;
c x
c b
_ _
; 0
_ _
The trapezoid membership function needs four parameters (a, b, c, d)
according to the following relation
trapezoidx; a; b; c; d max min
x a
b a
; 1;
d x
d c
_ _
; 0
_ _
The Gaussian function utilizes two parameters, c, r, with c being the
center of the curve, and r indicating the width
gaussianx; a; r e

1
2
xc
r

2
Finally, the Generalized bell curve uses three parameters: (a, b, c)
bellx; a; b; c
1
1
xc
a

2b
Diagram 2. Elliott wave super-cycle. Source: Prechter and Frost (1998).
9198 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206
2.3. Fuzzy systems
According to Jang and Sun (1997), a fuzzy system consists of
three parts:
the rule base;
the database, or dictionary that includes the membership
functions;
the reasoning mechanism.
There are three main types of fuzzy systems:
(1) the Mamdani system;
(2) the Sugeno system;
(3) the Tsukamoto system.
2.3.1. Mamdani system
The difference of each system lies in the way the inputs interact
and lead to an output. A Mamdani system produces a fuzzy output
which has to be defuzzied.
Diagram 5 depicts a Mamdani system, where the value x is a
part of both the A
1
and A
2
fuzzy set, value y belongs to the B
1
and B
2
fuzzy sets, and the output Z is expressed by two fuzzy sets,
C
1
and C
2
. This system basically has two rules
If x is A
1
and y is B
1
then Z is C
1
if x is A
2
and y is B
2
then Z is C
2
The nal results C
0
is used by using the Max operator. A different
operator can also be used. As mentioned before, the Mamdani sys-
tem produces a fuzzy output, which needs to be defuzzied. There
Diagram 3. Fuzzy logic example. Source: Jang and Sun (1997).
Diagram 4. Membership functions. Source: Jang and Sun (1997).
G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206 9199
are ve different methods to be used: (a) the smallest of the max;
(b) the largest of the max; (c) the centroid of the; area (d) the bisec-
tor of the area and (e) the mean of the max. The above methods can
be seen in Diagram 6.
2.3.2. Sugeno system
The Sugeno type fuzzy systems are different from the Mamdani
ones, in the sense that they use a function as an output, and there-
fore the output is already a crisp number and not a fuzzy set. The
same example is used as before in Diagram 7.
This time the rules have the following structure:
If x is A
1
and y is B
1
then z
1
p
1
x q
1
r
1
if x is A
2
and y is B
2
then z
2
p
2
x q
2
r
2
The nal output Z is calculated by using a weighted average of z
1
and z
2
.
Sugeno type systems are less demanding in computational
power, but not as exible as Mamdani systems. The system WASP
presented later is based on the ANFIS neuro-fuzzy architecture,
which uses a Sugeno type fuzzy system.
2.3.3. Tsukamoto system
Tsukamoto systems are used less often than the Mamdani and
Sugeno type systems. The difference lies again in the way the out-
put is calculated. Tsukamoto system use a monotonic function as is
illustrated in Diagram 8.
The main disadvantage of fuzzy systems is that knowledge must
be predened, meaning that the membership functions must be
set.
2.4. Neural networks
An articial neural network (ANN) or neural network (NN), is a
computational method used to model data, derived from the eld
of articial intelligence. Neural networks try to imitate the archi-
tecture of the human brain. Diagram 9 that follows shows a simple
Diagram 5. Mamdani architecture.
Diagram 6. Defuzzication methods.
Diagram 7. Sugeno architecture (Jang & Sun, 1997).
Diagram 8. Tsukamoto architecture.
x
1
(n)
x
2
(n)
x
3
(n)
x
p
(n) bias
Single Layer Perceptron
w
1
w
p
Output:y(n)
Inputs
Sum
v(n)
Diagram 9. Simple neural network.
9200 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206
neural network with one neuron and n inputs. Every input is mul-
tiplied by a different parameter. All inputs are added in the neuron,
giving the nal result. There is no limit in the number of neurons
used, although a very large number can make the network extre-
mely demanding in computational power.
Neural networks are very efcient in modeling non-linear
problems. Neural networks are trained from data with the help
of adaptive algorithms such as the back propagation algorithm
(Rumelhart, Hinton, & Williams, 1986; Werbose, 1974). The main
advantages of the neural networks are the ability to learn by exam-
ple, in other words to create knowledge from past data. Addition-
ally, such networks are found to be extremely useful in pattern
recognition. On the other hand, neural networks have also been
criticized mainly due to the high computational power that is re-
quired, which limits the number of input variables that can be
used. The main disadvantage of a neural network is the lack of
information regarding the impact of every input on the output.
Neural networks are commonly called a black box as there is
no information given other than the output.
2.5. Neuro fuzzy systems ANFIS
Neuro fuzzy systems emerged by the need to nd a solution to
the disadvantages of neural networks and fuzzy logic, while main-
taining the advantages from both theories. A neuro-fuzzy system is
basically a neural network, where the inputs are transformed into a
fuzzy set. The parameters are determined by an adaptive algorithm
from existing data. The advantages of the neuro-fuzzy architecture,
is that knowledge is created from the data, and the results are eas-
ily interpreted with the use of fuzzy verbal variables. The system
WASP uses the neuro-fuzzy architecture ANFIS proposed by Jang
and Sun (1997), which has been found to be extremely efcient
in forecasting time series. As mentioned before, the ANFIS architec-
ture uses a Sugeno type fuzzy system. Diagram 10 shows an ANFIS
system with two inputs and ve levels.
For every level, let O
ij
be the output of level i, and node j for level
one
O
1j
l
Aj
x for j 1; 2;
and
O
1j
l
Bj
x for j 3; 4;
where l
Ai
(x)l
Bi
(x) are the fuzzy values of the variables A and B of
the corresponding membership function.
For level two
O
2j
w
j
l
Aj
x l
Bj
x for j 1; 2
For level three
O
3j
w
j

w
j
w
1
w
2

for j 1; 2
For level four
O
4j
w
j
f
j
w
j
p
j
x q
j
y r
j
for j 1; 2
Finally, for level ve
O
5;1

1
w
k
f
k

1
w
k
f
k

1
w
k
for k 1; 2
Training of the system requires establishing the parameters of the
membership functions, and the parameters of the output functions
of the Sugeno type fuzzy system (p
i
, q
i
, r
i
). The latter are found in lev-
els four and ve of the ANFIS system. Jang proposed a hybrid meth-
od to optimize the ANFIS system, by using two phases, a forward
pass and a backward pass. The forward pass uses the least square
method to optimize the consequent parameters (p
i
, q
i
, r
i
) in levels
four and ve, while the backward pass uses a gradient descent algo-
rithm such as the back propagation algorithm, to optimize the pre-
mise parameters of the membership functions used as inputs in
levels one to three. The parameters of ANFIS are optimized in two
sets (premise parameters, and consequent) to reduce the computa-
tional power, as the consequent parameters are linear, and there-
fore, a linear method such as the least square method can be
used, while the premise parameters are non-linear.
2.5.1. Training ANFIS forward pass
The nal output of the ANFIS system is a function of the param-
eters p, q, r
O
5;1

1
w
i
f
i
w
1
x
1
p
1
w
1
x
2
q
1
w
1
r
1
w
2
x
1
p
2
w
2
x
2
q
2
w
2
r
2
which is a linear function.
2.5.2. Training ANFIS - backward pass
During the backward pass the parameters of the membership
functions are optimized. The gradient descent method is based
on a minimization of a cost function. The cost function is derived
from the sum of the square of errors. Assuming the data have p in-
puts then:
J
p

L
i1
T
i;p
O
L
i;p
_ _
2
where T
i,p
is the ith element of the pth vector generated by the pth
input vector. The total error is
J

p
p1
J
p
To use a gradient descent method, the error rate
@J
p
@O
of the pth set of
input of the training data must be computed. The error rate for the
output node in level L, and element are derived from the previous
function using the derivative
@J
p
@O
L
i;p
2 T
i;p
O
L
i;p
_ _
for the internal node the error rate is
@J
p
@O
k
i;p

k1
m1
@J
p
@O
k1
m;p
@O
k1
m;p
@O
k
i;p
where 1 6 k 6 L 1. As a conclusion to the above, the error rate of
an internal node is a linear function of the error rates of the nodes of
the subsequent levels. Thus, by combining the previous functions
every error can be computed. For the parameter a of a member-
ship function: Diagram 10. ANFIS architecture (Jang & Sun, 1997).
G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206 9201
@J
p
@a

O2S
@J
p
@O
@O
@a
where S indicates the nodes that include the parameter a.
The derivative of the total cost J for parameter a, is:
@J
@a

p
p1
@J
p
@a
In every epoch, parameter a is optimized according to the function:
Da g
@j
@a
where g is the learning rate which changes during the repetitions of
the algorithm according to the following relation:
g
k

a
@J
@a
_ _
2
_
where k is the step size, a very important parameter for nding an
optimum. A very small k might trap the algorithm in a local opti-
mum instead of a global optimum. On the other hand, a higher step
size might overpass a local or global optimum. The k-value also af-
fects the speed of the algorithm.
The parameters that ANFIS needs are: (a) the type of member-
ship functions; (b) the number of membership functions for each
input; the epochs (repetition) of the training algorithm and (d)
the step size.
3. Wave Analysis Stock Prediction WASP
The greatest challenge of the Elliott Wave Theory is to count the
waves, and spot the current position of the market or stock on an
Elliott wave pattern. The same chart can be interpreted in various
ways and this can lead to disastrous results for the investor. For this
reason, researchers looked for an index to help track the waves. As
mentioned before, the easiest wave to track is the third wave, which
pushed researchers to analyze the behavior during this wave. Dur-
ing this wave, a recent moving average wouldbe signicantlyhigher
than a longer moving average. This is howthe Elliott wave oscillator
emerged. The Elliott wave oscillator is derived by subtracting of a
35-day moving average froma 5-day moving average. The oscillator
will have higher values on the third wave, lower but positive values
on the rst and fth waves (but will show corrections and their sig-
nicance), and nally, will have negative values on biggest correc-
tions, or downtrend impulse waves. The Elliott wave oscillator is
used in technical analysis. The rst part of Diagram 11 depicts the
price movement of a stock. The second part shows the moving aver-
ages of 5 and 35 days. The third part indicates the difference of the
moving averages, which is the Elliott wave oscillator (EWO).
The EWO is a suitable oscillator to be used with fuzzy logic as it
is important for a system to recognize high values of the index.
Depending on the value of the stock, the same number can be high
or low in different time periods; therefore, a crisp logic cannot be
used. The system will use three inputs. Many trials were, carried
out and the best results appeared when we used the Elliot wave
oscillator and two lags of the oscillator. This will help track down
the change of trend. Following the ANFIS architecture, the system
will create different rules (linear functions), for various non-linear
scenarios that would look like the following statement:
If EWOt is High and EWOt-1 is High AND EWOt-2 is LOW then
FORECAST IS . . .
The output of the system is the forecasting of the next days
movement. On the training data, the output has been modied to
+1 indicating a positive rate of change, 1 for a negative rate of
change, and 0 for unchanged price movement. This was done to re-
duce the complexity of the results, as it is very challenging to fore-
cast the exact rate of change, while the important information is
the price trend movement.
3.1. Explaining the WASP system
The proposed system is not a single model, but a repetitive
method of selecting nine different ANFIS sub-models, whose com-
bination will give the nal forecast. This method was chosen due to
the fact that various models (ANFIS with different parameters of
membership function types, number, step sizes and epoch num-
bers) gave very good results for some periods, but not constantly,
while the other set of parameters gave very good results for later
periods. The reasoning of WASP is depicted in Diagram 12. Using
the price, as well as the moving averages of 5 and 35 days, the
EWO, and the oscillator lags, which formulate the input data can
be calculated. The returns can also be calculated from prices, and
transformed into values [1, 0, 1] thereby creating the output data.
The last set of the input data is used for the new forecast. The
remaining data combined with the already known output data
comprises the total data, which are divided in training data and
testing data. The data used are 2060 daily observations. The last
60 entries are used for testing data and the remaining 2000 data
are used to train the neuro-fuzzy sub-systems. Two membership
functions for each input were chosen, as the critical point of the
EWO is the change from negative to positive values. The number
of epochs used to train every sub-system is 15. In many cases,
the greatest improvements in the root mean square error took
place during the rst 1520 epochs. More epochs would mean a
longer time needed to train these 42 models, not necessarily
improving the hit rate. The 42 sub-systems are trained with differ-
ent combinations of step sizes, and membership functions. The val-
ues of step sizes used are [0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5], and the
Diagram 11. Elliott wave oscillator.
9202 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206
membership functions used are (Bell, Gaussian, Gaussian2
(a Gaussian Variation), Triangular, Trapezoidal, and Pi membership
function). These 42 sub-models are evaluated with the testing data.
The models that give the highest hit rates are selected, and used
with the new data, giving nine different forecasts for the next
day. The average of these nine forecasts is used as the nal forecast.
The method has many advantages besides the nal result.
One can take under consideration the number of positive and
negative forecasts as well. Furthermore, the system presents a con-
dence index. The condence index is the average hit-rate of the
nine sub-models achieved on the testing data. Obviously, if the
condence index is below 52%, one is advised not to follow the sys-
tem, as it would not make any difference from tossing a coin, or
guessing the next days movement. This is not an irrational restric-
tion, as such periods are often observed in the market place, where
movements seem to follow a chaotic pattern. It is important to be
able to recognize such periods to reduce risk. The typical output of
the WASP system is presented in Diagram 13.
The output window presents three box-plots. The rst one indi-
cates the hit-rate of the nine sub-models, on the testing data,
which is also the condence index. The second indicates the re-
turns that these sub-models would achieve in the testing data.
The third box-plot depicts the predictions, which are derived by
evaluating the latest set of input data on the sub models. The re-
sults show that for the next day, eight models forecasted a positive
movement, while one forecasted a negative one. On average, the
prediction is positive and the condence index is high, ranging
from 66.6% to 75%.
For every subsystem, the characteristics of the ANFIS system are
indicated in Table 1. Depending on the chosen membership
functions selected by WASP, the number of non-linear parameters,
varies from 12 for a two parameter MF such as the Gaussian func-
tion, 18 for a three parameter MF such as the Bell and Trigonal
functions, and 24 for a four parameter MF such as the trapezoid
MF. The case of a subsystem using the Bell function is presented
below.
3.2. Results
The WASP systemwas tested with the stock of the National Bank
of Greece. The system was retrained daily. A paper portfolio worth
10.000 Euros was simulated. Buy and sell decisions did not take into
account the condence index, as it is subjective, depending on the
risk the investor is willing to take, even though a threshold of 52%
is widely acceptable. Stocks were bought whenever the forecast
was positive, and the position was closed when the forecast became
negative. Transaction costs were not taken into consideration. The
system was tested for period April 2007 to November 2008, for a
total of 400 trading days. It is one of the longest periods that any
model has been tested for daily trading decisions. It is worthy to
note that this period also includes the great recession of October
2008, were the system achieved interesting results. For the whole
period of 400 trading days, the hit rate was 58.75%, mainly due to
the crisis. By breaking this period in four sub-periods of 100 obser-
vations, the hit rates achieved are 58%, 64%, 60% and 53%, respec-
tively. It is important to note that during the last sub-period, the
condence index often had a value below52% which would prevent
investors from investing, as will be shown in the diagrams that fol-
low. Again not including the condence index, the return of the
portfolio during this period was +6.79%, while the stock lost 60.9%
of its value. It is interesting to note, that just before the crisis, the
condence index fell below 52%. The respective returns on that
day were +71.49% for the WASP system, while the Buy&Hold strat-
egy to this day would produce a loss of 37.13%. For ease of use, the
total period has been divided into three periods. The rst includes
200 trading days, between 11/04/2007 and 23/01/2008, the second
from 24/01/2008 to 25/06/2008 (100 trading days), and nally the
third from 26/06/2008 to 14/11/2008.
3.2.1. First period: 11/04/200723/01/2008
For every period, the results include the portfolio return
compared to the return produced by a Buy&Hold strategy
(Diagram 15), the condence index, the hit-rate, and the moving
hit-rate. The moving hit-rate is a diagram which shows the hit-rate
since the rst day of this period. Table 2 outlines some of the re-
sults. Following the WASP system, the return achieved was
17.73%, which is 27.31% more than the return of the Buy&Hold
strategy. The hit-rate for this period is 61%. It can be seen in
Diagram 14 that the hit-rate moves towards the 60% region. The
condence index remained above 52% for the whole period, with
an average of 61%.
3.2.2. Second Period 24/01/200825/06/2008
During this second period of 100 trading days, the results are
even better. The WASP System returned a prot of +19.65% against
a price drop of 19.39% as can seen in Table 3. Diagram 16 shows the
portfolio returns usingWASP and the Buy&Hold strategy. The hit-
rate for this period was 60% with the moving hit-rate remaining
again on the 60% region as can be seen in Diagram 17.
The condence index remained again above 52%, with its low
for this period at 58.7%, and a maximum of 71.67%, giving no rea-
son for the investor not to trust the systems output.
3.2.3. Third period 26/06/200814/11/2008
During this period the market was much harder to explain. This
is the period where the famous 2008 crisis hit the markets
worldwide, as can be seen in Table 4. The WASP system would
Price
Moving Average
35 Day
Returns
Moving Average
5 Day
Returns
(-1,0,+1)
EWO(5/35)t
EWO(5/35)t-1
EWO(5/35)t-2
Input
Data
Output
Data
Total
Data
New
Data
Testing
Data
Training
Data
Various
ANFIS
Model
Evaluation
Best model
Choice
Average
Prediction
Final
Prediction
Diagram 12. The WASP system.
G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206 9203
have helped investors to reduce losses, by always achieving better
results than the Buy&Hold strategy. By the end of this period, the
stock of NBG had declined by 46.35% while the WASP system lost
24.28%.
In Diagram 18, the condence index is also noted as gray circles
on theWASP return line. Every gray circle denoted a day that the
condence index was below 52%. For these days, as mentioned be-
fore, the system is unable to support the results, and thus the
investor should not follow the system. The best action would be
to sell the stocks, and wait until the index gives an acceptable re-
sult. The system gave early signs to exit the market. On the point of
the rst exit signal, the WASP system produced prots of 26.61%
against prots of 3.95%. After this sign, the index moved again
slightly above 52%, but not higher than 53%, giving another sign
9 days later, with the results being +15.2% for the WASP system
and 9.21% for the Buy&Hold strategy. Diagram 19 indicates the
moving Hit-rate, which moves to the 60% region at the beginning,
but falls to approximately 53% at the end. The most important con-
clusion for this period was the signicance of the condence index,
and how it can be used to give exit signals.
3.2.4. Further results
Diagram20 shows the results for the WASP systemfor the whole
period. The WASP System outperformed the Buy&Hold strategy.
Since the forecasting problem has been converted to a classi-
cation problem, as the return has been converted to 1, 0, +1 val-
ues for negative, unchanged or positive rates of change, it is
Diagram 13. WASP system output.
Table 1
Characteristics of ANFIS sub-system.
ANFIS parameter type Value
MF type Bell function
Number of MFs 6
Output MF Linear
Number of nodes 34
Number of linear parameters 32
Number of nonlinear parameters 18
Total number of parameters 50
Number of training data pairs 2000
Number of evaluating data pairs 60
Number of fuzzy rules 8
Strategy Returns
8000
9000
10000
11000
12000
13000
1 50 99 148 197
Day
E
u
r
o
Buy & Hold
W.A.S.P
Diagram 14. Returns for the rst period.
Moving Hit-Rate
0
0.2
0.4
0.6
0.8
1
1 50 99 148 197
Days
H
i
t

R
a
t
e
Diagram 15. Moving hit rate for the rst period.
9204 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206
common to present classication results. Results are classied into
four categories: True Positive (TP), True Negative (TN), False Posi-
tive (FP) and False Negative (FN). Obviously, FP cases are the ones
that incur losses. FN cases prevent prots, while TP cases produce
prots and TN cases prevent losses. It is important to note that the
times that the stock price remained unchanged, are counted as
correct forecasts, as they bring no change to the portfolio value.
Table 5 shows the classication results for the whole testing peri-
od. The results are much better if the third period is excluded, as is
indicated in Table 6.
Another interesting result is the number of consecutive correct
and wrong forecasts achieved by the system. Ideal situation would
one where the false forecasts occur once between many correct
forecasts. Table 7 presents the analysis of the whole testing period.
Column 1 shows the number of consecutive forecasts. Columns
2 and 4, depicts the frequency of wrong and correct forecasts
respectively, while columns 3 and 5 give the number of
Table 2
Results for the rst period.
Date Buy&Hold (%) WASP (%) Difference (%)
20/06/2007 +0.47 +9.61 +9.14
30/08/2007 +3.10 +8.01 +4.90
08/11/2007 +10.12 +21.04 +10.91
23/01/2008 9.57 +17.73 +27.31
Table 3
Results for the second period.
Date Buy&Hold (%) WASP (%) Difference (%)
26/02/2008 +3.68 +16.79 +13.11
09/04/2008 6.57 +7.13 +13.71
19/05/2008 3.71 +21.18 +24.89
25/06/2008 19.39 +19.65 +39.05
Moving Hit-Rate
0
0.2
0.4
0.6
0.8
1
1 25 49 73 97
Days
H
i
t

R
a
t
e
Diagram 16. Moving hit-rate for the second period.
Strategy Returns
8000
9000
10000
11000
12000
13000
1 25 49 73 97
Day
E
u
r
o
Buy & Hold W.A.S.P
Diagram 17. Results for the second period.
Table 4
Results for the third period.
Date Buy&Hold (%) WASP (%) Difference (%)
03/09/2008 +8.53 +26.50 +7.97
24/10/2008 59.45 35.28 +18.17
14/11/2008 46.35 24.28 +26.51
Moving Hit-Rate
0
0.2
0.4
0.6
0.8
1
1 25 49 73 97
Days
H
i
t
-
R
a
t
e
Diagram 18. Moving hit rate for the third period.
Strategy Returns
3000
5000
7000
9000
11000
13000
1 25 49 73 97
Day
E
u
r
o
Buy&Hold Wasp Conf idence Index
Diagram 19. Results for the third period.
Strategy Returns
0
2500
5000
7500
10000
12500
15000
17500
20000
1 51 101 151 201 251 301 351 401
Day
E
u
r
o
Buy & Hold W.A.S.P
Diagram 20. Returns for whole period.
Table 5
Classication of results for the whole period.
Movement forecast Positive Negative
Positive 114 (59.37%) TP 87 (41.82%) FP
Negative 78 (40.625%) FN 121 (58.17%)TN
Total 192 208
G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206 9205
observations that were part of each category (simply multiplying
columns 1 and 2, or columns 1 and 4).
For example, a single wrong forecast occurred 59 times. Two
consecutive wrong forecasts occurred 20 times, three consecutive
false forecasts occurred 12 times, etc. On the other hand, a single
correct forecast occurred 40 times, two consecutive correct fore-
casts occurred 23 times, three consecutive correct forecasts oc-
curred 12 times, etc.
From columns 3 and 5, it can be concluded that the system gave
more consecutive correct forecasts than wrong ones.
4. Conclusions future research
The systempresented in this paper achieved remarkable results,
for a very long period of 400 trading days, when the system is
retrained daily. As mentioned before, the WASP system is a meth-
odology that selects 9 ANFIS systems, based on the hit-rate each
sub-system achieves on out of sample data. Even those results
are extremely interesting, with some sub-models achieving a hit-
rate above 75% for a sample of 60 trading days. This is evidence
of how effective neuro-fuzzy architectures can be in stock market
forecasting. The system showed a tendency to achieve hit rates in
the 60% mark which is signicantly better than forecasting with
the help of a coin. During this period of 400 trading days, the WASP
systemmade 63 transactions. This gives a rough average of 1 trans-
action every 6 days.
Variations of the WASP system could produce different results.
There are many parameters that can be changed, as can the num-
ber of sub-models to be chosen. The selection process of the sub-
models can also differ. Risk aversion investors might want to
choose the sub-models, depending on the number of false positive
cases.
A daily forecast needs approximately 90 s on a core 2 duo laptop
in 1.86 MHz with 3 GB of ram.
Of course the system has some restrictions, deriving both from
fuzzy logic theory, neural network, and the Elliott Wave Theory.
The Elliott wave oscillator is not a normalized index, which causes
price variations to change the value of the oscillator signicantly,
producing values outside of the values used to train the neuro-
fuzzy system. The second restriction depends again on the Elliott
wave oscillator. The oscillator is a slow moving oscillator, as it is
derived from two moving averages. This makes the separation of
if . . . then scenarios more difcult.
References
Abraham, A. (2004). Neuro fuzzy systems: Sate-of-the-art modelling techniques.
ArXiv Computer Science e-prints: cs/0405011.
Abraham, A., Nath, B., & Mahanti, P. (2001). Hybrid intelligent systems for stock
market analysis. Lecture Notes in Computer Science, 337345.
Atsalakis, G., Skiadas, C., & Braimis, I. (2007). Probability of trend prediction of
exchange rate by neuro-fuzzy techniques. Recent advances in stochastic modelling
and data analysis. London: World Scientic Publishing Co. Pte. Ltd..
Atsalakis, G., & Ucenic, C. (2005). Time series prediction of the greek manufacturing
index for the non-metallic minerals sector using a neuro-fuzzy approach
(ANFIS). In Conference international symposium on applied stochastic models and
data analysis, May, Brest, France (p. 211).
Atsalakis, G., & Nezis, D. (2008). Moving average, neural networks and genetic
algorithms for stock market prediction. Journal of Computational Optimization in
Economics and Finance, 1(1), 4257.
Atsalakis, G. S., & Valavanis, K. P. (2008). Surveying stock market forecasting
techniques Part II: Soft computing methods. Experts Systems with Applications,
36, 59325941.
Atsalakis, G., & Valavanis, K. A. (2009). Forecasting stock market short-term trends
using a neuro-fuzzy based methodology. Journal of Expert Systems with
Applications, 36, 1069610707.
Atsalakis, G., & Valavanis, K. (2010a). Forecasting stock trends using a combined
technical analysis and neuro-fuzzy based approach. Journal of Financial Decision
Making, 6(1), 7994.
Atsalakis, G., & Valavanis, K. (2010b). Surveying stock market forecasting
techniques Part I: Conventional methods. Journal of Computation
Optimization in Economics and Finance, 2(1).
Atsalakis, G., & Zopounidis, C. (2009). Forecasting turning points in stock market
prices by applying a neuro-fuzzy model. International Journal of Engineering and
Management, 1(1), 1928.
Atsalakis G., Valavanis K., & Ucenic C. (2001). Elliot waves, marketing and neuro-
fuzzy based techniques for marketing of stocks. Finance, accounting,
management, marketing (vol. 2, pp. 398405). Trgu-Mures: The Works of
Scientic Session of Petru Maior University, Publishing House of Petru Maior
University. ISBN 973-8084-53-9.
Bekiros, Stelios D. (2007). A neuro-fuzzy model for stock market trading. Applied
Economics Letters, 14(1), 5357.
Ghandar, A., Schmidt, Z., To, M., & Zurbruegg, R. (2007). A computational
intelligence portfolio construction system for equity market trading, In IEEE
Congress on Evolutionary Computation, 2007, CEC 2007.
Jang, J.-S. R., & Sun, C.-T. (1997). Neuro-fuzzy and soft computing: A computational
approach to learning and machine intelligence. Upper Saddle River, NJ, USA:
Prentice-Hall, Inc..
Nishina, T., & Hagiwara, M. (1997). Fuzzy inference neural network.
Neurocomputing, 14(3), 223239.
Pokropinska, A., & Scherer, R. (2008). Financial Prediction with Neuro-fuzzy
Systems. Lecture Notes in Computer Science, 5097, 11201126.
Prechter, R., & Frost, A. (1998). Elliott wave principle: Key to market behaviour. New
Classics Library.
Rast, M. (1999). Forecasting with fuzzy neural networks: A case study in
stockmarket crash situations. In 18th International Conference of the North
American, NAFIPS. Fuzzy Information Processing Society.
Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representations
by error propagation. Parallel distributed processing: Explorations in the
microstructure of cognition (vol. 1, p. 319).
Siekmann, S., Kruse, R., Gebhardt, J., Van Overbeek, F., & Cooke, R. (2001).
Information fusion in the context of stock index prediction. International
Journal of Intelligent Systems, 16(11), 12851298.
Werbose, P. J. (1974). Beyond regression: New tools for prediction and analysis in
the behavioral science. Ph.D. dissertation, Harvard University.
Wong, F., Wang, P., Goh, T., & Quek, B. (1992). Fuzzy neural systems for stock
selection. Financial Analysts Journal, 48(1), 4752.
Wu, J., Fung, M., & Flitman, A. (2001). Forecasting stock market performance using
hybrid intelligent system. Lecture Notes in Computer Science, 447458.
Zadeh, L. (1965). Fuzzy sets. Information and Control, 8(3), 338353.
Table 6
Classication of results for the rst two periods.
Movement forecast Positive Negative
Positive 86 (57.33%) TP 54 (36.00%) FP
Negative 64 (42.66%) FN 96 (64.00%) TN
Total 150 150
Table 7
Consecutive correct and wrong forecasts.
Number of
consecutive
forecasts
Frequency of
wrong
forecasts
Number of
wrong
forecasts
Frequency of
correct
forecasts
Number of
correct
forecasts
(1) (2) (3) (4) (5)
1 59 59 40 40
2 20 40 23 46
3 12 36 12 36
4 4 16 8 32
5 0 0 6 30
6 1 6 7 42
7 0 0 0 0
8 1 8 1 8
9206 G.S. Atsalakis et al. / Expert Systems with Applications 38 (2011) 91969206

You might also like