Professional Documents
Culture Documents
CHAPTER 6
HYBRID METHODS FOR MINING ASSOCIATION RULES
6.1
GENERAL
This chapter reviews the necessity of hybridizing GA and PSO methods,
125
126
127
Ranked Population
Upper
Updated Population
Genetic Algorithm
Evaluate Fitness
Particle Swarm Optimization
Lower
128
129
The population size is fixed based on the size of the dataset for which
ARM is applied. Binary encoding is adopted for representation of data. The
fitness function as given in Equation (4.2) is adopted for calculating the fitness
values. The experimental setting and results of the GPSO methodology for
mining ARs are presented in next section.
6.3.2 Experimental Results and Discussion
To test the performance of the hybrid GPSO for mining ARs,
experiments were carried out on the well-known benchmark datasets from UCI
repository.
The parameters, which play a major role during the rule discovery in the
hybrid GPSO methodology, are listed in Table 6.1. The population size is the
size of the individuals taken up for experimentation. The crossover and
mutation rates are the GA operator specifications. c1 and c2 are the acceleration
coefficients used in velocity updation of PSO as in Equation (3.2).
Evolutionary algorithms are relatively simple to implement, robust and
perform very well on a wide spectrum of problems. This study proposes a
hybrid methodology of evolutionary algorithms: GA and PSO for ARM. The
scope of this study on mining ARs using GPSO is to:
Study the performance of GA over generations
Analyze the performance of PSO over generations
Identify the limitations of GA and PSO while mining ARs in terms of
PA and execution speed.
Compare the performance of GPSO with GA and PSO
130
Crossover Rate
Mutation Rate
Selection Operation
c1
c2
No. of Generations
Value
Lenses
Car Evaluation
Habermans Survival
Post-operative Patient Care
Zoo
Lenses
Car Evaluation
Habermans Survival
Post-operative Patient Care
Zoo
Lenses
Car Evaluation
Habermans Survival
Post-operative Patient Care
Zoo
Roulette wheel selection
2
2
50
: 20
: 700
: 300
: 80
: 100
: 0.6
: 0.7
: 0.75
: 0.8
: 0.8
: 0.5
: 0.4
: 0.25
: 0.2
: 0.2
131
100
90
80
70
60
10
50
Predictive Accuracy (%)
20
30
40
40
30
50
20
10
0
GA
Methodology
(a)
132
100
90
80
70
60
10
20
30
40
40
30
50
20
10
0
GA
Methodology
100
90
80
70
60
Predictive Accuracy (%)
10
20
30
40
50
4050
30
20
10
0
Methodology
133
100
90
80
70
60
Predictive Accuracy (%)
10
20
50
40 40
30
50
30
20
10
0
Methodology
Car
Evaluation
Dataset
10
20
30
40
30
50
20
10
0
GA
Methodology
134
135
60
PSO
GPSO
50
Dataset
136
137
10000
8000
6000
4000
2000
0
GA
PSO
GPSO
Datasets
138
139
140
SF
LA
Generation of initial
population(P) and evaluating the
fitness of each particle
Velocity and position updation
of particles
Sorting the population in
descending order in terms of
fitness value
Distribution of frog into M
memeplexes
Iterative Updating of worst frog
in each memeplexes
Combining all frogs to form a
new population
Terminatio
N
n criteria
satisfied?
Y
141
Mem
eplex
1
Mem
eplex
2
Mem
eplex
3
Frog 1
Frog 2
Frog 3
Frog 4
Frog 5
Frog 6
Frog 7
Frog 8
(6.1)
(6.2)
Xw
Xb
is the
142
group. If this process produces a better solution, it replaces the worst frog.
Otherwise, the calculations in Equations are repeated, but with respect to the
global best frog (i.e. Xg replaces Xb).
The fitness function defined in equation (4.2) is used for evaluating the
fitness of the individuals. Both PSO and APSO methods are combined with
SFLA resulting in two proposals, namely, PSO+SFLA and APSO+SFLA for
ARM.
6.5.2 Experimental Results and Discussion
The PSO and APSO methodologies are both combined with SFLA for
local search to mine ARs as described in previous section. The five datasets
used for all the other methodologies is adopted for generating ARs.
ARs are mined from the datasets using the two proposals and the PA of
the generated rules are plotted as shown in Figure 6.7.
PSO+SFLA methodology of mining ARs performs better than simple
PSO in terms of PA for all the five datasets taken up for analysis. The
APSO+SFLA methodology for mining ARs outperforms the other three
methods.
100
98
96
94
92
90
Predictive Accuracy
PS O
APS O
88
86
PS O+S FLA
84
82
80
Datasets
143
APS
2 O
PS O+S FLA
1.5
1
0.5
0
Iteration Number
PSO+SFLA
APSO+SFLA
1
0.5
0
Iteration Number
144
Fitness
PSOValue
2.5
APSO
2
PSO+SFLA
APSO+SFLA
1.5
1
0.5
0
Iteration Number
2APSO
PSO+SFLA
APSO+SFLA
1.5
1
0.5
0
Iteration Number
Zoo Dataset
4
3.5
3
2.5
Fitness
Value
PSO
2
APSO
1.5
PSO+SFLA
1
0.5
0
Iteration Number
APSO+SFLA
145
Figure 6.8 Fitness Value for PSO with SFLA for ARM
The fitness values of both proposed methodologies; PSO +SFLA and
APSO+SFLA are more than the respective individual PSO and APSO values.
Thus, both methods perform better by generating ARs with enhanced PA, than
PSO and APSO methods.
The performances of the proposed two methods are compared with GA
and PSO methods discussed so far in terms of the PA of the ARs mined and the
results for the five datasets are shown in Table 6.2.
The APSO+SFLA methodology outperforms the other methods for all
the five datasets by generating ARs with better PA. The APSO methodology
too generates ARs with optimal accuracy compared to other methods. The data
independent adaptation methodologies (SAPSO1, SAPSO2 and SACPSO) rank
next in terms of performance for all the five datasets. However the performance
of other methods varies among datasets considered in this study.
The number of rules generated by each methodology for the datasets
taken up for analysis is given in Table 6.3. The SACPSO1 methodology
performs better among the data independent adaptation methodologies,
considered for analysis here as SAPSO.
The APSO+SFLA methodology of mining ARs generates more rules
than the other methods discussed. The SAPSO (SAPSO1) performs better by
generating optimal number of ARs.
Thus the proposed APSO+SFLA methodology performs better when
compared to the other methods, in terms of PA and number of rules generated.
The SFLA performs effective local search thereby balancing between
exploration and exploitation and hence better performance.
GA
APSO
GPS0
PSO
SF
A
s
valuation
mans
val
pertative Care
enses
ost Operative
atient
oo
aberman's
urvival
ar Evaluation
145
85
91.6
91.6
83.57
97.91
87.5
93.1
97.91
98.1
87
94
97
97.61
99.93
99.86
97.1
99.92
99.82
81
91.6
85
92.86
99.48
96.15
92
99.2
99.83
74
79
82
83.33
99.29
92.86
94.5
99.47
99.37
81
86
89.54
95.45
96.67
94.44
96.5
99.09
99.72
87.5
95.12
93.
99.
98.05
97.
90.42
98.
95.35
99.
Elitist
GA
8
PSO
3
WPSO
15
22
18
32
13
58
17
57
10
13
13
32
14
46
38
55
35
63
120
126
130
118
135
GA
10
CPS
O
4
APS
O
8
GPS
O
3
60
67
24
24
18
20
38
23
22
24
110
114
12
47
54
128
131
135
142
128
130
NPSO
13
SAPS
O
14
PSO+
SFLA
16
146
6.6
SUMMARY
A hybrid method combining both genetic algorithm and Particle Swarm
Optimization called GPSO has been proposed. This method brings a balance
between exploration and exploitation, resulting in higher prediction accuracy of
the ARs mined and consistency in performance. Two methodologies using PSO
with SFLA for local search (PSO+SFLA and APSO+SFLA) has been proposed.
Among them, APSO+SFLA methodology generate ARs with better PA, than all
other methodologies discussed so far.
CHAPTER 7
CONCLUSIONS AND FUTURE WORK
7.1
GENERAL
In this research, various important issues concerned with ARM have
been addressed. Investigations carried out in this research were mainly focused
towards developing an efficient methodology for ARM model by making use of
the two population based stochastic search algorithms, namely GA and PSO.
147
The salient conclusions of this research work and the scope for future work are
presented in this chapter.
7.2
SALIENT CONCLUSIONS
ARM was attempted using GA and PSO methodologies. Modification
and parameter tuning was done on both methods to enhance the PA of rules
mined. A hybrid of GA and PSO was proposed for ARM. PSO with SFLA for
local searched was proposed and from the results attained, The salient
inferences arrived are as follows:.
(i)
Genetic Algorithm when used for mining ARs performs better than
other existing traditional methods.
(ii)
(iii)
(iv)
148
(v)
(vi)
(viii)
149