Professional Documents
Culture Documents
by
means of their fitness values,
p
=
]tncss
i
]tncss
]
n
]=1
----------------------------------------------(2)
6. Produce new solutions (new positions) for the onlookers
from the solutions X
depending on probability p
and
evaluate them.
7. Apply the greedy selection process between the new and old
solution.
8. Determine the abandoned solution (source), if exists, and
replace it with a new randomly produced solution for the
scout.
9. Memorize the best food source position (solution) achieved
so far
10. Repeat step 3 to 9 until the termination criterion is reached
Figure 2. Artificial Bee Colony algorithm
For test data generation, initially a random population
of candidate solutions is generated from the inputs
domains. In ABC the solutions are represented by the
position of flower patches. The optimum positions of
flower patches are searched out in such a way so that
positions of these can satisfy the targeted path constraint
system. Corresponding to each flower patch, its
profitability is measured. In computer modeling, this
profitability is represented by the fitness of the position.
The ABC algorithm works in three phases. First phase is
called the employed phase, where employed bees modify
the position of elite flower patches (where profitability is
higher) in neighborhood. The second phase is executed
by onlookers, who modify their patches positions with
influence from elite positions. These positions are known
as selected patches. After every phase a greedy selection
process is repeated, where solutions (flower patch
positions) compete themselves for retention in the elite
or selected patches based on their fitness. In this process
some of the sources may migrate from one category to
another or some may be abandoned in favor of randomly
generated sources, which are simulated by the scout
phase of the algorithm. Subsequently, these search
processes of the employed, onlooker, and scout phases
are repeated in cycles until the termination criterion is
met. The ABC algorithm doesnt have much flexibility
in tuning its parameters for the best results other than the
size of colony and number of bees allocation for the
elite patches. For test case generation, we have taken
size of colony as 30 where half of these work as
employed bees for elite patches.
B. Fitness Function
For path testing criterion, in order to traverse a
feasible path, the control must satisfy the entire branch
predicates, which falls on that particular path. In our
experimentations, we have used a symbolic execution
technique of static structural testing. So, corresponding
to each path a compound predicate (CP) is made by
anding each branch predicate of the path. The CP must
be evaluated to true by a candidate solution in turn to
become a valid test case. The ABC generates population
of candidate solutions and these are used to evaluate the
CP. If the CP is not evaluated to true by an individual
then all the constraints of particular path are split into
distinct predicates (DP) and one by one each DP is
evaluated by taking values of its operands from
candidate solutions. A DP contains only one operator (a
constraint with modulus operator is the exception) and
can be expressed in the form of expression A op B where
A and B are LHS and RHS of an expression which is
made of one or more operand(s) and op is a relational
operator. For determining the fitness of candidate
solution following rule is followed. If the DP is satisfied
then no penalty is imposed on the candidate solution,
otherwise the candidate solution is penalized on the basis
of branch distance concept rules as shown in table 1,
which is also recommended by Watkins et al [18] for
static structural testing.
Table1. Branch Predicate based Fitness function
After this, integrated fitness due to whole of CP is
determined by adding penalty values of two DPs, if they
Violated distinct
predicate
Penalty to be imposed in case predicate
is not satisfied
A < B A B+
A B A B
A > B B A+
A B B A
A = B Abs(A B)
A B abs(A B)
A and B are operands and is a smallest constant of operands
universal domains. In case integer it is 1 and in case real values it
can be 0.1 or 0.01 depending on the accuracy we need in solution.
151
are connected by a conditional and operator. If two
DPs are connected by a conditional or operator then
the minimum penalty of two DPs is considered for the
evaluation of whole CP fitness. If the integrated fitness is
zero then the CP is called evaluated or satisfied by the
individual, whose values are replaced in the CP and the
search process for particular path is terminated otherwise
search is allowed to proceed further.
V. EXPERIMENTAL SETUP AND RESULTS
In order to prove the worthiness of the ABC method
for test data generation, we have experimented on ten
real world problems. The aim of the experiment was to
generate test cases automatically from the corresponding
CFG using the standard ABC algorithm. The CFG of
programs are automatically constructed from respective
source code and all feasible paths are identified
manually. The fitness function corresponding to the
target path is constructed using the concept of symbolic
testing and path constraint system, which has been
already described in section 4.2 of this paper. The ABC
algorithm is implemented using the MATLAB
programming environment. The performance of the
algorithms is measured using Average Test Cases
generated Per Path (ATCPP) and Average Percentage
Coverage (APC) metrics. The high value of ATCPP
signifies the difficulty faced by the search method for
test data generation. Experiment is conducted 10 times
for averaging results. In each attempt, ABC is iterated
for 100 generations for each of the 10 runs. In each run,
except for the first run, the first-generation population is
seeded with the best solution from the previous run. This
is done to check premature convergence of population.
The total number of real encoded individuals in each
population is 30. If a solution is not found within all runs
that generates total 30,000 invalid test cases then it is
declared that the test case generation has failed for that
particular attempt. This value has been obtained by
multiplying total number of runs, generations and
number of individual in each population. An invalid test
case is a solution, which does not qualify to become a
test case.
We have chosen 10 real world programs for test data
generation activity. Some of these are frequently used by
researchers. These are called test objects here and brief
explanation for each test object is given below.
Triangle classifier (TC) is one of the most used
programs for experimentation of test data generation.
It accepts three inputs as sides of a triangle and then
decides whether these sides form a triangle and if yes,
then of what type. This program contains total 7
feasible paths of which four involves equality
constraints.
Line-rectangle classifier (LRC) program identifies
whether a line cuts a rectangle or lies completely
outside or lies completely inside of the rectangle. In
this program, eight inputs are entered; four for co-
ordinates of the rectangle and four inputs to define the
line. Some of the nodes in CFG of this program have
very high level of nesting. This is the main reason of
using this program so that the difficulty of testing a
nested structure can be found.
Number of days between two dates program(DBTD)
accepts six integer input variables representing two
dates. Input ranges for first date year and second date
year are between 2000 and 2100. This program
contains plenty of branches with equality conditions;
some of them use the remainder operator, which adds
discontinuity to the decision domains and therefore the
tester may face a greater difficulty in finding the test
cases that cover those branches. The nesting level is
very high for some of the nodes. These characteristics
make this program an ideal one to evaluate the
effectiveness and efficiency of automatic test
generator for the path coverage criterion. This program
also contains several loops. We have converted loops
in case statements in such a way so that each condition
within the loop is executed at least once by test cases
and it covers each statement in the loop.
Program a2f (A2F) converts a numeric string into
real value. The main reason of taking this program is
its complexity and the nested structure it has, making
compound constraint in symbolic testing more
complex and harder to be satisfied. It inputs an array
of numeric characters. The input domain for each
position in array is 0 to 127 which represents
characters in the ASCII table. This program has 15
decision nodes. The highest nesting level is seven
which is rare in most real world programs. This
program contains a few equality conditions branches
also. This program also contains several loops. We
have allowed loops to execute utmost thrice thereby
limiting the explosion of number of paths but side by
side enough chance is given for the execution of every
statement in loop and traversing its effect in future
execution of loop.
Binary search (BS) accepts a variable size array of
maximum 80 elements. The loop in the program is
allowed to execute 5 times at most.
Remainder (REM) finds the remainder of two integer
numbers. It also contains 4 loops which are again
restricted for 5 executions only.
Bubble sort (BUB) arrange an array in ascending
order. It accepts a variable size array. This program is
unique in the sense that it is the only program in this
set of programs, where nested loops structure is
present.
152
Quadratic equation program(QUAD) finds the roots of
a quadratic equation. This program also tests equation
for linearity or infeasibility.
Min-max program (MINMAX) finds minimum and
maximum value from an array. In this program loop is
again allowed to execute 5 times utmost.
Isprime (ISPRIME) tests an integer for its primeness.
This is the simplest program in the list.
Detail characteristics of these test objects are given in
table 2.
TABLE2. Test Objects characteristics
Table 3 presents the results of testing effort on 10
testing objects selected for experimentation. Test cases
for TC and LRC programs are generated from inputs by
taking small as well as large domain of size 10
4
and 10
8
respectively for each path. ABC is able to generate test
cases for all paths except in cases of TC (small as well as
large domain), LRC (large domain) and BS program.
This shows the inapplicability of ABC for large domains
of inputs. It also fails to generate test data for TC (small
domain) frequently for a path in which it has to prove
triangle as equilateral. Thereby, we can also conclude
that search algorithm performance is affected by the
number of equality constraints the target path involving.
Other than these, the binary search is the only program
for which ABC fails to generate test cases. This may be
due to requirement of inputting variable array to satisfy
the boundary cases. Although we have taken a fixed size
array of size 80 but its size is varied by taking an
external variable n during experimentation. We have
used the same approach for A2F and BUB programs but
in these, boundary cases are not required to be satisfied.
TABLE 3. ATCPP and APC for Test Objects
Name of Program ATCPP APC
TC (small Domain) 6197 85.71
TC (Large Domain) 17156 42.86
LRC (small Domain) 1255 100
LRC (Large Domain) 3924 89.06
DBTD 206 100
A2F 3195 100
BS 15545 51.94
REM 970 100
BUB 258 100
QUAD 1930 100
MINMAX 619 100
ISPRIME 52 100
VI. CONCLUSION
We have proposed a swarm intelligence based
approach for structural software testing. Experiments are
performed on ten real world problems. Static testing
based symbolic execution method has been used in
which first, the target path is selected from the CFG of
program and then inputs are generated using the ABC
method to satisfy composite predicate corresponding to
the target path. The technique has performed
satisfactorily for most of the program except programs
with large input domains and many equality based path
constraints.
REFERENCES
[1] J. Wegener, A. Baresel and H. Sthamer, Evolutionary
test environment for automatic structural testing,
Information and Software Technology, 2001; Vol. 43,
pp. 84154.
[2] A. Windisch, S. Wappler and J. Wegener, Applying
particle swarm optimization to software testing, Proc.
conference on Genetic and evolutionary computation
GECCO07, London, England, United Kingdom, July
2007, pp. 711.
[3] T. Schmickl, R. Thenius and K. Crailsheim Simulating
swarm intelligence in honey bees: foraging in
differently fluctuating environments, GECCO'05,
Washington, DC, USA, 2005, pp. 273-274.
[4] T. Seeley, The Wisdom of the Hive, Harvard University
Press, Cambridge, MA, 1995.
[5] E. Daz,T. Javier, B. Raquel and J. Jose, A tabu search
algorithm for structural software testing, Computers
and Operations Research (2007), doi:10.1016/j.cor.
2007.01.009
[6] J. Edvardsson, A survey on automatic test data
generation, In Proceedings of the second conference on
computer science and engineering, Linkoping: ESCEL;
October 1999; pp. 2128.
[7] P. Frankl and E. Weyuker, An applicable family of data
flow testing criteria, IEEE Transaction On Software
Engineering. 1988; 14(10); pp.1483-1498.
[8] J. Duran and S. Ntafos S, A report on random testing,
International Conference on Software engineering
Proceedings of the 5th international conference on
Name of
Program
L
i
n
e
s
o
f
C
o
d
e
C
y
c
l
o
m
a
t
i
c
C
o
m
p
l
e
x
i
t
y
N
u
m
b
e
r
o
f
D
e
c
i
s
i
o
n
N
o
d
e
s
H
i
g
h
e
s
t
N
e
s
t
i
n
g
L
e
v
e
l
T
o
t
a
l
P
a
t
h
s
i
n
C
F
G
F
e
a
s
i
b
l
e
P
a
t
h
s
TC 35 07 06 05 07 07
LRC 56 19 18 12 17 17
DBTD 123 26 22 05 1643 566
A2F 48 15 14 07 910 568
BS 23 05 04 03 124 62
REM 35 10 8 04 22 22
BUB 21 04 03 03 121 31
QUAD 24 06 05 03 06 06
MINMAX 27 04 03 03 121 121
ISPRIME 16 03 02 02 10 08
153
Software engineering 1981, San Diego, California,
United States March 09 - 12, 1981
[9] R. Thayer, M. Lipow and E. Nelson, Software
Reliability, North-Holland, Amsterdam, 1978.
[10] D. Karaboga and B. Akay, A survey: algorithms
simulating bee swarm intelligence, Artificial
Intelligence Review, Volume 31, Numbers 1-4 , 2009,
pp. 64-85.
[11] K. Ayari, S. Bouktif and G. Antoniol, Automatic
mutation test input data generation via ant colony,
GECCO07, July 711, 2007, London, England, United
Kingdom.
[12] C. Chong, M. Low, A. Sivakumar and K. Gay, A bee
colony optimization algorithm to job shop scheduling,
Proceedings of the 37th Winter Simulation, Monterey,
California, 2006, pp. 1954-1961
[13] B. Korel, Automated software test data generation,
IEEE transaction on software engineering, 1990; 16(8):
pp. 870-879.
[14] R. Demillo and J. Offutt, Constraint-based automatic
test data generation, IEEE transaction on Software
engineering, 1991; 17(9): pp. 900-910.
[15] J. Lin and P. Yeh, Automatic test data generation for
path testing using Gas Information Sciences 2001; 131:
pp. 4764.
[16] N. Tracey, A Search-Based Automated Test-Data
Generation Framework for Safety Critical Software,
PhD thesis, University of York, 2000.
[17] N. Mansour and M. Salame, Data generation for path
testing, Software Quality Journal 2004; 12: pp. 121
136.
[18] A. Watkins and E. Hufnagel, Evolutionary test data
generation: a comparison of fitness functions,
Software Practice & Experience 2006; 36: pp. 95116
[19] P. McMinn, Search-based software test data
Generation: A survey, Software Testing, Verification
and Reliability June 2004; 14(2): pp. 105-156.
[20] C. Michael, G. McGraw and M. Schatz, Generating
software test data by evolution, IEEE Transactions on
Software Engineering 2001; 27(12): pp. 10851110.
[21] W. Miller and D. Spooner, Automatic generation of
floating-point test data, IEEE Transactions on Software
Engineering 1976; 2(3): pp. 223-226.
[22] A. Watkins, The automatic generation of test data using
genetic algorithms, In The fourth software quality
conference 1995; 2: pp. 300309.
[23] G. Myers, The art of software testing. New York: Wiley;
1979
[24] R. Pargas, M. Harrold and R. Peck, Test-data
generation using genetic algorithms, Journal of
Software Testing, Verification and Reliability 1999;
9(4): pp. 26382.
[25] S. Xanthakis, C. Ellis, C. Skourlas, A. Gall, S. Katsikas
and K. Karapoulios, Application of genetic algorithms
to software testing, In The fifth international
conference on software engineering 1992; pp. 62536.
[26] M. Roper, Computer aided software testing using
genetic algorithms, In 10th International Software
Quality Week, San Francisco, USA, 1997.
[27] Z. Yuan, A search-based framework for automatic test-
set generation for MATLAB/Simulink Models. PhD
Thesis, University of York Department of Computer
Science, December 2005.
[28] S. Dahiya, J. Chhabra and S. Kumar, Application of
Particle Swarm Optimization Algorithm to Symbolic
Software Testing, IISN-2010, to be held on 24-27
February 2010. (Accepted for publication)
[29] S. Nakrani and C. Tovey, On honey bees and dynamic
allocation in an internet server colony, Proceedings of
2nd International Workshop on the Mathematics and
Algorithms of Social Insects, Atlanta, Georgia, USA,
2003.
[30] D. Pham, S. Otri, A. Afify, M. Mahmuddin, and H. Al-
Jabbouli, Data clustering using the bees algorithm, In
40th CIRP International Seminar on Manufacturing
Systems. 2007: Liverpool.
[31] D. Pham, S. Otri, A. Ghanbarzadeh and E. Kog,
Application of the bees algorithm to the training of
learning vector quantisation networks for control chart
pattern recognition, ICTTA'06 Information and
Communication Technologies, 2006, pp. 1624-1629
154