You are on page 1of 5

2013 Sixth International Conference on Business Intelligence and Financial Engineering

Scheduling Workflow in Cloud Computing


Based on Ant Colony Optimization Algorithm
Yue Zhou
School of Computer Science and technology
East China Normal University
Shanghai, China
E-mail:zhouyue0101@126.com

XinLi Huang
School of Computer Science and technology
East China Normal University
Shanghai, China
E-mail: xlhuang@cs.ecnu.edu.cn
characteristics of cloud computing, this paper did a greater
improvement on ant colony optimization algorithm to
improve the ability of optimization and speed of execution .

AbstractCloud
computing
environments
facilitate
applications by providing virtualized resources that can be
provisioned dynamically. So how to schedule applications to
cloud resources and the execution time should be taken into
account. In this paper, we propose a scheduling strategy based
on ant colony optimization (ACO) and two-way ants
mechanism is introduced .Setting the pheromone threshold is
to avoid the premature phenomenon, in addition, taking a twotier search strategy and introducing pre-execution time is to
avoid the local optimum so that tasks can be assigned to
highest efficient computing resources. The simulation results
show that the algorithm can greatly shorten the time to find
the computing resource in cloud computing environment and
significantly improve the efficiency.

II.

There are many papers about scheduling in grid


environment.
For
conventional
applications
with
independent tasks, some simple methods such as Min-Min
and Max-Min [6, 7] are used to meet the QoS. In paper [8],
in order to reduce the time, Suraj Pandey used partical
swarm optimization (PSO) to schedule the workflow. Time
was also treated as optimization objective in literature [9]
and [10]. Paper [11] proposed Multiobjective Differential
Evolution(MODE). WeiNeng Chen and Jun Zhang [12] used
ACO to solve workflow with various QoS requirements in
grid computing. Genetic-based optimization techniques also
have been used to solve scheduling problem in Grid
environment, for review, readers are referred to [13, 14, 15].
Although these approaches worked effectively in grid
environment, they couldnt be directly applied to solve
scheduling problem in cloud computing, because cloud
computing is more commercialized than grid computing.
In cloud computing, Meng Xu [16] introduced a Multiple
QoS constraining scheduling strategy of Multi-Workflows
(MQMW) to solve workflow scheduling problem. Zhang Jun
Wu[18] used ACO, GA, and PSO to solve workflow
scheduling problem, the experimental results show that the
performance of ACO based scheduling algorithm is better
than others.

Keywords: cloud computing; ACO ; workflow; two-way


ants; two-tier search; pre-execution
I.

INTRODUCTION

Cloud computing does not have a common definition,


Vaquero et al. [1] have compared more than 20 different
concepts of cloud computing in order to give a common
definition. Cloud computing is further development of
distributed computing, parallel processing and grid
computing, and is internet-based computing [2]. Cloud
computing is a new type of share infrastructure which can
connect huge pools of systems for providing users with a
variety of storage and computing resources via the Internet
[3]. The business features of cloud computing require it to
meet users application needs. These applications can usually
be broken down into a number of tasks associated with each
other [4]. Therefore an effective scheduling optimization is
needed in order to achieve the goal that workflow tasks can
complete an entire application.
Ant colony optimization algorithm is self-adaptive search
algorithm, It simulates the process of ants looking for the
food. Because of its property of parallel distributed,
scalability, easy implementation, robustness, etc. and in a
dynamic environment also showed a high flexibility and
robustness. It has successfully solute many combinatorial
optimization problems. The ant colony optimization
algorithm is very appropriate way to solve resource
scheduling problems in the cloud computing [5]. Efficient
and accuracy of scheduling algorithm and directly affects the
performance of the cloud. This article studied the mechanism
of ant colony optimization algorithm, combined with the

III.

SCHEDULING WORKFLOW STRATEGY BASED ON ANT


COLONY ALGORITHM

A. Two-way ant mechanism


The ants are divided into two categories: the ForwardAnt and the Back-Ant. Forward-Ant is used to find the
available virtual machine nodes in the cloud. Back-Ant is
produced when the Forward-Ant finds the available
resources. The Back-Ant returns in the original way and
leaves the pheromone of the available resource to the nodes.
The structures of Forward-Ant and Back-Ant are shown in
TableI and TableII .
TABLE I.

This work is supported by the project of Shanghai Science and


Technology Innovation Action Plan (Grant No. 13511500400).
Corresponding Author: XinLi Huang. E-mail: xlhuang@cs.ecnu.edu.cn

978-1-4799-4777-5/14 $31.00 2014 IEEE


DOI 10.1109/BIFE.2013.14

RELATED WORK

57

FORWARD-ANT

Ant ID

Nodes on the path

Ant ID

Ns

TABLE II.

BACK-ANT

Ant ID

Nodes on the
path

pre-execution
time

pheromone of the
available
resource

Ant ID

NS

ETe

is the pheromone of the node at ,


Where
represents the pheromone of the node at when the task is
assigned, is the adjustment factor and
.
When the task runs to completion or failure, the load on
system reduces, in order to ensure to balance the load on the
node, the pheromone is updated according to formula (8)

B. The definition and updating of pheromone


1) The definition of pheromone
We use virtual hardware resources as ppheromones of a
node.
represents the number of CPU, stands for CPU
pprocessing power (MPIS), represents the memory capacity,
is storage capacity and is broadband. Set the threshold
value for each parameter according to the formula (1), if the
threshold is exceeded, the value of parameter should be set
the value of threshold.






Where
is the pheromone of the node at ,
represents the pheromone of the node at when the task
runs to completion or failure, is the adjustment factor and
.
When the task runs to completion successfully on the
node of which the pheromone will increase, if the task runs
to failure, the pheromone of the node will reduce. Therefore,
on the basis of the above formula, adding the factor 2 to
update the pheromone of the node, formula (9)

  

Firstly initialize the pheromone of the hardware:


The pheromone of CPU:




  

  
The pheromone of storage:
  



The pheromone of broadband:

  

Where is the pheromone of the available node ,


represents the node in the back path.
is the adjustment
factor and
.

  
The pheromone of node i is calculated according the
formula (6) :

The tasks on the available nodes will decrease with the


passage of the time and the load will reduce, so the
pheromone will increase. So from the time to time, the
pheromone of every node in the path should increase
according to the formula (11).

  

is the pheromone of the node .


2) Update the pheromone
The modification of pheromone is divided into two
categories, one is to update the node which is an available
resource. The other is the nodes which the Back-Ant comes
back through.
a) Update the available node
When a new task is assigned to the node, CPU utilization
will increase, the pheromone decreases. So the pheromone is
updated according to formula (7)


  

Where
is the pheromone of the node at ,
represents the pheromone of the node at .If the task runs
to completion successfully, 0<2<1, else -1<2<0.
b) Update the pheromone of node on the path
The available node produces the Back-Ant with the
pheromone. The Back-Ant returns the Master-Node and
leaves the pheromone to the node in the path. The
pheromone of the node in the path is updated according to
the formula (10).

The pheromone of memory:



  

  
Where is the pheromone of the node at ,
represents the pheromone of the node at
.
adjustment factor and
.

is the

C. The definition of the pre-execution time


1) The definition of the pre-execution time
In the environment of cloud computing, a node may have
to run multiple tasks at the same time. If the cloud assign

  

58

tasks to the higher efficiency node, the performance of the


cloud will be improved a lot, so if we can calculate the preexecution time of a task, it will be a key role to select the
next node. This paper the introduces the task pre-execution
time according to the formula (12)

  
Where
is the probability the ant on the node selects
the next node , represents the pre-execution time of the
node j, represents the importance of the pheromone,
represents the importance of the pre-execution time.
is the
set of nodes which is next to node , the node is one the ,
represents the set of nodes which is next to the node ,
is the adjustment factor and
.

  
is the pre-execution time of a task.
Where
represents
the number of the tasks which is predicted,
p
represents the previous number of the tasks,
stands for the real execution time,
stands
for the previous ppre-execution time, is the adjustment
factor and
.

IV.

DESCRIPTION OF THE SCHEDULING WORKFLOW


STRATEGY

Scheduling workflow algorithm as follows:


(1) Initial the pheromone of each node.
(2) Submit the batch of jobs to the Master node.
(3) Master node selects the first job . Assume that the
size of job is and the job is divided into tasks
of which the size is . The Master node starts a
timer and sends
forward-ants. is a parameter
which is to decide the multiple relationship between
the forward-ants and the tasks. Every forward-ant
randomly selects the next node.
(4) When the forward-ant comes into the node , the
node i will be set into the
of Forward-Ant. The
pre-execution time will be calculated according to
the formula (11), if the pre-execution time is less
than
, the node i is available node, otherwise not.
(5) If the node is available node, a Back-Ant is
produced which will get the
of the Forward-Ant,
the pheromone of the node and the pre-execution
time. The pheromone of node in the back path is
updated according to the formula (10).
(6) If the node
is not available node and the
pheromone of the node
does not reach the
threshold
, the Forward-Ant selects the next
node randomly.
(7) If the node
is not available node but the
pheromone of the node reaches the threshold
,
the forward-ant selects the next node according to
the formula (13).
(8) The pheromone of the node in the back path is
updated according to the formula (11) from time to
time.
(9) Before the timer of the Master node reaches zero, if
the Master node receives the Back-Ant, Master
node will assign the tasks to the available nodes
which have the least pre-execution time. The
pheromone of the available node is updated
according to formula (7). If the Master node does

D. Rules Forward-Ant choose next node


1) Description of the problem
In the ant colony algorithm, how the ant select the next
node is based on the concentration of the pheromone. If it is
only based on the concentration of the pheromone, at the
beginning of the algorithm, the ant will select the path with
high initial pheromone of the node which will lead the
singularity of the solution. With the development of the time,
it is easy to fall into local optimization. In order to solve the
problem of singularity, this paper proposes a strategy to set
the threshold. In order to solve the problem of local
optimization, this paper proposes a two-tier search strategy.
2) Solution to the problem
a) setting the threshold value
The ant always selects the node with higher
concentration of pheromone due to the characteristics of the
ant colony algorithm. When the higher concentration of
pheromone is just on a few nodes, it is more possibility to be
selected. That all the ants will focus on that few paths in the
early research will lead the singularity of the solution. In
order to solve it and make the ants find more effective nodes,
this paper proposes the following solution, the value 0 of
the pheromone is set on every node, if the pheromone on the
node is less than 0, the ants will ignore the impact of the
concentration of the pheromone and select the next node
randomly, if the pheromone on the node is not less than 0,
the ant will select the next node according the two-tier search
strategy. That in the early research ants can select the multipath ensures a diversity of solutions.
b) Two-tier search strategy
In the ant colony algorithm, the ants select the next node
based on the concentration of the pheromone. That the
concentration of pheromone on the path is too high or too
low is so easy to fall into local optimization. In order to solve
the problem, we use the two-tier search strategy to select the
next node. When the ant k is in the search path (i j), both the
pheromone of node j and the next node of j will be taken into
account. This will avoid the algorithm falling into local
optimization. The ants select the next node according to the
formula (13)

59

receive the Back-Ant, that means no available node,


the Master node does not assign tasks.
(10) When the tasks are completed or failed, the
pheromone of the available node will be updated
according to the formula (9), the tasks that are not
completed will be assigned to another node by
Master node.
(11) The Master node accepts the next job. Repeat steps
(3) to (10).
V.

7
8
9
10
11
12

1
1
2
1
1
2

1
2
1
1
2
1

2.5
2.5
2.5
3
3
3

As can be seen from TABLE III, when


,
,
, the time of scheduling is the least.
The size of the job is 2000M, the value of , , is set
to 1, 2, 2.5, this table shows the different scheduling time of
ACON, ACO, AS, ACS. We learn that the scheduling time
of ACON is the least. Refer to TABLE IV
As can be seen from TABLE IV, the scheduling time of
ACON is the least, which is due to less time for searching
virtual machine.
The size of single job is 2000M, we submit 5~25 jobs,
we repeat the experiment 10 times, we get the average time
of all the results. Refer to Figure 1
As can be seen from Figure 1, the strategy this paper
proposes is more efficiency than the ACO, AS and the ACS.

EXPERIMENTS AND RESULTS

In order to verify the new algorithm, we use the


CloudSim to do the simulations. We compare ACON (Ant
Colony Optimization New) in resource scheduling time with
AS based on ant system, ACS based on ant colony system
and other ant colony optimization (ACO).
A. Time Complexity
The time complexity of the algorithm
is
g
where k is the number of tasks.
is the time complexity
of Forward-Ants finding nodes,
is the time complexity
of putting the tasks to the nodes. The algorithm that this
paper introduces is to minimize the time of finding the
available resources to improve the efficiency.

VI.

SUMMARY

In this paper, considering the defects of ant colony


optimization and a series of characteristics of cloud
computing, this paper propose a new scheduling strategy
(ACON) based on ant colony optimization (ACO). Ant
Colony Optimization New (ACON) contains two-way ants
mechanism, Setting the pheromone threshold, taking a twotier search strategy and introducing pre-execution time. The
simulation results show that the algorithm can greatly
shorten the time to find the computing resource in cloud
computing environment and significantly improve the
efficiency.

B. Experimental parameter setting


In the algorithm, , , , represents the importance of the
CPU, memory, storage and the bandwidth of the virtual
he importance of the CPU
U is more, so set
machine. Becausee the
the value of , , , to 4, 2, 2, 2. , represents the
importance of pheromone and the pre-execution time,
represents the number of ants. The value of , , will be set
by the experiment. Adjustment factors , , , are set
to 0.2. When the task is completed
successfully,
is set to
p
y
0.2 or it will be set to -0.2.
is set to 0.3. is set to 0.4.
The number of virtual machine node is 200. Masters timer is
set to 5 seconds. The size of task is 500M. The frequency of
CPU is 200MIPS~400MIPS, the size of memory is
512M~1G, the bandwidth is 1M/S~2M/S, the size of storage
is 10G~20G.

TABLE IV.
algorithm
ACON
ACO
AS
ACS

THE DIFFERENT SCHEDULING TIME OF ACON, ACO, AS,

ACS

Total time

1
1
1
1

2
2
2
2

2.5
2.5
2.5
2.5

3.671
3.683
3.713
3.693

C. The results and analysis


The size of job is 2000 MB,, the jjob is submitted 12 times.
We learn that when

, the time of
scheduling is the least. We repeat the experiment above with
different size of job, we get the same result. So the value of
, , is set to 1, 2, 2.5. Refer to TABLE III.
TABLE III.
Experiments
1
2
3
4
5
6

THE DIFFERENT SCHEDULING TIME WITH DIFFERENT


VALUE OF N, , .

Total time

1
1
2
1
1
2

1
2
1
1
2
1

1.5
1.5
1.5
2
2
2

3.756
3.744
3.730
3.733
3.720
3.711

3.698
3.671
3.699
3.715
3.703
3.714

Figure 1. The time of ACON, ACS, AS, ACO

60

REFERENCES
[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10] J. Yu, R. Buyya, and C. K. Tham. A Cost-based Scheduling of


Scientific Workflow Applications on Utility Grids. Proc. of the 1st
IEEE International Conference on e-Science and Grid Computing,
Melbourne, Australia, December 2005, pp: 140-147.
[11] KhaledTalukde, MichaelKirley, Rajkumar Buyya. Multiobjective
Differential Evolution for Scheduling Workflow Applications on
Global Grids. Concurrency and Computation: Practice and
Experience. Wiley Press, New York, USA.21 (13), pp: 1742-1756,
2009.
[12] Wei-Neng Chen, Jun Zhang. An Ant Colony Optimization Approach
to a Grid Workflow Scheduling Problem with Various QoS
Requirements. 2009 IEEE Transactions on systems, Man, And
Cyberetics.
[13] J. Yu and R. Buyya. Scheduling Scientific Workflow Applications
with Deadline and Budget Constraints using Genetic Algorithms.
Scientific Programming Journal, IOS Press, 2006, 14(3-4), pp: 217230.
[14] Kim S, Weissman JB. A genetic algorithm based approach for
scheduling decomposable data Grid applications. ICPP.IEEE
Computer Society: Silver Spring, MD, 2004, pp: 406413.
[15] Ye G, Rao R, Li M. A multiobjective resource scheduling approach
based on genetic algorithms in Grid environment. Fifth International
Conference on Grid and Cooperative Workshops, Hunan, China,
2006; 504509.
[16] Meng Xu, Lizhen Cui, Haiyang Wang, Yanbing Bi. A Multiple QoS
Constrained Scheduling Strategy of Multiple Workflows for Cloud
Computing. 2009 IEEE International Symposium on Parallel and
Distributed Processing with Applications.2009, pp: 629-634.
[17] Zhangjun Wu, Xiao Liu, Zhiwei Ni, Dong Yuan, Yun Yang. A
Market-Oriented Hierarchical Scheduling Strategy in Cloud
Workflow. Systems.Journal of Supercomputing, Special issue on
Advances in Network&Parallel Comptg, to be appeared, 2012.
[18] Mohammed Alhamad, Tharam Dillon, Elizabeth Chang.Conceptual
SLA framework for Cloud Computing.4th IEEE International
Conference on Digital Ecosystems and Technologies. 2010, pp: 606610.

Vaquero L, Rodero- Merino L, Caceres J, Lindner M (2009) A break


in the clouds: towards a cloud dentition. ACM SIG-COMM computer
communications review.
Armbrust, M., Fox, A., et al.: Above the Clouds: A Berkeley View of
Cloud Computing. Technical Report No. UCB/EECS-2009-28,
University of California at Berkley, USA (February 10, 2009)
M.A. Vouk. Cloud computing issues, research and
implementations. Journal of Computing and Information Technology,
16 (4):235246, 2008.
Yu, J., Buyya, R.: A taxonomy of scientific workflow systems for
Grid computing, SIGMOD Record, Special Section on Scientific
Workflows. 2005,34(3):44-49.
Dorigo M, Caro GD. Ant colony opt imizat ion: A new m et a- h
euristic [ A] . Proc. of the 1999 Congres s on E volu ti on ary C om
put ation [ C] . Washingt on: IEEE Pr ess , 1999. 1470-1477.
D. B. Tracy, J. S.Howard, and B. Noah. Comparison of Eleven Static
Heuristics for Mapping a Class of Independent Tasks onto
Heterogeneous Distributed Computing Systems. Journal of Parallel
and Distributed Computing, vol. 61, no. 6 , 2001, pp. 810 - 837.
J. Yu, R. Buyya. Workflow Scheduling Algorithms for Grid
Computing. Metaheuristics for Scheduling in Distributed Computing
Environments, F. Xhafa and A. Abraham (eds), ISBN:978-3-54069260-7, Springer, Berlin, Germany, 2008.
Suraj Pandey1, LinlinWu1, Siddeswara Guru2, Rajkumar Buyya1. A
Particle Swarm Optimization (PSO)-based Heuristic for Scheduling
Workflow Applications in Cloud Computing Environments.
Technical Report,CLOUDS-TR-2009-11,Cloud Computing and
Distributed Systems laboratory, The University of Melbourne
Australia, October,2009.
R. Sakellariou, H. Zhao, E. Tsiakkouri, and M. D. Dikaiakos.
Scheduling Workflows with Budget Constraints. CoreGRID
Workshop on Integrated research in Grid Computing. Technical
Report TR-05-22, University of Pisa, Dipartimento Di Informatica,
Pisa, Italy, November 2005,pp: 347-357.

61

You might also like