Anshul Presentation

Reinforcement Learning Based Virtual Cluster Management
Anshul Gangwar
Dept. of Computer Science and Automation, Indian Institute of Science
Click to edit Master subtitle style
7/5/12
Virtualization:
The ability to run multiple operating systems on a single physical system and share the underlying hardware resources*.
Cloud Computing:
The provisioning of services in a timely (near on instant), on-demand manner, to allow the scaling up and down of resources**.
* VMware white paper, Virtualization Overview ** Alan Williamson, quoted in Cloud BootCamp March 2009
7/5/12
The Traditional Server Concept

SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 SERVER m SERVER n
App 1
30 %
App 2
40 %
App 3
25 %
App 4
30 %
App 5
20 %
App m
28 %
App n
50 %
OS 1
OS 2
OS 3
OS 4
OS 5
OS m
OS n
9 A.M. TO 5 P.M. M-F
Rate of Server Accesses
Machine provisioning done for peak demands Processors are under utilized during off-peak hours Wastage of resources
All Other Times

Time
Need technology and algorithms that can allow allocation of only as many resources as are required 7/5/12
Server Consolidation Process

SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 SERVER m SERVER n
App 1
30 %
App 2
40 %
App 3
25 %
App 4
30 %
App 5
20 %
App m
28 %
App n
50 %
OS 1
OS 2
OS 3
OS 4
OS 5
OS m
OS n
Consolidation Process
Allows shutting down of idle PMs, saving operational costs
SERVER 1 VM 1 App 1 on Guest OS 30 % VM 2 VM 3 App 3 on Guest OS 25 % SERVER 2 VM 4 App 4 on Guest OS 30 % VM 5 App 5 on Guest OS 20 % SERVER m VM m App m on Guest OS 28 % VM n App n on Guest OS 50 %
App 2 on Guest OS Hypervisor % 40
Hypervisor
Hypervisor 7/5/12
Load Distribution
VM 1 App 1 on Guest OS 30 % VM 2 VM 3 App 3 on Guest OS 25 % VM 4 App 4 on Guest OS 30 % VM 5 App 5 on Guest OS 20 % VM m App m on Guest OS 28 % VM n App n on Guest OS 50 % App 2 on Guest OS Hypervisor % 40 SERVER 1 VM 1 App 1 on Guest OS 30 % VM 2 VM 3 App 3 on Guest OS 45 %
Hypervisor
SERVER 2 VM 4 App 4 on Guest OS 30 %
5 Hypervisor
Hypervisor
SERVER m VM 5 App 5 on Guest OS 25 % VM m App m on Guest OS 28 % VM n App n on Guest OS 50 %
Hypervisor
7/5/12
Live Migration
VM 1 App 1 on Guest OS 30 % VM 2 VM 3 App 3 on Guest OS 45 % VM 4 App 4 on Guest OS 30 %
6 Hypervisor
VM 5 App 5 on Guest OS 25 %
VM m App m on Guest OS 28 %
VM n App n on Guest OS 50 %
App 2 on Guest OS Hypervisor % 20 SERVER 1
Hypervisor
SERVER m
Migrate VM 5
SERVER 2
SERVER 1 VM 1 App 1 on Guest OS 30 % VM 2 App 2 on Guest OS 20 %

6 Hypervisor
SERVER 2 VM 5 App 5 on Guest OS 25 % VM 3 App 3 on Guest OS 45 % VM 4
SERVER m VM m App m on Guest OS 28 % VM n App n on Guest OS 50 %
7/5/12
Hypervisor
Dynamic Resource Allocation
Dynamic Workload requires Dynamic Resource Management Allocation of resources to VMs in each PM Resources such as CPU, memory etc. Allocation of VMs to PMs Minimize number of operational PMs Modern VMs (e.g. Xen) allow Resource allocation within PM Dynamic allocation of VM to PM through Live Migration
Required: Architecture and mechanisms for Determining resource allocation to VM within PM Determining deployment of VMs on PMs So that: Capital and Operational costs are minimized Application Performance is maximized
7/5/12
Two Level Controller Architecture

VM Placement Controller
PMA : Performance Measurement Agent RAC : Resource Allocation Controller
Performance Measures Migration Decisions
RAC Determined Resource Requirements
PM
VM
PMA
RAC
VM
PM
PMA
VM
RAC
PM
VM
PMA
VM
RAC
VM
7/5/12 Note: PMs(Physical Machines/servers) are assumed to be homogeneous.
Problem Definition
VM Placement Controller has to make optimal migration decisions at regular intervals which results in
Low SLA Violations Reduction in number of busy PMs

SLA(Service Level Agreement) are Performance Guaranties which Data Center Owner negotiates with User. These Performance Guaranties can include average response time, maximum delay, maximum downtime etc. Idle PMs can be switched off/run in low power mode
7/5/12
Issues in Server Consolidation/ Distribution

There are various issues involved in Server Consolidation
Interference of VMs : Bad behaviors of one application in a VM adversely affect(degraded performance) the other VMs on same PM Delayed Effects : Resource configurations of a VM show effects after some delay Migration Cost : Live Migrations involves cost(performance degradation) Workload Pattern : is not deterministic or known apriori
These difficulties motivates us to use Reinforcement Learning Approach

7/5/12
Reinforcement Learning(RL)
The agent-environment interaction in RL The goal of the agent is to maximize the cumulative long term reward based on the immediate reward rn+1.
7/5/12
Reinforcement Learning(RL)
The agent-environment interaction in RL The goal of the agent is to maximize the cumulative long term reward based on the immediate reward rn+1. RL has two major benefits Doesn't requires model of the system Capture delayed effects in decision making Can take action before problem arises
7/5/12
Problem Formulation for RL Framework: System Assumptions

M PMs: Assumed to be homogeneous N VMs: Each assumed to be running one application whose performance metrics of interest are throughput and response time Response time implied at server level only (not as seen by user) Workload per VM is assumed to be cyclic Resource requirement assumed to be equal to workload Time Period Phase 2 Cyclic Workload model: Time period assumed to be divided Into phases
Rate of Server Accesses
Phase 1
Time
7/5/12
Problem Formulation for RL Framework
7/5/12
RL Agent implemented in CloudSim

CloudSim is a Java based Simulation tool for Cloud Computing We have implemented the following additions to it Response Time and Throughput calculations Cyclic Workload Model Interference Model RL Agent which takes Migration Decisions Implements Q-Learning with non-greedy policy 1) Full State Representation without batch updates 2) Full State Representation with batch updates of batch size 200 Implements Q-Learning with Function Approximation and non-greedy policy 1) Function Approximation without batch updates 2) Function Approximation with batch updates of batch size 200
Used CloudSim for implementation of all our algorithms 7/5/12
Workload Model Graphs for Experiments
Workload Model 1(wm2)
Workload Model 2(wm2)
Graphs shows the cyclic workload graphs with 5 phases which repeats itself periodically 7/5/12
Experiment Setup

5 VMs, 3 PMs and 5 Phases in cyclic workload model (shown in previous slide) Migration decisions are taken at end of Phase Experiments: 1) One VM have workload model wm1 and others have workload model wm2 2) Two VMs have workload model wm1 and others have workload model wm2 For each Experiment (1) and (2), following scenarios All costs are negligible except power High interference cost with VM 4 and 5 interfering(high performance degradation due to interference of VMs) High migration cost with VM 4 and 5 interfering (high performance degradation due to VM migration) High migration cost and Interference cost 7/5/12
1) 2) 3) 4)
Policy Generated after Convergence

Initial state is all VMs on PM 1 and Phase 1 i.e. (1;((1,2,3,4,5),(),())) All costs are negligible except power VM VM VM VM VM 1 2 3 4 5
0.5
utilization Migrate VM 1 to PM 1 0.1 1 2 Migrate VM 1 to PM 2 3 4 Phase Migrate VM 1 to PM 2 7/5/12 5 1 2 3 4 5 Migrate VM 1 to PM 1
Policy Generated after Convergence

Initial state is all VMs on PM 1 and Phase 1 i.e. (1;((1,2,3,4,5),(),())) High migration cost and Interference cost VM 3 and 4 are interfering VM VM VM VM VM 1 2 3 4 5
0.5
utilization
0.1 1 Migrate VM 1 2 3 4 5 Phase 1 2 3 4 5
7/5/12
Results with Full State Representation
Algorithm verified to Algorithm verified
converge most of the times in 15000 steps in case 1 and 80 steps in case 2 to converge every time in 20000 steps in case 1 and 115 steps in case 2.
7/5/12
Features Based Function Approximation
7/5/12
Features for Function Approximation

Let there be 5 VMs, 3 PMs and 2 Phases in Cyclic Workload Model State = [1,(1,2,3),(4),(5)] and Action = (4,3) Next State allocation = [1,(1,2,3),(),(4,5)] Phase Indicator of Cyclic Workload Model (utilization level) Fraction of Busy PMs in Next State (Power Savings) 0.7 Pairwise Indicator whether VMs (i,j) allocated on same PM in Next State(interference)
1 1
2 (1,2) (1,3) (1,4) 0 1 1 1
(3,5) (4,5)
total k features 7/5/12
Features for Function Approximation

Migrate VM 1 f 1 0 0 start index Migrate VM 2 Migrate VM 4 f f 2 4 0 0 0.7 1 No Migration f 6 0 0
k features
k features
k features
k features
Position of fi features captures the migration cost for migrating VM i
Features except f4 are zero vectors Store only f4 features and its start index Perform multiplication and addition operations only for k features starting from start index k features are corresponding to three bullet points on previous slide
Above idea reduces number of multiplication and addition operations by around 7/5/12 five times the number of VMs
Results with Function Approximation

Feature based Q-Learning with Function Approximation Algorithm
Features are found to be non-differentiating with some state-action (s;a) tuples. For Example Consider state-action tuples ((5;(1,2,3,4),(5),()); (1,2)) and ((5;(2,3,4),(1,5),()); (1,1)) Above state-action tuples are differentiated by pairwise indicators only In first case pairwise indicators (1,5) and in second case (1,2);(1,3);(1,4) Clearly from next slide action (1,2) is good for state (5;(1,2,3,4),(5),()) while action (1,1) is bad for state (5;(2,3,4),(1,5),()) Which implies pair (1,3) and (1,4) is bad allocation and pair (1,5) is good but they are equivalent in deployment
7/5/12
Optimal Policy
Initial state is all VMs on PM 1 and Phase 1 i.e. (1;((1,2,3,4,5),(),())) High migration cost and Interference cost VM 3 and 4 are interfering VM VM VM VM VM 1 2 3 4 5
0.5
utilization
0.1 1 Migrate VM 1 2 3 4 5 Phase 1 2 3 4 5
7/5/12
Conclusion and Future Work

We conclude with this project that
RL Algorithm with Full State Representation works very well

but has problem of huge state space For these features to work for Function Approximation, we have to add more features for interference of VMs Which results in huge Feature Set Same problem as before
Future work would involve following three issues
Features must be able to well differentiate

between (s;a) tuple Fast Convergence of algorithm Scalability of Algorithm
7/5/12
References
1.
Virtualization and Cloud Computing . Norman Wilde. Thomas Huber http:// uwf.edu/computerscience/seminar/Documents/2009 VCONF : A Reinforcement Learning Approach to Virtual Machines Auto-Configuration http://portal.acm.org/citation.cfm?id=1555263 L A Prashanth and Shalabh Bhatnagar. Reinforcement learning with function approximation for trafic signal control.
7/5/12
1.
1.
Thank You !
7/5/12

Anshul Presentation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Anshul Presentation

Uploaded by

Copyright:

Available Formats

Reinforcement Learning Based Virtual Cluster Management

Click to edit Master subtitle style

The Traditional Server Concept

9 A.M. TO 5 P.M. M-F

Rate of Server Accesses

All Other Times

Server Consolidation Process

App 2 on Guest OS Hypervisor % 40

App 2 on Guest OS Hypervisor % 20

App 2 on Guest OS Hypervisor % 20 SERVER 1

SERVER 1 VM 1 App 1 on Guest OS 30 % VM 2 App 2 on Guest OS 20 %

SERVER 2 VM 5 App 5 on Guest OS 25 % VM 3 App 3 on Guest OS 45 % VM 4

SERVER m VM m App m on Guest OS 28 % VM n App n on Guest OS 50 %

App 4 on Guest OS Hypervisor % 30

Dynamic Resource Allocation

Two Level Controller Architecture

PMA : Performance Measurement Agent RAC : Resource Allocation Controller

Performance Measures Migration Decisions

RAC Determined Resource Requirements

7/5/12 Note: PMs(Physical Machines/servers) are assumed to be homogeneous.

Low SLA Violations Reduction in number of busy PMs

Issues in Server Consolidation/ Distribution

These difficulties motivates us to use Reinforcement Learning Approach

Problem Formulation for RL Framework: System Assumptions

Rate of Server Accesses

Problem Formulation for RL Framework

RL Agent implemented in CloudSim

Used CloudSim for implementation of all our algorithms 7/5/12

Workload Model Graphs for Experiments

Workload Model 1(wm2)

Workload Model 2(wm2)

Policy Generated after Convergence

utilization Migrate VM 1 to PM 1 0.1 1 2 Migrate VM 1 to PM 2 3 4 Phase Migrate VM 1 to PM 2 7/5/12 5 1 2 3 4 5 Migrate VM 1 to PM 1

Policy Generated after Convergence

0.1 1 Migrate VM 1 2 3 4 5 Phase 1 2 3 4 5

Results with Full State Representation

Algorithm verified to Algorithm verified

Features Based Function Approximation

Features for Function Approximation

2 (1,2) (1,3) (1,4) 0 1 1 1

total k features 7/5/12

Features for Function Approximation

Position of fi features captures the migration cost for migrating VM i

Results with Function Approximation

0.1 1 Migrate VM 1 2 3 4 5 Phase 1 2 3 4 5

Conclusion and Future Work

RL Algorithm with Full State Representation works very well

Future work would involve following three issues

Features must be able to well differentiate

You might also like