You are on page 1of 7

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.

6, November 2014

ANALYSIS OF QUALITY OF SERVICE IN CLOUD


STORAGE SYSTEMS
Hamed Alizadeh1 and Jaber Karimpour 2
1

Department of Computer Engineering, Science and Research Branch, Islamic Azad


University, Zanjan, Iran
2
Department of Computer Science, University of Tabriz, Tabriz, Iran

ABSTRACT
Cloud storage is a system composed of multiple computers that cooperate to optimally save lots of files.
Due to a file or server failure in this system, the service may be stopped and users may get no response
from the system. In this paper, we analyze how to apply quality of service in cloud storage systems to
improve fault tolerance and availability.

KEYWORDS
Cloud Storage, Quality of Service, High Availability, Fault Tolerance.

1. INTRODUCTION
Cloud storage [1] is a model of networked online storage where data is stored in storage machines
which are generally hosted by third parties. Hosting companies have large data centers, and
people who want their data to be hosted buy storage capacity from them. Cloud storage services
such as Amazon S3 [2], cloud storage products such as EMC Atmos [3], and distributed storage
research projects such as OceanStore [4] are examples of file storage. Quality of service (QoS)
[5] is the overall performance of a cloud storage seen by the users of the cloud storage. To
measure quality of service, several related aspects of the cloud service are considered, such as
error rate, bandwidth, throughput, transmission delay, and availability. Quality of service is
particularly important for the transport of files with special requirements.
There exist a number of studies (reviewed in Section II) on QoS in clouds, but they mostly
consider cloud computing, not cloud storage. The few studies that worked on QoS in cloud
storage, consider the service qualities given to different users, not to different files. In contrast,
we want to analyze how to give better services to more important files without considering which
users they belong to. The rest of this paper is organized as follows. We review the related work in
Section II. In Section III, we analyze QoS in cloud storage. Finally, Section IV concludes the
paper.

2. R ELATED WORK
In this section, we review related researches that focus on designing cloud storage systems and
QoS-based cloud systems.

2.1. Cloud Storage Systems


In a cloud storage system, it is difficult to balance the huge elastic capacity of storage and
investment of expensive cost for it. In order to solve this problem in the cloud storage
infrastructure, low cost cluster based storage server is configured in [6] to be activated for large
DOI:10.5121/ijfcst.2014.4607

71

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

amount of data to provide for cloud users. BlueSky [7] is a network file system backed by cloud
storage. BlueSky stores data persistently in a cloud storage provider allowing users to take
advantage of the reliability and large storage capacity of cloud providers and avoid the need for
dedicated server hardware. Authors in [8] address the problem of building a secure cloud storage
system that supports dynamic users and data provenance. Gecko [9] is a design for storage arrays
where a single log structure is distributed across a chain of drives, physically separating the tail of
the log from its body. This design provides the benefits of logging fast, sequential writes for any
number of contending applications while eliminating the disruptive effect of log cleaning
activity on application I/O. Authors in [10] present a power-lean storage system where racks of
servers can be powered down to save energy.

2.2. QoS-based Cloud Systems


Service Level Agreement (SLA) is a contract between a user and a service provider to provide a
pre-determined quality of service for the user. The problem of maximizing the provider's income
through SLA-based dynamic resource allocation is addressed in [11] as SLA plays an important
role in cloud computing to connect service providers and customers. Authors in [12] present
vision, challenges, and architectural elements of SLA-oriented resource management in cloud
systems. A generic QoS framework is proposed in [13] for cloud workflow systems. The
framework consists of the following four components: QoS requirement specification, QoS-aware
service selection, QoS consistency monitoring, and QoS violation handling. In [14], authors
tackle a cloud workflow scheduling problem which enables users to define various QoS
constraints like the deadline constraint, the budget constraint, and the reliability constraint. A
scheduling heuristic is presented in [15] considering multiple SLA parameters for deploying
applications in Clouds. In [16], authors added trust into workflow's QoS target and proposed a
novel customizable cloud workflow scheduling model. In [17], authors describe an approach and
methodology to develop novel cloud monitoring techniques and services enabling automated
application QoS management under uncertainties. In [18], authors propose resource allocation
algorithms for cloud providers who want to minimize infrastructure cost and SLA violations.

3. QOS IN CLOUD STORAGE


In this section, we represent a basic cloud storage system and call it OCSS (Ordinary Cloud
Storage System). Then, we present our idea on this system. Afterwards, we analyse how to apply
the idea.

3.1. Basic Cloud Storage


In this paper, we define the cloud model shown in Fig. 1. We assume that the cloud is composed
of a central controller and N physical computers working as cloud machines. The machines may
face failure or shutting down at any time except for the central controller. There are millions of
internet users who send requests to the central controller to read a file from the system or write a
file to the system. Then, the central controller forwards the request to a machine to handle it. If
the user requested to write a file, the selected machine receives the file from the user and saves it
on its storage disk. If the user requested to read a file, the selected machine sends the file from its
storage disk to the user.

72

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

Figure 1. Cloud system model

3.2. QoS Problems in Cloud Storage


Quality of service is affected by several factors which can be divided into human and technical
factors. Many things can happen to files as they travel from the cloud storage to the user, resulting
in the following problems:

Low throughput
Dropped packets
Errors
Latency
Jitter
Out-of-order delivery

3.3. Improving Fault Tolerance


For increased stability against failure, the most widely used method (generally called Replication)
is that each file is saved on more than one machine. Then in case of failure of one of these
machines, the file can be read from another machine. We save the original and copies of every
file on multiple machines.

3.4. Our QoS mechanism


We consider the following two parameters:
File read delay
File read failure probability
When an internet user sends a request to the cloud system, we define file read delay as the time
duration from the point of receiving the file read request by the central controller to the point just
before sending the first bit of the file from the system to the requesting user. This delay measures
the latency of both handling the file read request in the system and internal data/control message
transfers between system machines. It does not includes the latency of data transfer between the
user and the system. We define file read failure probability as the probability of failure in reading
the file due to only machine failure.

3.5. Classes of Service


Class of service is a parameter to differentiate the types of files being transmitted. The purpose of
such differentiation is generally associated with assigning priorities to the files.

73

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

We assume that the owner of a file who wants to write the file into the system, first determines its
QoS class and writes this class number inside the file.
We define three values for QoS class:
QoS=3 (Highest quality): Read delay and failure probability for this file should be three
times less than class-1 files. (Stability against simultaneous failures of multiple cloud
machines)
QoS=2 (Middle quality): Read delay and failure probability for this file should be twice less
than class-1 files. (Stability against simultaneous failure of two cloud machines)
QoS=1 (Lowest quality): Read delay and failure probability for this file must be low enough
so that the system can meet the service requirements of class-2 and class-3 files, and then
gives the remaining service to class-1 files. (Stability against failure of a cloud machine)

3.6. How to give priority to more valuable files


Class-2 and Class-3 files are more valuable files. Our system is supposed to give a better service
to Class-2 and Class-3 files. When the system faces machine failure and/or heavy load, Class-2
and Class-3 files should experience less failure and delay compared to Class-1 files. Now we
explain what mechanisms we integrated in our design to achieve this.
To reduce read delay of Class-2 and Class-3 files, we reduced the followings for these files:
Total distance between original blocks of the file.
Duration of file extraction after failure.
To reduce failure probability of Class-2 and Class-3 files, we increased copy blocks for these
files.

3.7. High Availability


Cloud storage centers require high availability of their systems to perform routine daily activities.
Availability refers to the ability of the user community to obtain a service or good, access the
system, whether to submit new file, update or alter an existing file, or read an existing file. If a
user cannot access the system, it is - from the users point of view - unavailable. Generally, the
term downtime is used to refer to periods when a system is unavailable.
There are three principles in high availability. They are:
1. Elimination of single points of failure: This means adding redundancy to the system so that
failure of a single component does not mean failure of the entire system.
2. Reliable crossover. In multithreaded systems, the crossover point between components tends to
become a single point of failure: High availability engineering must provide a reliable crossover.
3. Detection of failures as they occur: If the two former principles are observed, then a user may
never see a failure. But there must be maintenance activities.

3.8. System design for high availability


Adding more components (including machines, files, network equipments) to the cloud system
design increases availability. Advanced system designs allow the system to be patched and
upgraded without compromising service availability.
High availability requires less human intervention to restore operation in cloud systems, the
reason for this being that the most common cause for outages is human error.
74

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

Redundancy is used to create cloud systems with high levels of Availability. In this case, it is
required to have high levels of failure detectability and avoidance of common cause failures.
Modelling and simulation are used to evaluate the theoretical reliability for large cloud systems.
Simulation of a cloud system is usually done using the CloudSim [19] simulator. The outcome of
this kind of model is used to evaluate different design options. A model of the entire cloud system
is created, and the model is stressed by removing components (including machines, files, network
equipments). Redundancy simulation involves the N-x criteria. N is the total number of
components in the system. x represents the number of components used to stress the system. The
(N-1) criteria means the model is stressed by evaluating performance with all possible
combinations where one cloud machine is failed. The (N-2) criteria means the model is stressed
by evaluating performance with all possible combinations where two cloud machines are failed
simultaneously.

3.9. Reasons for unavailability


There are the following reasons [20][21] that may cause the system or file read to fail:

Monitoring of cloud machines


Requirements and procurement
Operations
Avoidance of network failures
Avoidance of internal application failures
Avoidance of external services that fail
Physical environment
File/Network redundancy
Technical solution of backup
Process solution of backup
Physical location
Infrastructure redundancy
Storage architecture redundancy

4. SIMULATION
We implemented our design in the CloudSim [19] simulator. In this section, we evaluate the
performance of our design. To do this, we consider the simulation parameters presented in Table
1. In this simulation, we change number of failures in different executions whereas the other
parameters are constant.
Table 1. Simulation Parameters.

Parameter

Value

Machine failure rate


Number of computers

from 0.01 to 1.25


100

File Write Rate

50 requests per recond

File Read Rate

50 requests per second

File Size
Number of Files
Simulation Duration

100 Kbytes
10000
1 hour

75

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

Figure 2. Average file read failure rate in our design for QoS classes

Fig. 2 shows average file read failure rate versus machine failure rate in our design for files of the
three QoS classes. These results prove that our design gives better service to Class-2 and Class-3
files in term of file read failure rate compared to Class-1 files.

5. C ONCLUSIONS
In this paper, we analyzed the design options and requirements for applying QoS in cloud storage
systems. The results of this analysis show the cost of QoS in cloud storage is high and it has to be
implemented only for important files.

REFERENCES
[1]
[2]
[3]
[4]

Cloud Storage, http://en.wikipedia.org/wiki/Cloud_storage.


Amazon S3, http://en.wikipedia.org/wiki/Amazon_S3.
EMC Atmos, http://en.wikipedia.org/wiki/EMC_Atmos.
Sean Rhea, Chris Wells, Patrick Eaton, Dennis Geels, Ben Zhao, Hakim Weatherspoon, and John
Kubiatowicz, Maintenance-Free Global Data Storage, IEEE Internet Computing , Vol 5, No 5,
September/October 2001, pp 4049.
[5] M Natkaniec, K Kosek-Szott, S Szott, G Bianchi, A Survey of Medium Access Mechanisms for
Providing QoS in Ad-Hoc Networks, IEEE Communications Surveys & Tutorials, Volume:15 ,
Issue: 2, pp. 592-620, 2012.
[6] Tin Tin Yee, Thinn Thu Naing, PC-Cluster based Storage System Architecture for Cloud Storage,
International Journal on Cloud Computing: Services and Architecture, Volume: 1 - volume NO: 3 Issue: November 2011.
[7] Michael Vrable, Stefan Savage, and Geoffrey M. Voelker, BlueSky: A Cloud-Backed File System
for the Enterprise, Proceedings of the 7th USENIX Conference on File and Storage Technologies
(FAST), San Jose, CA, February 2012.
[8] Sherman S. M. Chow, Cheng-Kang Chu, Xinyi Huang, Jianying Zhou, Robert H. Deng, Dynamic
Secure Cloud Storage with Provenance, Cryptography and Security, pp. 442-464, 2012.
[9] Ji-Yong Shin , Mahesh Balakrishnan , Lakshmi Ganesh , Tudor Marian , Hakim Weatherspoon,
Gecko: A Contention-Oblivious Design for Cloud Storage, In Proceedings of the USENIX
Workshop on Hot Topics in Storage and File Systems (HotStorage), Boston, MA, U.S.A., Jun 2012.
[10] Lakshmi Ganesh, Hakim Weatherspoon, Ken Birman, Beyond Power Proportionality: Designing
Power-Lean Cloud Storage, NCA 2011, pp.147-154, 2011.
[11] G. Feng, S. Garg, R. Buyya, and W. Li, Revenue Maximization using Adaptive Resource
Provisioning in Cloud Computing Environments, Proc. 13th ACM/IEEE Int. Conf. Grid Computing,
pp. 192200, 2012.
[12] R. Buyya, S.K. Garg, and R.N. Calheiros. SLA-Oriented Resource Provisioning for Cloud
Computing: Challenges, Architecture, and Solutions, Proc. Int. Conf. Cloud and Service Computing,
pp. 1-10, 2011.

76

International Journal in Foundations of Computer Science & Technology (IJFCST), Vol.4, No.6, November 2014

[13] X. Liu, Y. Yang, D. Yuan, G. Zhang, W. Li, and D. Cao, A Generic QoS Framework for Cloud
Workflow Systems, Proc. Ninth IEEE Int. Conf. Dependable, Autonomic and Secure Computing, pp.
713-720, 2011.
[14] W.N. Chen, and J. Zhang, A Set-Based Discrete PSO for Cloud Workflow Scheduling with UserDefined QoS Constraints, Proc. IEEE Int. Conf. Systems, Man and Cybernetics, pp. 773-778, 2012.
[15] V.C. Emeakaroha, I. Brandic, M. Maurer, and I. Breskovic, SLA-Aware Application Deployment
and Resource Allocation in Clouds, Proc. 35th IEEE Annual Computer Software and Applications
Conference Workshops (COMPSACW), pp. 298-303, 2011.
[16] W. Li, Q. Zhang, J. Wu, J. Li, and H. Zhao, Trust-based and QoS Demand Clustering Analysis
Customizable Cloud Workflow Scheduling Strategies, Proc. IEEE Int. Conf. Cluster Comp.
Workshops, pp. 111119, 2012.
[17] K. Alhamazani, R. Ranjan, F. Rabhi, L. Wang, and K. Mitra, Cloud Monitoring for Optimizing the
QoS of Hosted Applications, Proc. 4th IEEE Int. Conf. Cloud Comp. Tech. and Sc., pp. 765-770,
2012.
[18] L. Wu, S.K. Garg, and R. Buyya, SLA-based Resource Allocation for Software as a Service Provider
(SaaS) in Cloud Computing Environments, Proc. 11th IEEE/ACM Int. Symp. Cluster, Cloud and
Grid Com-puting, pp. 195-204, 2011.
[19] Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, Csar A. F. De Rose, Rajkumar Buyya,
CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of
resource provisioning algorithms, SoftwarePractice & Experience, Volume 41 Issue 1, Pages 2350, January 2011.
[20] Marcus, E.; Stern, H. (2003). Blueprints for high availability (Second ed.). Indianapolis, IN: John
Wiley & Sons. ISBN 0-471-43026-9.
[21] IBM Global Services, Improving systems availability, IBM Global Services, 1998.

Authors
Hamed Alizedeh was born in 1986 in Iran. Mr. Alizadeh received her B.Engr.degree
from Islamic Azad University khoy, and his M.S. degree from Islamic Azad University,
Zanjan branch (Zanjan, Iran) in computer engineering in 2012.

Jaber Karimpour was born in 1975 in Iran. Dr. Karimpour received his B.Engr. degree
and his M.S. degree from University of Tabriz in computer science. He also received his
Phd degree from University of Tabriz (Tabriz, Iran) in computer science in 2009.

77

You might also like