Professional Documents
Culture Documents
On
Bachelor of Engineering
of
University of Pune
in
Information Technology
Ashwin Habbu
Luv Varma
Shreya Tripathi
Utsav Dusad
Guided By
Prof.S.S Pande
Mr.Amol Pujari
Mr. Prasad Joshi
I. T. Department
Pune Institute of Computer Technology.
S. No. 27, Dhankawadi, Pune Satara Road, Pune 411043.
University of Pune
2013-14
Certificate
This is to certify that,
Ashwin Habbu
Luv Varma
Shreya Tripathi
Utsav Dusad
have successfully completed this project report entitled Scaling REST Services,
under my guidance in partial fulfillment of the requirements for the degree of
Bachelor of Engineering in Department of Information Technology of University
of Pune during the academic year 2013-14.
Prof.S.S.Pande
Guide
Dr.Emmanuel.M.
HoD
Acknowledgement
Ashwin Habbu
Luv Varma
Shreya Tripathi
Utsav Dusad
Abstract
REST services are a playing very important role building modern web applications and services. Good web application design involves several distributed
components including one more REST service backed up by sql and no-sql data
streams. The scaled applications and services allow billions of users to access the
services seamlessly. While static data is being cached at every stage, dynamic
data needs to be regularly refreshed at lightning speed. REST is of significant
importance in achieving this. Hence we want to focus on making a REST service
serve the contents at a super speed. This project aims to create a REST service
deployed using a new and/or custom micro web server/framework where it should
support only REST only features. It is going to be targeting only GET requests.
Contents
List of Figures
iv
1 Introduction
1.1
Need . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1
Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2
REST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Aim
3 Objective
4 Literature Survey
4.1
4.2
4.3
REST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1
HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1
HTTP Verbs . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2
GET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3
Representation . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.4
Response Codes . . . . . . . . . . . . . . . . . . . . . . . . .
ii
4.3.1
4.4
MVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.5
Tools Studied . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.5.1
cURL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.5.2
Apache Bench . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Problem Statement
18
6 Project Requirements
19
6.1
Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . 19
6.2
Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 19
7 Design
7.1
20
UML Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
26
9 References
27
iii
List of Figures
4.1
4.2
7.1
7.2
Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.3
Deployment Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.4
Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
7.5
Communication Diagram . . . . . . . . . . . . . . . . . . . . . . . . 24
7.6
Activity Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
iv
Chapter 1
Introduction
1.1
Need
REST services are playing a crucial role in building modern web application and
services. Good web application design involves several distributed components
including one or more REST services backed by SQL and NO-SQL data streams.
Usually caching is done to serve contents over HTTP, so as to scale the applications
and the services provided by them seamlessly serving up to billions of users at
a time. However caching is conducive to static data, but for dynamic data the
contents need to be refreshed and served at extremely fast speeds. REST has a
major role to play in this. This project deals with creating REST services that
can serve these contents at extremely fast speeds, without having to deal with any
form of session management or any other similar modules that are usually required
for a full stack web application.
1.2
Basic Concepts
The things which we need for understanding the Scaling of REST services require
the basic understanding of the term like REST or Scaling. These terms are listed
below with adequate meaning and structure.
1.2.1
Scaling
1.2.2
REST
Chapter 2
Aim
The aim of this project is to create a REST service deployed using a new or custom
micro web server and/or framework where it should support only REST features.
By doing this we want to be able to make the service scalable according to the
application and serve any number of customers at an extremely high speed. This
service will be customized for GET messages only.
Chapter 3
Objective
There are a number of other protocols built on top of HTTP that are designed to
specifically build web services. They focus on using HTTP as a transport protocol
for enormous XML encoded messages. The resulting service is very complex,
almost impossible to debug and wont work unless the clients also have the exact
same setup. The main issue with this is that the HTTP is used as a means of
sending the XML encoded data when HTTP can be used to send the contents as
it is. RESTful web services require none of this internal complexity, and therefore
it can instead spend its complexity on additional features or on making multiple
services interact. We want to create or customise a web framework that will deploy
a RESTful web service that will be lightweight, scalable and serve the Clients at
a super fast speed. However it will be customized for GET messages only.
Chapter 4
Literature Survey
Our project at GSLAB requires us to understand the different web service architectures such as SOAP, REST etc. And tools to test the performance of the
different web services. In the following sections, 4.1 to 4.3, we shall study about
the REST, Scaling and different HTTP verbs. In 4.4 to 4.5 we will look at the
tools to test the performance of the web services and about Mvcs.
4.1
REST
4.1.1
4.2
HTTP
HTTP is the protocol that allows for sending documents back and forth on the web.
A protocol is a set of rules that determines which messages can be exchanged, and
which messages are appropriate replies to others. In HTTP, there are two different
roles: server and client. In general, the client always initiates the conversation;
7
the server replies. HTTP is text based; that is, messages are essentially bits of
text, although the message body can also contain other media. HTTP messages
are made of a header and a body. The body can often remain empty; it contains
data that you want to transmit over the network. The header contains metadata,
such as encoding information; but, in the case of a request, it also contains the
important HTTP methods. In the REST style, you will find that header data is
often more significant than the body.
4.2.1
HTTP Verbs
Each request specifies a certain HTTP verb, or method, in the request header.
This is the first all caps word in the request header. An example of GET method
being used is: GET / HTTP/1.1 HTTP verbs tell the server what to do with
the data identified by the URL. The request can optionally contain additional
information in its body, which might be required to perform the operation for
instance, data you want to store with the resource. You can supply this data in
cURL with the -d option. If youve ever created HTML forms, youll be familiar
with two of the most important HTTP verbs: GET and POST. But there are far
more HTTP verbs available. The most important ones for building RESTful API
are GET, POST, PUT and DELETE.
4.2.2
GET
GET is the simplest type of HTTP request method; the one that browsers use
each time you click a link or type a URL into the address bar. It instructs the
server to transmit the data identified by the URL to the client. Data should never
be modified on the server side as a result of a GET request. In this sense, a GET
request is read-only, but of course, once the client receives the data, it is free to
do any operation with it on its own side for instance, format it for display.
8
4.2.3
Representation
The HTTP client and HTTP server exchange information about resources identified by URLs. Both the header and the body are pieces of the representation. The
HTTP headers, which contain metadata, are tightly defined by the HTTP spec;
they can only contain plain text, and must be formatted in a certain manner.
The body can contain data in any format, and this is where the power of HTTP
truly shines. You know that you can send plain text, pictures, HTML, and XML
in any human language. Through request metadata or different URLs, you can
choose between different representations for the same resource. For example, you
might send a webpage to browsers and JSON to applications.
The HTTP response should specify the content type of the body. This is done
in the header, in the Content-Type field; for instance:
1. Content/Type: application/json
For simplicity, our example application only sends JSON back and forth, but
the application should be architecture in such a way that you can easily change
the format of the data, to tailor for different clients or user preferences.
4.2.4
Response Codes
HTTP response codes standardize a way of informing the client about the result
of its request.
You might have noticed that the example application uses the PHP header(),
passing some strange looking strings as arguments. The header() function prints
the HTTP headers and ensures that they are formatted appropriately. Headers
should be the first thing on the response, so you shouldnt output anything else
before you are done with the headers. Sometimes, your HTTP server may be
configured to add other headers, in addition to those you specify in your code.
9
Headers contain all sort of meta information; for example, the text encoding
used in the message body or the MIME type of the bodys content. In this case,
we are explicitly specifying the HTTP response codes. HTTP response codes
standardize a way of informing the client about the result of its request. By default,
PHP returns a 200 response code, which means that the response is successful.
The server should return the most appropriate HTTP response code; this way,
the client can attempt to repair its errors, assuming there are any. Most people
are familiar with the common 404 Not Foundresponse code, however, there are a
lot more available to fit a wide variety of situations.
Keep in mind that the meaning of a HTTP response code is not extremely
precise; this is a consequence of HTTP itself being rather generic. You should
attempt to use the response code which most closely matches the situation at
hand. That being said, do not worry too much if you cannot find an exact fit.
Here are some HTTP response codes, which are often used with REST:
1. 200 OK: This response code indicates that the request was successful.
2. 201 Created: This indicates the request was successful and a resource was
created. It is used to confirm success of a PUT or POST request.
3. 400 Bad Request: The request was malformed. This happens especially with
POST and PUT requests, when the data does not pass validation, or is in
the wrong format.
4. 404 Not Found: This response indicates that the required resource could not
be found. This is generally returned to all requests which point to a URL
with no corresponding resource.
5. 401 Unauthorized: This error indicates that you need to perform authentication before accessing the resource.
10
6. 405 Method Not Allowed: The HTTP method used is not supported for this
resource.
7. 409 Conflict: This indicates a conflict. For instance, you are using a PUT
request to create the same resource twice.
8. 500 Internal Server Error: When all else fails; generally, a 500 response is
used when processing fails due to unanticipated circumstances on the server
side, which causes the server to error out.
4.3
which is unacceptable. REST takes a different way from RPC in the matter of
interfaces, and that is uniform interface constraint (constraint3). It means that
all resources expose the same interfaces to the client in REST architecture. The
most known REST implementation is HTTP protocol. REST is formed based on
truly understanding HTTP protocol. In REST, HTTP is a state transfer protocol
other than a data transport protocol but. HTTP not only can uniquely locate a
resource, but also tell us how to operate the resource.
Stateless interactions constrain (constrain6) can also enhance the scalability of
RESTful architecture. Stateless interactions constrain demands that each clients
request should contain all application states necessary to understand that request.
None of state information is kept on the server, and none of it is implied by previous
requests. Stateless interactions reduce the cost of enlarging system scale. Because
all conversations are stateless, when system scale is up, the only thing needed to
do is plugging more load-balanced servers. The coordination between different
servers is not needed. Some RPC conversations are stateful. When facing the
same question, RPC has to use some additional technologies to achieve the same
effect, such as duplicate data or shared memory.
4.3.1
Caching allows you to store your web assets on remote points along the way to
your visitors browsers. Of course the browser itself also maintains an aggressive
cache, which keeps clients from having to continually ask your server for a resource
each time it comes up.
Cache configuration for your web traffic is critical for any performing site. If
you pay for bandwidth, make revenue from an e-commerce site, or even just like
keeping your reputation as a web-literate developer intact, you need to know what
caching gets you and how to set it up.
12
In the case of assets, things like your company logo, the favicon for your site,
or your core CSS files arent likely to change from request to request, so it is safe
to tell the requester to hold onto their copy of the asset for a while.
By cutting down on the requests your server has to deal with, you are able
to handle more requests, and your users will enjoy a faster browsing experience,
thereby reducing traffic at server side which in turn helps in scaling of those web
applications . Generally, assets like images, JavaScript files, and style-sheets can
all be cached fairly heavily, while assets that are dynamically generated, like dashboards, forums, or many types of web-applications, benefit less, if at all.
4.4
MVC
13
15
4.5
4.5.1
Tools Studied
cURL
cURL is a command line tool that is available on all major operating systems. Once
you have cURL installed, type: curl -v google.com. This will display the complete
HTTP conversation. It is used for getting or sending files using URL syntax. Since
cURL uses libcurl, it supports a range of common Internet protocols, currently
16
including HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP, LDAP, LDAPS, DICT,
TELNET, FILE, IMAP, POP3, SMTP etc. cURL is a utility that is used to make
HTTP requests to a given url. It outputs HTTP response to standard output.
Examples of cURL use from command line:
1. Make requests with different HTTP method type without data:
(a) curl -X POST http://www.somedomain.com/
(b) curl -X DELETE http://www.somedomain.com/
(c) curl -X PUT http://www.somedomain.com/
2. Make requests with data: send login data with POST request curl request
4.5.2
Apache Bench
ApacheBench (ab) is a single-threaded command line computer program for measuring the performance of HTTP web servers. Originally designed to test the
Apache HTTP Server, it is generic enough to test any web server. Example Usage: ab -n 100 -c 10 http://www.yahoo.com/ This will execute 100 HTTP GET
requests, processing up to 10 requests concurrently, to the specified URL, in this
example, http://www.yahoo.com/
17
Chapter 5
Problem Statement
We want to imagine and design a REST service deployed using a new or custom
micro web server and/or framework where it should support only REST only features. No session management and any other similar module that usually required
for a full stack web application. We as well decide to Target GET request.
18
Chapter 6
Project Requirements
6.1
Hardware Requirements
1. Test machine
6.2
Software Requirements
1. Windows 7/Vista OS
2. Tools: cURL, ApacheBench
19
Chapter 7
Design
7.1
UML Diagrams
(1).png
Figure 7.1: Use Case Diagram
20
21
22
23
24
Diagram1(1).jpg
Figure 7.6: Activity Diagram
25
Chapter 8
Planning and Scheduling
1. Months 1, 2, 3: Study of various web service protocols and architectures
such as HTTP, SOAP, REST architecture, MVC architecture, Advantages
of REST etc. More focused on understanding the importance of REST architecture
2. Months 4: Development of RESTful web services on various web frameworks.
Analysis of these services will be carried out and performance measured. The
different modules of a micro web server or framework will also be examined.
3. Months 5: Development of a testing environment where the perfomance of
each service is measured and generated. Implementation will also begin in
the form of customising framework modules and compiling and implementing
them.
4. Month 6, 7: Implementation of the RESTful architecture by customising
a web framework. Show how the RESTful server is more efficient than the
other servers. Also highlight the performance before/after the customization,
how the change was brought about and what was actually changed.
26
Chapter 9
References
[ 1 ]Pautasso, E. Wilde, and A. Marinos. First International Workshop on
RESTful Design (WSREST 2010), Apr. 2010
[ 2 ]Richardson and S. Ruby. RESTful Web Services. OReilly, Oct. 2007
[ 3 ]REST and Web Services: In Theory and in Practice;Paul Adamczyk, Patrick
H. Smith, Ralph E. Johnson, and Munawar Hafiz.
[ 4 ]Berners-Lee, R. Fielding, and H. Frystyk. RFC 1945: Hypertext Transfer
ProtocolHTTP/1.0,May. 1996
27