You are on page 1of 31

VOICE OVER INTERNET PROTOCOL

MITHUN AGGARWAL
A1607108167
B.TECH(E&T)
3RD YEAR SEC –C
TABLE OF CONTENT

1. OVERVIEW
1.1 Defination
1.2 Introduction
1.3 why voip important

2. HISTORY
2.1 Origin
2.2 Development

3 STANDARDS FOR VOIP

4. VOIP ENABLED SERVICES


4.1 VIDEO TELEPHONY
4.2 HOW DOES VOIP WORK
4.3 TCP \ IP
4.4 UDP
4.5 RTP

5. SPEECH CODING TECHNIQUES

6 H.323
6.1 CODES
6.2 H.323 and Voice over IP services
6.3 H.323 and Videoconference services

7 SIP

8 GLOBAL VIEW ON VOIP

9. CONCLUSION
10. BIBLIOGRAPHY

11. APPENDIX
1. OVERVIEW

First things first, VoIP stands for Voice over Internet Protocol. At a base level that means
phone calls over your broadband connection. You really do need a high-speed connection
to take advantage of VoIP to get phone calls comparable to a normal landline phone.
Traditional "dial-up" connections are not really sufficient.

VoIP works in a different way to your home phone. Your home phone is based on an
analogue network, whereas VoIP is based on a digital one. Essentially when you speak
into a VoIP enabled phone or headset your voice is converted into digital packets; it is
then compressed to help your Internet connection run more efficiently and then it is
transferred down the connection much like an email. Once it reaches its destination the
process is reversed.

1.1 DEFINATION

Voice over IP (VoIP) is a general term for a family of transmission technologies for
delivery of voice communications over IP networks such as the Internet or other packet-
switched networks. Other terms frequently encountered and synonymous with VoIP
are IP telephony, Internet telephony, voice over
broadband (VoBB), broadband telephony, and broadband phone.

Internet telephony refers to communications services — voice, facsimile, and/or voice-


messaging applications — that are transported via the Internet, rather than the public
switched telephone network (PSTN). The basic steps involved in originating an Internet
telephone call are conversion of the analog voice signal to digital format and
compression/translation of the signal into Internet protocol (IP) packets for transmission
over the Internet; the process is reversed at the receiving end.

VoIP systems employ session control protocols to control the set-up and tear-down of
calls as well as audio codecs which encode speech allowing transmission over an IP
network as digital audio via an audio stream. Codec use is varied between different
implementations of VoIP (and often a range of codecs are used); some implementations
rely on narrowband and compressed speech, while others support high
fidelitystereo codecs.

it is also said as.

VoIP is a technology that allows telephone calls to be made over computer networks like
the Internet. VoIP converts analog voice signals into digital data packets and supports
real-time, two-way transmission of conversations using Internet Protocol (IP).

VoIP calls can be made on the Internet using a VoIP service provider and standard
computer audio systems. Alternatively, some service providers support VoIP through
ordinary telephones that use special adapters to connect to a home computer network.
Many VoIP implementations are based on the H.323 technology standard.

VoIP offers a substantial cost savings over traditional long distance telephone calls. The
main disadvantage of VoIP is, like cell phones, a greater potential for dropped calls and
generally lesser voice quality.
1.2 INTRODUCTION

Since the telephone was invented in the late 1800s, telephone communication has not
changed substantially. Of course, new technologies like digital circuits,DTMF (or, "touch
tone"), and caller ID have improved on this invention, but the basic functionality is still
the same. Over the years, service provides made a number of changes "behind the scenes"
to improve on the kinds and types of services offered to subscribers, including toll-free
numbers, call-return, call forwarding, etc. By and large, users do not know how those
services work, but they did know two things: the same old telephone is used and the
service provider charges for each and every little incremental service addition introduced.

In the 1990s, a number of individuals in research environments, both in educational and


corporate institutions, took a serious interest in carrying voice and video over IP
networks, especially corporate intranets and the Internet. This technology is commonly
referred to today as VoIP and is, in simple terms, the process of breaking up audio or
video into small chunks, transmitting those chunks over an IP network, and reassembling
those chunks at the far end so that two people can communicate using audio and video.

This idea of VoIP is certainly not new, as there are research papers and patents dating
back several decades and demonstrations of the concept given at various times over the
years. VoIP took center stage with the "information super highway" (or, the Internet)
concept that was popularized by former Vice President Al Gore in the 1990s, as the
Internet would make it possible to interconnect every home and every business with a
packet-switched data network. Before Al Gore's effort to grow the Internet, the Internet
was generally limited to use in academic environments, but the possibility of mass
deployment of the Internet sparked this renewed interest in VoIP.
1.3 WHY VOIP IMPORTANT

One of the most important things to point out is that VoIP is not limited to voice
communication. In fact, a number of efforts have been made to change this popular
marketing term to better reflect the fact that VoIP means voice, video, and data
conferencing. All such attempts have failed up to this point, but do understand that video
telephony and real-time text communication (ToIP), for example, is definitely within the
scope of the VoIP.

VoIP is important because, for the first time in more than 100 years, there is an
opportunity to bring about significant change in the way that people communicate. In
addition to being able to use the telephones we have today to communicate in real-time,
we also have the possibility of using pure IP-based phones, including desktop and
wireless phones. We also have the ability to use videophones, much like those seen in
science fiction movies. Rather than calling home to talk to the family, a person can call
home to see the family.

One of the more interesting aspects of VoIP is that we also have the ability to integrate a
stand-alone telephone or videophone with the personal computer. One can use a
computer entirely for voice and video communications (softphones), use a telephone for
voice and the computer for video, or can simply use the computer in conjunction with a
separate voice/video phone to provide data conferencing functions, like application
sharing, electronic whiteboarding, and text chat.

VoIP allows something else: the ability to use a single high-speed Internet connection for
all voice, video, and data communications. This idea is commonly referred to as
convergence and is one of the primary drivers for corporate interest in the technology.
The benefit of convergence should be fairly obvious: by using a single data network for
all communications, it is possible to reduce the overall maintenance and deployment
costs. The benefit for both home and corporate customers is that they now have the
opportunity to choose from a much larger selection of service providers to provide voice
and video communication services. Since the VoIP service provider can be located
virtually anywhere in the world, a person with Internet access is no longer geographically
restricted in their selection of service providers and is certainly not bound to their Internet
access provider.

In short, VoIP enables people to communicate in more ways and with more choices.

2. HISTORY
2.1 Origin

Russel Shaw posted quite a nice article on his ZDNet blog. For everyone who's
wondering how VoIP came about, what the original idea was, where VoIP originated
from, here's part of the answer. VoIP's birth:

I would like to offer up a suggestion for a product, or perhaps I should say a technology.
This is an idea that I had that is really an extension of existing products, but I want to go
on record as proposing this now so that when someone gets the bright idea in a few
months or years, I canpoint to this as "prior art" (the Telecom Archives ARE permanent,
aren't they?).

The idea is this: At some point on the Internet you have a server that connects to the
telephone network. It can detect ringing and seize (answer) the line, or it can pick up the
line and initiate outdialing. So far all of this can be done using existing products
(modems, forexample). But what I would then propose for this new technology is to take
the audio from the phone line and convert it into an audio data stream that can be sent to
another location on the Internet. In a similar manner, this product should be able to accept
an audio stream from the Internet and send it out to the phone line.

On the user (client) end, a companion product (designed to work with the server) would
operate similar to IPhone or another two-way voice over Internet product, except that
when the server receives a ringing signal from the telephone line, it would sent a data
packet to the user's program that would cause an audible (or other) signal to sound or
appear on the video display of the user's computer.

2.2 Development

1974 — The Institute of Electrical and Electronic Engineers (IEEE) published a paper
titled "A Protocol for Packet Network Interconnection."

1981 — IPv4 is described in RFC 791.

1985 — The National Science Foundation commissions the creation of NSFNET.

1995 — VocalTec releases the first commercial Internet phone software.

1996 —ITU-T begins development of standards for the transmission and signaling of
voice communications over Internet Protocol networks with the H.323 standard.

US telecommunication companies petition the US Congress to ban Internet phone


technology.

1997 — Level 3 began development of its first softswitch, a term they coined in 1998.

1999 —The Session Initiation Protocol (SIP) specification RFC 2543 is released.

Mark Spencer of Digium develops the first open source Private branch exchange (PBX)
software (Asterisk).

2004 — Commercial VoIP service providers proliferate.

2005 — OpenSER (later Kamailio and OpenSIPS) SIP proxy server is forked from
the SIP Express Router.

2006 — FreeSWITCH open source software is released.


3 STANDARDS FOR VOIP

There are a number of protocols that may be employed in order to provide


forVoIP communication services. In this section, we will focus on those which are most
common to the majority of the devices deployed and being deployed today.

Virtually every device in the world uses a standard called Real-Time Protocol(RTP) for
transmitting audio and video packets between communicating computers. RTP is defined
by the IETF in RFC 3550. The payload format for a number of CODECs are defined
in RFC 3551, though payload format specifications are defined in documents also
published by the ITU and in other IETF RFCs. RTP also addresses issues like packet
order and provides mechanisms (via the Real-Time Control Protocol, or RTCP, also
defined in RFC 3550) to help address delay and jitter.

One of the areas of concern for people communicating over the Internet is the potential a
person to eavesdrop on communication. To address these security concerns, RTP was
improved upon with the result being called Secure RTP (defined in RFC 3711). Secure
RTP provides for encryption, authentication, and integrity of the audio and video packets
transmitted between communicating devices.

Before audio or video media can flow between two computers, various protocols must be
employed to find the remote device and to negotiate the means by which media will flow
between the two devices. The protocols that are central to this process are referred to as
call-signaling protocols, the most popular of which are H.323 and Session Initiation
Protocol (SIP) and they both rely on static provisioning, RAS (ITU-T Rec. H.225.0),
DNS, TRIP (RFC 3219), ENUM (RFC 3762), and other protocols to find other users.

H.323 and SIP both have their origins in 1995 as researchers looked to solve the problem
of how two computers can initiate communication in order to exchange audio and video
media streams. H.323 enjoyed the first commercial success, due to the fact that those
working on the protocol in the ITU worked quickly to publish the first standard in early
1996. SIP, on the other hand, progressed much more slowly in the IETF, with the first
draft published in 1996, but the first recognized "standard" published later in 1999. SIP
was revised over the years and re-published in 2002 as RFC 3261, which is the currently
recognized standard for SIP. These delays in the standards process resulted in delays in
market adoption of the SIP protocol.

Fundamentally, H.323 and SIP allow users to do the same thing: to establish multimedia
communication (audio, video, or other data communication). However, H.323 and SIP
differ significantly in design, with H.323 borrowing heavily from legacy communication
systems and being a binary protocol, and with SIP not adopting many of the information
elements found in legacy systems and being an ASCII-based protocol. Supporters of each
protocol have debated at length as to which approach is better and the results are certainly
mixed.
Over the years, there have been a lot of papers debating H.323 vs. SIP, but most of the
arguments have often been "religious" in nature (e.g., "ITU vs. IETF" and "binary versus
ASCII"). Very few of the papers and reports have compared the protocol on the basis of
functionality and what really matters: does the protocol do the job? The fact is, both can
do the job, though H.323 is superior in a number of ways: better interoperability with
the PSTN, better support for video, excellent interoperability with legacy video systems
(e.g., H.320), and reliable out-of-band transport of DTMF. SIP, being a "session initiation
protocol", was not designed to address many of the problems that were raised and solved
in legacy communication systems. SIP was also popularized in the market through
misstatements that it was "easy to implement and debug". The truth is that there is a
certain amount of complexity in any communication system and, no matter how one
looks at it, it requires about the same amount of work to do the same thing two different
ways.

In the simplest deployment, the SIP implementation is certainly easier to develop and
troubleshoot. However, there are very few real-world deployments that are "simple". As a
result, SIP proponents have defined a number of non-standard variations of SIP (e.g.,
SIP-T and SIP-I), as well as a number of non-standard extensions in order to carry the
necessary information or provide the required functionality. Some have said that there are
as many variations of SIP as there are SIP deployments.

Today, H.323 still commands the bulk of the VoIP deployments in the service provider
market for voice transit, especially for transporting voice calls internationally. H.323 is
also widely used in room-based video conferencing systems and is the #1 protocol for IP-
based video systems. SIP has, most recently, become more popular for use in instant
messaging systems, though there have been no successful commercial deployments of
SIP-based instant messaging at the time of this writing.

Both H.323 and SIP can be referred to as "intelligent endpoint protocols". What this
means is that all of the intelligence required to locate the remote endpoint and to establish
media streams between the local and remote device is an integral part of the protocol.
There is another class of protocols which is complementary to H.323 and SIP referred to
as "device control protocols". Those protocols are H.248 and MGCP.

To understand the purpose of H.248 and MGCP, it is important to first understand the
function of a gateway. A gateway is a device that offers an IP interface on one side and
some sort of legacy telephone interface on the other side. The legacy telephone interface
may be complex, such as an interface to a legacy PSTN switch, or may be a simple
interface that allows one to connect one or a few traditional telephones. Depending on the
size and purpose of the gateway, it may allow IP-originated calls to terminate to the
PSTN (and vice-versa) or may simply provide a means for a person to connect a
telephone to the Internet.

Originally, gateways were viewed as monolithic devices that had call control (H.323/SIP)
and hardware required to control the PSTN interface. In 1998, the idea of splitting the
gateway into two logical parts was proposed: one part, which contains the call control
logic, is called the media gateway controller (MGC) or call agent (CA), and the other
part, which interfaces with the PSTN, is called the media gateway (MG). With this
functional split, a new interface existed (going between the MGC and MG), driving the
necessity to define MGCP and H.248.

Some service providers provide users with devices that implement H.248 or MGCP (or
comparable protocols). In the core of the network, some device serving as the MGC
provides the H.323 or SIP logic necessary to properly terminate VoIP calls around the
world.

Outside of H.323/SIP and H.248/MGCP, there are also non-standard protocols introduced
by various companies that have been very successful in the market. Skype is one such
company that has been extremely successful using a proprietary protocol. Which protocol
is best for you? It really depends on your requirements, but most people simply want to
make a phone call and, as such, it really does not matter.

It is also important to remember that, just as with every other new capability introduced
in the world of high-tech, there is always something new and bigger coming down the
road. Presently, the ITU is working on a new protocol that will have much more
capability than either SIP or H.323. The new protocol is referred to as H.325 and is
expected to enable voice, video, and data communications capabilities across a number of
separate devices that work together, such as a mobile phone, a PC, and even an HD TV!

4. VOIP ENABLED SERVICES

Many people have proclaimed that VoIP enables all kinds of new services that were
never possible before. This is certainly true, though the hype far exceeds reality and what
is practical. Even so, there are a number of new capabilities which are practical and will
come forward as we continue to deploy VoIP systems.

Video telephony is probably the first new service that will come forward that helps set
VoIP apart from traditional telephone systems. Service providers are already rolling out
services offering video terminals to allow people to call friends and family using video-
enabled phones.

VoIP also allows one to potentially launch calls from the PC, determine the availability
of friends and family members (called "presence"), control telephone services from the
PC, etc. The market acceptance of most of these new kinds of services are questionable at
this point, but the potential is there and has certainly garnered a tremendous amount of
focus from companies trying to find a niche in this new market.

The one business application that VoIP, video telephony (or, videoconferencing), and
instant messaging will enable is application sharing and electronic whiteboarding.
The ITU has defined a suite of protocols (called T.120) to address this application and it
has been used in tools like Microsoft NetMeeting. While NetMeeting met some success,
it failed to gain wider market adoption due to the fact that it was somewhat difficult to set
up and use in a corporate environment. By having better integration with the phone and
wider deployment of VoIP, businesses will probably find the ability to do application
sharing and electronic whiteboarding very appealing in order to improve productivity.
These kinds of services that are related to VoIP are most exciting.

4.1 VIDEO TELEPHONY


Video telephony is the term used for communication between people using video in
addition to other forms of media (e.g., audio). Video telephony systems exist for
allISDN, PSTN, and IP networks.

Video Telephony is virtually the same as video conferencing and, as such, we suggest
that readers follow the links and information related to video conferencing for more
details on the subject.

4.2 HOW DOES VOIP WORK


Many people have used a computer and a microphone to record a human voice or other
sounds. The process involves sampling the sound that is heard by the computer at a very
high rate (at least 8,000 times per second or more) and storing those "samples" in
memory or in a file on the computer. Each sample of sound is just a very tiny bit of the
person's voice or other sound recorded by the computer. The computer has the
wherewithal to take all of those samples and play them, so that the listener can hear what
was recorded.

VoIP is based on the same idea, but the difference is that the audio samples are not stored
locally. Instead, they are sent over the IP network to another computer and played there.

Of course, there is much more required in order to make VoIP work. When recording the
sound samples, the computer might compress those sounds so that they require less space
and will certainly record only a limited frequency range. There are a number of ways to
compress audio, the algorithm for which is referred to as a "compressor/de-compressor",
or simply CODEC. Many CODECs exist for a variety of applications (e.g., movies and
sound recordings) and, for VoIP, the CODECs are optimized for compressing voice,
which significantly reduce the bandwidth used compared to an uncompressed audio
stream. Speech CODECs are optimized to improve spoken words at the expense of
sounds outside the frequency range of human speech. Recorded music and other sounds
do not generally sound very good when passed through a speech CODEC, but that is
perfectly OK for the task at hand.

Once the sound is recorded by the computer and compressed into very small samples, the
samples are collected together into larger chunks and placed into data packets for
transmission over the IP network. This process is referred to packetization. Generally, a
single IP packet will contain 10 or more milliseconds of audio, with 20 or 30
milliseconds being most common.

Vint Cerf, who is often called the Father of the Internet, once explained packets in a way
that is very easy to understand. Paraphrasing his description, he suggested to think of a
packet as a postcards sent via postal mail. A postcard contains just a limited amount of
information. To deliver a very long message, one must send a lot of postcards. Of course,
the post office might lose one or more postcards. One also has to assemble the received
postcards in order, so some kind of mechanism must be used to properly order to
postcards, such as placing a sequence number on the bottom right corner. One can think
of data packets in an IP network as postcards.

Just like postcards sent via the postal system, some IP data packets get lost and the
CODECs must compensate for lost packets by "filling in the gaps" with audio that is
acceptable to the human ear. This process is referred to as packet-loss
concealment (PLC). In some cases, packets are sent multiple times in order to overcome
packet loss. This method is called, appropriately enough, redundancy. Another method to
address packet loss, known as forward-error correction (FEC), is to include some
information from previously transmitted packets in subsequent packets. By performing
mathematical operations in a particular FEC scheme, it is possible to reconstruct a lost
packet from information bits in neighboring packets.

Packets are also sometimes delayed, just as with the postcards sent through the post
office. This is particularly problematic for VoIP systems, as delays in delivering a voice
packet means the information is too old to play. Such old packets are simply discarded,
just as if the packet was never received. This is acceptable, as the same PLC algorithms
can smooth the audio to provide good audio quality.

Computers generally measure the packet delay and expect the delay to remain relatively
constant, though delay can increase and decrease during the course of a conversation.
Variation in delay (called jitter) is the most frustrating for IP devices. Delay, itself, just
means it takes longer for the recorded voice spoken by the first person to be heard by the
user on the far end. In general, good networks have an end-to-end delay of less than
100ms, though delay up to 400ms is considered acceptable (especially when using
satellite systems). Jitter can result in choppy voice or temporary glitches, so VoIP devices
must implement jitter buffer algorithms to compensate for jitter. Essentially, this means
that a certain number of packets are queued before play-out and the queue length may be
increased or decreased over time to reduce the number of discarded, late-arriving packets
or to reduce "mouth to ear" delay. Such "adaptive jitter buffer" schemes are also used by
CD recorders and other types of devices that deal with variable delay.

Video works in much the same way as voice. Video information received through a
camera is broken into small pieces, compressed with a CODEC, placed into small
packets, and transmitted over the IP network. This is one reason why VoIP is promising
as a new technology: adding video or other media is relatively simple. Of course, there
are certain issues that must be considered that are unique to video (e.g., frame refresh and
much higher bandwidth requirements), but the basic principles of VoIP equally apply
to video telephony.

Of course there is much more to VoIP than just sending the audio/video packets over the
Internet. There must also be an agreed protocol for how computers find each other and
how information is exchanged in order to allow packets to ultimately flow between the
communicating devices. There must also be an agreed format (called payload format) for
the contents of the media packets. We will describe some of the popular VoIP protocols
in the next section.

Through this section, we have focused on computers that communicate with each other.
However, VoIP is certainly not limited to desktop computers. VoIP is implemented in a
variety of hardware devices, including IP phones, analog terminal adapters(ATAs),
and gateways. In short, a large number of devices can enable VoIP communication, some
of which allow one to use traditional telephone devices to interface with the IP networks:
one does not have to throw out existing equipment to migrate to VoIP.

4.3 TCP / IP
What is TCP/IP? TCP/IP (Transmission Control Protocol/Internet Protocol) is the basic
communication language or protocol of the Internet. It can also be used as a
communications protocol in a private network (either an intranet or an extranet). When
you are set up with direct access to the Internet, your computer is provided with a copy of
the TCP/IP program just as every other computer that you may send messages to or get
information from also has a copy of TCP/IP.

TCP/IP is a two-layer program. The higher layer, Transmission Control Protocol,


manages the assembling of a message or file into smallerpackets that are transmitted over
the Internet and received by a TCP layer that reassembles the packets into the original
message. The lower layer,Internet Protocol, handles the address part of each packet so
that it gets to the right destination. Each gateway computer on the network checks this
address to see where to forward the message. Even though some packets from the same
message are routed differently than others, they'll be reassembled at the destination.

TCP/IP uses the client/server model of communication in which a computer user (a


client) requests and is provided a service (such as sending a Web page) by another
computer (a server) in the network. TCP/IP communication is primarily point-to-point,
meaning each communication is from one point (or host computer) in the network to
another point or host computer. TCP/IP and the higher-level applications that use it are
collectively said to be "stateless" because each client request is considered a new request
unrelated to any previous one (unlike ordinary phone conversations that require a
dedicated connection for the call duration). Being stateless frees network paths so that
everyone can use them continuously. (Note that the TCP layer itself is not stateless as far
as any one message is concerned. Its connection remains in place until all packets in a
message have been received.)

Many Internet users are familiar with the even higher layer application protocols that use
TCP/IP to get to the Internet. These include the World Wide Web's Hypertext Transfer
Protocol (HTTP), the File Transfer Protocol (FTP), Telnet (Telnet) which lets you logon
to remote computers, and the Simple Mail Transfer Protocol (SMTP). These and other
protocols are often packaged together with TCP/IP as a "suite."

Personal computer users with an analog phone modem connection to the Internet usually
get to the Internet through the Serial Line Internet Protocol (SLIP) or the Point-to-Point
Protocol (PPP). These protocols encapsulate the IP packets so that they can be sent over
the dial-up phone connection to an access provider's modem.

Protocols related to TCP/IP include the User Datagram Protocol (UDP), which is used
instead of TCP for special purposes. Other protocols are used by network host computers
for exchanging router information. These include the Internet Control Message Protocol
(ICMP), the Interior Gateway Protocol (IGP), the Exterior Gateway Protocol (EGP), and
the Border Gateway Protocol (BGP).

4.4 UDP

he User Datagram Protocol (UDP) is one of the core members of the Internet Protocol
Suite, the set of network protocols used for the Internet. With UDP, computer
applications can send messages, in this case referred to as datagrams, to other hosts on
an Internet Protocol (IP) network without requiring prior communications to set up
special transmission channels or data paths. UDP is sometimes called the Universal
Datagram Protocol. The protocol was designed by David P. Reed in 1980 and formally
defined in RFC 768.

UDP uses a simple transmission model without implicit hand-shaking dialogues for
guaranteeing reliability, ordering, or data integrity. Thus, UDP provides an unreliable
service and datagrams may arrive out of order, appear duplicated, or go missing without
notice. UDP assumes that error checking and correction is either not necessary or
performed in the application, avoiding the overhead of such processing at the network
interface level. Time-sensitive applications often use UDP because dropping packets is
preferable to waiting for delayed packets, which may not be an option in a real-time
system. If error correction facilities are needed at the network interface level, an
application may use the Transmission Control Protocol (TCP) or Stream Control
Transmission Protocol (SCTP) which are designed for this purpose.

UDP's stateless nature is also useful for servers that answer small queries from huge
numbers of clients. Unlike TCP, UDP is compatible with packet broadcast (sending to all
on local network) andmulticasting (send to all subscribers).

Common network applications that use UDP include: the Domain Name
System (DNS), streaming media applications such as IPTV, Voice over
IP (VoIP), Trivial File Transfer Protocol (TFTP) and many online games.

4.5 RTP

RTP was developed by the Audio/Video Transport working group of the IETF standards
organization. RTP is used in conjunction with other protocols such
as H.323 and RTSP. The RTP standard defines a pair of protocols, RTP and the Real-
time Transport Control Protocol (RTCP). RTP is used for transfer of multimedia data,
and the RTCP is used to periodically send control information and QoS parameters.

RTP is designed for end-to-end, real-time, transfer of multimedia data. The protocol
provides facility for jitter compensation and detection of out of sequence arrival in data,
that are common during transmissions on an IP network. RTP supports data transfer to
multiple destinations throughmulticast. RTP is regarded as the primary standard for
audio/video transport in IP networks and is used with an associated profile and payload
format.

Real-time multimedia streaming applications require timely delivery of information and


can tolerate some packet loss to achieve this goal. For example, loss of a packet in audio
application may result in loss of a fraction of a second of audio data, which can be made
unnoticeable with suitable error concealment algorithms.[4] The Transmission Control
Protocol (TCP), although standardized for RTP use (RFC 4571), is not often used by RTP
because of inherent latency introduced by connection establishment and error correction,
instead the majority of the RTP implementations are built on the User Datagram
Protocol (UDP).[4] Other transport protocols specifically designed for multimedia sessions
are SCTP and DCCP, although they are not in widespread use yet.
Protocol components
The RTP specification describes two sub-protocols:

 The data transfer protocol, which deals with the transfer of real-time multimedia
data. Information provided by this protocol include timestamps (for synchronization),
sequence numbers (for packet loss detection) and the payload format which indicates
the encoded format of the data.
 The Real Time Control Protocol (RTCP) is used to specify Quality of Service
(QoS) feedback and synchronization between the media streams. The bandwidth of
RTCP traffic compared to RTP is small, typically around 5%.

5. SPEECH CODING TECHNIQUES


Speech coding is the application of data compression of digital audio signals
containing speech. Speech coding uses speech-specificparameter estimation using audio
signal processing techniques to model the speech signal, combined with generic data
compression algorithms to represent the resulting modeled parameters in a compact
bitstream.

The two most important applications of speech coding are mobile telephony and Voice
over IP.

The techniques used in speech coding are similar to that in audio data
compression and audio coding where knowledge in psychoacousticsis used to transmit
only data that is relevant to the human auditory system. For example,
in narrowband speech coding, only information in the frequency band 400 Hz to 3500 Hz
is transmitted but the reconstructed signal is still adequate for intelligibility.

Speech coding differs from other forms of audio coding in that speech is a much simpler
signal than most other audio signals, and that there is a lot more statistical information
available about the properties of speech. As a result, some auditory information which is
relevant in audio coding can be unnecessary in the speech coding context. In speech
coding, the most important criterion is preservation of intelligibility and "pleasantness" of
speech, with a constrained amount of transmitted data.

It should be emphasised that the intelligibility of speech includes, besides the actual
literal content, also speaker identity, emotions, intonation, timbre etc. that are all
important for perfect intelligibility. The more abstract concept of pleasantness of
degraded speech is a different property than intelligibility, since it is possible that
degraded speech is completely intelligible, but subjectively annoying to the listener.

In addition, most speech applications require low coding delay, as long coding delays
interfere with speech interaction.

From this viewpoint, the A-law and μ-law algorithms (G.711) used in
traditional PCM digital telephony can be seen as a very early precursor of speech
encoding, requiring only 8 bits per sample but giving effectively 12 bits of resolution.
Although this would generate unacceptable distortion in a music signal, the peaky nature
of speech waveforms, combined with the simple frequency structure of speech as a
periodic waveform with a single fundamental frequency with occasional added noise
bursts, make these very simple instantaneous compression algorithms acceptable for
speech.

A wide variety of other algorithms were tried at the time, mostly variants on delta
modulation, but after careful consideration, the A-law/μ-law algorithms were chosen by
the designers of the early digital telephony systems. At the time of their design, their 33%
bandwidth reduction for a very low complexity made them an excellent engineering
compromise. Their audio performance remains acceptable, and there has been no need to
replace them in the stationary phone network.

In 2008, G.711.1 codec, which has a scalable structure, was standardized by ITU-T. The
input sampling rate is 16 kHz.

6. H.323
H.323 is a recommendation from the ITU Telecommunication Standardization Sector
(ITU-T) that defines the protocols to provide audio-visualcommunication sessions on
any packet network. The H.323 standard addresses call signaling and control, multimedia
transport and control, and bandwidth control for point-to-point and multi-point
conferences.

It is widely implemented by voice and videoconferencing equipment manufacturers, is


used within various Internet real-time applications such as GnuGK and NetMeeting and
is widely deployed worldwide by service providers and enterprises for both voice
and video services overInternet Protocol (IP) networks.

It is a part of the ITU-T H.32x series of protocols, which also


address multimedia communications over Integrated Services Digital
Network(ISDN), Public Switched Telephone Network (PSTN) or Signaling System
7 (SS7), and 3G mobile networks.

H.323 Call Signaling is based on the ITU-T Recommendation Q.931 protocol and is
suited for transmitting calls across networks using a mixture of IP, PSTN, ISDN,
and QSIG over ISDN. A call model, similar to the ISDN call model, eases the
introduction of IP telephony into existing networks of ISDN-based PBX systems,
including transitions to IP-based Private Branch eXchanges (PBXs).

Within the context of H.323, an IP-based PBX might be an H.323 Gatekeeper or other
call control element that provides service to telephonesor videophones. Such a device
may provide or facilitate both basic services and supplementary services, such as call
transfer, park, pick-up, and hold.

While H.323 excels at providing basic telephony functionality and interoperability,


H.323’s strength lies in multimedia communication functionality designed specifically
for IP networks.

6.1 CODES

H.323 utilizes both ITU-defined codecs and codecs defined outside the ITU. Codecs that
are widely implemented by H.323 equipment include:

 Audio
codecs: G.711, G.729 (including G.729a), G.723.1, G.726, G.722, G.728, Speex
 Text codecs: T.140
 Video codecs: H.261, H.263, H.264
All H.323 terminals providing video communications shall be capable of encoding and
decoding video according to H.261 QCIF. All H.323 terminals shall have an audio codec
and shall be capable of encoding and decoding speech according to ITU-T Rec. G.711.
All terminals shall be capable of transmitting and receiving A-law and μ-law. Support for
other audio and video codecs is optional.

6.2 H.323 and Voice over IP services

Voice over Internet Protocol (VoIP) describes the transmission of voice using the Internet
or other packet switched networks. ITU-T Recommendation H.323 is one of the
standards used in VoIP. VoIP requires a connection to the Internet or another packet
switched network, a subscription to a VoIP service provider and a client (an analogue
telephone adapter (ATA), VoIP Phone or "soft phone"). The service provider offers the
connection to other VoIP services or to the PSTN. Most service providers charge a
monthly fee, then additional costs when calls are made.Using VoIP between two
enterprise locations would not necessarily require a VoIP service provider, for example.
H.323 has been widely deployed by companies who wish to interconnect remote
locations over IP using a number of various wired and wireless technologies.

6.3 H.323 and Videoconference services

A videoconference, or videoteleconference (VTC), is a set


of telecommunication technologies allowing two or more locations to interact via two-
way video and audio transmissions simultaneously. There are basically two types of
videoconferencing; dedicated VTC systems have all required components packaged into
a single piece of equipment while desktop VTC systems are add-ons to normal PC's,
transforming them into VTC devices. Simultaneous videoconferencing among three or
more remote points is possible by means of a Multipoint Control Unit (MCU). There are
MCU bridges for IP and ISDN-based videoconferencing. Due to the price point and
proliferation of the Internet, and broadband in particular, there has been a strong spurt of
growth and use of H.323-based IP videoconferencing. H.323 is accessible to anyone with
a high speed Internet connection, such as DSL. Videoconferencing is utilized in various
situations, for example; distance education,telemedicine and business
7. SIP

The Session Initiation Protocol (SIP) is an IETF-defined signaling protocol, widely used
for controlling multimedia communication sessions such as voice and video calls
over Internet Protocol(IP). The protocol can be used for creating, modifying and
terminating two-party (unicast) or multiparty (multicast) sessions consisting of one or
several media streams. The modification can involve changing addresses or ports,
inviting more participants, and adding or deleting media streams. Other feasible
application examples include video conferencing, streaming multimedia
distribution, instant messaging, presence information, file transfer and online games.

SIP was originally designed by Henning Schulzrinne and Mark Handley starting in 1996.
The latest version of the specification is RFC 3261 from the IETF Network Working
Group. In November 2000, SIP was accepted as a 3GPP signaling protocol and
permanent element of the IP Multimedia Subsystem (IMS) architecture for IP-based
streaming multimedia services in cellular systems.

The SIP protocol is an Application Layer protocol designed to be independent of the


underlyingtransport layer; it can run on Transmission Control Protocol (TCP), User
Datagram Protocol (UDP), or Stream Control Transmission Protocol (SCTP). It is a text-
based protocol, incorporating many elements of the Hypertext Transfer Protocol (HTTP)
and the Simple Mail Transfer Protocol(SMTP).

7.1 PROTOCOL DESIGN


SIP employs design elements similar to the HTTP request/response transaction
model Each transaction consists of a client request that invokes a particular method or
function on the server and at least one response. SIP reuses most of the header fields,
encoding rules and status codes of HTTP, providing a readable text-based format.

SIP works in concert with several other protocols and is only involved in the signaling
portion of a communication session. SIP clients typically use TCP or UDP on port
numbers 5060 and/or 5061 to connect to SIP servers and other SIP endpoints. Port 5060
is commonly used for non-encrypted signaling traffic whereas port 5061 is typically used
for traffic encrypted with Transport Layer Security (TLS). SIP is primarily used in setting
up and tearing down voice or video calls. It has also found applications in messaging
applications, such as instant messaging, and event subscription and notification. There are
a large number of SIP-related Internet Engineering Task Force (IETF) documents that
define behavior for such applications. The voice and video stream communications in SIP
applications are carried over another application protocol, the Real-time Transport
Protocol (RTP). Parameters (port numbers, protocols, codecs) for these media streams are
defined and negotiated using the Session Description Protocol (SDP) which is transported
in the SIP packet body.

A motivating goal for SIP was to provide a signaling and call setup protocol for IP-based
communications that can support a superset of the call processing functions and features
present in the public switched telephone network (PSTN). SIP by itself does not define
these features; rather, its focus is call-setup and signaling. However, it was designed to
enable the construction of functionalities of network elements designated proxy servers
and user agents. These are features that permit familiar telephone-like operations: dialing
a number, causing a phone to ring, hearing ringback tones or a busy signal.
Implementation and terminology are different in the SIP world but to the end-user, the
behavior is similar.

SIP-enabled telephony networks can also implement many of the more advanced call
processing features present in Signaling System 7(SS7), though the two protocols
themselves are very different. SS7 is a centralized protocol, characterized by a complex
central network architecture and dumb endpoints (traditional telephone handsets). SIP is
a peer-to-peer protocol, thus it requires only a simple (and thus scalable) core network
with intelligence distributed to the network edge, embedded in endpoints (terminating
devices built in either hardware or software). SIP features are implemented in the
communicating endpoints (i.e. at the edge of the network) contrary to traditional SS7
features, which are implemented in the network.

Although several other VoIP signaling protocols exist, SIP is distinguished by its
proponents for having roots in the IP community rather than the telecommunications
industry. SIP has been standardized and governed primarily by the IETF, while other
protocols, such as H.323, have traditionally been associated with the International
Telecommunication Union (ITU).
The first proposed standard version (SIP 2.0) was defined by RFC 2543. This version of
the protocol was further refined and clarified in RFC 3261, although some
implementations are still relying on the older definitions.

8. GLOBAL VIEW ON VOIP

VoIP and related real-time communication applications, such as video conferencing and
instant messaging, continue to attract considerable interest worldwide, with millions of
active private and business VoIP users today. Carriers and enterprises are increasingly
seeing the benefits of VoIP services that allow voice messaging and video conferencing
to be conducted securely, like email, as communications are transferred freely over
traditional phone networks and the Internet.

VoIP promises many business benefits and efficiency gains, from integrated and
streamlined voice and data communications to cost savings. In the rush to realise these
benefits and the economic impact it is easy to forget that VoIP is an IP service and that
VoIP networks and applications servers are exposed to all of the threats and risks that
face other IP network services.

oIP) services have proved to be a disruptive technology that has transformed the
telecommunication industry. VoIP has gained widespread acceptance among service
providers, consumers and businesses, offering a cheaper way to get in touch. Instead of
using conventional landlines, people can make phone calls via the Internet. And operators
themselves are saving money by using IP-based networks.

Convergence and VoIP services are redefining markets and blurring boundaries between
networks and content. They are eliminating barriers to entering markets (as competitors
no longer need to own a network) and bringing facilities-based providers into direct
competition with service-based competitors, redefining the role of regulators in the
process.
The size of the market

Estimating the global number of VoIP subscribers is difficult, for several reasons. The
various definitions in use mean that countries report different numbers. Also, it is hard to
estimate the number of computerto- computer or pure VoIP users, including those who
employ such services as Skype, or who use embedded VoIP in online games. This means
that estimates of the total number of VoIP subscribers are almost always presented as a
range; for example, the number of residential VoIP customers in the United States is
projected to reach anywhere between 12 and 44 million by 2010.

As regards the worldwide number of VoIP subscribers, Infonetics Research, based in the
United States, estimates that there were some 80 million by the end of 2008. Point Topic,
of the United Kingdom, suggests there were 92.2 million in the first quarter of 2009,
while IDATE, of France, projected 175 million VoIP subscribers by 2009, equivalent to
10 per cent of total mainline subscribers, and more than 200 million by 2012 (see Figure
1).

According to Point Topic, Western Europe accounted for the largest tranche (38 per cent)
of all VoIP subscribers in March 2009 (see Figure 2). But this share is declining as VoIP
gains popularity elsewhere. North America and the Asia-Pacific region are the next
largest markets. South-East Asia, Latin America and Eastern Europe all have relatively
small market shares, but these are growing fast. TeleGeography Research, of the United
States, projected that international VoIP traffic reached 94.8 billion minutes in 2008,
accounting for around one quarter of the world’s international traffic in that year (see
Figure 3).

Meanwhile, the popularity of VoIP as a business also continues to grow. AMI Research,
of the United States, projects that global revenues from IP privatebranch exchanges (IP
PBX), VoIP gateways, soft switches, VoIP application services, IP phones and adapters
will reach USD 9.7 billion in 2010.

Figure 1 — Estimated number of Figure 2 — Distribution of VoIP subscribers


VoIP subscribers worldwide, 2005– worldwide (March 2009)
2011
Regional distribution of VoIP subscribers, first
Total and as a proportion of quarter of 2009
mainlines
Source: IDATE Source: IDATE

A core element of business

VoIP is changing the telecommunication industry by opening up new markets and


bringing in different players. Broadband, cable modem and wireless providers are now
competing directly with each other. And VoIP boosts service-based competition by
enabling operators to participate without wholesale access to infrastructure.

The perception of VoIP is of new market entrants competing with traditional


telecommunication providers. However, the reality is that most incumbents now use
wholesale VoIP to carry international traffic over backbone networks. Wik Consult, a
research firm based in Germany, has observed that “large and small operators,
incumbents and competitors, are converting their networks to next-generation networks
(NGN) and are betting their businesses on a successful migration to VoIP”.

VoIP is now central to the business strategy of many service providers in both developed
and developing countries. Incumbents in Bangladesh, Fiji, Ghana, Tunisia and Sudan, for
instance, all use VoIP for the transmission of their international traffic.

Potentially, the costs of carrying telecommunication traffic can be slashed. The cost of
transmitting calls over IP could be as little as a quarter of that for sending calls through
the public switched telephone network (PSTN), and maintenance expenses might be cut
by 50 to 60 per cent because VoIP calls use only 10 per cent of the bandwidth required
for a PSTN call.

There are other forces behind the move to VoIP, too. Some operators point to the high
costs of maintaining legacy infrastructure and the need to upgrade to intelligent networks
based on the latest technologies. Other operators are trying to respond to competitors
(domestic and foreign) and position themselves in a truly global communications
industry. As operators integrate voice and data networks, IPbased networks may be seen
as the foundation for business applications. And consumer VoIP runs over a range of
devices, offering flexibility in the first step towards seamless communications. On the
other hand, incumbents may be reluctant to introduce VoIP because they already offer
voice services over PSTN and do not wish to cannibalize their higher-margin
international service offerings.

For some operators, IP-based transmission is the first incarnation of a next-generation


network. It could be that cable television firms are at an advantage compared with PSTN
operators in this field, because it is easier to adapt cable networks for VoIP (which is
transmitted in a similar way to video) than it is for fixed-line operators to add high-speed
data, video and Internet services.

Figure 3 — Growth in international VoIP Figure 4 — Worldwide regulation of


and time division multiplexing (TDM) VoIP (2004–2009)
traffic

Source: IDATE Source: IDATE

Note — VoIP traffic includes all cross- Note — “Closed” means countries
border voice calls over IP networks, but where wholesale VoIP is permitted,
terminated on PSTN. Computer-to- but retail VoIP is banned, as well as
computer and private network traffic are those countries where only the
excluded. Figures for 2008 are incumbent is licensed to provide
projections. VoIP.

Regulatory challenges

VoIP service providers, such as Vonage, Fastweb or Skype, often have quite different
business models and service portfolios. Defining VoIP is one basic step every country
can take in determining the regulatory environment it wishes to see. And if VoIP is to
spread, it needs broadband networks, deployed within the “level playing field” of a
technologically neutral and competitive environment.
Most countries view broadband Internet access as the future of modern communications.
By 2008, according to ITU data, broadband Internet services were commercially
available in 182 countries. Other regulatory measures that encourage the growth of VoIP
include ensuring number portability between PSTN and VoIP users, and rules to prevent
the blocking of VoIP traffic.

By 2004, VoIP had been explicitly legalized in 46 countries (see Figure 4), mainly in
Europe, North America and Asia. In another 57 countries, VoIP was also broadly
permitted, while 80 countries prohibited VoIP services, mainly in Africa and some Arab
States. In contrast, today 92 countries have explicitly legalized VoIP and it is tolerated in
just over two-thirds of the world’s nations, while the number of countries banning VoIP
has fallen to 49, or around a quarter of all countries for which data exist.

This growth raises a host of issues for regulatory frameworks designed mainly for the
PSTN world. The main questions are whether VoIP should be regulated as an alternative
to PSTN telephony, and whether the regulation of VoIP services should differ when they
come from PSTN incumbents or from VoIP operators (including Internet service
providers).

Many developing countries still retain outdated telecommunication legislation from an


era long before VoIP. Legacy obligations that worked well for the PSTN network (and
more recently, updated regulations for mobile networks) can coexist with growth in
VoIP, but it is difficult to apply them directly to VoIP services. For example, access to
emergency service numbers is more difficult to achieve with VoIP, and some providers
argue that requiring them to offer such services is, effectively, a barrier to entering the
market.

When the European Commission first examined VoIP regulation in 2004, it advocated a
“light regulatory touch”. European regulators are now moving on to consider geographic
numbering, nomadic services and caller location, as well as interconnection issues and
lawful interception of calls. In the United States, VoIP has gradually become more
regulated, especially in the context of security concerns (whether and how VoIP traffic
can be monitored) and the provision of emergency calls. Regulators in the
Commonwealth of Independent States take various views. For example, Georgia and
Kazakhstan have generally allowed VoIP operators to flourish, while Turkmenistan
applies a strict licensing regime.

The bottom line

Although VoIP can save money, incumbents may also be concerned about its impact on
revenues. In several countries, greater use of VoIP has been widely associated with
declining revenues for international calls, alongside the growth of such options as e-mail
and the international short message service (SMS). For example, Ghana Telecom’s
revenues from international calls dropped from USD 42 million in 1998 to USD 14.4
million in 2002. FINTEL, the sole provider of telecommunication services to and from
Fiji, saw its revenues fall from USD 41.27 million in 2000 to USD 24.91 million in 2004,
as VoIP eroded its international business.

The effect of VoIP on an incumbent’s revenues depends on the structure of its traffic. The
CEO of Etisalat, a telecommunication provider based in the United Arab Emirates,
commented in January 2008 that, overall, the company did not expect a huge net impact
from any future roll-out of VoIP, given the scale of its business in sixteen markets. And
growth in the use of VoIP does not always mean that a country’s incumbent operator will
lose revenue. This is because the opportunities and volumes that the new technology may
open up can compensate for losses, especially if countries actively promote the expansion
of VoIP. For example, in Bahrain, over the two-and-a-half years to July 2008, VoIP
captured 60 per cent of international call minutes and about 40 per cent of revenues,
taking these away from Bahrain Telecommunications Company (Batelco). But the overall
market in Bahrain is growing and there is still money to be made. PSTN incumbents can
also consider enhancing their revenues through offering value-added services, including
IP television.

9 CONCLUSION

Major telecommunications operators recognize that VoIP may become the dominant
mechanism for voice communication in the future. Although VoIP is considerably
complex, it also brings flexibility and service convergence. Design complexity matters
little once any associated performance problems are solved as service convergence
eventually lowers overall cost. Main reason for flexibility is the ability to transport and
route voice traffic using the ubiquitous IP transport network.

To meet the ever increasing needs for networking services, network managers constantly
face the situation of expanding the existing enterprise network not only via wireline
extensions but also via the wireless mode. Providing voice communications in addition to
traditional data services over the same network infrastructure is just another emerging
need. It has been reported (4) that wireless networks perform much worse than their
wireline counterparts in respect to VoIP services. From the simulation results presented
in the previous section, we further demonstrate that some of the network traffic
characteristics regarding the HTTP, email and database services are also affected
differentially by the wireline and wireless expansion modes in the presence of VoIP
services. It seems that VoIP services impose extra overhead (albeit more heavily for the
wireless mode than the wireline mode) on the network as the number of nodes (hence
demand) is increased in the network. This result is of significant importance for network
managers who want to expand networks using wireless technology and add Internet
telephony via the computer networks at the same time.

10 BIBLOGRAPHY
REFERENCES

1. Ali, M. G. and S. Zahir. "Performance Evaluation for Web Applications with Web
Caching in a Distributed Wireless System using Opnet(TM)," Journal of Computer
Information Systems, 46:3, 2006, pp. 57-66.

2. Angerer, C. "IP-enabled communication - The future of Voice," Journal of


the CommunicationsNetwork, 4:3, 2005, pp. 173-175.
3. Bayrak, T. and M. R. Barbowski. "Critical Infrastructure Network Evaluation," Journal
of Computer Information Systems, 46:3, 2006, pp. 67-86.

4. http://www.prlog.org/10016683-the-economic-impact-of-voip-security-
vulnerabilities-and-how-to-secure-voip-for-business.html

5. MCGRAW- HILL NETWORKING CARRIER GRADE VOICE OVER IP SECOND


EDITION

6. http://www.networksorcery.com/enp/protocol/udp.htm

7. http://en.wikipedia.org/wiki/Voice_over_IP

8. http://www.fcc.gov/voip/

9. http://www.protocols.com/pbook/VoIP.htm

10. http://voip.internet2.edu/

11. http://www.zoesnet.net/VOIP.htm

12. http://www.cse.ohio-state.edu/~jain/refs/ref_voip.htm#gateway

You might also like