You are on page 1of 13

Networking Basics

Computers running on the Internet communicate to each other using either the Transmission Control
Protocol (TCP) or the User Datagram Protocol (UDP), as this diagram illustrates:

When you write Java programs that communicate over the network, you are programming at the
application layer. Typically, you don't need to concern yourself with the TCP and UDP layers. Instead, you
can use the classes in the java.net package. These classes provide system-independent network
communication. However, to decide which Java classes your programs should use, you do need to
understand how TCP and UDP differ.

TCP

When two applications want to communicate to each other reliably, they establish a connection and send
data back and forth over that connection. This is analogous to making a telephone call. If you want to
speak to Aunt Beatrice in Kentucky, a connection is established when you dial her phone number and she
answers. You send data back and forth over the connection by speaking to one another over the phone
lines. Like the phone company, TCP guarantees that data sent from one end of the connection actually
gets to the other end and in the same order it was sent. Otherwise, an error is reported.

TCP provides a point-to-point channel for applications that require reliable communications. The
Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), and Telnet are all examples of
applications that require a reliable communication channel. The order in which the data is sent and
received over the network is critical to the success of these applications. When HTTP is used to read from
a URL, the data must be received in the order in which it was sent. Otherwise, you end up with a jumbled
HTML file, a corrupt zip file, or some other invalid information.

Definition:

TCP (Transmission Control Protocol) is a connection-based protocol that provides a reliable flow of data
between two computers.
UDP

The UDP protocol provides for communication that is not guaranteed between two applications on the
network. UDP is not connection-based like TCP. Rather, it sends independent packets of data,
called datagrams, from one application to another. Sending datagrams is much like sending a letter
through the postal service: The order of delivery is not important and is not guaranteed, and each
message is independent of any other.

Definition:

UDP (User Datagram Protocol) is a protocol that sends independent packets of data, called datagrams,
from one computer to another with no guarantees about arrival. UDP is not connection-based like TCP.

For many applications, the guarantee of reliability is critical to the success of the transfer of information
from one end of the connection to the other. However, other forms of communication don't require such
strict standards. In fact, they may be slowed down by the extra overhead or the reliable connection may
invalidate the service altogether.

Consider, for example, a clock server that sends the current time to its client when requested to do so. If
the client misses a packet, it doesn't really make sense to resend it because the time will be incorrect
when the client receives it on the second try. If the client makes two requests and receives packets from
the server out of order, it doesn't really matter because the client can figure out that the packets are out of
order and make another request. The reliability of TCP is unnecessary in this instance because it causes
performance degradation and may hinder the usefulness of the service.

Another example of a service that doesn't need the guarantee of a reliable channel is the ping command.
The purpose of the ping command is to test the communication between two programs over the network.
In fact, ping needs to know about dropped or out-of-order packets to determine how good or bad the
connection is. A reliable channel would invalidate this service altogether.

The UDP protocol provides for communication that is not guaranteed between two applications on the
network. UDP is not connection-based like TCP. Rather, it sends independent packets of data from one
application to another. Sending datagrams is much like sending a letter through the mail service: The
order of delivery is not important and is not guaranteed, and each message is independent of any others.

Note:

Many firewalls and routers have been configured not to allow UDP packets. If you're having trouble
connecting to a service outside your firewall, or if clients are having trouble connecting to your service,
ask your system administrator if UDP is permitted.
Understanding Ports

Generally speaking, a computer has a single physical connection to the network. All data destined for a
particular computer arrives through that connection. However, the data may be intended for different
applications running on the computer. So how does the computer know to which application to forward
the data? Through the use of ports.

Data transmitted over the Internet is accompanied by addressing information that identifies the computer
and the port for which it is destined. The computer is identified by its 32-bit IP address, which IP uses to
deliver data to the right computer on the network. Ports are identified by a 16-bit number, which TCP and
UDP use to deliver the data to the right application.

In connection-based communication such as TCP, a server application binds a socket to a specific port
number. This has the effect of registering the server with the system to receive all data destined for that
port. A client can then rendezvous with the server at the server's port, as illustrated here:

Definition:

The TCP and UDP protocols use ports to map incoming data to a particular process running on a
computer.

In datagram-based communication such as UDP, the datagram packet contains the port number of its
destination and UDP routes the packet to the appropriate application, as illustrated in this figure:

Port numbers range from 0 to 65,535 because ports are represented by 16-bit numbers. The port
numbers ranging from 0 - 1023 are restricted; they are reserved for use by well-known services such as
HTTP and FTP and other system services. These ports are calledwell-known ports. Your applications
should not attempt to bind to them.

The term network programming refers to writing programs that execute across multiple devices (computers), in which
the devices are all connected to each other using a network.

The java.net package of the J2SE APIs contains a collection of classes and interfaces that provide the low-level
communication details, allowing you to write programs that focus on solving the problem at hand.

The java.net package provides support for the two common network protocols:

 TCP: TCP stands for Transmission Control Protocol, which allows for reliable communication between two
applications. TCP is typically used over the Internet Protocol, which is referred to as TCP/IP.
 UDP: UDP stands for User Datagram Protocol, a connection-less protocol that allows for packets of data to be
transmitted between applications.

This tutorial gives good understanding on the following two subjects:

 Socket Programming: This is most widely used concept in Networking and it has been explained in very detail.
 URL Processing: This would be covered separately. Click here to learn about URL Processing in Java language.

Socket Programming:
Sockets provide the communication mechanism between two computers using TCP. A client program creates a
socket on its end of the communication and attempts to connect that socket to a server.

When the connection is made, the server creates a socket object on its end of the communication. The client and
server can now communicate by writing to and reading from the socket.

The java.net.Socket class represents a socket, and the java.net.ServerSocket class provides a mechanism for the
server program to listen for clients and establish connections with them.

The following steps occur when establishing a TCP connection between two computers using sockets:

 The server instantiates a ServerSocket object, denoting which port number communication is to occur on.

 The server invokes the accept() method of the ServerSocket class. This method waits until a client connects to the
server on the given port.

 After the server is waiting, a client instantiates a Socket object, specifying the server name and port number to
connect to.

 The constructor of the Socket class attempts to connect the client to the specified server and port number. If
communication is established, the client now has a Socket object capable of communicating with the server.

 On the server side, the accept() method returns a reference to a new socket on the server that is connected to the
client's socket.

After the connections are established, communication can occur using I/O streams. Each socket has both an
OutputStream and an InputStream. The client's OutputStream is connected to the server's InputStream, and the
client's InputStream is connected to the server's OutputStream.
TCP is a twoway communication protocol, so data can be sent across both streams at the same time. There are
following usefull classes providing complete set of methods to implement sockets.

ServerSocket Class Methods:


The java.net.ServerSocket class is used by server applications to obtain a port and listen for client requests

The ServerSocket class has four constructors:

SN Methods with Description

public ServerSocket(int port) throws IOException


1 Attempts to create a server socket bound to the specified port. An exception occurs if the port is
already bound by another application.

public ServerSocket(int port, int backlog) throws IOException


2 Similar to the previous constructor, the backlog parameter specifies how many incoming clients to
store in a wait queue.

public ServerSocket(int port, int backlog, InetAddress address) throws IOException


Similar to the previous constructor, the InetAddress parameter specifies the local IP address to
3
bind to. The InetAddress is used for servers that may have multiple IP addresses, allowing the
server to specify which of its IP addresses to accept client requests on

public ServerSocket() throws IOException


4 Creates an unbound server socket. When using this constructor, use the bind() method when you
are ready to bind the server socket

If the ServerSocket constructor does not throw an exception, it means that your application has successfully bound to
the specified port and is ready for client requests.

Here are some of the common methods of the ServerSocket class:

SN Methods with Description

public int getLocalPort()


1 Returns the port that the server socket is listening on. This method is useful if you passed in 0 as
the port number in a constructor and let the server find a port for you.

public Socket accept() throws IOException


Waits for an incoming client. This method blocks until either a client connects to the server on the
2
specified port or the socket times out, assuming that the time-out value has been set using the
setSoTimeout() method. Otherwise, this method blocks indefinitely

public void setSoTimeout(int timeout)


3
Sets the time-out value for how long the server socket waits for a client during the accept().

public void bind(SocketAddress host, int backlog)


4 Binds the socket to the specified server and port in the SocketAddress object. Use this method if
you instantiated the ServerSocket using the no-argument constructor.

When the ServerSocket invokes accept(), the method does not return until a client connects. After a client does
connect, the ServerSocket creates a new Socket on an unspecified port and returns a reference to this new Socket. A
TCP connection now exists between the client and server, and communication can begin.

Socket Class Methods:


The java.net.Socket class represents the socket that both the client and server use to communicate with each other.
The client obtains a Socket object by instantiating one, whereas the server obtains a Socket object from the return
value of the accept() method.
The Socket class has five constructors that a client uses to connect to a server:

SN Methods with Description

public Socket(String host, int port) throws UnknownHostException, IOException.


This method attempts to connect to the specified server at the specified port. If this constructor
1
does not throw an exception, the connection is successful and the client is connected to the
server.

public Socket(InetAddress host, int port) throws IOException


2 This method is identical to the previous constructor, except that the host is denoted by an
InetAddress object.

public Socket(String host, int port, InetAddress localAddress, int localPort) throws
IOException.
3
Connects to the specified host and port, creating a socket on the local host at the specified
address and port.

public Socket(InetAddress host, int port, InetAddress localAddress, int localPort) throws
IOException.
4
This method is identical to the previous constructor, except that the host is denoted by an
InetAddress object instead of a String

public Socket()
5
Creates an unconnected socket. Use the connect() method to connect this socket to a server.

When the Socket constructor returns, it does not simply instantiate a Socket object but it actually attempts to connect
to the specified server and port.

Some methods of interest in the Socket class are listed here. Notice that both the client and server have a Socket
object, so these methods can be invoked by both the client and server.

SN Methods with Description

public void connect(SocketAddress host, int timeout) throws IOException


1 This method connects the socket to the specified host. This method is needed only when you
instantiated the Socket using the no-argument constructor.

public InetAddress getInetAddress()


2
This method returns the address of the other computer that this socket is connected to.

public int getPort()


3
Returns the port the socket is bound to on the remote machine.

public int getLocalPort()


4
Returns the port the socket is bound to on the local machine.

public SocketAddress getRemoteSocketAddress()


5
Returns the address of the remote socket.

public InputStream getInputStream() throws IOException


6 Returns the input stream of the socket. The input stream is connected to the output stream of the
remote socket.

public OutputStream getOutputStream() throws IOException


7 Returns the output stream of the socket. The output stream is connected to the input stream of the
remote socket

public void close() throws IOException


8 Closes the socket, which makes this Socket object no longer capable of connecting again to any
server
InetAddress Class Methods:
This class represents an Internet Protocol (IP) address. Here are following usefull methods which you would need
while doing socket programming:

SN Methods with Description

static InetAddress getByAddress(byte[] addr)


1
Returns an InetAddress object given the raw IP address .

static InetAddress getByAddress(String host, byte[] addr)


2
Create an InetAddress based on the provided host name and IP address.

static InetAddress getByName(String host)


3
Determines the IP address of a host, given the host's name.

String getHostAddress()
4
Returns the IP address string in textual presentation.

String getHostName()
5
Gets the host name for this IP address.

static InetAddress InetAddress getLocalHost()


6
Returns the local host.

String toString()
7
Converts this IP address to a String.

Socket Client Example:


The following GreetingClient is a client program that connects to a server by using a socket and sends a greeting,
and then waits for a response.

// File Name GreetingClient.java

import java.net.*;
import java.io.*;

public class GreetingClient


{
public static void main(String [] args)
{
String serverName = args[0];
int port = Integer.parseInt(args[1]);
try
{
System.out.println("Connecting to " + serverName
+ " on port " + port);
Socket client = new Socket(serverName, port);
System.out.println("Just connected to "
+ client.getRemoteSocketAddress());
OutputStream outToServer = client.getOutputStream();
DataOutputStream out =
new DataOutputStream(outToServer);

out.writeUTF("Hello from "


+ client.getLocalSocketAddress());
InputStream inFromServer = client.getInputStream();
DataInputStream in =
new DataInputStream(inFromServer);
System.out.println("Server says " + in.readUTF());
client.close();
}catch(IOException e)
{
e.printStackTrace();
}
}
}

Socket Server Example:


The following GreetingServer program is an example of a server application that uses the Socket class to listen for
clients on a port number specified by a command-line argument:

// File Name GreetingServer.java

import java.net.*;
import java.io.*;

public class GreetingServer extends Thread


{
private ServerSocket serverSocket;

public GreetingServer(int port) throws IOException


{
serverSocket = new ServerSocket(port);
serverSocket.setSoTimeout(10000);
}

public void run()


{
while(true)
{
try
{
System.out.println("Waiting for client on port " +
serverSocket.getLocalPort() + "...");
Socket server = serverSocket.accept();
System.out.println("Just connected to "
+ server.getRemoteSocketAddress());
DataInputStream in =
new DataInputStream(server.getInputStream());
System.out.println(in.readUTF());
DataOutputStream out =
new DataOutputStream(server.getOutputStream());
out.writeUTF("Thank you for connecting to "
+ server.getLocalSocketAddress() + "\nGoodbye!");
server.close();
}catch(SocketTimeoutException s)
{
System.out.println("Socket timed out!");
break;
}catch(IOException e)
{
e.printStackTrace();
break;
}
}
}
public static void main(String [] args)
{
int port = Integer.parseInt(args[0]);
try
{
Thread t = new GreetingServer(port);
t.start();
}catch(IOException e)
{
e.printStackTrace();
}
}
}

Compile client and server and then start server as follows:

$ java GreetingServer 6066


Waiting for client on port 6066...

Check client program as follows:

$ java GreetingClient localhost 6066


Connecting to localhost on port 6066
Just connected to localhost/127.0.0.1:6066
Server says Thank you for connecting to /127.0.0.1:6066
Goodbye!

TCP and UDP are two transport layer protocols, which are extensively used in internet
for transmitting data between one host to another. Good knowledge of how TCP and
UDP works is essential for any programmer. That's why difference between TCP and
UDP is one of the most popular programming interview question. I have seen this
question many times on various Java interviews , especially for server side Java
developer positions. Since FIX (Financial Information Exchange) protocol is also a TCP
based protocol, several investment banks, hedge funds, and exchange solution
provider looks for Java developer with good knowledge of TCP and UDP. Writing fix
engines and server side components for high speed electronic trading platforms needs
capable developers with solid understanding of fundamentals including data
structure, algorithms and networking. By the way, use of TCP and UDP is not limited
to one area, its at the heart of internet. The protocol which is core of internet, HTTP
is based on TCP. One more reason, why Java developer should understand these two
protocol in detail is that Java is extensively used to write multi-threaded, concurrent
and scalable servers. Java also provides rich Socket programming API for both TCP and
UDP based communication. In this article, we will learn key differences between TCP
and UDP protocol, which is useful to every Java programmers. To start with, TCP
stands for Transmission Control Protocol and UDP stands for User Datagram Protocol,
and both are used extensively to build Internet applications.
Difference between TCP vs UDP Protocol
I love to compare two things on different points, this not only makes them easy to
compare but also makes it easy to remember differences. When we compare TCP to
UDP, we learn difference in how both TCP and UDP works, we learn which provides
reliable and guaranteed delivery and which doesn't. Which protocol is fast and why,
and most importantly when to choose TCP over UDP while building your own
distributed application. In this article we will see difference between UDP and TCP in
9 points, e.g. connection set-up, reliability, ordering, speed, overhead, header size,
congestion control, application, different protocols based upon TCP and UDP and how
they transfer data. By learning these differences, you not only able to answer this
interview question better but also understand some important details about two of
the most important protocols of internet.

1) Connection oriented vs Connection less

First and foremost difference between them is TCP is a connection oriented protocol,
and UDP is connection less protocol. This means a connection is established between
client and server, before they can send data. Connection establishment process is also
known as TCP hand shaking where control messages are interchanged between client
and server. Attached image describe the process of TCP handshake, for example
which control messages are exchanged between client and server. Client, which is
initiator of TCP connection, sends SYN message to server, which is listening on a TCP
port. Server receives and sends a SYN-ACK message, which is received by client again
and responded using ACK. Once server receive this ACK message, TCP connection is
established and ready for data transmission. On the other hand, UDP is a connection
less protocol, and point to point connection is not established before sending
messages. That's the reason, UDP is more suitable for multicast distribution of
message, one to many distribution of data in single transmission.

2) Reliability
TCP provides delivery guarantee, which means a message sent using TCP protocol is
guaranteed to be delivered to client. If message is lost in transits then its recovered
using resending, which is handled by TCP protocol itself. On the other hand, UDP is
unreliable, it doesn't provide any delivery guarantee. A datagram package may be lost
in transits. That's why UDP is not suitable for programs which requires guaranteed
delivery.

3) Ordering
Apart from delivery guarantee, TCP also guarantees order of message. Message will be
delivered to client in the same order, server has sent, though its possible they may
reach out of order to the other end of the network. TCP protocol will do all
sequencing and ordering for you. UDP doesn't provide any ordering or sequencing
guarantee. Datagram packets may arrive in any order. That's why TCP is suitable for
application which need delivery in sequenced manner, though there are UDP based
protocol as well which provides ordering and reliability by using sequence number and
redelivery e.g. TIBCO Rendezvous, which is actually a UDP based application.

4) Data Boundary
TCP does not preserve data boundary, UDP does. In Transmission control protocol,
data is sent as a byte stream, and no distinguishing indications are transmitted to
signal message (segment) boundaries. On UDP, Packets are sent individually and are
checked for integrity only if they arrived. Packets have definite boundaries which are
honoured upon receipt, meaning a read operation at the receiver socket will yield an
entire message as it was originally sent. Though TCP will also deliver complete
message after assembling all bytes. Messages are stored on TCP buffers before
sending to make optimum use of network bandwidth.

5) Speed
In one word, TCP is slow and UDP is fast. Since TCP does has to create connection,
ensure guaranteed and ordered delivery, it does lot more than UDP. This cost TCP in
terms of speed, that's why UDP is more suitable where speed is a concern, for
example online video streaming, telecast or online multi player games.

6) Heavy weight vs Light weight


Because of the overhead mentioned above, Transmission control protocol is
considered as heavy weight as compared to light weight UDP protocol. Simple mantra
of UDP to deliver message without bearing any overhead of creating connection and
guaranteeing delivery or order guarantee keeps it light weight. This is also reflected
in their header sizes, which is used to carry meta data.

7) Header size
TCP has bigger header than UDP. Usual header size of a TCP packet is 20 bytes which
is more than double of 8 bytes, header size of UDP datagram packet. TCP header
contains Sequence Number, Ack number, Data offset, Reserved, Control bit, Window,
Urgent Pointer, Options, Padding, Check Sum, Source port, and Destination port.
While UDP header only contains Length, Source port, Destination port, and Check
Sum. Here is how TCP and UDP header looks like :

TCP Header Format

UDP Header Format

8) Congestion or Flow control


TCP does Flow Control. TCP requires three packets to set up a socket connection,
before any user data can be sent. TCP handles reliability and congestion control. On
the other hand, UDP does not have an option for flow control.
9) Usage and application
Where does TCP and UDP are used in internet? After knowing key differences between
TCP and UDP, we can easily conclude, which situation suits them. Since TCP provides
delivery and sequencing guaranty, it is best suited for applications that require high
reliability, and transmission time is relatively less critical. While UDP is more suitable
for applications that need fast, efficient transmission, such as games. UDP's stateless
nature is also useful for servers that answer small queries from huge numbers of
clients. In practice, TCP is used in finance domain e.g. FIX protocol is a TCP based
protocol, UDP is used heavily in gaming and entertainment sites.

10) TCP and UDP based Protocols


One of the best example of TCP based higher end protocol is HTTP and HTTPS, which
is every where on internet. In fact most of the common protocol you are familiar of
e.g. Telnet, FTP and SMTP all are based over Transmission Control Protocol. UDP don't
have any thing as popular as HTTP but UDP is used in protocol like DHCP (Dynamic
Host Configuration Protocol) and DNS (Domain Name System). Some of
the other protocol, which is based over User Datagram protocol is Simple Network
Management Protocol (SNMP), TFTP, BOOTP and NFS (early versions).

You might also like