You are on page 1of 6

Get IT Done: Dissecting and diagnosing

TCP/IP routing problems


Takeaway: Find out how TCP/IP sends information across your network

Most of us take for granted the complexities of the Internet and even our own intranets when
we start up a Web browser and browse the Web. However, in order for the packets to flow
from your computer to the server, there are a variety of mechanisms being used by the local
computer and its nearest neighbor routers that you should know about. By understanding the
process in which a computer can discover routes, you can make better decisions about how to
architect your network and how to troubleshoot any routing problems that may arise.

TCP/IP basics
Almost everyone who has been exposed to TCP/IP knows that there are three pieces of
information that are mandatory in a networked TCP/IP environment: IP address, subnet mask,
and default gateway. The function of the IP address is clear; it is a unique address that refers
to the machine just like a street address refers to a house.

There is, however, a lot more confusion about how the subnet mask and the default gateway
are used. The subnet mask, put simply, determines whether the destination host for a packet is
local or not. The subnet mask is logically ANDed with the IP address of the local machine
and ANDed with the destination IP address. If the result is the same, then the destination is
local. If not, it is remote. A logical AND takes each bit and returns a one or a zero. The
logical AND only returns a one when both of the numbers being ANDed are ones. Logical
ANDing is done on a bit-by-bit basis. By ANDing an IP address with a subnet mask, you get
only the network portion of the address—and so you can determine if the host is local or not.

If the address is local, TCP/IP uses the address resolution protocol (ARP) to determine the
physical address or media access control (MAC) address. Ultimately, communication on a
physical network is done by identifying the hardware address for which the packet is
intended. It is for this reason that TCP/IP must broadcast to determine the physical address
for an IP address. Typically, this represents only a small percentage of the number of packets
on the network, because once the address is discovered, it is cached by the local machine.

If, on the other hand, the address is not local, the computer uses a local routing table to
determine where it should send the packet. The default gateway is simply a special default
entry in the routing table that is used whenever the computer does not have a specific entry in
its routing table.

Routing table basics


When every computer boots up, it builds its own routing table. The table is used to determine
how to send the packet from its source to the destination. Above, when I mentioned that the
subnet mask is logically ANDed to determine whether the address is local or not, I was
referring to a small part of the process where the computer consults the routing table to
determine what to do with the packets.

Each routing table contains the appropriate entries to push a packet destined for the local
network to the ARP protocol for the IP address to be resolved. The same routing table pushes
a packet towards a router connected to the local subnet.

In the simplest form, the routing table contains entries for:

• Every local adapter


• The networks attached to every local adapter
• Default gateways
• A local loop back address
• A multicast address

In more complicated environments, the routing table would also contain entries for the
networks that have routers connected to the local network.

The local adapter entries point packets that are destined for a network that is locally attached
to the computer. The loop back address entry sends packets back to an internal interface in
the computer for processing. The multicast address, although rarely used, routes packets in
such a way that they can be sent to multiple destinations simultaneously.

The routing table is reviewed using three criteria. First, the length of the subnet mask is
considered. The more specific the entry in the routing table, the more likely that it will be
used. This is necessary to allow you to have routes to specific locations and default
destinations for traffic that has no specific routes. The routes to a destination have a long
subnet mask associated with them. The traffic without a route uses the default gateway entry,
which has a subnet mask of no length.

The second criterion is the metric associated with the route. This helps determine the cost of
the route. It is used to provide standby routes in the event of a primary route loss. In other
words, it is used primarily to trigger dial-up backup routes when the main line is cut. In most
networks, metrics are not used on PCs. They exist only in the routing tables of the core
routers.

The final criterion on a Windows computer is a random order in which items of equal subnet,
depth, and metric are tested. One entry starts at the top of this list and is not bumped from its
spot until Windows tries to send it to the gateway and it fails. From that point, the next entry
is used until it cannot be reached. This randomness only applies when there are two routes
with equal priority. This rarely occurs, unless there are two default gateways. This might
occur if you have a local area network and you dial up to the network. Your local area
network has a default gateway, as does the dial-up connection.

The building of a routing table


Routing tables are built through local interfaces, static routes, routing protocols, and router
discovery messages. The local interfaces are automatically added when they are activated.
Static routes are those routes that have been added to the routing table manually. They are
added by using the ROUTE command.

Routing protocols are used for routers to communicate between one another and learn a
complete set of routes. They are typically not used on computers—however, several versions
of Windows servers offer some of the basic routing protocols. These protocols are not
installed by default, but they can be added, and they can automatically modify the routing
table.

The final way that routing tables are updated is by Internet control message protocol (ICMP)
redirect messages. This message is sent back from a router when a packet is sent to a router—
but it knows that it is not the best route to reach the final destination. These messages cause
the computer to add the information about the new router and the route to the routing table.
These messages are the reason why a network can have two different routers connected to the
same network, leading to different places, with only one default gateway configured.

The making of a redirect message


Redirect messages are sent back to a computer when a router detects that it is receiving a
packet from an interface where the best route would send it back out that same interface. Let
us say a router has a local interface with an address of 10.55.1.1 (255.255.255.0), and it has a
route in its routing table that sends 10.254.1.0 (255.255.255.0) to 10.55.1.2 for further
routing. When it receives a packet from a computer on 10.55.1.3 destined for 10.254.1.13, it
responds by indicating that 10.55.1.2 is the best route to the destination. The computer adds
an entry to its routing table indicating that it should use the router on 10.55.1.2 to reach the
host.

In effect, these ICMP redirect messages allow the client computers to be configured with
only a single default router, when, in fact, there are several routers on the local network that
the computer may have to communicate with in order to reach both internal and external
hosts.

Route and repeat


One of the fundamentals of IP routing is that each device gets the packet closer to the
destination. Each router knows a small amount about the IP addresses that are in use on the
Internet. These routers route the packet to the best of their ability. The hope is that the
destination is closer after the route than before. The process is repeated as the packet is
transmitted from router to router until it reaches its destination.

However, this isn’t always the case. It is possible for routers to route a packet back and forth
between two neighbors. This case, called a routing loop, causes the packet to be bounced
back and forth until a special field in the packet, called time to live (TTL), reaches zero.

TTL is decremented by each router before it routes the packet on. When the time to live
reaches zero, a response is sent to the originating computer indicating that the time to live has
expired. This is the message that PING will show you when a routing loop exists. This
technique is used to prevent packets from routing back and forth forever.

Address resolution protocol


Thus far, I’ve been talking about how packets are routed from one router to another until they
reach their final destination. However, you should have a basic understanding of how packets
are transmitted on the local network before I explore how to troubleshoot problems.

As I mentioned above, the ARP is responsible for associating TCP/IP addresses with the
hardware or MAC addresses. All transmissions on a local network can be directed to a single
machine or all machines on the network. All transmissions on the network use a hardware
address to determine their destination. A special condition exists whereby if a packet is
transmitted with all bits in the address set, every machine in the network receives a copy and
processes it. This is a broadcast.

The hardware address is technically named a MAC address because the address operates at
the media access control layer of the protocol stack. IP addresses live in the network layer of
the OSI network protocol model. MAC addresses in an Ethernet environment are six bytes
(48 bits) long. They are unique because each vendor is defined with a prefix that is three
bytes (24 bits) long. Each vendor is then responsible for keeping hardware IDs with that
prefix unique.

The process of resolving a hardware address from an IP address isn’t complicated, but it does
involve a broadcast packet. The first step is that the computer looks in the routing table and
determines that the address is a local address. From there, it transmits a broadcast packet
from the appropriate interface. The packet contains the hardware address of the current
system and the IP address that is being sought. The system that has the IP address in question
responds to the packet by sending a packet back to the originating computer.

Only the first solicitation packet is broadcast and then all of the remaining packets are sent
directly between the two computers that are communicating. This is important because
switches are a common part of network infrastructure today. They forward packets to
computers only if they need to see them. This is in contrast to a hub, which sends all packets
to all ports. Because switches send only the necessary packets to each port, they can improve
performance on a network by allowing the traffic to exceed the bandwidth of any one port.
Switches must transmit broadcast packets to every port. When there are a large number of
broadcast packets on a network, the value of network switches is reduced.

Once ARP has looked up an IP address, it is added to its local ARP table. The ARP table is
simply a list of IP addresses and their associated hardware addresses. ARP tables are created
primarily through the discovery process discussed above, but can also have static entries
added.

One odd thing about ARP is that it is used even when the packet’s final destination isn’t
local. This is because the hardware address of the default gateway must be located. So even if
none of the packets are local, ARP will have to be used at least once.

Seeing your ARP table


If you want to see what’s happening behind the scenes, you can look at your ARP table by
typing:
ARP -a

at the command line. You’ll see a response similar to:


Interface: 10.254.1.16 on Interface 0x1000004
Internet Address Physical Address Type
10.254.1.247 00-01-03-d0-b4-8f dynamic
10.254.1.254 00-10-5a-07-84-23 dynamic

This shows the machines on the local network that ARP has found and, thus, the hardware
addresses that have been resolved. In this case, 10.254.1.247 is a domain controller and
10.254.1.254 is the default gateway on the local network. As you can see, even the default
gateway’s address gets resolved.
Troubleshooting your routing
There are two basic tools used in the troubleshooting of IP networks. The first tool, which is
perhaps the most often used TCP/IP network-testing tool, is PING. It’s joined by
TRACEROUTE, a more informative tool that can help you diagnose the path a packet takes
to its destination. On Windows operating systems this is called TRACERT.

PING
The PING command, in its simplest form, uses only one parameter. That parameter is the IP
address to be pinged. PING will return one of only a few responses. The possible first
response is the number of milliseconds that it took for the PING command to send a packet to
the remote machine and for a response to be returned. If PING responded, then there are no
problems with connectivity to the remote device.

The second possible response is No Response. This message is generated when the PING
command didn’t receive a response to its request. The most likely cause of this is that the
device is offline or a device, such as a firewall, between you and the device will not pass
along ICMP messages. Both TRACEROUTE and PING use ICMP messages to do their
work. This means that neither of the two tools that you typically have at your disposal for
resolving TCP/IP problems will function. If the device is local, you should verify its
connectivity to the network. If the device is remote, you’ll have to investigate what devices
are between you and the destination and try to diagnose the problem from the device that isn’t
allowing ICMP messages to be transmitted.

The third possible response from PING is Destination Host Unreachable. In this case, you
either have not specified a valid default gateway, or one of the routers along the path to the
destination has lost its connection. This response tells that the route that should lead to the
destination is not working. This is most typically found when the only connectivity to the site
is down. If you receive this message, you should follow up by using the TRACEROUTE
command to determine which router believes the destination is unreachable.

The fourth possible response from PING is Time To Live Expired. This message typically
indicates a routing loop where one router sends a packet to its peer and then the peer router
sends it back. This generally indicates a routing table problem. You’ll need to use the
TRACEROUTE command to locate the routers that have the problem.

There are other possible responses from PING, such as Hardware Failure. This can occur
when you disconnect the network cable during the PING process. However, most of the other
messages that can be generated by PING are messages that are not normally associated with
the troubleshooting process.

TRACEROUTE
PING is a great tool, however, it gives a rather limited set of information. TRACEROUTE,
on the other hand, can return the complete path that the packet takes on its way to the final
destination. The basic execution of the TRACEROUTE command is simply the command
name followed by the IP address to trace to. In the case of Windows, the command is called
TRACERT; in all flavors of UNIX, it is TRACEROUTE. The command will return output
similar to the following:

Tracing route to penguin.datacenterdaily.com [216.37.52.229]


over a maximum of 30 hops:

1 <10 ms 10 ms 10 ms WEBMASTER [10.254.1.254]


2 <10 ms <10 ms 10 ms adsl-68-23-14-174.dsl.lgtpmi.ameritech.net [68.23.14.174]
3 20 ms 40 ms 20 ms adsl-68-23-14-1.dsl.lgtpmi.ameritech.net [68.23.14.1]
4 20 ms 30 ms 40 ms dist1-vlan50.ipltin.ameritech.net [67.36.128.226]
5 10 ms 20 ms 20 ms bb1-fa2-1-0.ipltin.ameritech.net [67.36.128.115]
6 30 ms 30 ms 20 ms sl-gw22-chi-2-0.sprintlink.net [144.228.153.125]
7 30 ms 20 ms 30 ms 144.232.10.9
8 30 ms 50 ms 40 ms sl-st21-chi-14-1.sprintlink.net [144.232.20.86]
9 30 ms 30 ms 40 ms 204.255.174.153
10 40 ms 30 ms 40 ms 0.so-3-1-0.XL2.CHI2.ALTER.NET [152.63.71.97]
11 50 ms 40 ms 30 ms 0.so-7-0-0.XR2.CHI2.ALTER.NET [152.63.67.134]
12 121 ms 60 ms 60 ms 192.at-6-2-0.CL2.IND6.ALTER.NET [152.63.66.217]
13 40 ms 60 ms 50 ms 190.ATM7-0.GW5.IND1.ALTER.NET [152.63.68.245]
14 60 ms 50 ms 50 ms onecall-POS-core-gw1.customer.alter.net [63.122.162.214]
15 50 ms 60 ms 50 ms Obelisk-2-Cedar-Oc3c.Onecall.net [216.37.0.114]
16 60 ms 50 ms 60 ms JayQualls-55-60.OneCall.Net [216.37.55.60]
17 40 ms 50 ms 121 ms GM-Colo-52-229.OneCall.Net [216.37.52.229]

Trace complete.

This shows the complete path that is taken for a packet from my private network attached to
Ameritech DSL to a system located in One Call Internet’s colocation facility. If the utility
had returned a list of alternating routers, then you could identify that one or the other of those
routers had a configuration problem. Alternatively, you may receive a message indicating
Destination Unreachable. This message indicates that the router's path to the destination has
been severed.

It’s not that complicated


When troubleshooting TCP/IP problems, keep in mind that it’s critical that you get the IP
address, subnet mask, and default gateway correct. After those few parameters are met, the
TCP/IP protocols infrastructure can begin to help your system reach every other connected
computer. The PING and TRACEROUTE commands are key to diagnosing your network
problems.

You might also like