You are on page 1of 3

File Transfer Protocol (FTP)

This chapter is a précis of Chapter 34 of Doug Comer's book [Comer 2004] .


FTP is defined in [RFC 959] .

A file is a fundamental abstraction for long-term storage of information. As


networks emerged, so did the need to transfer files from one computer to
another. File transfer is complicated because it must accommodate
differences among the ways computer systems store files, e.g. rules for valid
file names, file extensions, file ownership, etc.

The File Transfer Protocol


The most widely deployed Internet file transfer service uses the File Transfer
Protocol (FTP). The characteristics of FTP are as follows.

 General purpose - FTP handles the problems discussed above


 Arbitrary file contents - FTP transfers arbitrary data, e.g. text and binary
files
 Authentication and ownership - FTP allows files to have ownership and
access restrictions
 Accommodates heterogenity - FTP hides the details of individual
computer systems

FTP is one of the oldest application protocols still used on the Internet, and is
invoked by web browsers when a user requests a file download. FTP predates
both IP and TCP. As TCP/IP was created, a new version of FTP was
developed to work with the new Internet protocols. FTP is still heavily used --
only in 1995 did web traffic on the Internet surpass FTP traffic for the first
time.

FTP Commands

FTP is designed to run from a program, e.g. a browser, or for interactive use.
When invoked, FTP must handle all the details and then inform the user
whether the operation succeeded or failed; the user never sees the FTP
interface.

FTP has commands that allow users to connect to a remote computer,


provide authentication, find out what remote files are available, and to request
file transfers. The FTP protocol specifies exactly how FTP software on one
computer interacts with FTP software on another. However, it does not specify
a user interface. Consequently, implementations vary, though most use the
interface originally written for the BSD UNIX system. Figure 1 illustrates the
command names.

Figure 1: FTP commands

There are over 50 commands. Some are seldom implemented


(e.g. proxy permits simultaneous communication with two remote computers),
some are irrelevent (e.g. tenex refers to an obsolete operating system), and
some are aliases (e.g. dir and ls both request a directory listing).

Although file representations may differ on two computer systems, FTP does
not attempt to handle all possible representations. Instead, FTP defines two
basic types of transfer -- text and binary. A text file contains a sequence of
characters (usually in ASCII) separated into lines. Binary transfer mode is
used for all non-text files. FTP does not interpret the contents of a file
transferred in binary mode -- this can cause problems, e.g. a file of 32-bit
floating point numbers where the representation is different on the two
computers.

Control and Data Connections

Like other network applications, FTP uses the client-server paradigm. The
client establishes a control connection. When the user enters a command, the
client forms a request using the FTP protocol and sends it to the server.
Similarly, the server uses the FTP protocol to send a reply.
FTP uses the control connection only to send and receive control messages.
It establishes a separate data connection for each file transfer. Although these
data connections appear and disappear frequently, the control connection
persists for the entire session. Figure 2 illustrates the concept.

Figure 2: FTP Connections

The arrows on the connections show which side initiated the connection.
Separate connections for control and transfer have several advantages -- it
simplifies the protocol and allows the control connection to be used during a
file transfer, e.g. to abort the transfer.

You might also like