Professional Documents
Culture Documents
S.Nagaraju
M050216CS
1
CERTIFICATE
Dr.M.P.Sebastian Dr.M.P.Sebastian
Professor and Project Guide
Head of the Department Professor and Head of the Department
Dept. of Computer Engineering Dept. of Computer Engineering
NIT Calicut NITCalicut
2
ACKNOWLEDGEMENT
S.Nagaraju.
3
Contents
1 Introduction 6
2 Assumptions or requirements 9
3 Protocol 9
3.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Opening the Initial Connection . . . . . . . . . . . . . . . . . 9
3.3 Connection re-establishment . . . . . . . . . . . . . . . . . . . 10
4 Performance 10
5 Network Configurations 10
7 Conclusion 11
8 References 12
4
Abstract
We present an implementation of a faulttolerant TCP (FT-TCP)
that allows a faulty server to keep its TCP connections open until
it either recovers or it is failed over to a backup. The failure and
recovery of the server process are completely transparent to client
processes connected with it via TCP. FT-TCP does not affect the
software running on a client, does not require to change the servers
TCP implementation, and does not use a proxy.
5
1 Introduction
Failure of Server is within the control of the organization while the
failure of Client is not.So server recovery is important. Previously
fallow three approaches.
1. Application-level Approach
2. Proxy-based Approach
6
3. TCP Replication Approach
Figure 4: FT-TCP.
7
Working of SSW
Working of NSW
Intercepts read and write socket calls from the application layer to
the TCP layer. Logs the amount of data returned with each socket call
(read length). During crashed server recovery NSW forces read socket
calls to have same data and read lengths. This ensures deterministic
recovery. Discards write socket calls to avoid resending data to the
client.
Working of Logger
8
2 Assumptions or requirements
A restarting server has the application restarting from its initial
state. Process issues the same sequence of read socket calls when
replayed. Requires a mechanism allowing another process/processor
to take over the IP address of a process on a failed processor. This
mechanism also needs to update the ARP cache of any client on same
physical network.
3 Protocol
3.1 Variables
. delta seq = allows SSW to map seq #s.
. stable seq = smallest seq # that the SSW does not know to be
logged.
. serverseq = highest seq # acknowledged by the client.
. unstable reads = no. of read socket calls whose read lengths
NSW does not know to be logged.
. restarting = true when server is not in normal operation.
9
3.3 Connection re-establishment
. When a server crashes:
- Logger detects server failure and temporarily takes over by send-
ing TCP segments with closed window and acks upto stable seq.
- Server restarts and FT-TCP reconnects with Log server. FT-
TCP sets stable seq and server seq from the logged data, sets unsta-
ble reads to 0 and recovering to true.
- Logger implicitly relinquishes generation of Acks to SSW.
- Restarting application either executes an accept or connect socket
call.
- SSW fabricates a SYN that appears to come from the client and
has initial seq # of stable seq and passes it to the TCP layer.
- Acknowledging SYN from servers TCP is captured by SSW and
it sets delta seq to the initial seq # minus the new proposed initial
seq #.
- SSW discards this segment, fabricates an ACK and passes it to
the servers TCP.
4 Performance
. Prototype implementation:
- Client transmits a stream as bulk data to the server as fast as it
can.
- The server just discards this data.
. Quantities measured:
- Throughput of FT-TCP.
- Additional latency introduced by FT-TCP.
- Recovery time of the server.
5 Network Configurations
1. Client and server share a 10 MB Ethernet and server and logger
share another 10 MB Ethernet (10-10).
2. All 3 are on the same 10 MB Ethernet (10 Shared).
3. Client and Server share 10 MB Ethernet and server and logger
share 100 MB Ethernet (10-100).
10
6 Recovering from Logged Data
To avoid large latency, FT-TCP sends recovery data to logger asyn-
chronously. Some recovery data may be lost and so recovery can re-
store the server to a state earlier to the one that the client knows
about.E.g.. Ack seq # sent by server = asn but logger has only
recorded asn-l. Now when server recovers, TCP knows only about
asn-l so its next packet has an ack # less than asn. But client may
have already discarded the data upto asn since it received an ack for
it. FT-TCP solves this problem by making the SSW not allow the
outgoing ack seq # to be larger than asn-l+1.
7 Conclusion
FT-TCP wraps an existing TCP layer to mask server failures from
unmodified clients. If server-logger connection is fast the additional
overhead on throughput and latency is low. FT-TCP solves the net-
work part of the puzzle indistinguishable from non-fault-tolerant TCP.
11
8 References
References
[1] P. M. Chen et. al. The Rio file cache: “Surviving operating system
crashes”. In Proceedings of the Seventh International Conference on
Architectural Support for Programming Languages and Operating
Systems, October 1996, pp. 7483.
[2] E. Elnozahy, L. Alvisi, Y.M. Wang, and D.B. Johnson. “A Survey
of Rollback-Recovery Protocols in Message Passing Systems”. CMU
Technical Report CMU-CS-99-148, June 1999.
[3] D. Maltz and P. Bhagwat. “TCP splicing for application layer
proxy performance”. IBM Research Report 21139 (Computer Sci-
ence/ Mathematics), IBM Research Division, 17 March 1998.
12