Professional Documents
Culture Documents
1. Introduction.
2. Fault Tolerance by Duplication.
3. Types :
Hardware Fault Tolerance Systems
Software Fault Tolerance Systems
4. General Fault Tolerance Procedure.
5. Fault Tolerance in Distributed Systems.
6. Conclusion.
7. References.
Introduction
• Fault-tolerance is the property of a system that
continues operating properly in the event of
failure of some of its parts .
• A fault-tolerant system is designed from the
ground up for reliability by building
multiples of all critical components, such as
CPUs, memories, disks.
• Usually the definitions involved in this
propagation process are
Failure
Fault
Error
• Fault-tolerance is not just a property of
individual machines; it may also characterise
the rules by which they interact.
• Recovery from errors in fault-tolerant systems
can be characterised as either roll-forward or
roll-back.
Fault-tolerance by duplication:
Duplication can give fault-tolerance in three ways :
Replication
Redundancy
Diversity
A redundant array of independent disks (RAID)
is an example of a fault-tolerant storage device that
uses redundancy.
Tandem and Stratus were the first two manufacturers
that were dedicated to building fault tolerant
computer systems for the transaction processing
(OLTP)market.
Types:
Fault tolerance systems are two types.