FDIcourse

Fault Tolerant Control - an Engineering Approa
Mogens Blanke
Department of Control Engineering
Aalborg University, Denmark
email: blanke ontrol.au .dk
September 1996
Contents
1 Introdu tion
1.1 A ronyms and Abbreviations
1.1.1 Denitions . . . . . .
1.1.2 A ronyms . . . . . . .
1.1.3 Abbreviations . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 About Fault Tolerant Control

2.1 Introdu tion . . . . . . . . . . . . . . . . . . . . .
2.2 How Fault Toleran e is Obtained . . . . . . . . .
2.2.1 Open and losed loop systems . . . . . . .
2.2.2 Reliability Analysis . . . . . . . . . . . . .
2.2.3 A systemati Approa h. . . . . . . . . . .
2.3 Component Based Analysis of Fault Propagation
2.3.1 The Matrix FMEA Method . . . . . . . .
2.3.2 Completeness . . . . . . . . . . . . . . . .
2.3.3 Fault propagation in losed loop . . . . .
2.3.4 Other approa hes . . . . . . . . . . . . . .
2.3.5 De ision about fault handling . . . . . . .
2.3.6 Fault A ommodation . . . . . . . . . . .
2.4 Models for FDI . . . . . . . . . . . . . . . . . . .
2.4.1 FDI based on dynami models . . . . . .
2.5 Inter onne tion at subsystem level . . . . . . . .
2.5.1 The link to FDI models . . . . . . . . . .
2.6 An Ar hite ture for Supervisory Control . . . . .
2.7 Systemati Design . . . . . . . . . . . . . . . . .
2.8 Supervisor Design and Implementation . . . . . .
2.8.1 Array based logi . . . . . . . . . . . . . .
2.8.2 Petri net implementation . . . . . . . . .
2.8.3 Re e tive programming implementation .
2.8.4 A prototype implementation . . . . . . .
2.9 Example: Temperature Control . . . . . . . . . .
2.10 Summary . . . . . . . . . . . . . . . . . . . . . .
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
11
11
12
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
18
18
18
19
19
19
22
22
23
23
23
25
25
26
27
28
29
31
32
32
32
33
34
36
CONTENTS
3 Control System Interfa e with Physi al Plant

3.1 Component Failure Modes . . . . . . . . . . . . . . . . .
3.1.1 Sensor and A tuator Types . . . . . . . . . . . .
3.1.2 Level Measurement . . . . . . . . . . . . . . . . .
3.1.3 Temperature Measurement . . . . . . . . . . . .
3.1.4 Pressure Measurement . . . . . . . . . . . . . . .
3.2 Angular Position Measurement . . . . . . . . . . . . . .
3.2.1 Potentiometer . . . . . . . . . . . . . . . . . . . .
3.2.2 Flow Measurement . . . . . . . . . . . . . . . . .
3.3 A tuators for Flow Control . . . . . . . . . . . . . . . .
3.3.1 Three-way Valve . . . . . . . . . . . . . . . . . .
3.3.2 Pumps . . . . . . . . . . . . . . . . . . . . . . . .
3.4 FMEA S hemes for Sensors and A tuators . . . . . . . .
3.4.1 Sensor Faults . . . . . . . . . . . . . . . . . . . .
3.4.2 A tuator Faults . . . . . . . . . . . . . . . . . . .
3.5 Requirements to Interfa e . . . . . . . . . . . . . . . . .
3.5.1 Component Categorization . . . . . . . . . . . .
3.5.2 Sensors . . . . . . . . . . . . . . . . . . . . . . .
3.5.3 Single Sensor Fault Dete tion . . . . . . . . . . .
3.5.4 Multiple sensor fault dete tion . . . . . . . . . .
3.5.5 Filtering of Ele tromagneti Spikes . . . . . . . .
3.6 A tuators . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7 Interfa e Requirements . . . . . . . . . . . . . . . . . . .
3.7.1 Requirements to hardware and rmware . . . . .
3.7.2 Combined Hardware - Software requirements
4 Fault Dete tion and Isolation
4.1 FDI in Closed Loop Control Systems . . . .
4.2 Requirements to FDI . . . . . . . . . . . . .
4.3 Modelling of Faults and Fault-propagation .
4.4 Methods for Change Dete tion . . . . . . .
4.5 Geometri Approa hes to Change Dete tion
4.5.1 Generation of residuals . . . . . . . .
4.5.2 Parity Equations . . . . . . . . . . .
4.5.3 Diagnosti Observer . . . . . . . . .
4.5.4 Unknown Input Observer . . . . . .
4.6 Statisti al Methods to Generate Residuals .
4.6.1 Kalman Filtering . . . . . . . . . . .
4.6.2 Parameter Estimation . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
37
38
38
39
41
43
44
45
45
47
48
50
51
51
54
55
56
57
58
61
62
63
64
64
65
.
.
.
.
.
.
.
.
.
.
.
.
67
67
69
69
71
71
72
74
75
78
79
80
81
5 The Change Dete tion Problem

85
5.1 About sto hasti signals . . . . . . . . . . . . . . . . . . . . . 85
5.1.1 Amplitude distribution. . . . . . . . . . . . . . . . . . 86
5.1.2 Mean and varian e of a stationary pro ess . . . . . . . 86
CONTENTS
5.1.3 Mean and varian e of a ltered stationary pro ess . .

5.2 Measuring the dieren e between statisti al signals . . . . . .
5.2.1 The Kullbak Distan e between Gaussian signals . . .
5.3 Change Evaluation . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Threshold tests . . . . . . . . . . . . . . . . . . . . . .
5.4 Statisti al Dete tion . . . . . . . . . . . . . . . . . . . . . . .
5.4.1 The Weighted Sum-squared Residual Te hnique
93
5.4.2 Sequential Probability Ratio Test . . . . . . . . .
5
87
89
90
91
91
92
94
CONTENTS
List of Figures
2.1 Failure Mode and Ee t Analysis s heme illustrated graphi ally. Two omponent levels are shown. . . . . . . . . . . . .
2.2 Propagation of fault ee ts in losed loop ontrol of 3-way
valve. Solid lines show fault propagation. Points marked
with star show where propagation an be stopped. . . . . . .
2.3 Blo k diagram for ooling system with two 3-way valves and
sket h of surrounding omponents. . . . . . . . . . . . . . . .
2.4 Bond graph model prin iple for ooling loop. is serial, is
serial, and a parallel onne tion of omponents. . . . . . .
2.5 Three layer model for autonomous ontroller with link to upper level plant wide ontrol or to operator interfa e. . . . . .
2.6 Design method for dependable ontroller with autonomous
fault dete tion and a ommodation . . . . . . . . . . . . . . .
2.7 Temperature ontrol loop with 3-way valve. . . . . . . . . . .
3.1 Swit h arrangement for level swit h. . . . . . . . . . . . . . .
3.2 3-wire resistan e measurement of resistan e in Pt element to
measure temperature with ompensation of wire resistan e. .
3.3 Pressure measurement using a strain gauge bridge tted to a
membrane onverted to a 4-20 mA urrent out of the transdu er.
3.4 Binary pressure indi ator. The solid line is me hani al dieren e and the bottom of it is the adjustable set-point value. .
3.5 Ele tri al diagram of potentiometer and omputer interfa e
to enable fault dete tion at the single sensor level. . . . . . .
3.6 Valve hara teristi s for diverging and onverting operation
(the use of A or B ports for in ow). . . . . . . . . . . . . . .
3.7 Operation of 3-way valve a tuator with relay operated indu tion motor. Abbreviations are: o:open, : lose, s:stop,
HTR:Heater, LS:Limit Swit h, TS:Torque Swit h. . . . . . .
3.8 Standby pump set with remote ontrol. The ontrol omputers are independent and have mutual supervision. . . . . . . .
3.9 Tripple onversion sampling has only marginal overhead but
oers both signi ant ele tromagneti spike suppression and
onsisten y he k within one sample. . . . . . . . . . . . . . .
7
20
24
27
28
29
31
34
41
42
44
45
46
49
50
51
63
LIST OF FIGURES
4.1
4.2
4.3
4.4
Levels of FDIA automation. . . . . . . . . . . . . . . . . . . .

A general pro edure of residual generation in FDI . . . . . .
Geometri interpretation for dete tion of several faults. . . . .
Geometri interpretation for dete tion and isolation of one
fault at a time. . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Geometri interpretation for simultaneous fault dete tion and
isolation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6 Illustration of a bank of Kalman lters for statisti al FDI. . .
4.7 Residual generation based on parameter estimation. . . . . .
68
72
73
74
75
82
83
5.1 Statisti al dete tion is applied to the residual. . . . . . . . . . 92
List of Tables
2.1 FMEA S heme for 3-way Valve . . . . . . . . . . . . . . . . . 26
10
LIST OF TABLES
Chapter 1
Introdu tion
This do ument is a le ture note in fault-tolerant ontrol used at the 9th
semester ourse for MS students in pro ess ontrol at Aalborg University.
1.1 A ronyms and Abbreviations

1.1.1 Denitions
The following denitions were made by the Safepro ess Te hni al Committe
of IFAC
1.1.2 A ronyms
Dependable System : A system that has high reliability in terms of high
availability and where the onsequen es of a fault are limited to the
system itself, i.e., lo al faults do not develop into failure at plant level.
Event : An internal or external o urren e involving equipment performan e or human a tion that auses a system upset.
Failure : The inability of a system or subsystem to a omplish its required
fun tion.
Fault : A hange in the hara teristi s of a part or omponent su h that
its mode of operation or performan e is hanged in an undesired way.
Required spe i ations are no longer fullled.
Fault tolerant system : A system where a fault may leed to hange of
operation or redu ed performan e but a single fault does not develop
into a failure on a subsystem or system level.
Failure Modes : The various ways in whi h failures o ur.
Hazard : An intrinsi property or ondition that has the potential to ause
an a ident.
11
12
CHAPTER 1. INTRODUCTION
Reliability : The probability that a system, subsystem or omponent

will perform its intended fun tion for a spe ied period of time under
normal onditions.
Slow-down : A hange in operation, usually a redu tion of apa ity or
power, to prote t a ma hinery from damage or ex essive wear.
Safety system : Ele troni equipment that re eives sensor information
about riti al quantities and a tivates a dedi ated a tuator to stop a
ma hinery if a spe ied onditions exist. Condition he k is usually
made as a simple limit he k of sensor values. The purpose of a safety
system is to prote t ma hinery from permanent damage due to, e.g.,
overspeed or la k of lubri ation oil or ooling water.
Shut-down : A stop of a ma hinery system to prote t it from permanent
damage. A shut-down is usually made by a dedi ated safety system
that measures essential
System hierar hy : The following system hierar hy is used:
Component: the lowest level whi h is onsidered as fault andidates/maintenan e. Components may omprise or use information from other omponents. Sensors and a tuators are examples
of omponents.
Subsystem: a olle tion of omponents that has a dened purpose
and requirements to performan e and operational modes.
System: a olle tion of sub-systems
Plant: the entirety of a physi al system with its own purpose. A
plant is the largest entity onsidered.
1.1.3 Abbreviations
AI : Analog Input. Part of omputer pro ess interfa e
AO : Analog Output. Part of omputer pro ess interfa e
A/D : Analog to Digital onversion. Part of analog input.
D/A : Digital to Analog onversion. Part of analog output.
DI : Digital Input. Part of omputer pro ess interfa e
DiGraph : Dire ted Graph. Used for fault models in reliability analysis.
DO : Digital Output. Part of omputer pro ess interfa e
ETA : Event Tree Analysis
1.1. ACRONYMS AND ABBREVIATIONS
FMEA : Failure Mode and Ee t Analysis

FTA : Fault Tree Analysis
FPA: Fault Propagation Analysis
HazOp : Hazard and Operability Analysis

IFAC: International Federation of Automati Control
I/O : Input/Output
ISC : Integrated Ship Control.
LO : Lubri ating Oil
PHA : Preliminary Hazard Analysis
13
14
CHAPTER 1. INTRODUCTION
Chapter 2
About Fault Tolerant

Control
Fault tolerant ontrols have the ability to be resilient to simple faults in
ontrol loop omponents. When faults o ur, performan e may be redu ed
or lose-down may be needed in some ases, but a simple fault will never be
amplied and ause plant failure. Dete tion and a ommodation te hniques
an be employed to obtain these features. Theory and basi development
methods exist for several fault dete tion problems, and some attempts have
been made in the eld of supervisory ontrol, but fault handling has not yet
been the subje t of systemati resear h. This paper fo us on fault tolerant
ontrol with spe i emphasis on fault handling design and implementation.
Aims and means are dis ussed and ideas towards onsistent design methods
are presented. A promising method is shown to be an automated analysis
of omponent fault modes and their ee ts. This method provides de ision
tables for fault handling that shows how fault migration an be stopped.
The potential of these te hniques, when fully developed, is shown to be
signi antly improved fault toleran e and autonomy of ontrol loops. The
impetus is that this is obtainable with fairly simple means.
2.1 Introdu tion

Pro ess te hnology has hanged to more omplex plants with a high degree
of automation. This has enhan ed quality and e ien y in normal operation, but also made systems more vulnerable to faults. As a onsequen e,
industrial attention has hanged towards in reased dependability, a synonym
for a high degree of availability, reliability, and safety.
Advan es in automation have provided integration of monitoring and
ontrol fun tions to enhan e the operator's overview and ability to take
remedy a tions when faults o ur. The manual supervision, to dete t a
fault, isolate it's ause, and a ommodate the system to a new ondition,
15
16
CHAPTER 2. ABOUT FAULT TOLERANT CONTROL
has been mu h improved. However, the omplexity and fast response time
required makes it appealing to move the more basi supervision down from
the operator to the automation level. To a hieve this, plant supervision
needs to be automated and be ome more autonomous.
This is te hni ally possible with integrated automation systems as platforms, but new design methods are needed to ope e iently with the omplexity and ensure that the fun tionality of a supervisor is orre t and onsistent. Fail-safe systems, known from avioni s and other safety riti al
appli ations are expensive in both hardware and development eort, and
are prohibitive in ost for ordinary pro ess automation. Here, additional
hardware should not be required and implementation osts be very limited.
The o urren e of faults an be tolerated but it should be prevented that
they develop into failures at a subsystem or plant level. Furthermore, it
should be guaranteed that all essential faults are dete ted and all riti al
faults are a ommodated.
Fault Dete tion and Isolation (FDI) theory has matured over the last
de ade. E ient methods exist to dete t additive faults - where faults are
understood as signal ve tors in a state spa e or polynomial system des ription. (Gertler,1993 [25; Patton,1995 [43; Isermann,1994 [29). Di ulties
with fault dete tion in nonlinear systems have started to be solved Frank,
(1995)[21 and Shields, (1994) [46 , and robustness problems have been
dealt with in various ways: fuzzi ation (Frank, 1994 [20), threshold adaption (Emami-Naeini, 1988 [17; Ding and Frank, 1991 [15; Jrgensen, 1995
[31), and statisti al hypothesis testing Baseville and Nikiforov,(1994) [2.
Dete tability was investigated in Chen and Patton, (1994) [11.
Mu h less work has been devoted the problem of what to do when a
fault has been dete ted. An overall approa h was taken by
Astrom, et al.,
(1986) [1 where not-normal ontroller operation and tuning were key issues.
The a ommodation problem was treated by Tsui, (1994) [47 for a narrow
s enario where state feedba k was required and faults needed to be state
disturban e signals, similar to a tuator faults.
The s ope in mu h ontrol systems resear h has been limited to solve
the fairly well formulated problem starting o with mathemati al models
of ontrol obje ts and faults represented as additive signal ve tors. A general design on ept was treated in Blanke,(1995) [6 and marine appli ation
studies were presented in Blanke and Jrgensen, (1993 [8 and 1995 [5).
The paper by Bgh, et al., (1995) [10 dis usses autonomous, fault tolerant
ontrol of a mi ro-satellite using the same basi ideas.
This hapter fo uses on development of an overall on ept that meets
industrial requirements to development methods. A method is suggested
that gives a onsistent design and assures system dependability. The basi
philosophy has been to use existing sensors and a tuators in an integrated
system and make systemati use of both dire t and indire t redundan y
in the available information. Component based fault analysis is shown to
2.1. INTRODUCTION
17
assure a mu h higher degree of ompleteness than otherwise a hievable.

Se tion 2 deals with fault toleran e, what it is and how it an be a hieved.
Se tion 3 presents the omponent based analysis developed by the author
and o-workers. Se tion 4 deals with the problem of getting models for FDI
design without major modelling eorts. Se tion 5 presents an ar hite ture
for fault tolerant ontrol where a three layer model with a supervisor at
the top level appears to be an e ient vehi le for implementation. Se tion 6
dis usses supervisor design and implementation in general terms, and se tion
7 lists the details of a pro edure for systemati design. Se tion 8 shows
an example on the use of the te hnique and reports on experien e with a
prototype tool. A on lusion and a referen e list ompletes the hapter.
18
2.2 How Fault Toleran e is Obtained

Faults in one subsystem of an automated plant have often undesired ee ts
on other subsystems if remedy a tions are not taken after a fault o urs.
Today, shut down fun tions and interlo ks are used to prevent failures to
dilate from one sub-system to another. The use of su h fun tions has, however, the onsequen e that plant availability is sometimes redu ed without
good reason. With the ever higher degree of automation, this has been the
key ause to in reased vulnerability to simple faults,parti ularly in sensors
and a tuators.
Dependability of a ontrol system an be obtained by giving it ability
to dete t and isolate faults and rea t with a tions that a ommodate the
ontrol system to the fault. Fault a ommodation is predetermined at the
design stage: a ontrol system an freeze to a safe state or the ontroller an
be re- ongured, e.g., by using a redu ed set of sensors if a sensor fault has
o urred.
2.2.1 Open and losed loop systems

Handling of faults in open loop systems, e.g., monitoring and remote ontrol,
is te hni ally straight-forward, but the rea tions used to a ommodate a
fault need to be designed with areful onsideration to safety and availability
of the total plant. Optimization at a lo al level may easily violate an overall
safety goal.
Handling of faults in losed loop omponents is a more di ult and
hallenging task. Properly designed systems an a ommodate the ee ts
of faults whereas less areful designs an let fault ee ts propagate to other
subsystems.
2.2.2 Reliability Analysis

For the reasons given above, fault analysis need to in orporate analysis
throughout a system.. Traditional methods for Fault Dete tion and Isolation
(FDI) do not over this problem. They are very able to dete t the presen e
of a fault as a dieren e between a tual and expe ted behaviour. Isolation
of a parti ular fault requires a hypothesis about the observed ee ts from
this fault. This is obtained by ad ho engineering and requires deep pro ess
knowledge and engineering skills to make a su essful design. It is expensive
in terms of both key personnel and time.
Analysis of system reliability is mandatory for safety riti al systems but
is also more and more often used for ommon industrial systems, driven by
the in reasing environment and safety awareness in re ent years. The state
of the art is su h that no method an guarantee a omplete des ription of
all possible fault modes of a system. Certain forms of risk analysis provide,
2.3. COMPONENT BASED ANALYSIS OF FAULT PROPAGATION 19
nevertheless, a very systemati approa h to fault modelling on e possible

omponent faults have been identied. Faults in ommon industrial omponents are subje t to onstant study, and a methodology based on omponent
fault modelling ould use a umulated knowledge for ea h type of omponent. The number of prin ipally dierent omponents in a ertain bran h
of industry is small enough to make this a manageable exer ise.
2.2.3 A systemati Approa h.

A systemati approa h an be made if the basi methodology from risk
analysis is adopted to the detailed mathemati al models needed for real time
FDI. A link has to be established from qualitative, stati risk models at the
omponent level to quantitative, dynami FDI des riptions of input-output
relations to a hieve this goal.
The link is obviously to merge the omponent based generi dynami
models (energy, momentum, and ow relations) with omponent fault models from the risk analysis. The generi dynami models an be extended to
subsystem input-output des riptions using a system behaviour des ription
approa h (Willems, 1991 [49) or a more traditional bond graph approa h
(Karnopp and Rosenberg, 1983 [32).
The systemati approa h shall provide the following information:
1. List of faults to dete t
2. Mathemati al model for use in FDI
3. Basi hara ter/ riti ality of ea h fault
4. Required rea tion to ea h fault
This is details are elaborated in the following.
2.3 Component Based Analysis of Fault Propagation

2.3.1 The Matrix FMEA Method
A Failure Mode and Ee ts Analysis (FMEA) (Legg, 1978[33, Herrin, 1981[26,
Yuan, 1985 [51, Bell, 1989[4) starts with sele tion of the lowest level of analysis. In this ontext, this is sensors, valves, motors and similar omponents.
All potential faults and their ee ts are determined. An FMEA s heme for
ea h omponent shows how fault ee ts out of the omponent relate to faults
at inputs, outputs, or parts within the omponents. This is illustrated in
gure 2.1.
Using f i for omponent faults and e i for the ee ts, the FMEA s heme
an also be expressed as:
20
To third level
Second level
analysis
E1 E2 Em
Inputs
F1
F2
Outputs F3
First level
analysis
Unit 1
Inputs
E1 E2
F1
F2
F3
Outputs F4
F5
Parts
F6
F7
F8
En
Unit 2
Inputs
E1 E2
Unit 1
E1
E2
En
Unit 2
E1
E2
En
En
F1
F2
F3
Outputs F4
F5
Parts
F6
F7
F8
Figure 2.1: Failure Mode and Ee t Analysis s heme illustrated graphi ally.
Two omponent levels are shown.
Afi
f i
e i
(2.1)
where Af is a Boolean matrix representing the propagation. The index

i is a omponent identier and
the inner produ t disjun tion operator.
The operation arried out by the operator is equivalent to the s alar
Boolean disjun tion"_" and the inner produ t to the "^", i.e., row no. k of
(2.1) is
(ak1 ^ f 1 ) _ (ak2 ^ f 2) _ : : : _ (akn ^ f n)
e k
(2.2)
When some faults are ee ts that are propagated from other omponents,
we get
e i
f
i
f i
e i 1
(2.3)
System des riptions are obtained from inter onne tion of omponent des riptions. The des ription of a system with three omponents and open
loop stru ture is
e 3
e 1
Af2
fe 2
1
Af3
fe 3 ; e 2
2
f
A2
[f 1
(2.4)
The fault ee t s heme for this example is
e 3
e 3
A3
Af3

f 3 ;
e 2
I 0
0 Af2
2
I 0
(2.5)

4

f 3
f 2
f 1
3
5;
3

f 3
f 2
f 1
5
4
5 Afsys
fsys
Af3
4 0 Af
I 0
f
2
0 A1
Ee ts are seen to be propagated to the next level of analysis and a t
as part's faults at that level. This is ontinued until the system level is
rea hed. The s hemes give a surje tive mapping from faults to ee ts: there
is a unique path from fault to end ee t, but dierent faults may ause the
same end ee t.
e 3
22
Reversal is obtainable through nding the inverse relation to 2.5
Absys
e 3 :
fsys
(2.6)
The matri es Af and Ab are ea h other's pseudo-inverse in the Boolean sense.

When there is no feedba k involved, the result is the apability of isolation
of fault ee ts at any level.
Re ent results from appli ation experien e indi ate that the fault ve tor
need to be extended su h that ea h omponent is a logi al expression of
more basi fault events,
fi = fk ^ fl
(2.7)
as an example. This extends the above pro edure to be ome more elaborated
but still solvable.
2.3.2 Completeness
Completeness of the fault ee t ve tor is a ne essary prerequisite for later
fault dete tion and isolation, be ause the only faults that an be isolated
are those spe ied in the design. Completeness is obtained if all possible
omponent faults are onsidered. This is not a hievable in a rigorous sense,
but engineering experien e from risk analysis makes it possible for pra ti al
purposes.
It is noted, that ompleteness does not ensure that omponent fault
isolation is possible sin e several omponent faults ould ause the same
ee ts.
2.3.3 Fault propagation in losed loop

The FMEA s heme for a set of omponents onne ted in a losed loop is
prin ipally des ribed as
e i
f
i
f i
e i
(2.8)
Looking at the logi operation of this equation, it is obvious that the

solution is, if it exists
e i
Afi
[f i
(2.9)
With losed loop feedba k and negative loop ampli ation, equation
(2.8) is unstable, however, and a steady state solution does not exist.
2.3.4 Other approa hes

An alternative to the suggested FMEA based method of analysis ould be
Petri net, see, e.g., David and All, (1994) [12, Jensen (1994[30). Petri net
enable analysis of dis rete event systems and extensions of the theory enable
modelling of mixed ontinuous and dis rete systems. The Petri net approa h
has not been pursued in this ontext, but is an obvious resear h task.
2.3.5 De ision about fault handling

The impli ation is that an automated analysis will need to onsider losed
loops as spe ial ases. The interpretation of a losed loop in an FMEA
s heme is merely the observation that losed loop operation may amplify or
attenuate the ee t of a fault. Whi h of the two happens depends on the
dynami properties of the ontrol loop and this question is outside the s ope
of the matrix FMEA analysis.
Figure 2.2 shows the graphi al representation of a losed loop FMEA
analysis. Bold lines in the s heme show how faults propagate. The important
observation is that propagation an be stopped at the points marked with
stars. This means that fault handling should be applied exa tly at these
points.
The omponent based analysis an thus provide both a list of fault ee ts
and a suggestion of where in a system fault propagation an be stopped. In
the design method, it is then up to the designer to evaluate the severity of
ea h fault ee t and determine whi h fault a ommodation a tions shall be
implemented.
Various examples have been investigated to illustrate the appli ation of
the method, to provide insight in traps, and to highlight features. (Blanke
and Jrgensen, 1993 [8; Jrgensen, 1995[31).
2.3.6 Fault A ommodation

On e a sensor or a tuator fault has been dete ted and isolated, some a tion
is needed by the ontrol system to redu e or eliminate the fault ee ts, if
possible. This is referred to as fault a ommodation. The spe i a tions
required to a ommodate faults an be one or more of the items listed below:
De rease performan e, e.g., redu e through-put of the ontrolled sys

tem.
Change settings in the surrounding pro ess to de rease the requirements to the ontrolled system.
Change ontroller parameters.
Change ontroller stru ture.
24
To Filter
Evl1 Evl2 Evl3 Evl4 Evl5 Evl6

Inp
Outp
Comp
Emg1
Emg2
Emg3
Three-way Valve
Emg1 Emg2 Emg3

Epc1
Epc2
Epc3
Epc4
X
X
X
Motor/Gear
Epc1 Epc2 Epc3 Epc4

Epm1
Epm2
Epm3
Epm4
Els1
Els2
Els3
Els4
Etc1
From
Temp ctrl Etc2
Etc3
Etc4
Position Controller
Epm1 Epm2 Epm3 Epm4

Inp
Outp
Comp
Emg1
Emg2
Emg3
Els1 Els2 Els3 Els4

Inp
Outp
Comp
Limit Switch
Potentiometer
Figure 2.2: Propagation of fault ee ts in losed loop ontrol of 3-way valve.

Solid lines show fault propagation. Points marked with star show where
propagation an be stopped.
2.4. MODELS FOR FDI
25
Use omponent redundan y if possible.

Repla e defe tive sensor with signal estimator/observer. (analyt-
i al redundan y). Note, this operation may be limited in time

be ause external disturban es may in rease the estimation error.
If the fault is a set point error then freeze at last fault-free set
point and ontinue ontrol operation. Issue an alert message to

operators.
Freeze ontroller output to a predetermined value. Zero, maxi-
mum or last fault-free value are three ommonly required values

- the one to be used is entirely appli ation dependent. Finally
disable the ontroller.
Fail-to-safe operation.
Emergen y stop of physi al pro ess (safety system).
The a ommodation a tions needed follow from the FMEA analysis.
The requirements on part of software are that su h fault a ommodation
a tions an be easily spe ied and that autonomous fault dete tion and
a ommodation is part of ontroller and safety system spe i ations. This
is beyond the s ope of present automation equipment but is believed to be
an essential part of requirements to ome to improve overall reliability of
automated ma hinery systems.
2.4 Models for FDI

2.4.1 FDI based on dynami models
On e the list of all omponent fault ee ts are established, dynami models
for FDI need to be generated.
The Bond Graph approa h (see e.g. Karnopp and Rosenberg, 1983[32)
is well suited for omponent based dynami modelling. An example on the
modelling asso iated with a 3-way valve in a ooling system is seen in table
2.1.
Two models are needed orresponding to ea h of the two physi al laws
of the thermal system: onservation of mass - the ow equation, and onservation of energy - the temperature equation.
The ow model is, in equation form:
26
Ee t )
fault +
input fault
Flow zero
omp. fault
rotor fault
pipe leak,
power fault
output fault pipe broken
Flow redu ed
A ow high B ow high
pipe logged,
power fault
pipe leak,
port A or B
logged
rotor fault,
bearing worn
setp. fault
setp. fault
Table 2.1: FMEA S heme for 3-way Valve
Input :
Input :
q1
q2
F ault :
q =
Internal : R1 =
R2 =
Output : q3 =
8
<
= 0 : no fault
=
< 0 : redu ed ow
:
= q1 q2 : no ow ;
f () ; where 0 < f () < K
K f ( )
q1 + q2 + q
(2.10)
The ow fault ee t is in orporated as an additive output fault. The

ow into port A is q1 , and q2 into port B. The output ow is q3 .
The temperature model is
Input :
t1
Input :
t2
Internal : K1 = f () ; where 0 < f () < K
K2 = K f ()
Output : t3 = t1 K1 + t2 K2
(2.11)
Su h models are easily represented as bond graphs. The ausality of

a bond graph model is determined by the inter onne tion, i.e., how the
omponents are used. There is hen e freedom to use the same omponent
representation in dierent onne tions and use dierent dire tions of ausality. This exibility makes the bond graphs well suited for use in a omponent
library, see de Vries, (1994) [13 .
2.5 Inter onne tion at subsystem level

A dynami model at the subsystem level is obtained from spe i ation of
links between omponents. This is illustrated s hemati ally in Figures 2.3
27
2.5. INTERCONNECTION AT SUBSYSTEM LEVEL

Controller
Controller
From cooler,
FW calorifier and
FW generator
TS
TS
To de-arating tank
Main
engine
M
(3a)
A
From pump system
(3b)
Three-way valves
From FW calorifier
and FW generator
Figure 2.3: Blo k diagram for ooling system with two 3-way valves and
sket h of surrounding omponents.
and 2.4 for two 3-way valve ooling loops and a pro ess being ooled. The diagrams are not omplete but serve as illustration. The prin ipal issue is that
a graphi al omponent des ription with omponent links has an underlying
model and the stru tures of the two an be dire tly related.
2.5.1 The link to FDI models

Analyti al methods for dete tion of faults, and the later isolation (FDI)
require a dynami fault model like equation (2.12).
x_ (t) = Ax(t) + Bu(t) + Ef f (t) + Ed d(t)

y(t) = Cx(t) + Du(t) + Gf f (t) + Gd d(t)
(2.12)
The symbols in eq. (2.12) are: state ve tor, x, ontrol input, u, disturban e, d, additive fault ve tor, f , and output ve tor, y. Fault propagation
is des ribed by the plant dynami s and the matri es Ef and Gf .
Parity equation and Fault dete tion observer based FDI method dete ts
a deviation from normal and isolates the omponent of the fault ve tor, f
, whi h is the most likely ause to the deviation. Identi ation approa hes
determine hanges of parameters in either of the system matri es.
A ru ial point about FDI methods is that only fault ee ts whi h have
been in luded in the model, an be isolated. The FDI methods alone an not
28
3WV
3WV
TS
ME
TS
Cntrl
Cntrl
Figure 2.4: Bond graph model prin iple for ooling loop. is serial,
serial, and a parallel onne tion of omponents.
is
guarantee that all relevant faults an be isolated. This obsta le is over ome
by using the risk analysis approa h to dene the fault ee ts.
Transformation of a bond graph model to an FDI state spa e des ription
an be ompletely automated when ausality in the loop is properly dened.
The interested reader should onsult (de Vries, 1994 [13, or Karnopp and
Rosenberg, 1983) [32.
2.6 An Ar hite ture for Supervisory Control

The rst step to a hieve fault tolerant ontrol is dete tion of a non-normal
ondition. The se ond step is to isolate the ause to one or more possible
omponent faults. The third step is to evaluate the ondition, take de ision
about a tivation of a tions to a ommodate the fault and nally enfor e the
handling a tions.
These fun tions are adequately implemented as a supervisory stru ture
with three levels:
1. A lower level with ontrol and input/output
2. A se ond level with fun tions to dete t fault onditions in sensors,
a tuators, ontrol loops and ontrol algorithms where needed.
2.7. SYSTEMATIC DESIGN
29
Figure 2.5: Three layer model for autonomous ontroller with link to upper
level plant wide ontrol or to operator interfa e.
3. A third level with de ision logi whi h rea ts on the urrent ondition,
re eiving inputs from dete tors on any non-normal state and the operational mode of the pro ess. Dedi ated ee tor modules will also exist
to exe ute handling a tions when required
The 2nd and 3rd level are meta-levels whi h together onstitute a supervisory ontrol. Levels 1 and 2 are exe uted in real-time. Level 3 is exe uted
when triggered by events at a lower level.
2.7 Systemati Design

A method for systemati design is a omputer assisted intera tive design
pro ess where the designers judgment is used to determine fault handling
a tions. Parts of the method are automated: the FMEA analysis, the logi
inferen e, onsisten y he k and implementation of de ision logi in a supervisor.
The systemati design is illustrated in gure ?? and omprises the following steps:
1. Component based FMEA
Detailed FMEA model for ea h omponent
30
List all potential omponent faults

Find fault ee ts for ea h omponent fault
2. Criti ality assessment
Propagate fault ee ts through system and make a list of fault

ee ts at the subsystem/ system level

Evaluate the riti ality of ea h fault ee t
3. Dedu t remedy a tions

Lo ate losed loop points and determine desired fault rea tions
Use system fun tionality requirements in this analysis
Make list of ee ts to be handled
Determine remedy a tions for ea h
4. Fault a ommodation design
Determine ontrol requirements and inputs/outputs for ea h fault

ee t to be a ommodated.
Determine ontroller onguration, in luding possible sensor signal estimation for ea h fault ee t. This step may in lude rea tions from plant shut-down to issue of an operator warning.
Determine how re onguration shall be done.
5. Reverse omponent FMEA
Determine faults that ause the end ee ts with high riti ality
level. These faults should be dete ted and subsequently handeled.

This is done by reversal of the FMEA logi from the list of fault
ee ts to be handled.
6. System modelling
Model relevant parts of the system as required by the FDI methods to be employed.
7. Fault dete tor design
Determine type of dete tion method.

Determine robustness requirements.
Determine dete tion method and parameters.
2.8. SUPERVISOR DESIGN AND IMPLEMENTATION
31
Figure 2.6: Design method for dependable ontroller with autonomous fault
dete tion and a ommodation
8. Supervisor design and implementation
Determine de ision logi .

Determine onditions for use of ea h dete tor.
Assure onsisten y and orre t implementation (a subje t of a
tive resear h).

Take a ount of modes of operation
The problem of onsisten y and ompleteness an be partly solved using

an inferen e engine. Reliable implementation is also a subje t for resear h.
2.8 Supervisor Design and Implementation

Implementation of a supervisor and it's dete tors/ee tors are required to be
orre t. This is a di ult task. Corre tness an not be tested like ontrollers
in normal operation. Most parts of a supervisor will only be a tivated when
ertain faults happen. Furthermore, sin e these parts of the supervisor are
not used in daily operation, one will not know of an implementation fault
32
before a ontrol loop fault has aused plant malfun tion. For these reasons,
automati ode generation and systemati design methods are key issues.
There are three main on erns when implementing a supervisor. First,
a design methodology must ensure that there is a unique mapping from the
system fault des ription to supervisor logi . Se ond, onsisten y of the logi
should be provable and, third, automati ode generation of the state-event
logi is preferred for reasons of implementation reliability.
2.8.1 Array based logi

Array based logi (Mller (1995) [40; More (1981) [41; Franksen (1978) [22)
is the basis for a ommer ial inferen e engine and run time system. It uses
array theory to perform veri ation of onsisten y of the logi at ompile
time, and has deterministi exe ution duration at run time. This is obtained
through representation of the logi in a rule base as matri es. Logi inferen e
is then boolean operations on logi matri es. This software tool makes it
possible to analyze logi al relations and implement state event ma hines
automati ally on e it has veried the des ribing logi for orre tness. A
main advantage of this tool is that its logi mat h the fault propagation
des ription.
2.8.2 Petri net implementation

If the fault ee t propagation had been des ribed in terms of Petri nets or
olored Petri nets (Jensen, 1994) [30, automati analysis and ode generation ould also have been possible. However, formal veri ation of onsisten y would not be immediately available.
2.8.3 Re e tive programming implementation

When it omes to orre t implementation and easy maintenan e, omputer
s ien e may have another tempting method for supervisor and dete tor/ee tor
implementation. Re e tive programming (Maes, 1987 [39; Lunau, 1995,
1997 [36, 37; Lunau and Nielsen, 1995 [38) is a te hnique within the obje t
oriented programming eld where one an obtain transparent implementation of the various dete tors and ee tors. Re e tive me hanisms enable
obje ts - with their inherent methods - to be exe uted in an order whi h
depends on external onditions. The parts of software used an thus hange
dynami ally. This is exa tly what is needed when re onguration shall take
pla e. New ontroller ode shall repla e what is obsolete in the new ondition and dete tors shall hange to the hanged onditions. The obje ts
whi h implement the dete tor and ee tor fun tions are referred to as metaobje ts.
The re e tive paradigm is parti ularly useful be ause ea h dete tor an
be implemented su h that it needs to know only the onditions under whi h
2.8. SUPERVISOR DESIGN AND IMPLEMENTATION
33
it is a tive. This makes implementation mu h more versatile and maintainable than traditional ase statement implementations and re-use of dete tor
ode be omes a genuine possibility. The ways to ensure onsisten y and orre tness of an re e tive implementation is, however, still an area of a tive
resear h.
2.8.4 A prototype implementation
Implementation of the supervisor has been made using a BEOLOGICr generated state-event ma hine on a small s ale prototype.
The methodology for dependable design was implemented as a prototype
tool using o the shelf software to the extent possible. FMEA s hemes
for omponents were entered in a spreadsheet, and a dedi ated ompiler
translated to a language understood by the BEOLOGICr inferen e engine.
The logi of this ommer ial tool is rather more advan ed than the basi
matrix formulation presented here, and array logi is employed to solve the
inter onne tion and analysis problems. (Mller, 1995 [40; Franksen, 1978
[22; More, 1981 [41).
The tool was able to generate the ne essary tables for fault handling for
the ooling system as desired. A se ond benet was easy a ess to the inverse
tables, whi h show all possible omponent faults on e a ertain ombination
of fault ee ts is observed. This list ould be useful in its own right for fault
diagnosis purpose and advise about the severity of an observed ondition.
The parti ular tool oers translation of the logi and has, thus, a xed
maximum al ulation time at run-time.
One di ulty en ountered was the analysis of losed loop systems. Equation (2.2) annot be solved dire tly for a losed loop onguration, and it
does not give any meaning to onsider loop gain when faults are des ribed
only quantitatively, the inferen e engine had di ulties. The work-around
solution was to in orporate an additional state with ea h FMEA blo k stating whether a logi sear h had already been through this part of the diagram. The result was easy determination of losed loop paths whi h ould
be used in the identi ation of potential points where fault handling ould
be a tivated to stop further propagation. (Blanke, et al., 1995) [7.
The false dete tion problem is not solved in this way, however. False
dete tion and noise on FDI residuals may ause onsiderable diagnosis un ertainty. This problem needs to be solved using, e.g., the usual sto hasti
dete tion methods or fuzzy dete tion te hniques Frank, (1995) [21.
Automati handling of bond-graphs inter onne tion and translation to
state spa e models has not been pursued. The reason was that other groups
have reported su h results, see de Vries, (1994) [13.
The prototype tool is ertainly far from a full s ale implementation,
but the experien e has shown that the on ept as su h seems to be worth
pursuing at a larger s ale.
34

D/A
AO
Control
Algorithm
A/D
AI
Computer
Plant
TS
Filter Unit
To Main Engine
From Heat Exchange Subsystem
Figure 2.7: Temperature ontrol loop with 3-way valve.
2.9 Example: Temperature Control

A three way valve ontrols the mixing of hot oil from an engine and tempered
oil from a heat ex hanger. The valve is ontrolled by the temperature ontrol
loop, whi h onsists of:
{
{
{
{
a tuator with AC motor

temperature sensor
ontroller with pro ess interfa e
lter system
The ontrol loop is shown in gure 2.3.

The temperature ontrol loop is a as ade ontrol with position ontrol
of the valve as the inner loop. Stability of the total loop is not guaranteed
if the inner loop be omes open due to a omponent fault.
Three way valve a tuator. The valve is driven by an AC motor whi h
is a tivated by a double a ting relay to either side of rotation. End stop
swit hes are supposed to avoid motor overload and for e the motor into
me hani al stop should the position ontrol loop fail in some way. The
potentiometer gives position feedba k. The position loop fails if either potentiometer or end stop swit hes fail.
2.9. EXAMPLE: TEMPERATURE CONTROL
35
Re onguration of the valve ontrol, should this be required, an be done

fairly simply.
1. Use an estimate of the valve position in the motor ontroller instead
of a faulty position signal.
2. Override a limit swit h information if both position sensor feedba k
and an estimated position show that a limit swit h fault has o urred.
An observer for this purpose is quite elementary. The estimated valve
position is in reased or de reased in proportion to the time either of the
two motor relays. This requires no additional hardware but a few lines of
observer ode. A ommodation of any of these sensor faults will make it possible to ontinue operation while giving an alert about required maintenan e.
Without a ommodation, the temperature ontrol loop would probably fail
due to the loop be oming unstable without the internal position feedba k.
Figure ?? showed the FMEA s heme in graphi form for the valve ontrol
part of the loop. The omponents are: potentiometer, limit swit hes, motor,
3 way valve, and digital ontroller.
Faults in a limit swit h will prevent motion in lo kwise or ounter lo kwise dire tion - opening or losing of the valve. The onsequen e is a severe oset of the temperature ontrol if fault handling is not initiated. A
breakdown of the position feedba k element will ause a breakdown of the
temperature ontrol loop be ause the motor will be driven rapidly to fully
open or fully losed position.
Be ause several faults an ause the same ee t, it is ne essary to isolate
the failure sour e. When the sour e is isolated it is possible to de ide the
rea tion:
A tuator fault (fault in the valve limit swit h) the motor must be stopped
immediately.
Position sensor fault the ontroller should be re- ongured. The analyti al relation between duration of relay pulses and motor shaft position,
a position estimate is readily available. The estimate is used until the
fault is repaired.
Temperature sensor fault the referen e to the position ontroller fails.
The ontroller is re- ongured and a time-history roll ba k is made of
the referen e signal and the mean used as new referen e until the fault
has been repaired.
These examples show situations where temperature would deviate signi antly or the ontrol would simply fail with the existing ontroller design. Fault handling, by ontrast, ould assure plant availability with simple
means.
36
2.10 Summary
This hapter has given an overview of ideas to make systemati design to
obtain fault tolerant ontrol. It showed how a matrix formulation of an
FMEA method ould be adopted to t into the fault dete tion and isolation
problem. State spa e des riptions of system dynami s and fault propagation
ould be obtained from generi bond-graph models of omponents. It was
shown how the omponent models ould be simplied into generi types
for used in the design, and how the generi types were used in the model
building stage. It was further shown that the FMEA method and the generi
omponent types enable isolation of failure modes with dierent degree of
riti ality and determination of ontrol system a tions to faults.
A systemati design method was further presented whi h led to a three
level ar hite ture for a supervisor based fault tolerant ontroller. Various
implementation problems were dis ussed.
The main ontribution was the suggestion of a new method to systemati apture of requirements for fault dete tion and a ommodation, and
a systemati way of spe ifying FDIA properties related to omponent failure modes. A salient feature was shown to be the ompleteness properties
obtained with this method if ombined with array theory based implementation of the supervisor logi .
Chapter 3
Control System Interfa e

with Physi al Plant
Reliability is a key issue when designing interfa es between a ontrol system and the physi al plant. High availability is mandatory, and in reased
omplexity of automation has in reased the dependen y on sensors, remotely
ontrolled a tuators, and their interfa es with omputerized ontrol systems.
Computer ontrolled parts of a ma hinery system or other pro ess are
vulnerable to sensor faults, be ause a ontrol loop may amplify the ee ts of
a fault, and safety systems depend on sensor information to make immediate
slow-down or shut-down of essential equipment. Even simple faults an
therefore ause failure or shut-down of an entire subsystem in the plant.
The ability to dete t faults in sensors and a tuators, and thus to avoid
undesired onsequen es of simple faults, depends to a large extent on the
interfa e between ma hinery systems and omputers.
Requirements to sensors, a tuators, and their interfa e are dis ussed in
this hapter. A fundamental prerequisite is shown to be that sensor and
a tuator interfa es must be designed su h that fault dete tion is possible.
This problem is shown to addresses both the hardware interfa e and software
methods that, together, makes it possible to dete t and isolate faults.
Based on the analysis presented, requirements to interfa e between omputers and a physi al plant are suggested for a sele ted set of sensors and
a tuators. The requirements address the ombination of hardware and software needed to dete t, isolate, and rea t to generi types of faults.
37
38CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT
3.1 Component Failure Modes

This se tion gives an assessment of fault types for sele ted, ommonly used
omponents in ma hinery systems. Ea h sub-se tion gives an overview of
dierent omponent prin iples and sele ted omponents are des ribed in
some detail from information extra ted from supplier data sheets. Fault
models are developed in a manner that enables later fault analysis on a
marine lubri ation oil an illary system, whi h was sele ted as an example
for demonstration of the analysis methods.
3.1.1 Sensor and A tuator Types

An overview of sensor and a tuator types available for dierent supervision
and ontrol purposes in pro ess ontrol is provided in this sub-se tion. A
more detailed des ription is given for the most ommonly used types. The
sele tion of these types is made su h that the most relevant omponents in
uid pro esses are in luded.
Sensors:
1. Level measurement
Level transdu er based on pressure/strain gauge measurement.

(Analog signal)
Level swit h based on a oat. (Binary signal)
2. Temperature measurement
Temperature transdu er based on resistan e measurement.

3. Pressure measurement
Pressure transdu er based on strain gauge measurement. (Analog

signal)
Pressure swit h. (Binary signal)
4. Flow measurement
Flow meter based on rotor revolutions.

5. Position measurement
Potentiometer.
A tuators:
1. A tuators for ow ontrol
3.1. COMPONENT FAILURE MODES
39
Three-way-valve with motor ontrol.

2. Pumps
Standby pump set with remote ontrol.

3.1.2 Level Measurement
Measurement of level is needed for assessment of the volume of the ontent
within tanks omprising liquids. The following methods are ommer ially
used:
1. Sounding of height of surfa e relative to the top of a tank:
ele tro-magneti impulse travel time (mi rowave radar prin iple)
ultrasound impulse travel time or standing wave re e tion
2. Indire t assessment through measuring the pressure near the tank bottom:
measurement of pressure at the tanks bottom. Level al ulation

is dependent on uid spe i weight and temperature.

measurement of air pressure in a tube that extends to the tank
bottom where an opening in the tube allows air bubbles to be
released.
3. Measurement of tank level by a oat with an angle transmitter or

swit h that indi ate low or high level.
Measurements made a ording to prin iple a) above are most a urate
but also expensive. Cal ulation of the volume of tank ontents, from the
level measurement, requires a al ulation, or table lookup, to take a ount
of tank geometry. On a ship, ompensation of trim and heel angles may also
be needed. Cal ulation of mass ontained in a tank, from the al ulated
volume, requires measurement of the temperature of the tank ontents.
Measurement a ording to prin iple b) above will require the same al ulations. In addition, however, onversion from pressure, p, to level, L,
depend on spe i weight, (T), of the liquid in a tank. The latter is a
fun tion of the temperature, T:
L = Lo j + (Tp)
p = p patm
(3.1)

where Lo
is the transdu er oset from the tank bottom
p
is tank pressure
patm is atmospheri air pressure
(T) is spe i mass (temperature dependant)
Prin iple ) is inherently non-linear. It is well tted where a binary
indi ation is needed for showing whether tank ontents is above or below a
ertain level.
Level Transdu er
The level measurement system onsidered here is of type b. It onsists of a
strain gauge pressure transdu er mounted in the tank and a signal amplier/transmitter. The transdu er and the transmitter are inter onne ted by
means of a vented able. The vent tube in the able provides the transdu er
with the referen e pressure patm .
The pressure transdu er onsists of a sensing diaphragm and a resistive
strain gauge bridge. The bridge onverts the diaphragm deformation, due
to pressure dieren e, to a voltage. In the transmitter the bridge output
is onverted to a 4-20 mA urrent signal. The bridge is supplied from the
transmitter board whi h also delivers ne essary power from the 4-20 mA line.
Figure 3.1 shows the prin iple of the measurement system with ele tri al
wiring.
A range sele tor ombined with span and zero adjustment potentiometers
are available for alibration of the system.
Input to the level transdu er is the strain gauge bridge signal, whi h
is aused by sensing diaphragm deformation due to a pressure dieren e
over the diaphragm. The output is a 4-20 mA urrent signal. The tank
level is almost proportional to the deformation. If the level, urrent and
pressure span is Lspan , ispan and pspan respe tively then the linear, relative
relationship between input and output (level relative ele tri urrent) is:
^
^
L
Lspan
L
Lspan
Lo
P (To )
= Pspan
+ Lspan
(T )
io )
Lo
= (iispan
+ Lspan
(3.2)
where p = p - patm .
Level Swit h
Most level swit hes use prin iple ) above. They are used to indi ate full or
empty tank, and to provide a binary indi ation of high/low level. One level
swit h has one of these fun tions. Commer ial level swit hes have double
onta ts to enable dete tion of onta t faults. One set is o when the other
is on. shows the ele tri al swit h arrangement.
41

B
A
Low Level
A
High Level
Figure 3.1: Swit h arrangement for level swit h.

Swit h arrangement for level swit h. The A-A onta t set is used for
low level indi ation and the B-B set for high level indi ation as onta ts are
always arranged as losed in normal ondition.
3.1.3 Temperature Measurement

Measurement of temperature is needed for various monitoring and ontrol
purposes. The type of sensor used depends primarily on the temperature
level and range needed.
1. Temperature range -50-300 deg C. PT100 sensors are most feasible
in this range. PT100 sensors omprise a platinum resistan e element
en apsulated in glass. The measurement prin iple is based on the
hange of resistan e with temperature.
2. Temperature range 300-600 deg C. Thermo ouple sensors are most
feasible in this range. Ni/Cr sensors are the most ommonly used. The
measurement prin iple is that the voltage over a jun tion between two
metals will hange with temperature. A thermo ouple measurement
requires ompensation by a reverse jun tion at a known temperature.
There are also spe ial requirements to wire materials. This makes
wiring for thermo ouple more expensive.
Industrial temperature sensors are normally made as a metalli en losure
for the sensor element, for reasons of me hani al robustness. A house for
wire termination is mounted on the en losure.
PT100 sensor appli ation in lude tank temperature, temperature of liquids like ooling water, lubri ation oil et . Thermo ouple appli ation is
mainly exhaust gas temperature and similar high temperature measurements.

Rw1
iref
MUX
Rw3
RPT100
Rw2
Sensor
Wirering
ISC
3-wire measurement
Figure 3.2: 3-wire resistan e measurement of resistan e in Pt element to

measure temperature with ompensation of wire resistan e.
PT100 Temperature Transdu er

The temperature transdu er onsidered here is of type a). The PT100 temperature transdu er onsists of a measurement platinum element with a
nominal resistan e of 100 ohm at 0 deg C. The resistan e of the element is
hanging almost linearly with temperature. The nonlinearity (Eq:3.3) needs,
nevertheless, to be onsidered when a ura y below 0.5 deg is needed or the
range ex eeds 0-100 Co.
Resistan e measurement is, in most ases, measurement of the voltage
resulting from passing a known urrent through a resistan e element. To
avoid errors from resistan e in onne ting ables, a 3 or 4 wire measurement
is needed.
A 3-wire measurement is illustrated in gure ??. The 3-wire oupling
eliminates wire resistan e if Rw1 and Rw2 are equal, be ause the measurement wire Rw3 has no urrent load.
The resistan e of the measurement element has nearly a linear dependen y to the temperature. The general relationship between the resistan e
and temperature of the element is:
RP T 100 = R0 (1 + T + T 2 + T 3 + : : : )
(3.3)
where R0
is resistan e at 0 C
, and are temperature oe ients
T
is the temperature in deg C of the element
The magnitude of higher order terms in the above equation are small, and
43
the relation between resistan e and temperature an be onsidered linear in

the 0-100 deg C range within an a ura y of 0.1 o .
RP T 100 = R0 (1 + T )
(3.4)
and for the temperature relative the urrent iref

1 V
T^ = ( P T 100
iref
R0 )
(3.5)
3.1.4 Pressure Measurement

Pressure measurement is needed in a large number of ma hinery subsystems
for reasons of monitoring and ontrol. Pressure measurement is needed in
various appli ations where the pressure itself is the key parameter. It is
also used for indire t assessment of ow through units like heat ex hangers,
lters, and asso iated piping.
Pressure measurement an be:
1. absolute with analog measurement of pressure
2. dierential - i.e.e, the dieren e pressure between two measurement
points, e.g., dieren e between inlet and outlet of a pump or a lter.
This type is also analog.
3. absolute measurement with binary indi ation of pressure above or below a set value
4. dierential measurement with binary indi ation of pressure above or
below a set value
The analog transdu ers will often use a strain gauge measurement. The
binary will most often use a spring load to determine the swit h-over point.
Pressure measurement a ura y is in many ases limited by hysteresis
properties of the material of the sensing element.
Examples of appli ation of absolute measurement are steam pressure in
steam generator, starting air for a diesel engine, or hydrauli pressure in
steering gear supply pumps.
Safety shut-down of a subsystem is often done based on binary pressure
indi ation. A low lubri ation oil pressure will, as an example, ause an
immediate diesel engine shut down.
Pressure Transdu er
The pressure transdu er onsidered here is of type b, and based on strain
gauge measurements. The transdu er and transmitter are olle ted as one
SG
SG
Vsup+
SG
Bridge
Supply
SG
Vout+
4 - 20 mA
PA
Supply
Vout-
VsupTransducer
ISC
Wirering
Figure 3.3: Pressure measurement using a strain gauge bridge tted to a

membrane onverted to a 4-20 mA urrent out of the transdu er.
unit. Power to the box is taken from the 4-20 mA interfa e. Figure 3.3
shows the pressure measurement prin iple and ele tri al onne tions.
The relationship between the pressure and the output urrent is:
i =
p =
p + i0
p patm
ispan
Pspan
(3.6)
Pressure Swit h
The pressure swit h onsidered here is of type . It is a pressure ontrolled
swit h, where pla ement of the swit h depends of the adjusted set-point
value and the pressure in the onne tion.
Figure ?? illustrates the operation of the swit h. The onta ts 1-4 lose
while 1-0 break as the pressure rises above a set-point value. The onta ts
return to initial position when the pressure falls to the set-point value minus
the me hani al hysteresis.
3.2 Angular Position Measurement

Angular position measurement is needed in remotely ontrolled valves and
in a large number of other devi es. The number of measurement prin iples
for angle are numerous, and will not be dis ussed in this ontext. One very
popular omponent is the ele tri al potentiometer. It is available for both
rotating and linear versions.
45
3.2. ANGULAR POSITION MEASUREMENT

P
4
2
Mechanical
Hysteresis
Setpoint
Figure 3.4: Binary pressure indi ator. The solid line is me hani al dieren e
and the bottom of it is the adjustable set-point value.
Prin iples for very a urate appli ations in lude: opti al en oder prin iples, magneti indu tion (the indu tosyn) and syn hro transmitter measurements. Opti al en oders and indu tosyns are be made in both rotating
and linear versions. Linear position measurement is also available from differential transformer based sensors.
Magnetorestri tive materials have been used for robust omponents sin e
about 1980. These elements are very robust but nonlinear in the 2 5%
order of magnitude.
3.2.1 Potentiometer
A potentiometer hanges the position of onta t between a resistan e element and a wiper when the turning angle is hanged. The potentiometer
an be onsidered a voltage divider with a division ratio that is a fun tion
of the turning angle. Linear potentiometers have a very a urate linear relation between turning angle and division ratio.Figure ?? shows the typi al
onne tion diagram. Fault dete tion ability is dis ussed in a subsequent
se tion.
3.2.2 Flow Measurement

Measurement of ow is used for load/dis harge of liquid argo and supply. It
is also used for onsumption assessment, e.g., of diesel-, fuel-, and lubri ation
oils. Available measurement prin iples in lude:
1. Rotation of propeller or similar rotor by passage of uid liquid. Rotor
Potentiometer
Wirering
iF
ISC
Figure 3.5: Ele tri al diagram of potentiometer and omputer interfa e to

enable fault dete tion at the single sensor level.
revolutions ount is proportional to volume of uid that has passed the
sensor. The rotating devi e will have onta t with the liquid while the
rotation sensor(s) an be made to work without being in dire t onta t with the liquid. A ura y is usually satisfa tory for onsumption
measurement be ause the basi prin iple is a volume measurement.
2. Shift in resonan e frequen y when a velo ity dieren e exists in two
paths for the uid, and this path is exposed to a magneti eld. Sensor
omponents are not in onta t with the liquid, and obtainable a ura y
is very high (0.1%). Ele troni ir uitry is omplex, and therefore
expensive.
3. Doppler shift of ultrasound signal a ross a se tion of a tube arrying
the liquid. Transdu ers using this prin iple have no sensitive omponents in onta t with the liquid. A ura y is usually onsidered high.
4. Pressure drop over a avity of the pipe arrying the liquid (Pitot tube).
Flow and pressure are mutually dependent a ording to Bernoullis
law. Measurement a ura y is normally onsidered insu ient for
onsumption assessment, be ause the prin iple is basi ally ow based.
Pressure sensor will normally be in onta t with the liquid.
Flowmeter
The ow measurement prin iple onsidered here is of type a. The ow-meter
onsists of a housed 4 bladed rotor whi h is pla ed in the uid stream. The
3.3. ACTUATORS FOR FLOW CONTROL
47
axis of rotation of the rotor is parallel to the dire tion of the ow. The in oming uid for es the blades to rotate at an angular velo ity approximately
proportional to the ow rate. A magneti oupling transmit the rotor rotation to an indu tive pulse transmitter. A pulse dis riminatory may be
in luded (option).
A pulse dis riminatory prevents measurement faults due to pipeline vibrations, pressure u tuations, or non-steady ow. The obsta le is that an
error will o ur if a ba k and forth u tuation of the rotor reates multiple
forward pulses. By using two pulse transmitters, whi h generate two signals
with a phase shift of 90 , these measurement errors an be eliminated.
Ele tri al output from the ow-meter is a binary signal. Supply is a
voltage of 24 V DC. The load urrent will hange value a ording to the
state of the binary signal.
3.3 A tuators for Flow Control

Flow ontrol is the most widespread a tuator fun tion in ma hinery systems.
It is used where a shut o of a pipe onne tion is needed, where a medium
need to be ow ontrolled, and where ontrol loops manipulate a ow of
a liquid to ontrol temperature. Flow ontrol an be open/ losed, variable
throughput, or redire tion of ow from one pipe into two (3 way valves).
Only remote ontrolled valves are relevant here.
1. Valve for open- lose of pipe onne tion. A tuation an be:
hydrauli a tivated. Open when pressure is applied and losed

with no pressure - or reverse.

ele tri a tivated.
Indi ation an be analog or binary for these valves. Binary indi ation
an be limited to " losed", and "not losed" should then be interpreted
as open. However, most valves have indi ation for both open and
losed positions.
2. Two way valve. Sends ow in one dire tions in a hydrauli ir uit,
stops the ow in neutral position, or reverses the ow when a tivated
in opposite dire tion. Two way valves are used in hydrauli ontrol
ir uits.
a tivation is ele tri with proportional or bang-bang (solenoid)

type of ontrol valve.
3. Valve for variable bypass. A rotor is turned me hani ally to hange

the opening area of the valve.
hydrauli a tivation: volume of hydrauli oil in rotary vane or in

ylinders determine the me hani al position of the rotor. Oil ow

is ontrolled by a smaller, ele tri ally a tivated valve.
ele tro-me hani al a tivation: an ele tri motor with gear turns
the rotor shaft of the valve.
A position sensor serves as feedba k element. Swit hes are, in many

ases, mounted to provide end position indi ation for both fully open
and fully losed.
4. Three way valve for redire tion of ow. Is used in temperature ontrol
systems, for example, to pass a su ient fra tion of ow through a
ooler to obtain a desired temperature. Opening and losing of the
valve will hange the ratio between ow through the ooler and that
bypassing it. A tuation an be as in ). Ele tri al a tivation is the
most ommon.
In this ontext, we need to des ribe an ele tro-me hani al three-way
valve, type d), in more detail.
3.3.1 Three-way Valve

The a tuator system onsists of a three-way valve. It has a ommon port
and two other ports, referred to as A and B. The rotor position determines
the opening area between the ommon port and ports A and B. The valve
distributes the ow between ports A and B. The distribution is ontrolled
by the rotor angle. An ele tro-me hani al devi e is atta hed to the valve to
ontrol the rotor position.
Two modes of servi e exists for the valve: diverging and onverging. In
diverging servi e ow enters the ommon port and ows out of one or both
of the two other ports (A and B). In onverging servi e ow enters from
port A and/or B and leaves through the ommon port. The position of the
rotor determines whether one or both of the ports A and B are in use.
The relation between rotor position and ow through the ports A and
B is nonlinear in onverging mode and linear in diverging mode. Figure ??
shows the valve hara teristi for ports A and B.
The ele tro-me hani al positioner onsists of a motor and a gear. The
rotor position is hanged by running the motor in lo kwise (CW) or ounter lo kwise (CCW) dire tions. The motor an be in one of the following states:
stopped, rotation CW, or rotation CCW.
The state is ontrolled by a tivation of two relay onta ts. They are
denoted Open (o) and Close ( ) respe tively. A potentiometer is used to
measure the a tual rotor position. Figure ?? shows the prin iple in the
a tuator operation and ele tri al onne tions.
49
3.3. ACTUATORS FOR FLOW CONTROL
100
80
Diverging
Converging
60
Percent of full travel

40
20
0
100
20
80
60
40
60
40
Percent of total flow
80
20
100
0
A
B
Figure 3.6: Valve hara teristi s for diverging and onverting operation (the
use of A or B ports for in ow).

220 V AC
AC
HTR1
HTR2
LS
TS
LS
TS
Potentiometer
A
C
B
Limit
Schwitches
Close
Motor
Open
ISC
Figure 3.7: Operation of 3-way valve a tuator with relay operated indu tion
motor. Abbreviations are: o:open, : lose, s:stop, HTR:Heater, LS:Limit
Swit h, TS:Torque Swit h.
Limit swit hes on the rotor provide indi ation of rotor end positions to
permit adequate ontroller design and fault dete tion. Torque swit hes are
mounted to provide overload prote tion.
3.3.2 Pumps
Pumps are used to drive a liquid through a ma hinery system or to move
a liquid from on tank to another. Ele tri ally driven pumps are normally
used. Criti al fun tions like lubri ation oil supply and ooling water for the
main engine, are done using redundant pumps that automati ally start if
supply pressure drops below a ertain level. It is essential that the standby
pump ontrol has no possible single point failures
Standby Pump Set with Remote Control

The pumps onsidered are ele tri ally driven. The pumps run with onstant
speed and an be in one of the two states: stopped or running. Control
an be lo al or remote. The pump provides a pressure and a ow whi h
are related through the pump hara teristi whi h is somewhat depending
on the degree of wear of the pump's rotor. The operating point at the
hara teristi depends solely on the ow resistan e of onne ted equipment.
A pump is started by losing a start onta t at signal urrent level. The
onta t a tivates a motor starter whi h will turn the power on to the pump.
Startup an be gradual with two or more steps be ause a pump is a fairly
51
3.4. FMEA SCHEMES FOR SENSORS AND ACTUATORS

Computer 1
Non return valves
PS
1
Computer 2
M1
PS
2
Pump 1
M2
rem.loc.blok
1
Motor
starter
1
start
stop
Pump 2
rem.loc.blok
start
stop
Motor
starter
2
Control of Twin Stand-by Pump System
Figure 3.8: Standby pump set with remote ontrol. The ontrol omputers
are independent and have mutual supervision.
large onsumer on a ship's power system. A two step starter has set of
resistors whi h are in series with the pump for a number of se onds after
startup. The resistors are bypassed after the startup period has elapsed.
The swit hing is done by onta tors (three phase relay apable of handling
the large urrents is needed).
A standby pump set onsists of two pumps with individual starters and
a pressure measurement in the outlet from ea h pump. If measured pressure
on one pump is lower than a predetermined value, the other is automati ally
started up. Figure ?? illustrates a standby pump ontrol system.
3.4 FMEA S hemes for Sensors and A tuators

3.4.1 Sensor Faults
Models for ea h sensor's potential failure modes and their ee ts are developed in this se tion. The models are presented in a manner that enables
dire tly use in the later FMEA on the lubri ating oil an illary system.
Level Transdu er
With referen e to Fig.?? the following table is developed.

Comp./
ee t
Too low
signal
Input
Output
wire
broken
Comp.
transmitter
adj. fault
Signal
not related
to physi s
Vent. tube
loagged
salt water,
sensor
damage
Flu tuating
signal
Too high
signal
onne tion
fault
transmitter
defe t
short ir uit
transmitter
adj. fault
Level Swit h
Signal
Flu tuComp./
Too low
not
related
ating
ee t
signal
to physi s
signal
Input
Output
low level
signal, eg.
broken wire
(low alarm
ong.)
onne tion
fault
Me h.
damage
Comp.
Too high
signal
high level
signal, eg.
broken
wire
(high
alarm
ong.)
Temperature Transdu er
Too
Signal not
Flu tuComp./
low
related
ating
ee t
signal
to physi s
signal
mounting
Input
fault
loose
short
onne Output
ir uit
tion
Comp.
eg. salt
water
sensor
element
fault
sensor
element
fault
Too
high
signal
Broken
wire
sensor
element
fault
53
3.4. FMEA SCHEMES FOR SENSORS AND ACTUATORS
Pressure Transdu er
Referring to Fig. ?? the following FMEA table is developed for the pressure
transdu er.
Signal not
Flu tuComp./
Too low
Too high
related
to
ating
ee t
signal
signal
physi s
signal
Output
Broken
wire
Input
Comp.
amplier
failure
pipe
broken,
ref. press.
failure
waterlled,
amplier
failure,
unit
damage
loose
onne tion
Short
ir uit
amplier
failure
amplier
failure
Pressure Swit h (Low level indi ator)

With a binary pressure swit h, the normal ondition needs to be dened.
Take normal ondition to be pressure above a setpoint. The output onta t
is losed in this ondition (NC). The following table is developed.
Too low
Signal
Too high
Flu tuComp./
signal
not
signal
ating
ee t
(Closed
related to
(Closed
signal
onta t)
physi s
onta t)
blo ked
setpoint
setpoint
Input
vibration
or leak
error
error
in pipe
loose
short on
onne onta t
Output
wire broken
tion
wires
waterin reain realled,
sed hystesed hysteComp.
unit
resis
resis,
damage
Potentiometer

Comp./
ee t
Too low
signal
Input
broken
wire
at A
short
at A-B
Output
short B-C
Comp.
Not
related
to angle
Flu tuating
signal
Loss of
supply
vibration
Broken
wire
at C
stu k,
shaft or,
element
broken
loose
onne tion
Too high
signal
broken
wire
at A,
short ir uit
A-C
Wiper
fault
Flowmeter For the ow-meter, a u tuating signal an only o ur, if the

instrument has double pi kups and indi ation of dire tion of rotation.
Signal
Flu tuComp./
Too low
Too high
not related
ating
ee t
signal
signal
to physi s
signal
power
Input
loss
wire
broken,
Output
short irkuit
press.
press.
u tu u turotor,
Rotor
Comp.
ation,
ation,
pi k-up
stu k
vibravibradamage
tion
tion
3.4.2 A tuator Faults

The position of the three-way valve is ontrolled. The ontroller onsists
of a potentiometer for position measurement, limit swit hes for end-stop
indi ation and the three-way valve. The failure modes for the limit swit h
and the valve are given below.
Limit Swit h (normally losed).

Normally losed means that the onta t is losed when no voltage is applied.
55
3.5. REQUIREMENTS TO INTERFACE
Comp./
ee t
End pos.
rea hed,
but not
indi ated
Signal not
related
to physi s
loose
wire,
onne tion
problem
Input
Output
Comp.t
short
ir uit
me h.
damage
Flu tuating
signal
me h.
damage
End pos.
indi ated,
but not
rea hed
broken
wire
me h.
damage
Three-way Valve
With referen e to Fig. ?? the following tables are developed.
Flow not
Flu tuFlow
Comp./
Flow
related to
ating
too high
ee t
too low
ontrol
ow
output
angle
pipe
too high
broken,
pipe
set-point
input ow
setpoint
Input
loagged,
u tuset-point
low,
pipe
leak
ating
high,
power
low
pipe
pipe
A or B
loagOutput
broken,
ged
logged,
or leak
damage,
damage,
Comp.
hysteresis
wear
wear
3.5 Requirements to Interfa e

This se tion elaborates on the interfa e to these omponents and dis usses
how various omponent faults an be dete ted. First, the set of omponents are ategorized a ording to their type of ele tri al interfa e. Se ond,
ea h fault type is treated, and hardware requirements to the omputer interfa e are derived. Third, signal hara teristi s are taken into a ount, and
methods for fault dete tion and isolation are dis ussed. Finally, results of

the analysis are summarized in the form of requirements to the ombined
hardware and software interfa e.
3.5.1 Component Categorization
A ording to Ele tri al Interfa e The omponents sele ted and dis ussed in
hapter 6 was seen to have only a few standardized types of output. The
omputer interfa e for ea h omponent is listed in the tables below.
57
3.5.2 Sensors
Sensors
Level
sensor
dierential
pressure
meas.
Pressure
transdu er
abs. or
dierential
Pressure
swit h
abs. or
dierential
PT100
element
Angle:
pot.meter
meas.
Norm.
losed
swit h
Norm.
open
swit h
Output
from
Comp.
Strain-gauge
bridge to
transmitter.
4-20 mA
output
Computer
interfa e
Comments
Current
24V DC
supply
from
omputer
Current
4-20 mA
Current
24V DC
supply
from
omputer
NC
onta t
Digital
Resist.
varies
with temp.
a) voltage
divider
varies ratio
with angle
b) resist.
varies with
angle
a) Conta t
b) Conta t,
resistor
in parallel
a) Conta t
b) Conta t,
resistor
in parallel
Resistan e
measurem.
with wire
omp.
a)
3 terminal
measurem.:
Supply
voltage to
resist.
element,
measure at
wiper
b) Resist.
measurem.
of wire
resist. with
ompens.
a) Digital
input
b) Resistan e
measurement
a) Digital
input
b) Resist.
measurem.
Constant
urrent
supply
from
interfa e
a)
No wire
supervis.
a) With
wire
supervis.
a) Without
wire
supervision
a) With wire
supervis.

Sensor outputs are seen to belong to one of the ategories listed below.
Sensor
Supply
Measurem.
ele tri al
from
by omputer
Comments
output
omputer
interfa e
a) 1 voltage
a) 4 wire
DC urrent,
measurem.
onne tion
usually
Resistan e
b)
2
voltage
b) 3 wire
1-5mA
measurem.
onne tion
a) DC
a) Wiper
voltage
voltage
Pot.meter (a)
measurem.
b)
see
Pot.meter (b)
resist.
b) resistan e
meas.
meas.
DC
Voltage
voltage
meas.
supply,
over
Current
usually
referen e
24 V
resist.
External
power
Voltage
None
Voltage
supply
a) Digital
input
Voltage
NC
b) Low
or
onta t
pre ision
urrent
resistan e
supply
meas.
a) Digital
Voltage
input
NO
or
b) Low
onta t
urrent
pre ision
supply
resist.
meas.
3.5.3 Single Sensor Fault Dete tion

The dete tion of a fault in any of the above elements depends on whether the
fault auses a deviation from normal that an be re ognized from a voltage
measurement.
Basi prin iples for single sensor fault dete tion are:
1. Range he k: Che k whether spe ied upper and lower limits are ex eeded
59
2. Slew rate he k: Che k whether a spe ied rate of hange has been

ex eeded
3. RMS value he k: Che k whether the RMS value of the signal ex eeds
a spe ied limit
Range Che k
Range he king is a very e ient way to dete t broken wires or short ir uits
between wires for all voltage or urrent based sensors. The requirement to
hardware is that all su h faults lead to a transition of the measured voltage
into an "out of range" region.
The time interval elapsed from the time the fault o urs until it is dete ted depends on how fast a limit is rea hed. The allowable time to dete t
depends on the a tual use of the sensor signal. This is dis ussed below.
For 4-20mA urrent output from sensors/transmitters, ranges ome natural. The requirements are
1. A/D onverter range is 0 to 24 mA.
2. If urrent ex eeds 24 mA, or is below 0 mA (reverse urrent), the
onverter must indi ate 24 mA or 0 mA respe tively, and not swap
around.
For voltage based measurements, range he king requires that all short
ir uit and open ir uit onditions an be dete ted:
1. any wire shorts to any other wire.
2. any wire shorts to ground
3. any wire is ope
For potentiometer measurements, dete tion requires:
1. Voltage supply is unipolar. Symmetri al supply around zero, as is
ommon pra ti e, an not dete t shorts between ground/zero level
and wiper.
2. Input ir uit on voltage amplier is driven out of range if input wire
is open.
Point b) implies that a urrent (iF in gure 6 in hapter 6) is inje ted
into the measurement ir uit. When an "open" fault o urs, input voltage will hange with a rate of hange that depends on ir uit apa itan es
and the magnitude of the urrent inje ted. As the urrents may have to be
hosen small in order to avoid too heavy impa t on measurement nonlinearity/a ura y, the rate of hange may be too small to meet required time to
dete t the fault. If so, slew rate he king an be adopted.
Slew rate he k
Slew rate is the hange of signal between onse utive samples - the "derivative" of the signal. If an open fault o urs, input ir uits should be designed
su h that the slew rate of the signal in this ondition if several times higher
than possible in normal operation. A slew rate dete tor an then be used
to onsiderably redu e the time to dete t.
The slew rate algorithm should be robust implemented with adequate
ltering that an be tuned with the time onstant d :
1
(y(k) y^(k))
d
y^(k + 1) = y^(k) + Ts x(k)
x(k) =
(3.7)
where Ts is sampling time, y is the measured signal, and x is the slew rate
estimate.
RMS value he k
With referen e to the fault s hemes, signal u tuation is sometimes a symptom on a fault in development. In addition to signal u tuations and wiring
defe ts in development,ele tro-magneti disturban es will ause in rease in
Root Mean Square (RMS) value of a signal. Ele tro-magneti interferen e
should be damped by ltering and s reening of ables. An in rease in RMS
signal may therefore be an indi ation of a s reening or ETC defe t or other
faults in the interfa e.
An estimator for the RMS value is best made re ursively. This requires
also an re ursive al ulation of the signal's mean:
y(0)
=
y(0)
1
y(k + 1) =
y(k) + k+1 (y(k + 1) y(k))
2 (k + 1) = 2 (k) + k1 ((y(k) y(k))2 2 (k))
(3.8)
where N is a xed number - the horizon - y with a bar over

is the RMS value.
In order to determine whether a hange has happened in the RMS value,
a hypothesis test should be made. This is fairly simple but is not within the
s ope of this report.
Requirements on time to dete t

The requirements on time to dete t a fault dier a ording to how the sensor
information is used in the system. Faults in sensors used for monitoring of
the average value of some physi al quantity need not be very rapidly dete ted. Faults in feedba k elements in ontrol loops or sensors for safety
61
systems, may need immediate dete tion be ause the fault will have immediate ee t on the ma hinery.
Faults in a ontrol loop an be ategorized in generi types listed in the
table:
Required
Level for
Level for
time
remedy
Fault type
dete tion
to dete t
a tion
Referen e
Pro ess
Several
value fault
interfa e
Controller
samples
(setpoint)
hw & sw
Feedba k
element
fault
Down to
one
sample
Pro ess
interfa e
hw & sw
A tuator
fault
Down to
one sample
Pro ess
interfa e
hw & sw
Exe ution
fault - eg.
in timing
Appli ation
SW, system
or HW fault
in omputer
ontroller
Controller.
Faulty
info. should
not be
used for
ommand
al ulation
Controller
if possible
or safety
system
Safety system Safety system

Computer
rmware
and/or
Safety
system
Comp.
rmware
and/or
Safety
system
Safety
Supply fault
Safety
system &
or other
system
fail-to-safe
fatal error
design
As apparent from the table, faults in feedba k elements and a tuators
are most demanding be ause there is often very little time to dete t a fault
before the fault ee t o urs.
3.5.4 Multiple sensor fault dete tion

If more measurements of the same quantity are available, fault dete tion
methods an utilize the redundant information. If 3 voltage measurements
are available V1 , V2 , and V3 , a fault in one sensor an be found with very
low omputational burden as follows:
V12 = V1 V2 ; V13 = V1 V3 ; V32 = V3 V2 ;

If a fault exists in number one measurement, then V12 and V13 will be
dierent from zero whereas V23 will be zero or lose to zero if V2 and V3
are equal. Any of the non-faulty values an be used. The mean of V2 and
V3 ould be used if measurement noise shall be minimized.
Fault isolation an also be done as:
F1 = V12 V13 ; F2 = V12 V23 ; F3 = V23 V13 ;

F1 will be large if measurement 1 is faulty and lose to zero if it is valid.
Dete tion of whether F is lose to zero or not needed, stri tly speaking, a
dete tor based on the sto hasti nature of measurement noise. A simple
threshold an, however, be used with appropriate results in many ases.
This multiple sensor fault dete tion s heme an be used even if the measurement are from dierent sour es. A sto hasti based dete tor will then
be needed, however.
3.5.5 Filtering of Ele tromagneti Spikes

Ele tromagneti Compatibility is a key issue for marine pro ess ontrol systems. Classi ation so iety approval requires EMC properties to be su ient so that the equipment will not be disturbed by radio frequen y signals
in the daily operation. One of the means to meet EMC requirements are to
use EMC ltering and spike suppression in front stages of pro ess omputer
interfa es.
In relation to dis rete ontrol loops, di ulties arise if a sensor value is
orrupted by a transient ele tromagneti disturban e. The value is "frozen"
on e every sample, and the onsequen e of a transient disturban e is therefore higher in omputerized equipment than with earlier generations based
on analogue te hniques.
The multiple sensor idea an be used here to onsiderably improve spike
sensitivity with very little software overhead. In ea h sampling y le, the
sampling should onsist of 3 A/D onversions rapidly after ea h other. If
the three measurements dier, one measurement is dis arded as des ribed
above.
63
3.6. ACTUATORS
Ts
2Ts
3Ts
time
Figure 3.9: Tripple onversion sampling has only marginal overhead but offers both signi ant ele tromagneti spike suppression and onsisten y he k
within one sample.
3.6 A tuators
A tuators
Three way
valve with
AC motor
positioner
Motor starter for

pump
Type of input
to omponent
a) losure of
"O" onta t
a tivates
"open" relay
(220 V AC)
b) losure of
"C" onta t
a tivates
" lose" relay
(220 V AC)
a) Closed
onta t
a tivates
start
(220V AC)
Computer
interfa e
a) Close "O"
onta t for
opening (digital
output relay)
b) Close "C"
onta t for
losing (digital
output relay)
) Pot. meter
input
d) 2 NC swit h
inputs
a) Close onta t
for start (digital
output relay)
b) NC swit h to
indi ate running
Comments
a and b)
Valve moves
as long as open
or lose signals
are present.
) Angle indi ation with potentiometer
d) End position
indi ation with
2 NC swit hes.
Timing of
start sequen e
is lo al within
the starter
) NC swit h
to indi ate lo al
A tuator fault dete tion an be made using information on both ontrol
and feedba k signals from the a tuator. Analyti redundan y and model
based methods are very e ient in this respe t. These methods use knowledge about stati and, if needed, dynami relations within the a tuator to

ompare expe ted performan e with measurements. Key issues are treated
in Blanke et.al (1993) [9 where dete tion methods and other issues of relevan e to losed loop fault handling are dis ussed.
3.7 Interfa e Requirements

The dis ussion of interfa es and the analysis of fault propagation for the
sele ted omponents have led to a set of requirements to interfa e in order to
obtain fault dete tion at the single omponent level. The requirements ae t
both hardware and rmware/software of the pro ess ontrol or monitoring
system.
3.7.1 Requirements to hardware and rmware

General requirements for analog signals and riti al binary signals:
1. Short- ir uit between any two wires or any wire and ground shall be
dete ted.
2. Any open onne tion shall be dete ted.
Spe i requirements:
For 4-20mA urrent output from sensors/transmitters:

1. A/D onverter range is 0 to 24 mA.
2. If urrent ex eeds 24 mA,or is below 0 mA (reverse urrent), the onverter must indi ate 24 mA or 0 mA respe tively, and not swap around.
For voltage measurements:
1. any wire shorts to any other wire.
2. any wire shorts to ground
3. any wire is open
Potentiometer measurements:
1. Voltage supply is unipolar. Symmetri al supply around zero, as is
ommon pra ti e, an not dete t shorts between ground/zero level
and wiper.
2. Input ir uit on voltage amplier is driven out of range if input wire
is open.
3.7. INTERFACE REQUIREMENTS
65
3.7.2 Combined Hardware - Software requirements

For feedba k elements and riti al sensors for shut-down systems, time to
dete t open onne tion or short- ir uit fault in feedba k elements shall be
down to one sample.
For set-point elements, time-history roll-ba k shall be possible.
The following methods are re ommended to be available for single sensor
dete tion:
1. Range he k: Che k whether spe ied upper and lower limits are ex eeded
2. Slew rate he k: Che k whether a spe ied rate of hange has been
ex eeded
3. RMS value he k: Che k whether the RMS value of the signal ex eeds
a spe ied limit. Sampling is re ommended to be made su h that
transient ele tromagneti disturban es an be dete ted, isolated within
one measurement y le.
Faults within omputer hardware and software must be dete ted by
rmware to the extent possible, i.e., a ording to lassi ation so iety rules.
Chapter 4
Fault Dete tion and Isolation

Extension of feedba k ontrol systems, with methods for fault dete tion,
isolation and a ommodation are needed to avoid una eptable ex itations
in plant states when faults o ur. Produ tion stop, plant failure or dire t
damage should be avoided to the extent possible.
Fault-tolerant ontrol design is the te hnique to prevent simple faults
asso iated with ontrol loops to develop into failures when possible degradation in performan e an be tolerated. If not, fail save design needs to be
imployed. This hapter deals with fault-tolerant design where the ingredients are dete tion, isolation and subsequent a ommodation of faults. The
rst step is to introdu e relevant te hniques for fault dete tion and isolation.
This is the subje t of this hapter.
4.1 FDI in Closed Loop Control Systems

Faults in the instruments of ontrol loop systems are espe ially ru ial for
the entire operation of a system. The possibilities to a omplish FDIA
in su h systems, depend on the ability to distinguish the onditions met
in fault situations from plant behaviour in normal operation. In normal
operation, feedba k ontrol should keep a pro ess state equal to a desired
setpoint, while the in uen es from pro ess disturban es and measurement
noise are kept minimal. In abnormal operation, when faults have o urred,
the ontrol loop should rea t immediately in a way that prevents a fault
from developing into a malfun tion of the system being ontrolled.
The general pro edure of FDIA onsists of the three basi steps:
Change Dete tion. Residuals are generated by means of a mathemati al system model and measurements. Residuals are signals, that arry
information of the system operational onditions, i.e., whether the system operates under normal or abnormal onditions.
Change Evaluation. If the information ontained in the residuals is
67
68
CHAPTER 4. FAULT DETECTION AND ISOLATION
Change
Evaluation
Change
Detector
Fault
Accommodator
Process &
Control
Figure 4.1: Levels of FDIA automation.

de ided to be aused by a fault, then it is isolated, i.e., the lo ation and
time (sometimes also type, size, sour e) are determined.
Fault a ommodation. In the ase of an isolated fault, an a tion is

taken, whi h ensures, that the pro ess a ommodates to the faulty situation.
The pro edure of model based FDI is shown in Fig. 4.1. When the model
based methods are applied for FDI, it is ne essary that input/output signals
of the monitored pro ess are available and that dynami hara teristi s of
the system are known with a reasonable degree of pre ision. The information
about the faults, ontained in the residuals, depends highly on the available
model. An a urate model gives residuals, with desirable relations between
the residuals and the faults. An ina urate model will produ e relations,
whi h deviate from the desired. An ina urate model must ne essarily be
used when the pro ess knowledge is low and/or in order to de rease the
design and on-line al ulation omplexity.
Several arti les on erning FDI investigations using model based methods have been published. Examples are Patton, Frank and Clarke, (1989)
[45, Patton and Chen, (1991) [42, Frank, (1991) [19, Isermann, (1991)
[16, the lassi survey paper by Willsky, (1976) [50 and later surveys by
Frank, (1990) [18, Isermann, (1994) [29, and Gertler, (1993) [25. Methods
for statisti al hange dete tion are dealt with in the next hapter. A key
referen e is detailed in Basseville and Nikiforov, (1993) [3.
4.2. REQUIREMENTS TO FDI
69
4.2 Requirements to FDI

The requirements to fault dete tion are losely related to the appli ation
of the result of a dete tion. One important parameter is the time to dete t
that a fault has o urred.
With in ipient omponent faults dete ted for use in maintenan e planning, FDI response may be rather slow (minutes to hours).
With abrupt omponent faults in a pro ess where dete tion used for
operator assisted hange of operational mode, dete tion must be more
responsive (se onds to minutes).
If abrupt faults in set-point values to a losed loop ontrol are onsidered, and used by the ontroller for automati re- onguration, time
to dete t should be within a few samples (5 to 10).
If abrupt faults in feedba k elements in a losed loop ontrol are onsidered, time to dete t be within one to two samples.
The ategorization above indi ated we need to distinguish between the

timely development of a fault (abrupt to in ipient) and the use of the information (from maintenan e planning to autonomous re- onguration of a
ontroller). In this ontext, fault tolerant ontrol is in fo us. Pro ess faults
are, therefore, only onsidered to the extent their ee ts propagate through
the ontrol system and, at the same time, the ontrol system an alter the
propagation.
4.3 Modelling of Faults and Fault-propagation

Fault ee ts were analysed in Chapter 1 using FMEA te hniques. The result
of this alalysis was a list of fault ee ts to be onsidered. A subsequent step
in the overall design methodology was to make mathemati al modelling,
in luding the listed fault ee ts. This eventually led to mathemati al models
of the relevant parts of the parti ular pro ess and its ontrol system. These
faults are des ribed as either additive or multiplikative. The two types
of faults enter dierently in a state spa e model. We use a dis rete time
representation, knowing that the plant under on ern is usually ontinuous
with dis rete time measurements.
A system, with measured inputs u(k), and output measurements, y(k)
is onsidered. The system an be subje t to any a tuator or sensor fault,
re e ted in the ve tor f (k), unknown inputs (disturban es) d(k), pro ess
and measurement noise, w(k) and v(k) respe tively. Considering the fault
70
and disturban e ve tors as purely additive, su h a system is modelled as
x(k + 1) = Ax(k) + Bu(k) + E1 d(k) + F1 f (k) + w(k)

y(k) = Cx(k) + E2 d(k) + F2 f (k) + v(k)
(4.1)
where x(k) is the state ve tor. A; B; C; D are known system matri es,
E1 and E2 are known matri es for unknown inputs and F1 and F2 are known
fault entry matri es. w(k) and v(k) are dis rete time Gaussian white noise
pro esses (ve tors) with zero mean and ovarian es
E [w(k) wT (k) = 1
E [w(k) vT (k) = 12
E [v(k) vT (k) = 2
(4.2)
If more general multipli ative faults are onsidered, we get

x(k + 1) = A I + EA f (k) + FA d(k) x(k) + B I + EB f (k) (4.3)

+FB d(k) u(k) + w(k)

y(k) = C I EC f (k) + FC d(k) x(k) + v(k)
The ee ts of multipli ative faults are learly dependent on the statte
and input signals. The additive fault ase is simpler as the ee t of a fault
develops dependent of the dynami s of the plant (and not its present state)
and the dynami s of the fault. Fault dete tion is somewhat dierent in the
two ases. Additive faults an be dete ted using ltering te hniques; dete tion of multipli ative faults will be better dete ted using system parameter
estimation te hniques. Within the limited s ope of this ourse, additive
faults will be the prime on ern.
Considering only additive faults, the state spa e des ription in Eq. (4.1)
has the equivalent input/output model

y(z ) = C(zI A) 1 Bu(z ) + E1 d(z ) + F1 f (z ) + w(z )
(4.4)
+E2 d(z ) + F2 f (z ) + v(z )
= Hu (z )u(z ) + Hud (z )d(z ) + Hf (z )f (z ) + Hw (z )w(z ) + v(z )
where ea h element of the H(z ) matri es is a transfer fun tion, that is a

rational fun tion of the shift operator z , i.e.:
Hyu (z ) = C(zI A) 1 B
(4.5)
1
Hud (z ) = C(zI A) E1 + E2
Hyf (z ) = C(zI A) 1 F1 + F2
Hyw (z ) = C(zI A) 1
The dynami properties of the fault and the lter fun tions above determine the dynami hanges in the measurements when a fault o urs.
4.4. METHODS FOR CHANGE DETECTION
71
4.4 Methods for Change Dete tion

Residuals for hange dete tion are generated by means of available measurements and pro ess models. A variety of methods for residual generation
have been presented in the literature. They an be brought down to two
basi on epts:
Geometri approa hes:
{ Parity spa e equations.

{ Diagnosti observers.
Statisti al approa hes:
{ Kalman ltering.
{ Parameter estimation.
{ Statisti al hange dete tion.
The geometri approa hes generate residuals, whi h ontain information
of system hanges due to faults, as hanges in magnitude. Under ideal
onditions, when the system operates normally, residuals are lose to zero.
If a faulty ondition arises, one or more elements of the residual ve tor
hange to nonzero.
The statisti al approa hes generate residuals, with information of hanges
in system statisti s due to faults, e.g., hanges in mean value or ovarian e.
4.5 Geometri Approa hes to Change Dete tion

In the geometri framework, residuals for hange dete tion are generated by
re onstru tion of output measurements. Basis for the design is des riptions
of the system in luding potential faults and unknown inputs. The dieren e
between measured and re onstru ted outputs are error signals. These are
sensitive to faults and, in many ases, also to unknown inputs. In order
to give the error signals spe ied properties in relation to the faults (i.e.,
to obtain fault isolation), it may be ne essary to lter the error signals. A
very general ar hite ture for residual generation is shown in Fig. 4.2. The
residual ve tor r(z ) an be al ulated as a fun tion of the fault ve tor f (z )
so that
r(z ) = H(z )f (z )
(4.6)
Denition 1
Dete tability: the ability to dete t the presen e of one out

of several faults f1 (k), f2(k), : : : , fm (k), dened as
r(k) 6= 0; when f (k) 6= 0
(4.7)
72

Faults Disturbances
Inputs
Physical
Plant
Outputs
Plant
Model
Error signals
Output error
Filter
Residuals
Figure 4.2: A general pro edure of residual generation in FDI

This means that when a fault o urs, one or more omponents of the residual ve tor will hange in magnitude and make it possible to re ognize, eg.
by a threshold dete tor, that some hange has taken pla e. Dete tability
implies that a set of faults an be dete ted - but not ne essarily isolated.
In mathemati al terms, the residual ve tor r(k) is a fun tion of all possible
faults:
r(k) = h(f1 (k); f2 (k); : : : ; fm (k)):
(4.8)
The geometri interpretation is depi ted in Fig. 4.3, where three faults are
onsidered.
Denition 2
Isolability:
the ability to isolate a fault dened as
ri (k) 6= 0; when fi (k) 6= 0
(4.9)
Isolability means that the i0 th residual is ae ted by only the i0 th fault, then
the fault has been isolated. (Frank, 1990) [18.
4.5.1 Generation of residuals

In general, two types of fault spe i residuals an be generated:
The faults f1 (k), f2 (k), : : : , fm (k), an be dete ted and isolated simultaneously.
73
4.5. GEOMETRIC APPROACHES TO CHANGE DETECTION

r3
Fault space
r2
r1
Figure 4.3: Geometri interpretation for dete tion of several faults.
The faults f1 (k), f2 (k), : : : , fm (k), an be dete ted and isolated one
at a time.
The rst approa h leads to a single residual generator where the residual
ve tor r(k) is ltered to give desired sensitivity to parti ular faults.The
se ond an be implemented as a bank of lters, ea h being sensitive to a
parti ular fault. The two strategies are elaborated below.
One lter to generate a ve tor residual

The rst strategy an be implemented as one lter produ ing a residual
ve tor with the desired properties. Ea h residual is onstru ted to be insensitive to one spe i fault. The i0 th residual be omes a fun tion of all faults
ex ept number i:
ri (k) = hi (f1 ; f2 ; : : : ; fi 1 ; fi+1 ; : : : ; fm ):
(4.10)
If the fault fi (k) happens, then all residuals ex ept the i0 th will respond
to that fault. The number of residuals must be larger than two. The geometri interpretation is depi ted in Fig. 4.4. With a single residual generator
onguration, the number of possible error signals equals the number of
available measurements. This gives one bound on the number of faults it is
possible to isolate. If m measurements are onsidered, m error signals an
be onstru ted. The faults are mapped on omponents of residual ve tor
following Eq. 4.8. The number of faults whi h theoreti ally an be dete ted
and isolated are n m. The fault isolation pro edure, depends on the possibility of designing a lter so that ea h omponent of the residual ve tor
has spe i properties to parti ular fault ee ts or a ombination of these.
74

r3
Fault 1
Fault 2
r2
r1
Fault 3
Figure 4.4: Geometri interpretation for dete tion and isolation of one fault
at a time.
Bank of lters to generate residuals

A bank of residual generators, ea h of whi h are usually designed to be optimally sensitive to one spe i fault, gives a larger freedom in the design due
to a larger number of independent parameters in the bank of dete tors than
in an implementation with just one ve torized dete tor. The i0 th residual
is onstru ted to be sensitive to only the i0 th fault: ri (k) = hi (fi (k)). The
geometri interpretation is depi ted in Fig. 4.4: Frank, (1990)[18.
Ea h residual generator is onstru ted, to have optimal properties to
one spe i fault. Both input/output measurements and residual generator
dynami s are sele ted so that error signals are generated in an optimal way in
relation to the spe i fault. It is possible to obtain the same spe i ations
for the two ases.
The following des riptions of geometri approa hes for FDI onsiders a
single residual generator onguration. A set of su h residual generators
an subsequently be oupled into a bank and make simulataneous dete tion
and isolation. The next se tions des ribe dierent te hniques for residual
generation within the geometri approa h.
4.5.2 Parity Equations

Parity equations are signals showing in onsisten y between the non-faulty
model and pro ess output signals. The name parity omes from hemi al
relations where left and right hand side of a rea tion must be in balan e.
On Fig. 4.5 the residual generation an be seen using parity equations.
The output measurements are generated from the nonfaulty nominal pro ess

u
Plant
y
+
-
Plant
Model
75
Filter
^
y
Figure 4.5: Geometri interpretation for simultaneous fault dete tion and
isolation.
des ription, referring to Eq. (4.5) without noise ontributions:
y^ (k) = Hu (z ) u(k)
(4.11)
The output error signals be ome:
ey (k) = y(k) y^ (k)

= y(k) Hu (z ) u(k)
= Hed (z )d(k) + Hef (z )f (k)
(4.12)
Those error signals an be sensitive to all potential fault and the disturban es. A lter W(z ) is onstru ted to generate residuals, r(k) = W(z )e(k),
whi h have the desired properties to faults.
To de ouple the disturban es, W(z )Hud (z ) must be zero. Fault dete tion an now be performed but not fault isolation. For dete tion and isolation of the i0 th fault on the i0 th residual, the i0 th row of W(z ) multiplied
with any olumn of Hf (z ) ex ept the i0 th must equal zero.
4.5.3 Diagnosti Observer

Consider the pro ess des ription in Eq. (4.1) negle ting the noise terms. The
output measurements an be re onstru ted by means of the observer
x^ (k + 1) = Ax^ (k) + Bu(k) + K(y(k) y^ (k))

y^ (k) = Cx^ (k)
(4.13)
where x^ (k) are the estimated states, y^ (k) the estimated outputs and K is
the observer feedba k gain matrix.
Using G = A KC, the state estimation error, ex(k), and the output
76
estimation error, ey (k) are:

ex (k + 1) = x(k + 1) x^ (k + 1)
ex (k + 1) = Gex (k) + (E1 KE2 )d(k) + (F1
(4.14)
KF2 )f (k) + w(k) Kv(k)

ex (k) = (zI G) 1 (F1 KF2 )f (k) + (E1 KE2 )d(k)

+(zI G) 1 w(k) Kv(k)
ey (k) = y(k) y^ (k)
= Cex(k) + E2 d(k) + F2 f (k) + v(k)
The error ve tor ey (k), an be given fault spe i properties by multip ation
with a lter matrix W(z ),
r(z ) = W(z )ey (z )
(4.15)
The task is to design the observer feedba k gain, K and the lter matrix W, so that the observer has optimal properties to faults and unknown
inputs.
Several methods exist for designing the two matri es. One is the eigen
stru ture assignment approa h, Patton and Chen, (1991) [42. This approa h for observer design is based on the eigen pair equations, with vj as
right side eigenve tor of G, wT as left side eigenve tor of G, and j is an
eigenvalue:
[j I Gvj = 0
wjT [j I G = 0
det(j I G) = 0
Following the matrix inversion lemma:
(4.16)
1
(4.17)
(A + BCD) 1 = A 1 A 1 B(C 1 + DA 1 B) 1 DA
Sin e this lemma is valid for all sets of A; B; C; D matri es having appropriate dimensions, and A being invertible, we may use the following renaming:
A ! C 1 ; B ! D, C ! A 1 ; D ! B and obtain
(C 1 + DA 1 B) 1 = C CD(A + BCD) 1 BC
(4.18)
This is used to rewrite the estimation error ex (k) in Eq. (4.15) by setting
(zI G) 1 = (A + BCD) 1 , i.e., zI = A and G = BCD. Then ex (k)
be omes:
(4.19)

1
2
ex (k) = (z I + Gz : : : ) (E1 KE2 )d(k) + (F1 KF2 )f (k)
1
X

1
Gm z m (E1 KE2 )d(k) + (F1 KF2 )f (k)
= z I
m=0
77
giving the residual r(k):

1
X
Gm z m (E1 KE2 d(z )
r(z ) = W Cz 1
m=0

+(F1 KF2 )f (z ) + E2 d(z ) + F2 f (z )
X
Gm z
= W Cz 1
m=0
(4.20)
E1tot + E2tot utot (z )
where
E1tot = [(E1 KE2 )(F1 KF2 )

E2tot = [E2 F2
utot (z ) = [d(z ) f (z )T
(4.21)
The goal is to design the matri es W and K, so that one residual, ri (k),
from the ve tor, r(k) in Eq. (4.21) is de oupled from other in uen es (fault
or unknown input) ex ept the parti ular fault fi (k). The left and right side
eigenve tor assignments are des ribed for this purpose.
Left Side Eigenve tor Assignment
If a row wjT C of WC in Eq. (4.21) is made a left side eigenve tor of G, for
an eigenvalue j , then in relation to Eq. (4.16):
WC(j I G) = 0
+
WCj = WCG
WC
1
X
m
= WC
j
1
X
m=0
=0
using Eq. (4.23) in Eq. (4.21) gives the residual:
(4.22)
Gm
X

Gm z m E1tot + E2tot utot (z )
r(z ) = W Cz 1
m=0
The limit value of the sum as m goes to innity is:
1
X
j
z
m=0
and the residual be omes:
m
z j
for jj j < 1

CE1tot
+ E2tot utot (z )
rj (z ) = W (z )
z j
(4.23)
(4.24)
(4.25)
78
This means, the left side eigen stru ture assignment results in a lter
matrix, whi h is independent of z .
If W furthermore is designed so that the i0 th row of W multiplied with
any olumn ex ept the i0 th of CE1tot and E2tot equals 0, then the i0 th residual
is only ae ted by the i0 th disturban e. Fault dete tion and isolation of the
i0 th fault is thus a omplished.
Right Side Eigenve tor Assignment
If all olumns of E1tot in Eq. (4.21) are made the right side eigen ve tors
of G for an eigenvalue, j then in relation to Eq. (4.16):
(j I G)E1tot = 0
+
j E1tot = GE1tot
+
(4.26)
1
1
X
X
Gm E1tot
m
j E1tot =
m=0
m=0
Substituting Eq. (4.27) into Eq. (4.21), r(k) will take the form of Eq.
(4.25).
The right side eigenstru ture assignment determines the values in K but
not W. The latter matrix is therefore free for further de oupling of any
other external in uen e ex ept the fault to be dete ted.
Still, if the i0 th disturban e is to be dete ted on the i0 th residual, then
the i0 th row of W multiplied with any olumn ex ept the i0 th of
CE1tot
+ E2tot
(4.27)
z j
must equal 0. The result is, that all in uen es, ex ept the i0 th fault, are
de oupled from the i0 th residual. In this design W be omes a fun tion of z .
Alternatively, the W matrix design an be done as des ribed for the left
side assignment, where W is independent of z .
Instead of making ea h olumn of E1tot the right side eigen ve tors of G,
it may be su ient to use only sele ted olumns. If, for instan e, the i0 th
fault within the fault ve tor f (k), in Eq. (4.21) is to be dete ted then any
olumn of E1tot ex ept the i0 th are made the right side eigenve tors of G.
4.5.4 Unknown Input Observer

Another way of generating the residuals, is to onsider the estimator as an
unknown input observer, Patton & Chen [42, Frank [18. This approa h
onsiders a new state z(k) as a linear transformation of the state ve tor,
x(k) so that z(k) = Tx(k). The observer has the form:
z(k + 1) = Mz(k) + Ju(k) + Ly(k)

y^ (k) = CT 1 z(k)
(4.28)
79
4.6. STATISTICAL METHODS TO GENERATE RESIDUALS
giving the estimation and output errors:
ex (k + 1) = Tx(k + 1) z(k + 1)
= Mex (k) + (TA MT LC)x(k) + (TB J)u(k)
+(TE1 LE2 )d(k) + (TF1 LF2 )f (k)
(4.29)
1
ey (k) = y(k) CT z(k)
= C(x(k) T 1 z(k)) + E2 d(k) + Fsfs (k)
= CT 1 ex (k) + E2 d(k) + F2 f (k)
Now if every part involving the states, the inputs and the unknown inputs
an be made zero, the state estimation error will only be in uen ed by the
faults ,i.e.:
TA MT LC = 0
TB J = 0
TE1 LE2 = 0
and the output error be omes:

ex (k) = (zI M) 1 (TF1
LF2 ) f (k)
(4.30)
(4.31)

ey (k) = CT 1 (zI M) 1 (TF1 LF2 ) f (k) + E2 d(k) + F2 f (k)
Now fault dete tion is a omplished. Furthermore, if the design matri es

ontain elements so that ey (k) has spe ied properties to the faults, the
isolation is a hieved. Again it may be ne essary to multiply a lter to the
output error to give the residual spe ied properties to the faults. Here the
lter will be ome a fun tion of z . For unknown input de oupling:
W(z )E2 = 0
(4.32)
Dete tion and isolation of the i0 th fault on the i0 th residual, implies that the
i0 th row of W(z ) multiplied with any olumn of:
CT 1 (zI M) 1 (TF1 LF2 ) + F2
(4.33)
ex ept the i0 th must equal zero.
4.6 Statisti al Methods to Generate Residuals

In the statisti al framework, the residuals for FDI are signals or quantities, whi h ontain information of the fault statisti s. Basis for the designs
80
is knowledge of the system statisti s, both under normal and faulty onditions, and the system des riptions. Inspe tion of hanges in mean value
or ovarian e of the quantities is used for determining whether a fault has
o urred. The dete tion problem is further elaborated in the hapter on
statisti al fault dete tion.
4.6.1 Kalman Filtering

Appli ation of the Kalman lter in a statisti al appli ation, uses the fa t
that the Kalman gain, K, is designed so that the state estimation error
is white, and has minimal ovarian e under normal operation and in the
presen e of random noise. For sensor and a tuator fault dete tion, a bank of
lters is designed, ea h onsidering dierent fault onditions. Under normal
operation all innovations are expe ted distributed with zero mean and a
known ovarian e. The o urren e of a fault will ause at least one residual
to hange statisti s and the dete tion an then be performed by inspe tion
of hanges in the mean value of the residual.
The basi Kalman lter is given in Eq. (4.1). The output measurements
are re onstru ted by means of the estimator given in Eq. (4.20). The state
and output estimation errors are then:
ex (k + 1)
ex (k + 1)
ex (k)
ey (k)
=
=
=
=
=
x(k + 1) x^ (k + 1)
(4.34)
Gex (k) + (E1 KE2 )d(k) + (F1 KF2 )f (k) + w(k) Kv(k)
(zI G) 1 ((F1 KF2 )f (k) + w(k) Kv(k))
y(k) y^ (k)
Cex(k) + E2 d(k) + F2 f (k) + v(k)
The design approa h is, normally, to minimize the varian e of the estimation
error, whi h is denoted P(k):
h
P(k) = E (ex(k) ex (k)) (ex (k) ex (k))T
(4.35)
This means, K is not designed to give the residuals optimal properties

for the dete tion of faults. The gain matrix K and the estimation error
varian e P(k) an be determined from Eq. (4.37).

(4.36)
K(k) = AP(k)CT w +CP(k)CT 1
1

CP(k)AT
P(k + 1) = AP(k)AT + v AP(k)CT w +CP(k)C T
0
The latter is the usual ve tor Ri atti equation where the ovarian e matri es
of state and measurement noise are

v = E vvT
w = E wwT
81
Fault spe i properties an be given to the error ve tor, ey (k), by multiplying with a lter matrix W so that:
r(k) = Wey (k)
(4.37)
This lter must ensure that unknown inputs are de oupled from r(k),
and that the desired properties to the faults are obtained. If the i0 th row of:

r(k) = H C(zI G) 1 (E1 KE2 ) + E2 d(k)
(4.38)
equals zero, then the unknown inputs are de oupled from the i0 th residual.
It is now possible to make fault dete tion, but not isolation, be ause the
i0 th residual is dependent of all faults. If the i0 th fault, fi(k) ontained in
the ve tor f (k), is to be dete ted and isolated from all other faults on the
residual ri (k), then the i0 th row of:

r(k) = W C(zI G) 1 (F1 KF2 ) + F2 f (k)
(4.39)
must equal zero as well. This pro edure determines the i0 th row of the lter
matrix W and is repeated for any other row.
The matrix W is a fun tion of z . If m in uen es are to be dete ted and
isolated simultaneously, then m innovations must be available and m 1
disturban es must be de oupled from ea h of them. The result is residuals
whi h are sensitive to only one in uen e. Usually, the Kalman ltering used
for statisti al FDI is ongured as a bank of lters.
The overall stru ture of the lter bank is illustrated in Fig. 4.6
If no fault is present, both rs (k) and ra (k) are innovations with zero
mean and a known ovarian e. On the other hand if the system a tuator
fails, then ra (k) is distributed with N (f ; f ), while rs (k) still is distributed
with N (0; ). (Gertler, 1988) [23, Tzafestas and Watanabe, (1990) [48.
4.6.2 Parameter Estimation

In general, the pro edure of parameter estimation, optimizes a performan e
fun tion in relation to the parameters to be estimated using measurements
and a model stru ture. This means that the parameters hange, when the
system dynami s hanges.
A typi al model is linear with lumped parameters on the form:
any(k + n) + + a1 y(k + 1) + y(k) = b0 u(k) + + bm u(k + m) +

y(k) = T (k)
(4.40)
where is a dis rete time noise pro ess. The parameters are estimated using:
82

u
Plant
Kalman filter
for sensor fault
detection
rs
Kalman filter
for actuator
fault detection
ra
Figure 4.6: Illustration of a bank of Kalman lters for statisti al FDI.
^
y^(k) = T (k)
(4.41)
where (k) is a ve tor ontaining the measurements and is a ve tor

ontaining the parameters to be estimated:
T (k) = [ y(k + 1) y(k + n); u(k) u(k + m)
= [a1 : : : an ; b0 : : : bm
(4.42)
The output error
y(k) y^(k) = y(k) T (k)^
(4.43)
is minimized by optimizing the performan e fun tion, whi h means that the
parameters hange, with system hanges.
When knowing the nominal parameters and their varian e, fault dete tion an be performed, by omparing estimated values with their nominal
values. The pro edure of generating residuals by means of parameter identi ation is shown in Fig. 4.7.
If a fault happens it is dete ted as a hange in mean value of the parameter ve tor .
Pro edures and methods for parameter estimation, in the FDI frame
are treated in (Isermann, 1991 [28; Isermann, 1984; [27, Tzafestas and
Watanabe, 1990 [48; Ljung and So derstrom, 1983 [35, Ljung, 1987 [34).
f
u
ud
Actual
Plant
Parameter
Identification
Calculation of
physical parameters
P
Nominel
Plant
Parameter
Identification
n
Calculation of
physical parameters
Pn
Determination of
changes
P = Pn - P
ONLINE
OFFLINE
Figure 4.7: Residual generation based on parameter estimation.
83
84
Chapter 5
The Change Dete tion

Problem
Having a omplished to generate residuals the issue arises of how to determine whether a fault has o urred. This analysis is disturbed by the
o urren e of noise in measurements and in sto hasti disturban es on the
pro ess at hand. It is therefore needed to onsider the statisti al properties of a residual and make statisti al based testing of the hypotheses:
fault or no fault. This hapter starts with re alling some elementary properties of sto hasti signals: the properties in amplitude distribution and
time/frequen y domain behaviour. The on ept of distan e is then introdu ed with the aim to explain how hange dete tion an be a omplished.
Methods for hange dete tion are then dealt with. The simple threshold test,
the usum based dete tor for evaluation of statisti al signals, where ertain
onden e an be obtained about the orre tness of a dete tion, and some
implementation onsiderations, in parti ular for on-line dete tion, where a
re ursive method is needed. Simple examples illustrate the on epts.
5.1 About sto hasti signals

Sto hasti signals have a random variation and are des ribed by two main
features:
The amplitude distribution

The time/frequen y domain properties.
By random we mean that there is no way to predi t an exa t value at a
future instant of time.
85
86
CHAPTER 5. THE CHANGE DETECTION PROBLEM
5.1.1 Amplitude distribution.

A random pro ess an be hara terized through the amplitude of measurements taken as a time sequen e. The properties an be fully determined by
al ulation of the moments of the pro ess:
Z 1
xn p(x)dx
Pn =
1
where p(x) is the probability density fun tion of the signal, and x the amplitude. The rst moment is the mean value,
Z 1
xp(x)dx
=
1
in other words, the mean value is the weighted linear sum of x(t) over all
values of x. Similarly, the mean square, also referred to as the varian e, is
given by the se ond order moment
Z 1
2
x2 p(x)dx
=
1
A very easiest probability density fun tion, in terms of al ulations, is
the Gaussian distribution with mean value and varian e 2 :
!
1
(x )2
p(x) = p exp
22
2
whi h has the well-known bell shape.
5.1.2 Mean and varian e of a stationary pro ess

A stationary sto hasti pro ess is one whi h is un hanged in time. The most
important measures of a sto hasti signal are the two lowest order moments:
the mean value, and the varian e. Assuming a stationary pro ess, the mean
value is dened by averaging over time
N
1X
x ( t)
(5.1)
N !1 N
i=0
The varian e of a s alar pro ess is dened as the se ond order moment.
= x = E fx (t)g lim
n
2 = E (x(t) x)2 = lim

N
For a ve tor valued pro ess
x=4
!1 N
x1
::
xn
3
5
N
X
1 i=0
(x (t) x)2
(5.2)
87
5.1. ABOUT STOCHASTIC SIGNALS
The denition of varian e is extended to the ovarian e matrix. Using

the notation
2 = E f(xj xj ) (xk xk )g
jk
(5.3)
o
Q = E (x x) (x x)T = 4
2 :::
2 12
11
2
21 ::: :::
2
::: ::: nn
3
5
(5.4)
5.1.3 Mean and varian e of a ltered stationary pro ess

The on ept of a white noise pro ess is very popular for omputational
reasons. The white noise on ept is easy to deal with when we look at
dis rete time. Whiteness means that there is no orrelation between any
two samples of the pro ess, regardless how lose in time we look at the
pro ess. This on ept has somewhat dierent properties in dis rete and
ontinuous time.
Dis rete time

The most important question for a sto hasti pro ess, in engineering terms
is, what happens when a lter is applied to a sto hasti pro ess. A lter
an be des ribed as a set of state spa e equations
xk+1 = Axk + Bw
with outputs (measurements)
yk = Cxk + v
where v and w are measurement noise and pro ess noise, respe tively with
ovarian es

E wwT = Qw
E vvT = Qv
The mean value is simply given by the propagation of the mean value of
the noise through the lter
Ix = Ax + Bw

y = Cxk + v
or, in lter terms
y = zlim
C (zI A) 1 Bw + v
!1
whi h, naturally, is just expressing the DC gain of the lter. The varian e
at the output of a lter is only slightly more ompli ated to al ulate, and
88
the following expression is very useful to know - and remember - sin e the
sole purpose of most ltering is to redu e the varian e of the sto hasti part
of a signal. With P being the varian e of the output,
n
P = E (y y) (y y)T
then
Pk+1 = APk + Pk AT + BQw BT + Qv

This equation is the matrix Ri atti equation.
Continuous pro esses

If going to ontinuous time, a di ulty is that, in prin iple, it is only the
integral of a sto hasti pro ess with un orrelated in rements whi h exists.
The problem lies in the fa t that we an not dene a white noise sour e with
nite intensity and assume it has innite bandwidth sin e this would be a
signal with innite power. It is outside the s ope of this hapter to treat the
ontinuous noise ase, so the interested reader should onsult, lassi al texts
in statisti s and ontrol, e.g. (Jazwinski, 19xx ;
Astrm, 1970 ; Maybe h,
1980 ).
To quote the result, the dynami lter is
x_ = Ax + B"
y = Cx +
where " and are ontinuous time sto hasti pro esses with intensities Q"
and Q dened by
n
E (" (t) ") (" ( ) ")T = Q" (t )

and similarly for Q : The propagation equations for the mean value are
x_ (t) = Ax (t) + B"

y(t) = Cx(t) +
The propagation equations for the ovarian e are
P_ = AP + PAT + BQ" BT + Q
The steady state solution is obtained setting P_ = 0 and solving the resulting
algebrai matrix Ri atti equation. There are standard routines to do this.
Setting P_ equal to zero gives the steady state ovarian e matrix.
5.2. MEASURING THE DIFFERENCE BETWEEN STATISTICAL SIGNALS89
5.2 Measuring the dieren e between statisti al

signals
In geometry, distan e is the shortest line between two points. This is easy
to grasp in three dimensional spa e, and easy to extend to more dimensions.
The denition of distan e makes it possible to measure area and volume
and determine whether two obje ts are equal in e.g. area or distan e. We
need similar measures for sto hasti signals. Probability theory denes a
Kullbak Leibner information between two probability densities of a random
variable. The two probability distributions are made to assume two dierent
set of parameters, most simple two dierent hypotheses about the mean or
varian e. Basseville and Nikiforov (1993) [3 show that the Kullbak Leibner
measure an be approximated by
N
1X
p (y jy :::y )
ln 0 i 1 i 1
N i=1 p1 (yi jy1 :::yi 1 )
where the key fun tion is natural logarithm taken of the ratio of the probability density fun tions. These fun tions, in turn, are al ulated from the
measurements up to instant no. i. This fun tion is referred to as the loglikelihood ratio
p (y jy :::y )
si = ln 0 i 1 i 1
p1 (yi jy1 :::yi 1 )
Its role is paramount in statisti al fault dete tion.
Considering fault dete tion, a residual generator generates sequen es,
r(i), whi h are hara terized by a Gaussian distribution with mean ve tor
and ovarian e matrix Q
KN (0 ; 1 )
p (r(i)) = N (; Q)
(5.5)
The statisti al dete tability is dened in terms of the Kullba k distan e

between two onditional distributions before and after a hange. The hange
is from parameter 0 to 1

p (r (1); rk (2); ; rk (i))

si = ln 1 k
p0 (rk (1); rk (2); ; rk (i))
(5.6)
where px (rk (1); rk (2); : : : ; rk (i)) is the probability density fun tion for the
sequen e with mean value, x , taken over the samples rk (1) to rk (i). The
hange is said to be dete table if the Kullba k distan e exists and satises:
K (0 ; 1 ) > 0
(5.7)
90
In other words, a hange is dete table if the log likelihood ratio has
hanged after a fault. In order to be statisti ally dete table, the mean
value, 0 , needs to be known - or estimated - for the system operating
under normal onditions. The des ription onsiders hanges from normal
to faulty onditions, and not hanges from one faulty situation to another.
The Kullbak measure is illustrated in the following example.
5.2.1 The Kullbak Distan e between Gaussian signals

Assume a Gaussian distribution N 0 ; 02 , i.e. with mean value 0 and
varian e 02 ; representing the no fault ondition H0

(ri 0 )2
1
(5.8)
p(ri ) = p exp
202
0 2

The faulty ondition is H1 hara terized by the Gaussian pro ess N 1 ; 12 .
The Kullbak information between the two onditions is
K (0 ; 1 ) =
N
p (r )
1X
ln H1 i
N k=1
pH0 (ri )
(5.9)
First, we al ulate the log-likelihood ratio

or
p (r )
si = ln H1 i
pH0 (ri )
0

2 1
(
r

)
i
1
1
B 1 p2 exp
C
212
C

= ln B
B
C
2
1
A
(
r

)
i
0
p exp
2
0 2
20
(r )2
si = ln 0 + i 2 0
1
20
(5.10)
1 )2
(5.11)
212
It is useful to look at one hange of property only. We thus assume
that a residual generator has been designed su h that, upon o urren e of a
ertain fault, the residual will hange in the pres ribed way.
A hange in mean from 0 to 1 , with un hanged varian e, 0 = 1 =
gives

si = 1 2 0

ri
(ri
1 + 0
2
(5.12)
A hange in varian e from 0 to 1 ; but un hanged mean 0 = 1 =

gives
91
5.3. CHANGE EVALUATION
2 2
si = ln 0 + 1 2 2 0 (ri )2
1
0 0
(5.13)
This means that the log likelihood ratio is a fun tion of the observation ri
and that the Kullbak distan e is an average of those over N observations,
K (0 ; 1 ) =
N
1X
s
N k=1 i
(5.14)
5.3 Change Evaluation

In the previous hapters, residual generation have been des ribed. The next
step in the fault handling pro edure is the evaluation of the residuals for
de ision of whether a fault o urred. Two basi ally dierent approa hes an
be used:

Change evaluation using a xed threshold test.

Change evaluation using a statisti al test.
5.3.1 Threshold tests

The simplest way of de iding whether a fault has o urred is to test ea h
omponent of a residual against a xed threshold value. A xed threshold
test is rarely su ient for robust fault dete tion be ause errors in the design
model makes the residuals dependent of the input ex itation. Robustness
to the input ex itation an be obtained by use of an adaptive threshold
sequen e whi h is a fun tion of the input signal. When onsidering a tuator
and sensor faults, the referen e signal for a ontrolled pro ess may be applied
with advantage, be ause it will not be in uen ed by any faults. Basi ideas
for generating an adaptive threshold were presented by (Emami-Naeini, 1976
[17; Ding and Frank [15 and later Patton and Chen, 1992 [44).
In general, all model based methods for residual generation an be des ribed as shown in 5.1. Gu (z ) represents the input/output des ription for
the pro ess used in the design, Gu (z ) represents additive modelling errors,
whileHu(z ) and Hy (z ) are the transfer fun tions, whi h give the relation
between the input/output measurements and the estimated output, y^. For
the ideal ase, with no modelling errors the residual is zero:
r(k) = y(k) y^(k) = Gu (z )u(k) Hu(z )u(k) Hy (z )y(k) = 0

when modelling errors are onsidered the residual be omes:
(5.15)
92
u
Gu+Gu
Hy
Hu
+
y^
Figure 5.1: Statisti al dete tion is applied to the residual.
r(k) = y(k) y^(k) = (Gu (z ) + Gu (k))u(k) Hu (z )u(k) Hy (z )y(k)

= (I Hy (z ))Gu (z )u(k)
(5.16)
It is assumed that Gu (z ) is bounded by the limit value , i.e.:
jjGu (z)jj
(5.17)
This gives a residual ve tor whi h is bounded by:
jjr(k)jj jj(I Hy (z))u(k)jj
(5.18)
An adaptive threshold an be determined as a fun tion of u(k):
T h(k) = (I
Hy (z ))u(k)
(5.19)
5.4 Statisti al Dete tion

A number of de ision te hniques are overed by statisti al (or hypothesis)
testing. These te hniques basi ally state dierent hypotheses, Hi, on erning the system's operational onditions and, by means of a de ision fun tion, gi (Hi ), determine whi h hypothesis that is a epted. This paragraph
des ribes two dierent te hniques:
Weighted sum-squared residual (WSSR) te hnique
93
5.4. STATISTICAL DETECTION
Sequential probability ratio test (SPRT)

The observations (residuals) used in the hypothesis testing are onsidered to be Gaussian sequen es with zero-mean, 0 = 0, and a known (or
estimated) varian e, 02 under normal operational onditions (non faulty),
i.e., typi ally residuals generated by means of Kalman ltering. The probability density fun tion of a Gaussian sequen e, rk , with N (; 2 ) is des ribed
by:
1
p(rk ) = p e
2
(rk )2
22
(5.20)
Statisti al testing methods are des ribed in Basseville and Nikiforov,

(1994) [2, Gertler, (1991) [24, Tzafestas and Watanabe, (1990) [48.
5.4.1
The Weighted Sum-squared Residual Te hnique
The WSSR te hnique uses the relation between a Gaussian and 2 distribution. The following are des ribed for the one-dimensional ase, but an
easily be onverted to a multi-variable ase. One residual generated from a
Kalman lter, [r(1); r(2); : : : ; r(j ) = r where j is the number of samples, is
hara terized by a Gaussian distribution with zero mean and a ovarian e,
2 . The sum of the squared samples [r2 (1) + r2 (2) + : : : + r2 (j ) = (r2 ) has
a 2 - distribution. If the Gaussian distribution is given by N (0; 1), then
the sum of squares has the distribution 2 (n), where n is the degrees of
freedoms. If r has the distribution N (; 1) then the sum of squares has the
distribution 2 (n; ), where:
(j ) =
j
X
k
=1
2 (k)
(5.21)
The mean and the varian e of the distribution are given by:
2 = n + ;
2 2 = 2n + 4
(5.22)
(5.23)
If r has the distribution N (; 2 ), then the sum of squares has the same
distribution as for 2 = 1, when ! 1.
Now, if the residual r(k) is weighted with the standard deviation, ,
whi h is onsidered known, then the result, rw (k), is a signal with zero
mean and a varian e of one. This means that the distribution of the weighted
94
sum squared sequen e, I (k), only ontains one parameter for determination,
namely the degrees of freedom, n. n is determined as j 1.
= r(k) 1
rw (k)
I (j ) =
j
P
rwT (k)rw (k) =
j
P
rT (k) 2 r(k)
(5.24)
k =1
=1
If a system fault happens the mean value of the residuals will hange.
Hypotheses are stated for the dierent faulty and non-faulty onditions. For
instan e the hypothesis that nothing has o urred H 0 = f : = 0 g and
the hypothesis that something has o urred H1 = f : 6= 0 g gives the
following dete tion rule using the weighted sum squared sequen e I (k).
k
a ept H0 if I (k)
reje t H0 (or a ept H1 ) if I (k) >
With the aid of 2 tables, the values for the innovation window length
(degrees of freedom) and the de ision threshold must be hosen so that trade
o are made between the probability of false dete tions (reje ting H0 when
really to be a epted, or a epting H1 when the reality is H0 ), PF , and the
probability of missed dete tions (a epting H0 when really to be reje ted,
or a epting H0 when the reality is H1 ), PM . PF an be determined from
the hoi e of onden e level = 1 PF . is typi ally hosen to 0.95,
0.995 or 0.999 giving a probability of false dete tion on 5%, 0.5% or 0.1%
respe tively. PM is dependent of the statisti s of the sequen e when a fault
is present whi h might not be known. The de ision rule an be written as:
g(k) =
1 when I (k) >

0 when I (k)
(5.25)
In the above formulation, the algorithm is running only on e. The algorithm should be reset to zero every time a hypothesis has been onrmed in
order to run sequentially for on-line dete tion.
5.4.2
Sequential Probability Ratio Test
If a fault o urs, the ee t is a hange the mean value, , of the residual

r(k). Then a set of simple hypotheses an be onsidered, where H0 = f :
= 0 g is the hypothesis on erning the pro ess under normal operational
onditions. Hi = f : = i g is the hypothesis on erning the pro ess
under a faulty ondition, i = 1; 2; : : : ; m, where m is the number of fault
onditions. In the following des ription, testing between the two hypotheses,
95
5.4. STATISTICAL DETECTION
H0 and H1 , is des ribed. The tool for testing between the two hypotheses
is based on the log-likelihood ratio, dened by:
pH 1 (ri (1); ri (2); : : : ; ri (j ))
(5.26)
pH 0 (ri (1); ri (2); : : : ; ri (j ))
where pHi (ri (1); ri (2); : : : ; ri (j )) is the probability density fun tion onsidering that hypothesis Hi is true and taken over the samples ri (1) to ri (j ).
The expe tation value of s(j ), E [s(j ), when hypothesis H0 is true is less
zero, while it is above zero when hypothesis H1 is true. A hange in the
mean value is then re e ted as a hange of sign in the mean value of s(j ).
The umulative sum of s(j ):
s(j ) = ln
S (j ) =
j
X
s(k)
(5.27)
=1
is the log likelihood ratio for the observations from r(1) to r(j ) and is
the de ision fun tion, when testing between H0 and H1 using the following
de ision rule:
k
a ept H0 when S a
a ept H1 when S h
ontinue to observe and test when a < S < h whi h an be rewritten to:
1 when S h
0 when S a
The threshold values a and h must fulll the inequality:
g (r ) =
(5.28)
a<S<h
The two threshold values an be sele ted by the designer to re e t the
trade -o between the probability of false alarms PF (H1 (a fault) is dete ted but the real ondition is H0 (no fault)) and the probability of missed
dete tions PM (H0 (no fault) is dete ted but the real ondition is H1 (fault)).
Determination of the threshold values using PF and PM dire tly was
proposed by De kert, (1978) [14. The thresholds h and a are give by
1 PM
)
PF
P
a = ln( M )
1 PF
h = ln(
(5.29)
96
This is a very useful result from an engineering point of view. In an implementation, it will usually be de ided to run the test sequentially. This
means, as soon as one of the thresholds is rea hed, the asso iated hypothesis
is de lared TRUE, and the test is restarted. This enables us to nd a fault
that o urs at a random instant in time. While the test runs for the rst
time, we an not assume any of the hypotheses to be true with the desired
probability (PM , PF ):
Bibliography
[1 K. J.
Astrom, J. J. Anton, and K. E.
A. n. Expert ontrol. Automati a,
22(3):pp 227{286, 1986.
[2 M. Basseville and I. Nikiforov. Statisti al Change Dete tion. Prenti e
Hall, 1994.
[3 M. Basseville and I. V. Nikiforov. Dete tion of Abrupt Changes: Theory
and Appli ation. Information and System S ien e. Prenti e Hall, New
York, 1993.
[4 T. E. Bell. Managing murphy's law: Engineering a minimum-risk system. Spe trum, 1989.
[5 M. Blanke. Aims and means in the evolution of fault tolerant ontrol. In
European S ien e Foundation Workshop, Control of Complex Systems
(COSY), pages 22{32, Sept. 1995.
[6 M. Blanke. Design of dependable ontrol systems using a omponent

based approa h. In Pro . IFAC Workshop: On-line Fault Dete tion
and Supervision in the Chemi al Pro ess Industries, pages 187{195,
New astle Upon Thyne, UK, Jun. 1995.
[7 M. Blanke, S. A. Bgh, R. B. Jrgensen, and R. J. Patton. Fault
dete tion for a diesel engine a tuator - a ben hmark for fdi. Control
Engineering Pra ti e, 3:1731{1740, De . 1995.
[8 M. Blanke and R. B. Jrgensen. Reliability related to sensor and a tuator interfa e in ma hinery systems. Te hni al report, Aalborg University R93-4016., 1993.
[9 M. Blanke, S. B. Nielsen, and R. B. Jrgensen. Fault A ommodation
in Feedba k Control Systems. Le ture Notes in Computer S ien e Vol.
736. Springer Verlag, 1993. ed. R.L. Grossman, A. Nerode, A.P. Ravn,
and H. Ris hel.
[10 S. A. Bgh, R. Izadi-Zamanabadi, and M. Blanke. Onboard supervisor
for the rsted satellite attitude ontrol system. In Arti ial Intelligen e and Knowledge Based Systems for Spa e, 5th Workshop, pages
97
98
BIBLIOGRAPHY
137{152, Noordwijk , Holand, O t. 1995. The European Spa e Agen y,

Automation and Ground Fa ilities Division.
[11 J. Chen and R. J. Patton. A reexamination of fault dete tability and
isolability in linear dynami systems. In IFAC Safepro ess 94, Helsinki,
Finland, pages pp 590{596., 1994.
[12 R. David and H. Alla. Petri-nets for modeling of dynami systems - a
survey. Automati a Vol. 30. No. 2, pages pp 175{202, 1994.
[13 T. J. A. de Vries. Con eptual Design of Controlled Ele tro-Me hani al
Systems. PhD thesis, Universiteit Twente, NL., 1994.
[14 J. C. De kert. Denition of the f-8 dfbw air raft ontrol sensor analyti
redundan y management algorithm. Te hni al report, C.S. Drasper
Laboratory, Cambridge, Masse huset, 1978. Report R-1178.
[15 X. Ding and P. M. Frank. An approa h to robust residual generation
and evaluation. In Pro . Conferen e on De ision and Control, pages
656{661, Brighton, UK, De . 1991. IEEE.
[16 R. I. (ed). Postprints of IFAC Safepro ess 91, Baden-Baden,. Pergamon
Press, Oxford, UK., 1991.
[17 A. Emami-Naeini, M. M. Akhter, and S. M. Ro k. Ee t of model
un ertainty on failure dete tion: The threshold sele tor. IEEE AC,
33(12):1106{1115, De . 1988.
[18 P. M. Frank. Fault diagnosis in dynami systems using analyti al and
knowledge-based redundan y. Automati a, 26(3):459{474, 1990.
[19 P. M. Frank. Enhan ement of robustness in observer-based fault dete tion. In Preprints of IFAC/IMACS Symposium SAFEPROCESS'91,
volume 1, pages 275{287, Baden-Baden, Sept. 10-13 1991. \A modied
version also published in Int. J. Control, Vol.59, No.4, 955-981, 1994".
[20 P. M. Frank. Appli ation of fuzzy logi pro ess supervision and fault
diagnosis. In Preprints of the IFAC Sympo. on Fault Dete tion, Supervision and Safety for Te hni al Pro esses: SAFEPROCESS'94, volume 2,
pages 531{538, Espoo, Finland, Jun. 13-16 1994.
[21 P. M. Frank. Advan es in fault toleran e by model-based fault diagnosis.
In ESF Workshop, COSY'95, pages 15{22, Rome, Italy, Sept. 1995.
[22 O. I. Franksen. Group representation of nite polyvalent logi - a ase
study using apl notation. Pro . IFAC World Congress, Helsinki,, pages
875{887, 1978.
BIBLIOGRAPHY
99
[23 J. J. Gertler. Survey of model-based failure dete tion and isolation in

omplex plants. IEEE Control Syst. Mag., 8(6):3{11, 1988.
[24 J. J. Gertler. Analyti al redundan y methods in failure dete tion
and isolation. In Preprints of IFAC/IMACS Symposium SAFEPROCESS'91, volume 1, pages 9{21, Baden-Baden, Sept. 10-13 1991. also
published in a revised version in \Control - - Theory and Advan ed
Te hnology, Vol. 9, No.1, 259-285, 1993".
[25 J. J. Gertler. Analyti al redundan y methods in failure dete tion
and isolation. Control Theory and Advan ed Te hnology, 1(9):259{285,
1993.
[26 S. A. Herrin. Maintainability appli ations using the matrix fmea te hnique. Transa tions on Reliability, R-30(2):212{217, Jun. 1981.
[27 R. Isermann. Pro ess fault dete tion based on modelling and estimation
methods: A survey. Automati a, 20(4):387{404, 1984.
[28 R. Isermann, editor. Preprints of IFAC/IMACS Symposium on Fault
Dete tion, Supervision and Safety for Te hni al Pro esses { SAFEPROSS'91, Baden-Baden, Germany, Sept. 10-13 1991.
[29 R. Isermann. Integration of fault dete tion and diagnosis methods.

In Preprints of the IFAC Sympo. on Fault Dete tion, Supervision and
Safety for Te hni al Pro esses: SAFEPROCESS'94, volume 2, pages
597{612, Espoo, Finland, Jun. 13-16 1994.
[30 K. Jensen. Coloured Petri Nets., volume 2 of EATCS Monographs on
Theoreti al Computer S ien e. Springer Verlag., 1994.
[31 R. B. Jrgensen. Development and Test of Methods for Fault Dete tion
and Isolation. PhD thesis, Department of Control Engineering, Aalborg
University, Fredrik Bajers Vej 7C, DK 9220 Aalborg, Denmark, Jul.
1995.
[32 D. Karnopp and R. Rosenberg. Introdu tion to Physi al System Dynami s,. M Graw-Hill., 1983.
[33 J. M. Legg. Computerized approa h for matrix-form fmea. IEEE Transa tions on Reliability, R-27(1):254{257, Jan. 1978.
[34 L. Ljung. System Identi ation: Theory for the User. Prenti e Hall,
1987.
[35 L. Ljung and T. Soderstrom. Theory and Pra ti e of Re ursive Identi ation. The MIT Press, Massa husetts and London, 1983.
100
BIBLIOGRAPHY
[36 C. P. Lunau. On the design of re e tive diagnosis systems. Te hni al

report, Aalborg University, Department of Computer S ien e, 1995.
[37 C. P. Lunau. A re e tive ar hite ture for pro ess ontrol appli ations.
In M. Aksit and S. Matsuoka, editors, ECOP'97 Obje t Oriented Programming, pages 170{189. Springer Verlag, 1997. Le ture Notes in
Computer S ien e, Vol. 1241.
[38 C. P. Lunau and J. K. Nielsen. Emma: An emergen y management
system for use onboard ships. In IFAC Workshop on Control Appli ations on Marine Systems CAMS'95, pages 164{173, Trondheim, Norway, May. 1995. International Federation of Automati Control.
[39 P. Maes. Computational Re e tion. PhD thesis, Arti ial Intelligen e
Laboratory, Vrije Universiteit Brussel, Belgium., 1987. Te hni al report
87-2.
[40 G. Mller. On the Te hnology of Array-Based Logi . PhD thesis, Ele tri al Power Eng. Dept., Te h. University of Denmark, Lyngby, Denmark,
1995.
[41 T. More. Notes on the Diagrams, Logi and Operations of Array Theory. In; Stru tures and Operations in Eng. Management Systems, (eds;
. Bjrke and O.I. Franksen) Tapir., 1981.
[42 R. Patton and J. Chen. A review of parity spa e approa hes to fault
diagnosis. In Preprints to SafePro ess 1991, volume 1, pages 239{55,
1991.
[43 R. J. Patton. Robust model-based fault diagnosis: The 1995 situation.
In Pro . IFAC Workshop on Supervision and Fault Diagnosis in the
Chemi al Pro ess Industries, New astle, UK., pages 55{78, 1995.
[44 R. J. Patton and J. Chen. A robustness study of model-based fault
dete tion for jet engine systems. In Pro . of the 1st IEEE Conf. on
Control Appli ation, pages 871{876, Dayton, Ohio, Sept. 13-16 1992.
[45 R. J. Patton, P. M. Frank, and R. N. Clark, editors. Fault Diagnosis in
Dynami Systems, Theory and Appli ation. Control Engineering Series.
Prenti e Hall, New York, 1989.
[46 D. N. Shields. Robust fault dete tion for generalized state spa e systems. In Pro . of the IEE Int. Conf.: Control' 94, pages 1335{1349,
Warwi k, UK, Mar h 21-24 1994. Peregrinus Press, Conf. Pub. No. 389.
[47 C. Tsui. A general failure dete tion, isolation and a ommodation system with model un ertainty and measurement noise. IEEE Transa tions on AC, vol 39. no. 11, pages pp. 2318{2321, 1994.
BIBLIOGRAPHY
101
[48 S. Tzafestas and K. Watanabe. Modern approa hes to system/sensor

fault dete tion and diagnosis. Journal A., 31(4):42{57, 1990.
[49 J. C. Willems. Paradigms and puzzles in the theory of dynami al systems. IEEE Trans. AC, Vol. 36. No. 3, pages 259{294., 1991.
[50 A. S. Willsky. A survey of design methods for failure dete tion in
dynami systems. Automati a, 12(6):601{611, 1976.
[51 J. Yuan. Strategy to establish a reliability model with dependent omponents through fmea. Reliab. Eng, 11(1):37{45, 1985.

FDIcourse

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FDIcourse

Uploaded by

Copyright:

Available Formats

Fault Tolerant Control - an Engineering Approa

2 About Fault Tolerant Control

3 Control System Interfa e with Physi al Plant

5 The Change Dete tion Problem

5.1.3 Mean and varian e of a ltered stationary pro ess . .

Levels of FDIA automation. . . . . . . . . . . . . . . . . . . .

5.1 Statisti al dete tion is applied to the residual. . . . . . . . . . 92

1.1 A ronyms and Abbreviations

Reliability : The probability that a system, subsystem or omponent

1.1. ACRONYMS AND ABBREVIATIONS

FMEA : Failure Mode and E e t Analysis

HazOp : Hazard and Operability Analysis

About Fault Tolerant

2.1 Introdu tion

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

assure a mu h higher degree of ompleteness than otherwise a hievable.

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

2.2 How Fault Toleran e is Obtained

2.2.1 Open and losed loop systems

2.2.2 Reliability Analysis

2.3. COMPONENT BASED ANALYSIS OF FAULT PROPAGATION 19

nevertheless, a very systemati approa h to fault modelling on e possible

2.2.3 A systemati Approa h.

2.3 Component Based Analysis of Fault Propagation

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

2.3. COMPONENT BASED ANALYSIS OF FAULT PROPAGATION 21

where Af is a Boolean matrix representing the propagation. The index

The fault e e t s heme for this example is

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

Reversal is obtainable through nding the inverse relation to 2.5

The matri es Af and Ab are ea h other's pseudo-inverse in the Boolean sense.

2.3.3 Fault propagation in losed loop

Looking at the logi operation of this equation, it is obvious that the

2.3. COMPONENT BASED ANALYSIS OF FAULT PROPAGATION 23

2.3.4 Other approa hes

2.3.5 De ision about fault handling

2.3.6 Fault A ommodation

 De rease performan e, e.g., redu e through-put of the ontrolled sys

 Change ontroller stru ture.

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

Evl1 Evl2 Evl3 Evl4 Evl5 Evl6

Emg1 Emg2 Emg3

Epc1 Epc2 Epc3 Epc4

Epm1 Epm2 Epm3 Epm4

Els1 Els2 Els3 Els4

Figure 2.2: Propagation of fault e e ts in losed loop ontrol of 3-way valve.

2.4. MODELS FOR FDI

 Use omponent redundan y if possible.

i al redundan y). Note, this operation may be limited in time

point and ontinue ontrol operation. Issue an alert message to

 Freeze ontroller output to a predetermined value. Zero, maxi-

mum or last fault-free value are three ommonly required values

2.4 Models for FDI

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

Table 2.1: FMEA S heme for 3-way Valve

The ow fault e e t is in orporated as an additive output fault. The

Su h models are easily represented as bond graphs. The ausality of

2.5 Inter onne tion at subsystem level

2.5. INTERCONNECTION AT SUBSYSTEM LEVEL

2.5.1 The link to FDI models

x_ (t) = Ax(t) + Bu(t) + Ef f (t) + Ed d(t)

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

2.6 An Ar hite ture for Supervisory Control

FMEA : Failure Mode and Ee t Analysis

The fault ee t s heme for this example is

De rease performan e, e.g., redu e through-put of the ontrolled sys

Change ontroller stru ture.

Figure 2.2: Propagation of fault ee ts in losed loop ontrol of 3-way valve.

Use omponent redundan y if possible.

Freeze ontroller output to a predetermined value. Zero, maxi-

The ow fault ee t is in orporated as an additive output fault. The

Detailed FMEA model for ea h omponent

List all potential omponent faults

Propagate fault ee ts through system and make a list of fault

ee ts at the subsystem/ system level

Determine ontrol requirements and inputs/outputs for ea h fault

Determine type of dete tion method.

Determine de ision logi .

Re onguration of the valve ontrol, should this be required, an be done

Level transdu er based on pressure/strain gauge measurement.

Temperature transdu er based on resistan e measurement.

Pressure transdu er based on strain gauge measurement. (Analog

Flow meter based on rotor revolutions.

Three-way-valve with motor ontrol.

Standby pump set with remote ontrol.

measurement of pressure at the tanks bottom. Level al ulation

hydrauli a tivated. Open when pressure is applied and losed