You are on page 1of 101

Fault Tolerant Control - an Engineering Approa

Mogens Blanke
Department of Control Engineering
Aalborg University, Denmark
email: blanke ontrol.au .dk

September 1996

Contents
1 Introdu tion
1.1 A ronyms and Abbreviations
1.1.1 De nitions . . . . . .
1.1.2 A ronyms . . . . . . .
1.1.3 Abbreviations . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

2 About Fault Tolerant Control


2.1 Introdu tion . . . . . . . . . . . . . . . . . . . . .
2.2 How Fault Toleran e is Obtained . . . . . . . . .
2.2.1 Open and losed loop systems . . . . . . .
2.2.2 Reliability Analysis . . . . . . . . . . . . .
2.2.3 A systemati Approa h. . . . . . . . . . .
2.3 Component Based Analysis of Fault Propagation
2.3.1 The Matrix FMEA Method . . . . . . . .
2.3.2 Completeness . . . . . . . . . . . . . . . .
2.3.3 Fault propagation in losed loop . . . . .
2.3.4 Other approa hes . . . . . . . . . . . . . .
2.3.5 De ision about fault handling . . . . . . .
2.3.6 Fault A ommodation . . . . . . . . . . .
2.4 Models for FDI . . . . . . . . . . . . . . . . . . .
2.4.1 FDI based on dynami models . . . . . .
2.5 Inter onne tion at subsystem level . . . . . . . .
2.5.1 The link to FDI models . . . . . . . . . .
2.6 An Ar hite ture for Supervisory Control . . . . .
2.7 Systemati Design . . . . . . . . . . . . . . . . .
2.8 Supervisor Design and Implementation . . . . . .
2.8.1 Array based logi . . . . . . . . . . . . . .
2.8.2 Petri net implementation . . . . . . . . .
2.8.3 Re e tive programming implementation .
2.8.4 A prototype implementation . . . . . . .
2.9 Example: Temperature Control . . . . . . . . . .
2.10 Summary . . . . . . . . . . . . . . . . . . . . . .
3

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.

11
11
11
11
12

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

15
15
18
18
18
19
19
19
22
22
23
23
23
25
25
26
27
28
29
31
32
32
32
33
34
36

CONTENTS

3 Control System Interfa e with Physi al Plant


3.1 Component Failure Modes . . . . . . . . . . . . . . . . .
3.1.1 Sensor and A tuator Types . . . . . . . . . . . .
3.1.2 Level Measurement . . . . . . . . . . . . . . . . .
3.1.3 Temperature Measurement . . . . . . . . . . . .
3.1.4 Pressure Measurement . . . . . . . . . . . . . . .
3.2 Angular Position Measurement . . . . . . . . . . . . . .
3.2.1 Potentiometer . . . . . . . . . . . . . . . . . . . .
3.2.2 Flow Measurement . . . . . . . . . . . . . . . . .
3.3 A tuators for Flow Control . . . . . . . . . . . . . . . .
3.3.1 Three-way Valve . . . . . . . . . . . . . . . . . .
3.3.2 Pumps . . . . . . . . . . . . . . . . . . . . . . . .
3.4 FMEA S hemes for Sensors and A tuators . . . . . . . .
3.4.1 Sensor Faults . . . . . . . . . . . . . . . . . . . .
3.4.2 A tuator Faults . . . . . . . . . . . . . . . . . . .
3.5 Requirements to Interfa e . . . . . . . . . . . . . . . . .
3.5.1 Component Categorization . . . . . . . . . . . .
3.5.2 Sensors . . . . . . . . . . . . . . . . . . . . . . .
3.5.3 Single Sensor Fault Dete tion . . . . . . . . . . .
3.5.4 Multiple sensor fault dete tion . . . . . . . . . .
3.5.5 Filtering of Ele tromagneti Spikes . . . . . . . .
3.6 A tuators . . . . . . . . . . . . . . . . . . . . . . . . . .
3.7 Interfa e Requirements . . . . . . . . . . . . . . . . . . .
3.7.1 Requirements to hardware and rmware . . . . .
3.7.2 Combined Hardware - Software requirements
4 Fault Dete tion and Isolation
4.1 FDI in Closed Loop Control Systems . . . .
4.2 Requirements to FDI . . . . . . . . . . . . .
4.3 Modelling of Faults and Fault-propagation .
4.4 Methods for Change Dete tion . . . . . . .
4.5 Geometri Approa hes to Change Dete tion
4.5.1 Generation of residuals . . . . . . . .
4.5.2 Parity Equations . . . . . . . . . . .
4.5.3 Diagnosti Observer . . . . . . . . .
4.5.4 Unknown Input Observer . . . . . .
4.6 Statisti al Methods to Generate Residuals .
4.6.1 Kalman Filtering . . . . . . . . . . .
4.6.2 Parameter Estimation . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

37
38
38
39
41
43
44
45
45
47
48
50
51
51
54
55
56
57
58
61
62
63
64
64
65

.
.
.
.
.
.
.
.
.
.
.
.

67
67
69
69
71
71
72
74
75
78
79
80
81

5 The Change Dete tion Problem


85
5.1 About sto hasti signals . . . . . . . . . . . . . . . . . . . . . 85
5.1.1 Amplitude distribution. . . . . . . . . . . . . . . . . . 86
5.1.2 Mean and varian e of a stationary pro ess . . . . . . . 86

CONTENTS

5.1.3 Mean and varian e of a ltered stationary pro ess . .


5.2 Measuring the di eren e between statisti al signals . . . . . .
5.2.1 The Kullbak Distan e between Gaussian signals . . .
5.3 Change Evaluation . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Threshold tests . . . . . . . . . . . . . . . . . . . . . .
5.4 Statisti al Dete tion . . . . . . . . . . . . . . . . . . . . . . .
5.4.1 The Weighted Sum-squared Residual Te hnique
93
5.4.2 Sequential Probability Ratio Test . . . . . . . . .

5
87
89
90
91
91
92
94

CONTENTS

List of Figures
2.1 Failure Mode and E e t Analysis s heme illustrated graphi ally. Two omponent levels are shown. . . . . . . . . . . . .
2.2 Propagation of fault e e ts in losed loop ontrol of 3-way
valve. Solid lines show fault propagation. Points marked
with star show where propagation an be stopped. . . . . . .
2.3 Blo k diagram for ooling system with two 3-way valves and
sket h of surrounding omponents. . . . . . . . . . . . . . . .
2.4 Bond graph model prin iple for ooling loop.  is serial, is
serial, and a parallel onne tion of omponents. . . . . . .
2.5 Three layer model for autonomous ontroller with link to upper level plant wide ontrol or to operator interfa e. . . . . .
2.6 Design method for dependable ontroller with autonomous
fault dete tion and a ommodation . . . . . . . . . . . . . . .
2.7 Temperature ontrol loop with 3-way valve. . . . . . . . . . .
3.1 Swit h arrangement for level swit h. . . . . . . . . . . . . . .
3.2 3-wire resistan e measurement of resistan e in Pt element to
measure temperature with ompensation of wire resistan e. .
3.3 Pressure measurement using a strain gauge bridge tted to a
membrane onverted to a 4-20 mA urrent out of the transdu er.
3.4 Binary pressure indi ator. The solid line is me hani al di eren e and the bottom of it is the adjustable set-point value. .
3.5 Ele tri al diagram of potentiometer and omputer interfa e
to enable fault dete tion at the single sensor level. . . . . . .
3.6 Valve hara teristi s for diverging and onverting operation
(the use of A or B ports for in ow). . . . . . . . . . . . . . .
3.7 Operation of 3-way valve a tuator with relay operated indu tion motor. Abbreviations are: o:open, : lose, s:stop,
HTR:Heater, LS:Limit Swit h, TS:Torque Swit h. . . . . . .
3.8 Standby pump set with remote ontrol. The ontrol omputers are independent and have mutual supervision. . . . . . . .
3.9 Tripple onversion sampling has only marginal overhead but
o ers both signi ant ele tromagneti spike suppression and
onsisten y he k within one sample. . . . . . . . . . . . . . .
7

20
24
27
28
29
31
34
41
42
44
45
46
49
50
51
63

LIST OF FIGURES

4.1
4.2
4.3
4.4

Levels of FDIA automation. . . . . . . . . . . . . . . . . . . .


A general pro edure of residual generation in FDI . . . . . .
Geometri interpretation for dete tion of several faults. . . . .
Geometri interpretation for dete tion and isolation of one
fault at a time. . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Geometri interpretation for simultaneous fault dete tion and
isolation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6 Illustration of a bank of Kalman lters for statisti al FDI. . .
4.7 Residual generation based on parameter estimation. . . . . .

68
72
73
74
75
82
83

5.1 Statisti al dete tion is applied to the residual. . . . . . . . . . 92

List of Tables
2.1 FMEA S heme for 3-way Valve . . . . . . . . . . . . . . . . . 26

10

LIST OF TABLES

Chapter 1

Introdu tion
This do ument is a le ture note in fault-tolerant ontrol used at the 9th
semester ourse for MS students in pro ess ontrol at Aalborg University.

1.1 A ronyms and Abbreviations


1.1.1 De nitions
The following de nitions were made by the Safepro ess Te hni al Committe
of IFAC

1.1.2 A ronyms
Dependable System : A system that has high reliability in terms of high
availability and where the onsequen es of a fault are limited to the
system itself, i.e., lo al faults do not develop into failure at plant level.
Event : An internal or external o urren e involving equipment performan e or human a tion that auses a system upset.
Failure : The inability of a system or subsystem to a omplish its required
fun tion.
Fault : A hange in the hara teristi s of a part or omponent su h that
its mode of operation or performan e is hanged in an undesired way.
Required spe i ations are no longer ful lled.
Fault tolerant system : A system where a fault may leed to hange of
operation or redu ed performan e but a single fault does not develop
into a failure on a subsystem or system level.
Failure Modes : The various ways in whi h failures o ur.
Hazard : An intrinsi property or ondition that has the potential to ause
an a ident.
11

12

CHAPTER 1. INTRODUCTION

Reliability : The probability that a system, subsystem or omponent


will perform its intended fun tion for a spe i ed period of time under
normal onditions.
Slow-down : A hange in operation, usually a redu tion of apa ity or
power, to prote t a ma hinery from damage or ex essive wear.
Safety system : Ele troni equipment that re eives sensor information
about riti al quantities and a tivates a dedi ated a tuator to stop a
ma hinery if a spe i ed onditions exist. Condition he k is usually
made as a simple limit he k of sensor values. The purpose of a safety
system is to prote t ma hinery from permanent damage due to, e.g.,
overspeed or la k of lubri ation oil or ooling water.
Shut-down : A stop of a ma hinery system to prote t it from permanent
damage. A shut-down is usually made by a dedi ated safety system
that measures essential
System hierar hy : The following system hierar hy is used:
Component: the lowest level whi h is onsidered as fault andidates/maintenan e. Components may omprise or use information from other omponents. Sensors and a tuators are examples
of omponents.
Subsystem: a olle tion of omponents that has a de ned purpose
and requirements to performan e and operational modes.
System: a olle tion of sub-systems
Plant: the entirety of a physi al system with its own purpose. A
plant is the largest entity onsidered.

1.1.3 Abbreviations
AI : Analog Input. Part of omputer pro ess interfa e
AO : Analog Output. Part of omputer pro ess interfa e
A/D : Analog to Digital onversion. Part of analog input.
D/A : Digital to Analog onversion. Part of analog output.
DI : Digital Input. Part of omputer pro ess interfa e
DiGraph : Dire ted Graph. Used for fault models in reliability analysis.
DO : Digital Output. Part of omputer pro ess interfa e
ETA : Event Tree Analysis

1.1. ACRONYMS AND ABBREVIATIONS

FMEA : Failure Mode and E e t Analysis


FTA : Fault Tree Analysis
FPA: Fault Propagation Analysis

HazOp : Hazard and Operability Analysis


IFAC: International Federation of Automati Control

I/O : Input/Output
ISC : Integrated Ship Control.
LO : Lubri ating Oil
PHA : Preliminary Hazard Analysis

13

14

CHAPTER 1. INTRODUCTION

Chapter 2

About Fault Tolerant


Control
Fault tolerant ontrols have the ability to be resilient to simple faults in
ontrol loop omponents. When faults o ur, performan e may be redu ed
or lose-down may be needed in some ases, but a simple fault will never be
ampli ed and ause plant failure. Dete tion and a ommodation te hniques
an be employed to obtain these features. Theory and basi development
methods exist for several fault dete tion problems, and some attempts have
been made in the eld of supervisory ontrol, but fault handling has not yet
been the subje t of systemati resear h. This paper fo us on fault tolerant
ontrol with spe i emphasis on fault handling design and implementation.
Aims and means are dis ussed and ideas towards onsistent design methods
are presented. A promising method is shown to be an automated analysis
of omponent fault modes and their e e ts. This method provides de ision
tables for fault handling that shows how fault migration an be stopped.
The potential of these te hniques, when fully developed, is shown to be
signi antly improved fault toleran e and autonomy of ontrol loops. The
impetus is that this is obtainable with fairly simple means.

2.1 Introdu tion


Pro ess te hnology has hanged to more omplex plants with a high degree
of automation. This has enhan ed quality and e ien y in normal operation, but also made systems more vulnerable to faults. As a onsequen e,
industrial attention has hanged towards in reased dependability, a synonym
for a high degree of availability, reliability, and safety.
Advan es in automation have provided integration of monitoring and
ontrol fun tions to enhan e the operator's overview and ability to take
remedy a tions when faults o ur. The manual supervision, to dete t a
fault, isolate it's ause, and a ommodate the system to a new ondition,
15

16

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

has been mu h improved. However, the omplexity and fast response time
required makes it appealing to move the more basi supervision down from
the operator to the automation level. To a hieve this, plant supervision
needs to be automated and be ome more autonomous.
This is te hni ally possible with integrated automation systems as platforms, but new design methods are needed to ope e iently with the omplexity and ensure that the fun tionality of a supervisor is orre t and onsistent. Fail-safe systems, known from avioni s and other safety riti al
appli ations are expensive in both hardware and development e ort, and
are prohibitive in ost for ordinary pro ess automation. Here, additional
hardware should not be required and implementation osts be very limited.
The o urren e of faults an be tolerated but it should be prevented that
they develop into failures at a subsystem or plant level. Furthermore, it
should be guaranteed that all essential faults are dete ted and all riti al
faults are a ommodated.
Fault Dete tion and Isolation (FDI) theory has matured over the last
de ade. E ient methods exist to dete t additive faults - where faults are
understood as signal ve tors in a state spa e or polynomial system des ription. (Gertler,1993 [25; Patton,1995 [43; Isermann,1994 [29). Di ulties
with fault dete tion in nonlinear systems have started to be solved Frank,
(1995)[21 and Shields, (1994) [46 , and robustness problems have been
dealt with in various ways: fuzzi ation (Frank, 1994 [20), threshold adaption (Emami-Naeini, 1988 [17; Ding and Frank, 1991 [15; Jrgensen, 1995
[31), and statisti al hypothesis testing Baseville and Nikiforov,(1994) [2.
Dete tability was investigated in Chen and Patton, (1994) [11.
Mu h less work has been devoted the problem of what to do when a
fault has been dete ted. An overall approa h was taken by 
Astrom, et al.,
(1986) [1 where not-normal ontroller operation and tuning were key issues.
The a ommodation problem was treated by Tsui, (1994) [47 for a narrow
s enario where state feedba k was required and faults needed to be state
disturban e signals, similar to a tuator faults.
The s ope in mu h ontrol systems resear h has been limited to solve
the fairly well formulated problem starting o with mathemati al models
of ontrol obje ts and faults represented as additive signal ve tors. A general design on ept was treated in Blanke,(1995) [6 and marine appli ation
studies were presented in Blanke and Jrgensen, (1993 [8 and 1995 [5).
The paper by Bgh, et al., (1995) [10 dis usses autonomous, fault tolerant
ontrol of a mi ro-satellite using the same basi ideas.
This hapter fo uses on development of an overall on ept that meets
industrial requirements to development methods. A method is suggested
that gives a onsistent design and assures system dependability. The basi
philosophy has been to use existing sensors and a tuators in an integrated
system and make systemati use of both dire t and indire t redundan y
in the available information. Component based fault analysis is shown to

2.1. INTRODUCTION

17

assure a mu h higher degree of ompleteness than otherwise a hievable.


Se tion 2 deals with fault toleran e, what it is and how it an be a hieved.
Se tion 3 presents the omponent based analysis developed by the author
and o-workers. Se tion 4 deals with the problem of getting models for FDI
design without major modelling e orts. Se tion 5 presents an ar hite ture
for fault tolerant ontrol where a three layer model with a supervisor at
the top level appears to be an e ient vehi le for implementation. Se tion 6
dis usses supervisor design and implementation in general terms, and se tion
7 lists the details of a pro edure for systemati design. Se tion 8 shows
an example on the use of the te hnique and reports on experien e with a
prototype tool. A on lusion and a referen e list ompletes the hapter.

18

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

2.2 How Fault Toleran e is Obtained


Faults in one subsystem of an automated plant have often undesired e e ts
on other subsystems if remedy a tions are not taken after a fault o urs.
Today, shut down fun tions and interlo ks are used to prevent failures to
dilate from one sub-system to another. The use of su h fun tions has, however, the onsequen e that plant availability is sometimes redu ed without
good reason. With the ever higher degree of automation, this has been the
key ause to in reased vulnerability to simple faults,parti ularly in sensors
and a tuators.
Dependability of a ontrol system an be obtained by giving it ability
to dete t and isolate faults and rea t with a tions that a ommodate the
ontrol system to the fault. Fault a ommodation is predetermined at the
design stage: a ontrol system an freeze to a safe state or the ontroller an
be re- on gured, e.g., by using a redu ed set of sensors if a sensor fault has
o urred.

2.2.1 Open and losed loop systems


Handling of faults in open loop systems, e.g., monitoring and remote ontrol,
is te hni ally straight-forward, but the rea tions used to a ommodate a
fault need to be designed with areful onsideration to safety and availability
of the total plant. Optimization at a lo al level may easily violate an overall
safety goal.
Handling of faults in losed loop omponents is a more di ult and
hallenging task. Properly designed systems an a ommodate the e e ts
of faults whereas less areful designs an let fault e e ts propagate to other
subsystems.

2.2.2 Reliability Analysis


For the reasons given above, fault analysis need to in orporate analysis
throughout a system.. Traditional methods for Fault Dete tion and Isolation
(FDI) do not over this problem. They are very able to dete t the presen e
of a fault as a di eren e between a tual and expe ted behaviour. Isolation
of a parti ular fault requires a hypothesis about the observed e e ts from
this fault. This is obtained by ad ho engineering and requires deep pro ess
knowledge and engineering skills to make a su essful design. It is expensive
in terms of both key personnel and time.
Analysis of system reliability is mandatory for safety riti al systems but
is also more and more often used for ommon industrial systems, driven by
the in reasing environment and safety awareness in re ent years. The state
of the art is su h that no method an guarantee a omplete des ription of
all possible fault modes of a system. Certain forms of risk analysis provide,

2.3. COMPONENT BASED ANALYSIS OF FAULT PROPAGATION 19

nevertheless, a very systemati approa h to fault modelling on e possible


omponent faults have been identi ed. Faults in ommon industrial omponents are subje t to onstant study, and a methodology based on omponent
fault modelling ould use a umulated knowledge for ea h type of omponent. The number of prin ipally di erent omponents in a ertain bran h
of industry is small enough to make this a manageable exer ise.

2.2.3 A systemati Approa h.


A systemati approa h an be made if the basi methodology from risk
analysis is adopted to the detailed mathemati al models needed for real time
FDI. A link has to be established from qualitative, stati risk models at the
omponent level to quantitative, dynami FDI des riptions of input-output
relations to a hieve this goal.
The link is obviously to merge the omponent based generi dynami
models (energy, momentum, and ow relations) with omponent fault models from the risk analysis. The generi dynami models an be extended to
subsystem input-output des riptions using a system behaviour des ription
approa h (Willems, 1991 [49) or a more traditional bond graph approa h
(Karnopp and Rosenberg, 1983 [32).
The systemati approa h shall provide the following information:
1. List of faults to dete t
2. Mathemati al model for use in FDI
3. Basi hara ter/ riti ality of ea h fault
4. Required rea tion to ea h fault
This is details are elaborated in the following.

2.3 Component Based Analysis of Fault Propagation


2.3.1 The Matrix FMEA Method
A Failure Mode and E e ts Analysis (FMEA) (Legg, 1978[33, Herrin, 1981[26,
Yuan, 1985 [51, Bell, 1989[4) starts with sele tion of the lowest level of analysis. In this ontext, this is sensors, valves, motors and similar omponents.
All potential faults and their e e ts are determined. An FMEA s heme for
ea h omponent shows how fault e e ts out of the omponent relate to faults
at inputs, outputs, or parts within the omponents. This is illustrated in
gure 2.1.
Using f i for omponent faults and e i for the e e ts, the FMEA s heme
an also be expressed as:

20

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

To third level

Second level
analysis
E1 E2 Em
Inputs

F1
F2

Outputs F3

First level
analysis
Unit 1
Inputs

E1 E2
F1
F2
F3

Outputs F4
F5
Parts

F6
F7
F8

En

Unit 2
Inputs

E1 E2

Unit 1

E1
E2
En

Unit 2

E1
E2
En

En

F1
F2
F3

Outputs F4
F5
Parts

F6
F7
F8

Figure 2.1: Failure Mode and E e t Analysis s heme illustrated graphi ally.
Two omponent levels are shown.

2.3. COMPONENT BASED ANALYSIS OF FAULT PROPAGATION 21

Afi
f i

e i

(2.1)

where Af is a Boolean matrix representing the propagation. The index


i is a omponent identi er and
the inner produ t disjun tion operator.
The operation arried out by the operator is equivalent to the s alar
Boolean disjun tion"_" and the inner produ t to the "^", i.e., row no. k of
(2.1) is
(ak1 ^ f 1 ) _ (ak2 ^ f 2) _ : : : _ (akn ^ f n)

e k

(2.2)

When some faults are e e ts that are propagated from other omponents,
we get

e i

f
i

f i
e i 1

(2.3)

System des riptions are obtained from inter onne tion of omponent des riptions. The des ription of a system with three omponents and open
loop stru ture is

e 3
e 1

Af2
fe 2
1

Af3
fe 3 ; e 2
2
f
A2
[f 1

(2.4)

The fault e e t s heme for this example is

e 3
e 3

A3

Af3




f 3 ;
e 2

I 0
0 Af2
2
I 0

(2.5)


4


f 3
f 2
f 1

3
5;
3


f 3
f 2
f 1

5
4
5  Afsys
fsys
Af3
4 0 Af
I 0
f
2
0 A1
E e ts are seen to be propagated to the next level of analysis and a t
as part's faults at that level. This is ontinued until the system level is
rea hed. The s hemes give a surje tive mapping from faults to e e ts: there
is a unique path from fault to end e e t, but di erent faults may ause the
same end e e t.

e 3

22

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

Reversal is obtainable through nding the inverse relation to 2.5

Absys
e 3 :

fsys

(2.6)

The matri es Af and Ab are ea h other's pseudo-inverse in the Boolean sense.


When there is no feedba k involved, the result is the apability of isolation
of fault e e ts at any level.
Re ent results from appli ation experien e indi ate that the fault ve tor
need to be extended su h that ea h omponent is a logi al expression of
more basi fault events,

fi = fk ^ fl

(2.7)

as an example. This extends the above pro edure to be ome more elaborated
but still solvable.

2.3.2 Completeness
Completeness of the fault e e t ve tor is a ne essary prerequisite for later
fault dete tion and isolation, be ause the only faults that an be isolated
are those spe i ed in the design. Completeness is obtained if all possible
omponent faults are onsidered. This is not a hievable in a rigorous sense,
but engineering experien e from risk analysis makes it possible for pra ti al
purposes.
It is noted, that ompleteness does not ensure that omponent fault
isolation is possible sin e several omponent faults ould ause the same
e e ts.

2.3.3 Fault propagation in losed loop


The FMEA s heme for a set of omponents onne ted in a losed loop is
prin ipally des ribed as

e i

f
i

f i
e i

(2.8)

Looking at the logi operation of this equation, it is obvious that the


solution is, if it exists

e i

Afi
[f i

(2.9)

With losed loop feedba k and negative loop ampli ation, equation
(2.8) is unstable, however, and a steady state solution does not exist.

2.3. COMPONENT BASED ANALYSIS OF FAULT PROPAGATION 23

2.3.4 Other approa hes


An alternative to the suggested FMEA based method of analysis ould be
Petri net, see, e.g., David and All, (1994) [12, Jensen (1994[30). Petri net
enable analysis of dis rete event systems and extensions of the theory enable
modelling of mixed ontinuous and dis rete systems. The Petri net approa h
has not been pursued in this ontext, but is an obvious resear h task.

2.3.5 De ision about fault handling


The impli ation is that an automated analysis will need to onsider losed
loops as spe ial ases. The interpretation of a losed loop in an FMEA
s heme is merely the observation that losed loop operation may amplify or
attenuate the e e t of a fault. Whi h of the two happens depends on the
dynami properties of the ontrol loop and this question is outside the s ope
of the matrix FMEA analysis.
Figure 2.2 shows the graphi al representation of a losed loop FMEA
analysis. Bold lines in the s heme show how faults propagate. The important
observation is that propagation an be stopped at the points marked with
stars. This means that fault handling should be applied exa tly at these
points.
The omponent based analysis an thus provide both a list of fault e e ts
and a suggestion of where in a system fault propagation an be stopped. In
the design method, it is then up to the designer to evaluate the severity of
ea h fault e e t and determine whi h fault a ommodation a tions shall be
implemented.
Various examples have been investigated to illustrate the appli ation of
the method, to provide insight in traps, and to highlight features. (Blanke
and Jrgensen, 1993 [8; Jrgensen, 1995[31).

2.3.6 Fault A ommodation


On e a sensor or a tuator fault has been dete ted and isolated, some a tion
is needed by the ontrol system to redu e or eliminate the fault e e ts, if
possible. This is referred to as fault a ommodation. The spe i a tions
required to a ommodate faults an be one or more of the items listed below:

 De rease performan e, e.g., redu e through-put of the ontrolled sys




tem.
Change settings in the surrounding pro ess to de rease the requirements to the ontrolled system.
Change ontroller parameters.

 Change ontroller stru ture.

24

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

To Filter

Evl1 Evl2 Evl3 Evl4 Evl5 Evl6


Inp
Outp
Comp
Emg1
Emg2
Emg3
Three-way Valve

Emg1 Emg2 Emg3


Epc1
Epc2
Epc3
Epc4

X
X

X
Motor/Gear

Epc1 Epc2 Epc3 Epc4


Epm1
Epm2
Epm3
Epm4
Els1
Els2
Els3
Els4
Etc1
From
Temp ctrl Etc2
Etc3
Etc4
Position Controller

Epm1 Epm2 Epm3 Epm4


Inp
Outp
Comp
Emg1
Emg2
Emg3

Els1 Els2 Els3 Els4


Inp
Outp
Comp
Limit Switch

Potentiometer

Figure 2.2: Propagation of fault e e ts in losed loop ontrol of 3-way valve.


Solid lines show fault propagation. Points marked with star show where
propagation an be stopped.

2.4. MODELS FOR FDI

25

 Use omponent redundan y if possible.


 Repla e defe tive sensor with signal estimator/observer. (analyt-

i al redundan y). Note, this operation may be limited in time


be ause external disturban es may in rease the estimation error.

 If the fault is a set point error then freeze at last fault-free set

point and ontinue ontrol operation. Issue an alert message to


operators.

 Freeze ontroller output to a predetermined value. Zero, maxi-

mum or last fault-free value are three ommonly required values


- the one to be used is entirely appli ation dependent. Finally
disable the ontroller.

 Fail-to-safe operation.
 Emergen y stop of physi al pro ess (safety system).
The a ommodation a tions needed follow from the FMEA analysis.
The requirements on part of software are that su h fault a ommodation
a tions an be easily spe i ed and that autonomous fault dete tion and
a ommodation is part of ontroller and safety system spe i ations. This
is beyond the s ope of present automation equipment but is believed to be
an essential part of requirements to ome to improve overall reliability of
automated ma hinery systems.

2.4 Models for FDI


2.4.1 FDI based on dynami models
On e the list of all omponent fault e e ts are established, dynami models
for FDI need to be generated.
The Bond Graph approa h (see e.g. Karnopp and Rosenberg, 1983[32)
is well suited for omponent based dynami modelling. An example on the
modelling asso iated with a 3-way valve in a ooling system is seen in table
2.1.
Two models are needed orresponding to ea h of the two physi al laws
of the thermal system: onservation of mass - the ow equation, and onservation of energy - the temperature equation.
The ow model is, in equation form:

26

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

E e t )
fault +
input fault

Flow zero

omp. fault

rotor fault

pipe leak,
power fault
output fault pipe broken

Flow redu ed

A ow high B ow high

pipe logged,
power fault
pipe leak,
port A or B
logged
rotor fault,
bearing worn

setp. fault

setp. fault

Table 2.1: FMEA S heme for 3-way Valve

Input :
Input :

q1
q2

F ault :

q =

Internal : R1 =
R2 =
Output : q3 =

8
<

= 0 : no fault
=
< 0 : redu ed ow
:
= q1 q2 : no ow ;
f () ; where 0 < f () < K
K f ( )
q1 + q2 + q

(2.10)

The ow fault e e t is in orporated as an additive output fault. The


ow into port A is q1 , and q2 into port B. The output ow is q3 .
The temperature model is

Input :
t1
Input :
t2
Internal : K1 = f () ; where 0 < f () < K
K2 = K f ()
Output : t3 = t1 K1 + t2 K2

(2.11)

Su h models are easily represented as bond graphs. The ausality of


a bond graph model is determined by the inter onne tion, i.e., how the
omponents are used. There is hen e freedom to use the same omponent
representation in di erent onne tions and use di erent dire tions of ausality. This exibility makes the bond graphs well suited for use in a omponent
library, see de Vries, (1994) [13 .

2.5 Inter onne tion at subsystem level


A dynami model at the subsystem level is obtained from spe i ation of
links between omponents. This is illustrated s hemati ally in Figures 2.3

27

2.5. INTERCONNECTION AT SUBSYSTEM LEVEL


Controller

Controller
From cooler,
FW calorifier and
FW generator

TS
TS

To de-arating tank

Main
engine
M

(3a)

A
From pump system

(3b)

Three-way valves

From FW calorifier
and FW generator

Figure 2.3: Blo k diagram for ooling system with two 3-way valves and
sket h of surrounding omponents.
and 2.4 for two 3-way valve ooling loops and a pro ess being ooled. The diagrams are not omplete but serve as illustration. The prin ipal issue is that
a graphi al omponent des ription with omponent links has an underlying
model and the stru tures of the two an be dire tly related.

2.5.1 The link to FDI models


Analyti al methods for dete tion of faults, and the later isolation (FDI)
require a dynami fault model like equation (2.12).

x_ (t) = Ax(t) + Bu(t) + Ef f (t) + Ed d(t)


y(t) = Cx(t) + Du(t) + Gf f (t) + Gd d(t)

(2.12)

The symbols in eq. (2.12) are: state ve tor, x, ontrol input, u, disturban e, d, additive fault ve tor, f , and output ve tor, y. Fault propagation
is des ribed by the plant dynami s and the matri es Ef and Gf .
Parity equation and Fault dete tion observer based FDI method dete ts
a deviation from normal and isolates the omponent of the fault ve tor, f
, whi h is the most likely ause to the deviation. Identi ation approa hes
determine hanges of parameters in either of the system matri es.
A ru ial point about FDI methods is that only fault e e ts whi h have
been in luded in the model, an be isolated. The FDI methods alone an not

28

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

3WV

3WV

TS

ME

TS

Cntrl

Cntrl

Figure 2.4: Bond graph model prin iple for ooling loop.  is serial,
serial, and a parallel onne tion of omponents.

is

guarantee that all relevant faults an be isolated. This obsta le is over ome
by using the risk analysis approa h to de ne the fault e e ts.
Transformation of a bond graph model to an FDI state spa e des ription
an be ompletely automated when ausality in the loop is properly de ned.
The interested reader should onsult (de Vries, 1994 [13, or Karnopp and
Rosenberg, 1983) [32.

2.6 An Ar hite ture for Supervisory Control


The rst step to a hieve fault tolerant ontrol is dete tion of a non-normal
ondition. The se ond step is to isolate the ause to one or more possible
omponent faults. The third step is to evaluate the ondition, take de ision
about a tivation of a tions to a ommodate the fault and nally enfor e the
handling a tions.
These fun tions are adequately implemented as a supervisory stru ture
with three levels:
1. A lower level with ontrol and input/output
2. A se ond level with fun tions to dete t fault onditions in sensors,
a tuators, ontrol loops and ontrol algorithms where needed.

2.7. SYSTEMATIC DESIGN

29

Figure 2.5: Three layer model for autonomous ontroller with link to upper
level plant wide ontrol or to operator interfa e.
3. A third level with de ision logi whi h rea ts on the urrent ondition,
re eiving inputs from dete tors on any non-normal state and the operational mode of the pro ess. Dedi ated e e tor modules will also exist
to exe ute handling a tions when required
The 2nd and 3rd level are meta-levels whi h together onstitute a supervisory ontrol. Levels 1 and 2 are exe uted in real-time. Level 3 is exe uted
when triggered by events at a lower level.

2.7 Systemati Design


A method for systemati design is a omputer assisted intera tive design
pro ess where the designers judgment is used to determine fault handling
a tions. Parts of the method are automated: the FMEA analysis, the logi
inferen e, onsisten y he k and implementation of de ision logi in a supervisor.
The systemati design is illustrated in gure ?? and omprises the following steps:
1. Component based FMEA

 Detailed FMEA model for ea h omponent

30

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

 List all potential omponent faults


 Find fault e e ts for ea h omponent fault
2. Criti ality assessment

 Propagate fault e e ts through system and make a list of fault




e e ts at the subsystem/ system level


Evaluate the riti ality of ea h fault e e t

3. Dedu t remedy a tions






Lo ate losed loop points and determine desired fault rea tions
Use system fun tionality requirements in this analysis
Make list of e e ts to be handled
Determine remedy a tions for ea h

4. Fault a ommodation design

 Determine ontrol requirements and inputs/outputs for ea h fault





e e t to be a ommodated.
Determine ontroller on guration, in luding possible sensor signal estimation for ea h fault e e t. This step may in lude rea tions from plant shut-down to issue of an operator warning.
Determine how re on guration shall be done.

5. Reverse omponent FMEA

 Determine faults that ause the end e e ts with high riti ality

level. These faults should be dete ted and subsequently handeled.


This is done by reversal of the FMEA logi from the list of fault
e e ts to be handled.

6. System modelling

 Model relevant parts of the system as required by the FDI methods to be employed.

7. Fault dete tor design

 Determine type of dete tion method.


 Determine robustness requirements.
 Determine dete tion method and parameters.

2.8. SUPERVISOR DESIGN AND IMPLEMENTATION

31

Figure 2.6: Design method for dependable ontroller with autonomous fault
dete tion and a ommodation
8. Supervisor design and implementation

 Determine de ision logi .


 Determine onditions for use of ea h dete tor.
 Assure onsisten y and orre t implementation (a subje t of a 

tive resear h).


Take a ount of modes of operation

The problem of onsisten y and ompleteness an be partly solved using


an inferen e engine. Reliable implementation is also a subje t for resear h.

2.8 Supervisor Design and Implementation


Implementation of a supervisor and it's dete tors/e e tors are required to be
orre t. This is a di ult task. Corre tness an not be tested like ontrollers
in normal operation. Most parts of a supervisor will only be a tivated when
ertain faults happen. Furthermore, sin e these parts of the supervisor are
not used in daily operation, one will not know of an implementation fault

32

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

before a ontrol loop fault has aused plant malfun tion. For these reasons,
automati ode generation and systemati design methods are key issues.
There are three main on erns when implementing a supervisor. First,
a design methodology must ensure that there is a unique mapping from the
system fault des ription to supervisor logi . Se ond, onsisten y of the logi
should be provable and, third, automati ode generation of the state-event
logi is preferred for reasons of implementation reliability.

2.8.1 Array based logi


Array based logi (Mller (1995) [40; More (1981) [41; Franksen (1978) [22)
is the basis for a ommer ial inferen e engine and run time system. It uses
array theory to perform veri ation of onsisten y of the logi at ompile
time, and has deterministi exe ution duration at run time. This is obtained
through representation of the logi in a rule base as matri es. Logi inferen e
is then boolean operations on logi matri es. This software tool makes it
possible to analyze logi al relations and implement state event ma hines
automati ally on e it has veri ed the des ribing logi for orre tness. A
main advantage of this tool is that its logi mat h the fault propagation
des ription.

2.8.2 Petri net implementation


If the fault e e t propagation had been des ribed in terms of Petri nets or
olored Petri nets (Jensen, 1994) [30, automati analysis and ode generation ould also have been possible. However, formal veri ation of onsisten y would not be immediately available.

2.8.3 Re e tive programming implementation


When it omes to orre t implementation and easy maintenan e, omputer
s ien e may have another tempting method for supervisor and dete tor/e e tor
implementation. Re e tive programming (Maes, 1987 [39; Lunau, 1995,
1997 [36, 37; Lunau and Nielsen, 1995 [38) is a te hnique within the obje t
oriented programming eld where one an obtain transparent implementation of the various dete tors and e e tors. Re e tive me hanisms enable
obje ts - with their inherent methods - to be exe uted in an order whi h
depends on external onditions. The parts of software used an thus hange
dynami ally. This is exa tly what is needed when re on guration shall take
pla e. New ontroller ode shall repla e what is obsolete in the new ondition and dete tors shall hange to the hanged onditions. The obje ts
whi h implement the dete tor and e e tor fun tions are referred to as metaobje ts.
The re e tive paradigm is parti ularly useful be ause ea h dete tor an
be implemented su h that it needs to know only the onditions under whi h

2.8. SUPERVISOR DESIGN AND IMPLEMENTATION

33

it is a tive. This makes implementation mu h more versatile and maintainable than traditional ase statement implementations and re-use of dete tor
ode be omes a genuine possibility. The ways to ensure onsisten y and orre tness of an re e tive implementation is, however, still an area of a tive
resear h.

2.8.4 A prototype implementation

Implementation of the supervisor has been made using a BEOLOGICr generated state-event ma hine on a small s ale prototype.
The methodology for dependable design was implemented as a prototype
tool using o the shelf software to the extent possible. FMEA s hemes
for omponents were entered in a spreadsheet, and a dedi ated ompiler
translated to a language understood by the BEOLOGICr inferen e engine.
The logi of this ommer ial tool is rather more advan ed than the basi
matrix formulation presented here, and array logi is employed to solve the
inter onne tion and analysis problems. (Mller, 1995 [40; Franksen, 1978
[22; More, 1981 [41).
The tool was able to generate the ne essary tables for fault handling for
the ooling system as desired. A se ond bene t was easy a ess to the inverse
tables, whi h show all possible omponent faults on e a ertain ombination
of fault e e ts is observed. This list ould be useful in its own right for fault
diagnosis purpose and advise about the severity of an observed ondition.
The parti ular tool o ers translation of the logi and has, thus, a xed
maximum al ulation time at run-time.
One di ulty en ountered was the analysis of losed loop systems. Equation (2.2) annot be solved dire tly for a losed loop on guration, and it
does not give any meaning to onsider loop gain when faults are des ribed
only quantitatively, the inferen e engine had di ulties. The work-around
solution was to in orporate an additional state with ea h FMEA blo k stating whether a logi sear h had already been through this part of the diagram. The result was easy determination of losed loop paths whi h ould
be used in the identi ation of potential points where fault handling ould
be a tivated to stop further propagation. (Blanke, et al., 1995) [7.
The false dete tion problem is not solved in this way, however. False
dete tion and noise on FDI residuals may ause onsiderable diagnosis un ertainty. This problem needs to be solved using, e.g., the usual sto hasti
dete tion methods or fuzzy dete tion te hniques Frank, (1995) [21.
Automati handling of bond-graphs inter onne tion and translation to
state spa e models has not been pursued. The reason was that other groups
have reported su h results, see de Vries, (1994) [13.
The prototype tool is ertainly far from a full s ale implementation,
but the experien e has shown that the on ept as su h seems to be worth
pursuing at a larger s ale.

34

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL


D/A

AO

Control
Algorithm

A/D
AI

Computer
Plant

TS

Filter Unit
To Main Engine

From Heat Exchange Subsystem

Figure 2.7: Temperature ontrol loop with 3-way valve.

2.9 Example: Temperature Control


A three way valve ontrols the mixing of hot oil from an engine and tempered
oil from a heat ex hanger. The valve is ontrolled by the temperature ontrol
loop, whi h onsists of:

{
{
{
{

a tuator with AC motor


temperature sensor
ontroller with pro ess interfa e
lter system

The ontrol loop is shown in gure 2.3.


The temperature ontrol loop is a as ade ontrol with position ontrol
of the valve as the inner loop. Stability of the total loop is not guaranteed
if the inner loop be omes open due to a omponent fault.
Three way valve a tuator. The valve is driven by an AC motor whi h
is a tivated by a double a ting relay to either side of rotation. End stop
swit hes are supposed to avoid motor overload and for e the motor into
me hani al stop should the position ontrol loop fail in some way. The
potentiometer gives position feedba k. The position loop fails if either potentiometer or end stop swit hes fail.

2.9. EXAMPLE: TEMPERATURE CONTROL

35

Re on guration of the valve ontrol, should this be required, an be done


fairly simply.
1. Use an estimate of the valve position in the motor ontroller instead
of a faulty position signal.
2. Override a limit swit h information if both position sensor feedba k
and an estimated position show that a limit swit h fault has o urred.
An observer for this purpose is quite elementary. The estimated valve
position is in reased or de reased in proportion to the time either of the
two motor relays. This requires no additional hardware but a few lines of
observer ode. A ommodation of any of these sensor faults will make it possible to ontinue operation while giving an alert about required maintenan e.
Without a ommodation, the temperature ontrol loop would probably fail
due to the loop be oming unstable without the internal position feedba k.
Figure ?? showed the FMEA s heme in graphi form for the valve ontrol
part of the loop. The omponents are: potentiometer, limit swit hes, motor,
3 way valve, and digital ontroller.
Faults in a limit swit h will prevent motion in lo kwise or ounter lo kwise dire tion - opening or losing of the valve. The onsequen e is a severe o set of the temperature ontrol if fault handling is not initiated. A
breakdown of the position feedba k element will ause a breakdown of the
temperature ontrol loop be ause the motor will be driven rapidly to fully
open or fully losed position.
Be ause several faults an ause the same e e t, it is ne essary to isolate
the failure sour e. When the sour e is isolated it is possible to de ide the
rea tion:

A tuator fault (fault in the valve limit swit h) the motor must be stopped
immediately.
Position sensor fault the ontroller should be re- on gured. The analyti al relation between duration of relay pulses and motor shaft position,
a position estimate is readily available. The estimate is used until the
fault is repaired.
Temperature sensor fault the referen e to the position ontroller fails.
The ontroller is re- on gured and a time-history roll ba k is made of
the referen e signal and the mean used as new referen e until the fault
has been repaired.
These examples show situations where temperature would deviate signi antly or the ontrol would simply fail with the existing ontroller design. Fault handling, by ontrast, ould assure plant availability with simple
means.

36

CHAPTER 2. ABOUT FAULT TOLERANT CONTROL

2.10 Summary
This hapter has given an overview of ideas to make systemati design to
obtain fault tolerant ontrol. It showed how a matrix formulation of an
FMEA method ould be adopted to t into the fault dete tion and isolation
problem. State spa e des riptions of system dynami s and fault propagation
ould be obtained from generi bond-graph models of omponents. It was
shown how the omponent models ould be simpli ed into generi types
for used in the design, and how the generi types were used in the model
building stage. It was further shown that the FMEA method and the generi
omponent types enable isolation of failure modes with di erent degree of
riti ality and determination of ontrol system a tions to faults.
A systemati design method was further presented whi h led to a three
level ar hite ture for a supervisor based fault tolerant ontroller. Various
implementation problems were dis ussed.
The main ontribution was the suggestion of a new method to systemati apture of requirements for fault dete tion and a ommodation, and
a systemati way of spe ifying FDIA properties related to omponent failure modes. A salient feature was shown to be the ompleteness properties
obtained with this method if ombined with array theory based implementation of the supervisor logi .

Chapter 3

Control System Interfa e


with Physi al Plant
Reliability is a key issue when designing interfa es between a ontrol system and the physi al plant. High availability is mandatory, and in reased
omplexity of automation has in reased the dependen y on sensors, remotely
ontrolled a tuators, and their interfa es with omputerized ontrol systems.
Computer ontrolled parts of a ma hinery system or other pro ess are
vulnerable to sensor faults, be ause a ontrol loop may amplify the e e ts of
a fault, and safety systems depend on sensor information to make immediate
slow-down or shut-down of essential equipment. Even simple faults an
therefore ause failure or shut-down of an entire subsystem in the plant.
The ability to dete t faults in sensors and a tuators, and thus to avoid
undesired onsequen es of simple faults, depends to a large extent on the
interfa e between ma hinery systems and omputers.
Requirements to sensors, a tuators, and their interfa e are dis ussed in
this hapter. A fundamental prerequisite is shown to be that sensor and
a tuator interfa es must be designed su h that fault dete tion is possible.
This problem is shown to addresses both the hardware interfa e and software
methods that, together, makes it possible to dete t and isolate faults.
Based on the analysis presented, requirements to interfa e between omputers and a physi al plant are suggested for a sele ted set of sensors and
a tuators. The requirements address the ombination of hardware and software needed to dete t, isolate, and rea t to generi types of faults.

37

38CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT

3.1 Component Failure Modes


This se tion gives an assessment of fault types for sele ted, ommonly used
omponents in ma hinery systems. Ea h sub-se tion gives an overview of
di erent omponent prin iples and sele ted omponents are des ribed in
some detail from information extra ted from supplier data sheets. Fault
models are developed in a manner that enables later fault analysis on a
marine lubri ation oil an illary system, whi h was sele ted as an example
for demonstration of the analysis methods.

3.1.1 Sensor and A tuator Types


An overview of sensor and a tuator types available for di erent supervision
and ontrol purposes in pro ess ontrol is provided in this sub-se tion. A
more detailed des ription is given for the most ommonly used types. The
sele tion of these types is made su h that the most relevant omponents in
uid pro esses are in luded.
Sensors:

1. Level measurement

 Level transdu er based on pressure/strain gauge measurement.




(Analog signal)
Level swit h based on a oat. (Binary signal)

2. Temperature measurement

 Temperature transdu er based on resistan e measurement.


3. Pressure measurement

 Pressure transdu er based on strain gauge measurement. (Analog




signal)
Pressure swit h. (Binary signal)

4. Flow measurement

 Flow meter based on rotor revolutions.


5. Position measurement

 Potentiometer.
A tuators:

1. A tuators for ow ontrol

3.1. COMPONENT FAILURE MODES

39

 Three-way-valve with motor ontrol.


2. Pumps

 Standby pump set with remote ontrol.


3.1.2 Level Measurement
Measurement of level is needed for assessment of the volume of the ontent
within tanks omprising liquids. The following methods are ommer ially
used:
1. Sounding of height of surfa e relative to the top of a tank:

 ele tro-magneti impulse travel time (mi rowave radar prin iple)
 ultrasound impulse travel time or standing wave re e tion
2. Indire t assessment through measuring the pressure near the tank bottom:

 measurement of pressure at the tanks bottom. Level al ulation




is dependent on uid spe i weight and temperature.


measurement of air pressure in a tube that extends to the tank
bottom where an opening in the tube allows air bubbles to be
released.

3. Measurement of tank level by a oat with an angle transmitter or


swit h that indi ate low or high level.
Measurements made a ording to prin iple a) above are most a urate
but also expensive. Cal ulation of the volume of tank ontents, from the
level measurement, requires a al ulation, or table lookup, to take a ount
of tank geometry. On a ship, ompensation of trim and heel angles may also
be needed. Cal ulation of mass ontained in a tank, from the al ulated
volume, requires measurement of the temperature of the tank ontents.
Measurement a ording to prin iple b) above will require the same al ulations. In addition, however, onversion from pressure, p, to level, L,
depend on spe i weight, (T), of the liquid in a tank. The latter is a
fun tion of the temperature, T:

L = Lo j + (Tp)
p = p patm

(3.1)

40CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


where Lo
is the transdu er o set from the tank bottom
p
is tank pressure
patm is atmospheri air pressure
(T) is spe i mass (temperature dependant)
Prin iple ) is inherently non-linear. It is well tted where a binary
indi ation is needed for showing whether tank ontents is above or below a
ertain level.

Level Transdu er
The level measurement system onsidered here is of type b. It onsists of a
strain gauge pressure transdu er mounted in the tank and a signal ampli er/transmitter. The transdu er and the transmitter are inter onne ted by
means of a vented able. The vent tube in the able provides the transdu er
with the referen e pressure patm .
The pressure transdu er onsists of a sensing diaphragm and a resistive
strain gauge bridge. The bridge onverts the diaphragm deformation, due
to pressure di eren e, to a voltage. In the transmitter the bridge output
is onverted to a 4-20 mA urrent signal. The bridge is supplied from the
transmitter board whi h also delivers ne essary power from the 4-20 mA line.
Figure 3.1 shows the prin iple of the measurement system with ele tri al
wiring.
A range sele tor ombined with span and zero adjustment potentiometers
are available for alibration of the system.
Input to the level transdu er is the strain gauge bridge signal, whi h
is aused by sensing diaphragm deformation due to a pressure di eren e
over the diaphragm. The output is a 4-20 mA urrent signal. The tank
level is almost proportional to the deformation. If the level, urrent and
pressure span is Lspan , ispan and pspan respe tively then the linear, relative
relationship between input and output (level relative ele tri urrent) is:

^
^

L
Lspan
L
Lspan

Lo
P (To )
= Pspan
+ Lspan
(T )
io )
Lo
= (iispan
+ Lspan

(3.2)

where p = p - patm .

Level Swit h
Most level swit hes use prin iple ) above. They are used to indi ate full or
empty tank, and to provide a binary indi ation of high/low level. One level
swit h has one of these fun tions. Commer ial level swit hes have double
onta ts to enable dete tion of onta t faults. One set is o when the other
is on. shows the ele tri al swit h arrangement.

41

3.1. COMPONENT FAILURE MODES


B

A
Low Level

A
High Level

Figure 3.1: Swit h arrangement for level swit h.


Swit h arrangement for level swit h. The A-A onta t set is used for
low level indi ation and the B-B set for high level indi ation as onta ts are
always arranged as losed in normal ondition.

3.1.3 Temperature Measurement


Measurement of temperature is needed for various monitoring and ontrol
purposes. The type of sensor used depends primarily on the temperature
level and range needed.
1. Temperature range -50-300 deg C. PT100 sensors are most feasible
in this range. PT100 sensors omprise a platinum resistan e element
en apsulated in glass. The measurement prin iple is based on the
hange of resistan e with temperature.
2. Temperature range 300-600 deg C. Thermo ouple sensors are most
feasible in this range. Ni/Cr sensors are the most ommonly used. The
measurement prin iple is that the voltage over a jun tion between two
metals will hange with temperature. A thermo ouple measurement
requires ompensation by a reverse jun tion at a known temperature.
There are also spe ial requirements to wire materials. This makes
wiring for thermo ouple more expensive.
Industrial temperature sensors are normally made as a metalli en losure
for the sensor element, for reasons of me hani al robustness. A house for
wire termination is mounted on the en losure.
PT100 sensor appli ation in lude tank temperature, temperature of liquids like ooling water, lubri ation oil et . Thermo ouple appli ation is
mainly exhaust gas temperature and similar high temperature measurements.

42CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


Rw1

iref

MUX
Rw3
RPT100

Rw2

Sensor

Wirering

ISC

3-wire measurement

Figure 3.2: 3-wire resistan e measurement of resistan e in Pt element to


measure temperature with ompensation of wire resistan e.

PT100 Temperature Transdu er


The temperature transdu er onsidered here is of type a). The PT100 temperature transdu er onsists of a measurement platinum element with a
nominal resistan e of 100 ohm at 0 deg C. The resistan e of the element is
hanging almost linearly with temperature. The nonlinearity (Eq:3.3) needs,
nevertheless, to be onsidered when a ura y below 0.5 deg is needed or the
range ex eeds 0-100 Co.
Resistan e measurement is, in most ases, measurement of the voltage
resulting from passing a known urrent through a resistan e element. To
avoid errors from resistan e in onne ting ables, a 3 or 4 wire measurement
is needed.
A 3-wire measurement is illustrated in gure ??. The 3-wire oupling
eliminates wire resistan e if Rw1 and Rw2 are equal, be ause the measurement wire Rw3 has no urrent load.
The resistan e of the measurement element has nearly a linear dependen y to the temperature. The general relationship between the resistan e
and temperature of the element is:
RP T 100 = R0  (1 + T + T 2 + T 3 + : : : )
(3.3)
where R0
is resistan e at 0 C
, and are temperature oe ients
T
is the temperature in deg C of the element
The magnitude of higher order terms in the above equation are small, and

43

3.1. COMPONENT FAILURE MODES

the relation between resistan e and temperature an be onsidered linear in


the 0-100 deg C range within an a ura y of 0.1 o .

RP T 100 = R0  (1 + T )

(3.4)

and for the temperature relative the urrent iref


1 V
T^ = ( P T 100
iref

R0 )

(3.5)

3.1.4 Pressure Measurement


Pressure measurement is needed in a large number of ma hinery subsystems
for reasons of monitoring and ontrol. Pressure measurement is needed in
various appli ations where the pressure itself is the key parameter. It is
also used for indire t assessment of ow through units like heat ex hangers,
lters, and asso iated piping.
Pressure measurement an be:
1. absolute with analog measurement of pressure
2. di erential - i.e.e, the di eren e pressure between two measurement
points, e.g., di eren e between inlet and outlet of a pump or a lter.
This type is also analog.
3. absolute measurement with binary indi ation of pressure above or below a set value
4. di erential measurement with binary indi ation of pressure above or
below a set value
The analog transdu ers will often use a strain gauge measurement. The
binary will most often use a spring load to determine the swit h-over point.
Pressure measurement a ura y is in many ases limited by hysteresis
properties of the material of the sensing element.
Examples of appli ation of absolute measurement are steam pressure in
steam generator, starting air for a diesel engine, or hydrauli pressure in
steering gear supply pumps.
Safety shut-down of a subsystem is often done based on binary pressure
indi ation. A low lubri ation oil pressure will, as an example, ause an
immediate diesel engine shut down.

Pressure Transdu er
The pressure transdu er onsidered here is of type b, and based on strain
gauge measurements. The transdu er and transmitter are olle ted as one

44CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT

SG

SG

Vsup+

SG

Bridge
Supply

SG

Vout+

4 - 20 mA
PA
Supply

Vout-

VsupTransducer

ISC
Wirering

Figure 3.3: Pressure measurement using a strain gauge bridge tted to a


membrane onverted to a 4-20 mA urrent out of the transdu er.
unit. Power to the box is taken from the 4-20 mA interfa e. Figure 3.3
shows the pressure measurement prin iple and ele tri al onne tions.
The relationship between the pressure and the output urrent is:

i =
p =

p + i0
p patm

ispan
Pspan

(3.6)

Pressure Swit h
The pressure swit h onsidered here is of type . It is a pressure ontrolled
swit h, where pla ement of the swit h depends of the adjusted set-point
value and the pressure in the onne tion.
Figure ?? illustrates the operation of the swit h. The onta ts 1-4 lose
while 1-0 break as the pressure rises above a set-point value. The onta ts
return to initial position when the pressure falls to the set-point value minus
the me hani al hysteresis.

3.2 Angular Position Measurement


Angular position measurement is needed in remotely ontrolled valves and
in a large number of other devi es. The number of measurement prin iples
for angle are numerous, and will not be dis ussed in this ontext. One very
popular omponent is the ele tri al potentiometer. It is available for both
rotating and linear versions.

45

3.2. ANGULAR POSITION MEASUREMENT


P

4
2

Mechanical
Hysteresis
Setpoint

Figure 3.4: Binary pressure indi ator. The solid line is me hani al di eren e
and the bottom of it is the adjustable set-point value.
Prin iples for very a urate appli ations in lude: opti al en oder prin iples, magneti indu tion (the indu tosyn) and syn hro transmitter measurements. Opti al en oders and indu tosyns are be made in both rotating
and linear versions. Linear position measurement is also available from differential transformer based sensors.
Magnetorestri tive materials have been used for robust omponents sin e
about 1980. These elements are very robust but nonlinear in the 2 5%
order of magnitude.

3.2.1 Potentiometer
A potentiometer hanges the position of onta t between a resistan e element and a wiper when the turning angle is hanged. The potentiometer
an be onsidered a voltage divider with a division ratio that is a fun tion
of the turning angle. Linear potentiometers have a very a urate linear relation between turning angle and division ratio.Figure ?? shows the typi al
onne tion diagram. Fault dete tion ability is dis ussed in a subsequent
se tion.

3.2.2 Flow Measurement


Measurement of ow is used for load/dis harge of liquid argo and supply. It
is also used for onsumption assessment, e.g., of diesel-, fuel-, and lubri ation
oils. Available measurement prin iples in lude:
1. Rotation of propeller or similar rotor by passage of uid liquid. Rotor

46CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT

Potentiometer

Wirering

iF

ISC

Figure 3.5: Ele tri al diagram of potentiometer and omputer interfa e to


enable fault dete tion at the single sensor level.
revolutions ount is proportional to volume of uid that has passed the
sensor. The rotating devi e will have onta t with the liquid while the
rotation sensor(s) an be made to work without being in dire t onta t with the liquid. A ura y is usually satisfa tory for onsumption
measurement be ause the basi prin iple is a volume measurement.
2. Shift in resonan e frequen y when a velo ity di eren e exists in two
paths for the uid, and this path is exposed to a magneti eld. Sensor
omponents are not in onta t with the liquid, and obtainable a ura y
is very high (0.1%). Ele troni ir uitry is omplex, and therefore
expensive.
3. Doppler shift of ultrasound signal a ross a se tion of a tube arrying
the liquid. Transdu ers using this prin iple have no sensitive omponents in onta t with the liquid. A ura y is usually onsidered high.
4. Pressure drop over a avity of the pipe arrying the liquid (Pitot tube).
Flow and pressure are mutually dependent a ording to Bernoullis
law. Measurement a ura y is normally onsidered insu ient for
onsumption assessment, be ause the prin iple is basi ally ow based.
Pressure sensor will normally be in onta t with the liquid.

Flowmeter
The ow measurement prin iple onsidered here is of type a. The ow-meter
onsists of a housed 4 bladed rotor whi h is pla ed in the uid stream. The

3.3. ACTUATORS FOR FLOW CONTROL

47

axis of rotation of the rotor is parallel to the dire tion of the ow. The in oming uid for es the blades to rotate at an angular velo ity approximately
proportional to the ow rate. A magneti oupling transmit the rotor rotation to an indu tive pulse transmitter. A pulse dis riminatory may be
in luded (option).
A pulse dis riminatory prevents measurement faults due to pipeline vibrations, pressure u tuations, or non-steady ow. The obsta le is that an
error will o ur if a ba k and forth u tuation of the rotor reates multiple
forward pulses. By using two pulse transmitters, whi h generate two signals
with a phase shift of 90 , these measurement errors an be eliminated.
Ele tri al output from the ow-meter is a binary signal. Supply is a
voltage of 24 V DC. The load urrent will hange value a ording to the
state of the binary signal.

3.3 A tuators for Flow Control


Flow ontrol is the most widespread a tuator fun tion in ma hinery systems.
It is used where a shut o of a pipe onne tion is needed, where a medium
need to be ow ontrolled, and where ontrol loops manipulate a ow of
a liquid to ontrol temperature. Flow ontrol an be open/ losed, variable
throughput, or redire tion of ow from one pipe into two (3 way valves).
Only remote ontrolled valves are relevant here.
1. Valve for open- lose of pipe onne tion. A tuation an be:

 hydrauli a tivated. Open when pressure is applied and losed




with no pressure - or reverse.


ele tri a tivated.

Indi ation an be analog or binary for these valves. Binary indi ation
an be limited to " losed", and "not losed" should then be interpreted
as open. However, most valves have indi ation for both open and
losed positions.
2. Two way valve. Sends ow in one dire tions in a hydrauli ir uit,
stops the ow in neutral position, or reverses the ow when a tivated
in opposite dire tion. Two way valves are used in hydrauli ontrol
ir uits.

 a tivation is ele tri with proportional or bang-bang (solenoid)


type of ontrol valve.

3. Valve for variable bypass. A rotor is turned me hani ally to hange


the opening area of the valve.

48CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT

 hydrauli a tivation: volume of hydrauli oil in rotary vane or in




ylinders determine the me hani al position of the rotor. Oil ow


is ontrolled by a smaller, ele tri ally a tivated valve.
ele tro-me hani al a tivation: an ele tri motor with gear turns
the rotor shaft of the valve.

A position sensor serves as feedba k element. Swit hes are, in many


ases, mounted to provide end position indi ation for both fully open
and fully losed.
4. Three way valve for redire tion of ow. Is used in temperature ontrol
systems, for example, to pass a su ient fra tion of ow through a
ooler to obtain a desired temperature. Opening and losing of the
valve will hange the ratio between ow through the ooler and that
bypassing it. A tuation an be as in ). Ele tri al a tivation is the
most ommon.
In this ontext, we need to des ribe an ele tro-me hani al three-way
valve, type d), in more detail.

3.3.1 Three-way Valve


The a tuator system onsists of a three-way valve. It has a ommon port
and two other ports, referred to as A and B. The rotor position determines
the opening area between the ommon port and ports A and B. The valve
distributes the ow between ports A and B. The distribution is ontrolled
by the rotor angle. An ele tro-me hani al devi e is atta hed to the valve to
ontrol the rotor position.
Two modes of servi e exists for the valve: diverging and onverging. In
diverging servi e ow enters the ommon port and ows out of one or both
of the two other ports (A and B). In onverging servi e ow enters from
port A and/or B and leaves through the ommon port. The position of the
rotor determines whether one or both of the ports A and B are in use.
The relation between rotor position and ow through the ports A and
B is nonlinear in onverging mode and linear in diverging mode. Figure ??
shows the valve hara teristi for ports A and B.
The ele tro-me hani al positioner onsists of a motor and a gear. The
rotor position is hanged by running the motor in lo kwise (CW) or ounter lo kwise (CCW) dire tions. The motor an be in one of the following states:
stopped, rotation CW, or rotation CCW.
The state is ontrolled by a tivation of two relay onta ts. They are
denoted Open (o) and Close ( ) respe tively. A potentiometer is used to
measure the a tual rotor position. Figure ?? shows the prin iple in the
a tuator operation and ele tri al onne tions.

49

3.3. ACTUATORS FOR FLOW CONTROL

100

80
Diverging
Converging
60

Percent of full travel


40

20

0
100

20
80

60
40
60
40
Percent of total flow

80
20

100
0

A
B

Figure 3.6: Valve hara teristi s for diverging and onverting operation (the
use of A or B ports for in ow).

50CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


220 V AC

AC

HTR1
HTR2
LS

TS

LS

TS

Potentiometer
A
C
B
Limit
Schwitches

Close

Motor

Open
ISC

Figure 3.7: Operation of 3-way valve a tuator with relay operated indu tion
motor. Abbreviations are: o:open, : lose, s:stop, HTR:Heater, LS:Limit
Swit h, TS:Torque Swit h.
Limit swit hes on the rotor provide indi ation of rotor end positions to
permit adequate ontroller design and fault dete tion. Torque swit hes are
mounted to provide overload prote tion.

3.3.2 Pumps
Pumps are used to drive a liquid through a ma hinery system or to move
a liquid from on tank to another. Ele tri ally driven pumps are normally
used. Criti al fun tions like lubri ation oil supply and ooling water for the
main engine, are done using redundant pumps that automati ally start if
supply pressure drops below a ertain level. It is essential that the standby
pump ontrol has no possible single point failures

Standby Pump Set with Remote Control


The pumps onsidered are ele tri ally driven. The pumps run with onstant
speed and an be in one of the two states: stopped or running. Control
an be lo al or remote. The pump provides a pressure and a ow whi h
are related through the pump hara teristi whi h is somewhat depending
on the degree of wear of the pump's rotor. The operating point at the
hara teristi depends solely on the ow resistan e of onne ted equipment.
A pump is started by losing a start onta t at signal urrent level. The
onta t a tivates a motor starter whi h will turn the power on to the pump.
Startup an be gradual with two or more steps be ause a pump is a fairly

51

3.4. FMEA SCHEMES FOR SENSORS AND ACTUATORS


Computer 1

Non return valves

PS
1

Computer 2

M1

PS
2

Pump 1

M2

rem.loc.blok

1
Motor
starter
1

start

stop

Pump 2

rem.loc.blok

start

stop

Motor
starter
2

Control of Twin Stand-by Pump System

Figure 3.8: Standby pump set with remote ontrol. The ontrol omputers
are independent and have mutual supervision.
large onsumer on a ship's power system. A two step starter has set of
resistors whi h are in series with the pump for a number of se onds after
startup. The resistors are bypassed after the startup period has elapsed.
The swit hing is done by onta tors (three phase relay apable of handling
the large urrents is needed).
A standby pump set onsists of two pumps with individual starters and
a pressure measurement in the outlet from ea h pump. If measured pressure
on one pump is lower than a predetermined value, the other is automati ally
started up. Figure ?? illustrates a standby pump ontrol system.

3.4 FMEA S hemes for Sensors and A tuators


3.4.1 Sensor Faults
Models for ea h sensor's potential failure modes and their e e ts are developed in this se tion. The models are presented in a manner that enables
dire tly use in the later FMEA on the lubri ating oil an illary system.

Level Transdu er
With referen e to Fig.?? the following table is developed.

52CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


Comp./
e e t

Too low
signal

Input
Output

wire
broken

Comp.

transmitter
adj. fault

Signal
not related
to physi s
Vent. tube
loagged

salt water,
sensor
damage

Flu tuating
signal

Too high
signal

onne tion
fault
transmitter
defe t

short ir uit
transmitter
adj. fault

Level Swit h
With referen e to Fig.?? the following table is developed.
Signal
Flu tuComp./
Too low
not
related
ating
e e t
signal
to physi s
signal
Input

Output

low level
signal, eg.
broken wire
(low alarm
on g.)

onne tion
fault
Me h.
damage

Comp.

Too high
signal
high level
signal, eg.
broken
wire
(high
alarm
on g.)

Temperature Transdu er
With referen e to Fig.?? the following table is developed.
Too
Signal not
Flu tuComp./
low
related
ating
e e t
signal
to physi s
signal
mounting
Input
fault
loose
short
onne Output
ir uit
tion
Comp.

eg. salt
water

sensor
element
fault

sensor
element
fault

Too
high
signal
Broken
wire
sensor
element
fault

53

3.4. FMEA SCHEMES FOR SENSORS AND ACTUATORS

Pressure Transdu er
Referring to Fig. ?? the following FMEA table is developed for the pressure
transdu er.
Signal not
Flu tuComp./
Too low
Too high
related
to
ating
e e t
signal
signal
physi s
signal
Output

Broken
wire

Input

Comp.

ampli er
failure

pipe
broken,
ref. press.
failure
water lled,
ampli er
failure,
unit
damage

loose
onne tion

Short
ir uit

ampli er
failure

ampli er
failure

Pressure Swit h (Low level indi ator)


With a binary pressure swit h, the normal ondition needs to be de ned.
Take normal ondition to be pressure above a setpoint. The output onta t
is losed in this ondition (NC). The following table is developed.
Too low
Signal
Too high
Flu tuComp./
signal
not
signal
ating
e e t
(Closed
related to
(Closed
signal
onta t)
physi s
onta t)
blo ked
setpoint
setpoint
Input
vibration
or leak
error
error
in pipe
loose
short on
onne onta t
Output
wire broken
tion
wires
waterin reain rea lled,
sed hystesed hysteComp.
unit
resis
resis,
damage

Potentiometer
With referen e to Fig.?? the following table is developed.

54CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


Comp./
e e t

Too low
signal

Input

broken
wire
at A
short
at A-B

Output

short B-C

Comp.

Not
related
to angle

Flu tuating
signal

Loss of
supply

vibration

Broken
wire
at C
stu k,
shaft or,
element
broken

loose
onne tion

Too high
signal
broken
wire
at A,
short ir uit
A-C

Wiper
fault

Flowmeter For the ow-meter, a u tuating signal an only o ur, if the


instrument has double pi kups and indi ation of dire tion of rotation.
Signal
Flu tuComp./
Too low
Too high
not related
ating
e e t
signal
signal
to physi s
signal
power
Input
loss
wire
broken,
Output
short irkuit
press.
press.
u tu u turotor,
Rotor
Comp.
ation,
ation,
pi k-up
stu k
vibravibradamage
tion
tion

3.4.2 A tuator Faults


The position of the three-way valve is ontrolled. The ontroller onsists
of a potentiometer for position measurement, limit swit hes for end-stop
indi ation and the three-way valve. The failure modes for the limit swit h
and the valve are given below.

Limit Swit h (normally losed).


Normally losed means that the onta t is losed when no voltage is applied.

55

3.5. REQUIREMENTS TO INTERFACE

Comp./
e e t

End pos.
rea hed,
but not
indi ated

Signal not
related
to physi s

loose
wire,
onne tion
problem

Input
Output
Comp.t

short
ir uit
me h.
damage

Flu tuating
signal

me h.
damage

End pos.
indi ated,
but not
rea hed
broken
wire

me h.
damage

Three-way Valve
With referen e to Fig. ?? the following tables are developed.
Flow not
Flu tuFlow
Comp./
Flow
related to
ating
too high
e e t
too low
ontrol
ow
output
angle
pipe
too high
broken,
pipe
set-point
input ow
setpoint
Input
loagged,
u tuset-point
low,
pipe
leak
ating
high,
power
low
pipe
pipe
A or B
loagOutput
broken,
ged
logged,
or leak
damage,
damage,
Comp.
hysteresis
wear
wear

3.5 Requirements to Interfa e


This se tion elaborates on the interfa e to these omponents and dis usses
how various omponent faults an be dete ted. First, the set of omponents are ategorized a ording to their type of ele tri al interfa e. Se ond,
ea h fault type is treated, and hardware requirements to the omputer interfa e are derived. Third, signal hara teristi s are taken into a ount, and
methods for fault dete tion and isolation are dis ussed. Finally, results of

56CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


the analysis are summarized in the form of requirements to the ombined
hardware and software interfa e.

3.5.1 Component Categorization

A ording to Ele tri al Interfa e The omponents sele ted and dis ussed in
hapter 6 was seen to have only a few standardized types of output. The
omputer interfa e for ea h omponent is listed in the tables below.

57

3.5. REQUIREMENTS TO INTERFACE

3.5.2 Sensors
Sensors
Level
sensor
di erential
pressure
meas.
Pressure
transdu er
abs. or
di erential
Pressure
swit h
abs. or
di erential
PT100
element

Angle:
pot.meter
meas.

Norm.
losed
swit h

Norm.
open
swit h

Output
from
Comp.
Strain-gauge
bridge to
transmitter.
4-20 mA
output

Computer
interfa e

Comments

Current

24V DC
supply
from
omputer

Current
4-20 mA

Current

24V DC
supply
from
omputer

NC
onta t

Digital

Resist.
varies
with temp.

a) voltage
divider
varies ratio
with angle
b) resist.
varies with
angle

a) Conta t
b) Conta t,
resistor
in parallel
a) Conta t
b) Conta t,
resistor
in parallel

Resistan e
measurem.
with wire
omp.
a)
3 terminal
measurem.:
Supply
voltage to
resist.
element,
measure at
wiper
b) Resist.
measurem.
of wire
resist. with
ompens.
a) Digital
input
b) Resistan e
measurement
a) Digital
input
b) Resist.
measurem.

Constant
urrent
supply
from
interfa e

a)
No wire
supervis.
a) With
wire
supervis.
a) Without
wire
supervision
a) With wire
supervis.

58CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


Sensor outputs are seen to belong to one of the ategories listed below.
Sensor
Supply
Measurem.
ele tri al
from
by omputer
Comments
output
omputer
interfa e
a) 1 voltage
a) 4 wire
DC urrent,
measurem.
onne tion
usually
Resistan e
b)
2
voltage
b) 3 wire
1-5mA
measurem.
onne tion
a) DC
a) Wiper
voltage
voltage
Pot.meter (a)
measurem.
b)
see
Pot.meter (b)
resist.
b) resistan e
meas.
meas.
DC
Voltage
voltage
meas.
supply,
over
Current
usually
referen e
24 V
resist.
External
power
Voltage
None
Voltage
supply
a) Digital
input
Voltage
NC
b) Low
or
onta t
pre ision
urrent
resistan e
supply
meas.
a) Digital
Voltage
input
NO
or
b) Low
onta t
urrent
pre ision
supply
resist.
meas.

3.5.3 Single Sensor Fault Dete tion


The dete tion of a fault in any of the above elements depends on whether the
fault auses a deviation from normal that an be re ognized from a voltage
measurement.
Basi prin iples for single sensor fault dete tion are:
1. Range he k: Che k whether spe i ed upper and lower limits are ex eeded

3.5. REQUIREMENTS TO INTERFACE

59

2. Slew rate he k: Che k whether a spe i ed rate of hange has been


ex eeded
3. RMS value he k: Che k whether the RMS value of the signal ex eeds
a spe i ed limit

Range Che k
Range he king is a very e ient way to dete t broken wires or short ir uits
between wires for all voltage or urrent based sensors. The requirement to
hardware is that all su h faults lead to a transition of the measured voltage
into an "out of range" region.
The time interval elapsed from the time the fault o urs until it is dete ted depends on how fast a limit is rea hed. The allowable time to dete t
depends on the a tual use of the sensor signal. This is dis ussed below.
For 4-20mA urrent output from sensors/transmitters, ranges ome natural. The requirements are
1. A/D onverter range is 0 to 24 mA.
2. If urrent ex eeds 24 mA, or is below 0 mA (reverse urrent), the
onverter must indi ate 24 mA or 0 mA respe tively, and not swap
around.
For voltage based measurements, range he king requires that all short
ir uit and open ir uit onditions an be dete ted:
1. any wire shorts to any other wire.
2. any wire shorts to ground
3. any wire is ope
For potentiometer measurements, dete tion requires:
1. Voltage supply is unipolar. Symmetri al supply around zero, as is
ommon pra ti e, an not dete t shorts between ground/zero level
and wiper.
2. Input ir uit on voltage ampli er is driven out of range if input wire
is open.
Point b) implies that a urrent (iF in gure 6 in hapter 6) is inje ted
into the measurement ir uit. When an "open" fault o urs, input voltage will hange with a rate of hange that depends on ir uit apa itan es
and the magnitude of the urrent inje ted. As the urrents may have to be
hosen small in order to avoid too heavy impa t on measurement nonlinearity/a ura y, the rate of hange may be too small to meet required time to
dete t the fault. If so, slew rate he king an be adopted.

60CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT

Slew rate he k
Slew rate is the hange of signal between onse utive samples - the "derivative" of the signal. If an open fault o urs, input ir uits should be designed
su h that the slew rate of the signal in this ondition if several times higher
than possible in normal operation. A slew rate dete tor an then be used
to onsiderably redu e the time to dete t.
The slew rate algorithm should be robust implemented with adequate
ltering that an be tuned with the time onstant d :
1
(y(k) y^(k))
d
y^(k + 1) = y^(k) + Ts x(k)

x(k) =

(3.7)

where Ts is sampling time, y is the measured signal, and x is the slew rate
estimate.

RMS value he k
With referen e to the fault s hemes, signal u tuation is sometimes a symptom on a fault in development. In addition to signal u tuations and wiring
defe ts in development,ele tro-magneti disturban es will ause in rease in
Root Mean Square (RMS) value of a signal. Ele tro-magneti interferen e
should be damped by ltering and s reening of ables. An in rease in RMS
signal may therefore be an indi ation of a s reening or ETC defe t or other
faults in the interfa e.
An estimator for the RMS value is best made re ursively. This requires
also an re ursive al ulation of the signal's mean:

y(0)
=
y(0)
1
y(k + 1) =
y(k) + k+1 (y(k + 1) y(k))
 2 (k + 1) =  2 (k) + k1 ((y(k) y(k))2  2 (k))

(3.8)

where N is a xed number - the horizon - y with a bar over


 is the RMS value.
In order to determine whether a hange has happened in the RMS value,
a hypothesis test should be made. This is fairly simple but is not within the
s ope of this report.

Requirements on time to dete t


The requirements on time to dete t a fault di er a ording to how the sensor
information is used in the system. Faults in sensors used for monitoring of
the average value of some physi al quantity need not be very rapidly dete ted. Faults in feedba k elements in ontrol loops or sensors for safety

61

3.5. REQUIREMENTS TO INTERFACE

systems, may need immediate dete tion be ause the fault will have immediate e e t on the ma hinery.
Faults in a ontrol loop an be ategorized in generi types listed in the
table:
Required
Level for
Level for
time
remedy
Fault type
dete tion
to dete t
a tion
Referen e
Pro ess
Several
value fault
interfa e
Controller
samples
(setpoint)
hw & sw
Feedba k
element
fault

Down to
one
sample

Pro ess
interfa e
hw & sw

A tuator
fault

Down to
one sample

Pro ess
interfa e
hw & sw

Exe ution
fault - eg.
in timing
Appli ation
SW, system
or HW fault
in omputer
ontroller

Controller.
Faulty
info. should
not be
used for
ommand
al ulation
Controller
if possible
or safety
system

Safety system Safety system


Computer
rmware
and/or
Safety
system

Comp.
rmware
and/or
Safety
system
Safety
Supply fault
Safety
system &
or other
system
fail-to-safe
fatal error
design
As apparent from the table, faults in feedba k elements and a tuators
are most demanding be ause there is often very little time to dete t a fault
before the fault e e t o urs.

3.5.4 Multiple sensor fault dete tion


If more measurements of the same quantity are available, fault dete tion
methods an utilize the redundant information. If 3 voltage measurements
are available V1 , V2 , and V3 , a fault in one sensor an be found with very
low omputational burden as follows:
V12 = V1 V2 ; V13 = V1 V3 ; V32 = V3 V2 ;

62CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


If a fault exists in number one measurement, then V12 and V13 will be
di erent from zero whereas V23 will be zero or lose to zero if V2 and V3
are equal. Any of the non-faulty values an be used. The mean of V2 and
V3 ould be used if measurement noise shall be minimized.
Fault isolation an also be done as:

F1 = V12  V13 ; F2 = V12  V23 ; F3 = V23  V13 ;


F1 will be large if measurement 1 is faulty and lose to zero if it is valid.
Dete tion of whether F is lose to zero or not needed, stri tly speaking, a
dete tor based on the sto hasti nature of measurement noise. A simple
threshold an, however, be used with appropriate results in many ases.
This multiple sensor fault dete tion s heme an be used even if the measurement are from di erent sour es. A sto hasti based dete tor will then
be needed, however.

3.5.5 Filtering of Ele tromagneti Spikes


Ele tromagneti Compatibility is a key issue for marine pro ess ontrol systems. Classi ation so iety approval requires EMC properties to be su ient so that the equipment will not be disturbed by radio frequen y signals
in the daily operation. One of the means to meet EMC requirements are to
use EMC ltering and spike suppression in front stages of pro ess omputer
interfa es.
In relation to dis rete ontrol loops, di ulties arise if a sensor value is
orrupted by a transient ele tromagneti disturban e. The value is "frozen"
on e every sample, and the onsequen e of a transient disturban e is therefore higher in omputerized equipment than with earlier generations based
on analogue te hniques.
The multiple sensor idea an be used here to onsiderably improve spike
sensitivity with very little software overhead. In ea h sampling y le, the
sampling should onsist of 3 A/D onversions rapidly after ea h other. If
the three measurements di er, one measurement is dis arded as des ribed
above.

63

3.6. ACTUATORS

Ts

2Ts

3Ts

time

Figure 3.9: Tripple onversion sampling has only marginal overhead but offers both signi ant ele tromagneti spike suppression and onsisten y he k
within one sample.

3.6 A tuators
A tuators

Three way
valve with
AC motor
positioner

Motor starter for


pump

Type of input
to omponent
a) losure of
"O" onta t
a tivates
"open" relay
(220 V AC)
b) losure of
"C" onta t
a tivates
" lose" relay
(220 V AC)
a) Closed
onta t
a tivates
start
(220V AC)

Computer
interfa e
a) Close "O"
onta t for
opening (digital
output relay)
b) Close "C"
onta t for
losing (digital
output relay)
) Pot. meter
input
d) 2 NC swit h
inputs
a) Close onta t
for start (digital
output relay)
b) NC swit h to
indi ate running

Comments
a and b)
Valve moves
as long as open
or lose signals
are present.
) Angle indi ation with potentiometer
d) End position
indi ation with
2 NC swit hes.
Timing of
start sequen e
is lo al within
the starter

) NC swit h
to indi ate lo al
A tuator fault dete tion an be made using information on both ontrol
and feedba k signals from the a tuator. Analyti redundan y and model
based methods are very e ient in this respe t. These methods use knowledge about stati and, if needed, dynami relations within the a tuator to

64CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT


ompare expe ted performan e with measurements. Key issues are treated
in Blanke et.al (1993) [9 where dete tion methods and other issues of relevan e to losed loop fault handling are dis ussed.

3.7 Interfa e Requirements


The dis ussion of interfa es and the analysis of fault propagation for the
sele ted omponents have led to a set of requirements to interfa e in order to
obtain fault dete tion at the single omponent level. The requirements a e t
both hardware and rmware/software of the pro ess ontrol or monitoring
system.

3.7.1 Requirements to hardware and rmware


General requirements for analog signals and riti al binary signals:

1. Short- ir uit between any two wires or any wire and ground shall be
dete ted.
2. Any open onne tion shall be dete ted.
Spe i requirements:

For 4-20mA urrent output from sensors/transmitters:


1. A/D onverter range is 0 to 24 mA.
2. If urrent ex eeds 24 mA,or is below 0 mA (reverse urrent), the onverter must indi ate 24 mA or 0 mA respe tively, and not swap around.
For voltage measurements:
1. any wire shorts to any other wire.
2. any wire shorts to ground
3. any wire is open
Potentiometer measurements:
1. Voltage supply is unipolar. Symmetri al supply around zero, as is
ommon pra ti e, an not dete t shorts between ground/zero level
and wiper.
2. Input ir uit on voltage ampli er is driven out of range if input wire
is open.

3.7. INTERFACE REQUIREMENTS

65

3.7.2 Combined Hardware - Software requirements


For feedba k elements and riti al sensors for shut-down systems, time to
dete t open onne tion or short- ir uit fault in feedba k elements shall be
down to one sample.
For set-point elements, time-history roll-ba k shall be possible.
The following methods are re ommended to be available for single sensor
dete tion:
1. Range he k: Che k whether spe i ed upper and lower limits are ex eeded
2. Slew rate he k: Che k whether a spe i ed rate of hange has been
ex eeded
3. RMS value he k: Che k whether the RMS value of the signal ex eeds
a spe i ed limit. Sampling is re ommended to be made su h that
transient ele tromagneti disturban es an be dete ted, isolated within
one measurement y le.
Faults within omputer hardware and software must be dete ted by
rmware to the extent possible, i.e., a ording to lassi ation so iety rules.

66CHAPTER 3. CONTROL SYSTEM INTERFACE WITH PHYSICAL PLANT

Chapter 4

Fault Dete tion and Isolation


Extension of feedba k ontrol systems, with methods for fault dete tion,
isolation and a ommodation are needed to avoid una eptable ex itations
in plant states when faults o ur. Produ tion stop, plant failure or dire t
damage should be avoided to the extent possible.
Fault-tolerant ontrol design is the te hnique to prevent simple faults
asso iated with ontrol loops to develop into failures when possible degradation in performan e an be tolerated. If not, fail save design needs to be
imployed. This hapter deals with fault-tolerant design where the ingredients are dete tion, isolation and subsequent a ommodation of faults. The
rst step is to introdu e relevant te hniques for fault dete tion and isolation.
This is the subje t of this hapter.

4.1 FDI in Closed Loop Control Systems


Faults in the instruments of ontrol loop systems are espe ially ru ial for
the entire operation of a system. The possibilities to a omplish FDIA
in su h systems, depend on the ability to distinguish the onditions met
in fault situations from plant behaviour in normal operation. In normal
operation, feedba k ontrol should keep a pro ess state equal to a desired
setpoint, while the in uen es from pro ess disturban es and measurement
noise are kept minimal. In abnormal operation, when faults have o urred,
the ontrol loop should rea t immediately in a way that prevents a fault
from developing into a malfun tion of the system being ontrolled.
The general pro edure of FDIA onsists of the three basi steps:

Change Dete tion. Residuals are generated by means of a mathemati al system model and measurements. Residuals are signals, that arry
information of the system operational onditions, i.e., whether the system operates under normal or abnormal onditions.

Change Evaluation. If the information ontained in the residuals is

67

68

CHAPTER 4. FAULT DETECTION AND ISOLATION

Change
Evaluation

Change
Detector

Fault
Accommodator

Process &
Control

Figure 4.1: Levels of FDIA automation.


de ided to be aused by a fault, then it is isolated, i.e., the lo ation and
time (sometimes also type, size, sour e) are determined.

Fault a ommodation. In the ase of an isolated fault, an a tion is


taken, whi h ensures, that the pro ess a ommodates to the faulty situation.

The pro edure of model based FDI is shown in Fig. 4.1. When the model
based methods are applied for FDI, it is ne essary that input/output signals
of the monitored pro ess are available and that dynami hara teristi s of
the system are known with a reasonable degree of pre ision. The information
about the faults, ontained in the residuals, depends highly on the available
model. An a urate model gives residuals, with desirable relations between
the residuals and the faults. An ina urate model will produ e relations,
whi h deviate from the desired. An ina urate model must ne essarily be
used when the pro ess knowledge is low and/or in order to de rease the
design and on-line al ulation omplexity.
Several arti les on erning FDI investigations using model based methods have been published. Examples are Patton, Frank and Clarke, (1989)
[45, Patton and Chen, (1991) [42, Frank, (1991) [19, Isermann, (1991)
[16, the lassi survey paper by Willsky, (1976) [50 and later surveys by
Frank, (1990) [18, Isermann, (1994) [29, and Gertler, (1993) [25. Methods
for statisti al hange dete tion are dealt with in the next hapter. A key
referen e is detailed in Basseville and Nikiforov, (1993) [3.

4.2. REQUIREMENTS TO FDI

69

4.2 Requirements to FDI


The requirements to fault dete tion are losely related to the appli ation
of the result of a dete tion. One important parameter is the time to dete t
that a fault has o urred.

 With in ipient omponent faults dete ted for use in maintenan e planning, FDI response may be rather slow (minutes to hours).

 With abrupt omponent faults in a pro ess where dete tion used for
operator assisted hange of operational mode, dete tion must be more
responsive (se onds to minutes).

 If abrupt faults in set-point values to a losed loop ontrol are onsidered, and used by the ontroller for automati re- on guration, time
to dete t should be within a few samples (5 to 10).

 If abrupt faults in feedba k elements in a losed loop ontrol are onsidered, time to dete t be within one to two samples.

The ategorization above indi ated we need to distinguish between the


timely development of a fault (abrupt to in ipient) and the use of the information (from maintenan e planning to autonomous re- on guration of a
ontroller). In this ontext, fault tolerant ontrol is in fo us. Pro ess faults
are, therefore, only onsidered to the extent their e e ts propagate through
the ontrol system and, at the same time, the ontrol system an alter the
propagation.

4.3 Modelling of Faults and Fault-propagation


Fault e e ts were analysed in Chapter 1 using FMEA te hniques. The result
of this alalysis was a list of fault e e ts to be onsidered. A subsequent step
in the overall design methodology was to make mathemati al modelling,
in luding the listed fault e e ts. This eventually led to mathemati al models
of the relevant parts of the parti ular pro ess and its ontrol system. These
faults are des ribed as either additive or multiplikative. The two types
of faults enter di erently in a state spa e model. We use a dis rete time
representation, knowing that the plant under on ern is usually ontinuous
with dis rete time measurements.
A system, with measured inputs u(k), and output measurements, y(k)
is onsidered. The system an be subje t to any a tuator or sensor fault,
re e ted in the ve tor f (k), unknown inputs (disturban es) d(k), pro ess
and measurement noise, w(k) and v(k) respe tively. Considering the fault

70

CHAPTER 4. FAULT DETECTION AND ISOLATION

and disturban e ve tors as purely additive, su h a system is modelled as

x(k + 1) = Ax(k) + Bu(k) + E1 d(k) + F1 f (k) + w(k)


y(k) = Cx(k) + E2 d(k) + F2 f (k) + v(k)

(4.1)

where x(k) is the state ve tor. A; B; C; D are known system matri es,
E1 and E2 are known matri es for unknown inputs and F1 and F2 are known
fault entry matri es. w(k) and v(k) are dis rete time Gaussian white noise
pro esses (ve tors) with zero mean and ovarian es

E [w(k)  wT (k) = 1
E [w(k)  vT (k) = 12
E [v(k)  vT (k) = 2

(4.2)

If more general multipli ative faults are onsidered, we get




x(k + 1) = A I + EA f (k) + FA d(k) x(k) + B I + EB f (k) (4.3)



+FB d(k) u(k) + w(k)

y(k) = C I EC f (k) + FC d(k) x(k) + v(k)
The e e ts of multipli ative faults are learly dependent on the statte
and input signals. The additive fault ase is simpler as the e e t of a fault
develops dependent of the dynami s of the plant (and not its present state)
and the dynami s of the fault. Fault dete tion is somewhat di erent in the
two ases. Additive faults an be dete ted using ltering te hniques; dete tion of multipli ative faults will be better dete ted using system parameter
estimation te hniques. Within the limited s ope of this ourse, additive
faults will be the prime on ern.
Considering only additive faults, the state spa e des ription in Eq. (4.1)
has the equivalent input/output model

y(z ) = C(zI A) 1 Bu(z ) + E1 d(z ) + F1 f (z ) + w(z )
(4.4)
+E2 d(z ) + F2 f (z ) + v(z )
= Hu (z )u(z ) + Hud (z )d(z ) + Hf (z )f (z ) + Hw (z )w(z ) + v(z )

where ea h element of the H(z ) matri es is a transfer fun tion, that is a


rational fun tion of the shift operator z , i.e.:
Hyu (z ) = C(zI A) 1 B
(4.5)
1
Hud (z ) = C(zI A) E1 + E2
Hyf (z ) = C(zI A) 1 F1 + F2
Hyw (z ) = C(zI A) 1
The dynami properties of the fault and the lter fun tions above determine the dynami hanges in the measurements when a fault o urs.

4.4. METHODS FOR CHANGE DETECTION

71

4.4 Methods for Change Dete tion


Residuals for hange dete tion are generated by means of available measurements and pro ess models. A variety of methods for residual generation
have been presented in the literature. They an be brought down to two
basi on epts:

Geometri approa hes:

{ Parity spa e equations.


{ Diagnosti observers.

Statisti al approa hes:

{ Kalman ltering.
{ Parameter estimation.
{ Statisti al hange dete tion.
The geometri approa hes generate residuals, whi h ontain information
of system hanges due to faults, as hanges in magnitude. Under ideal
onditions, when the system operates normally, residuals are lose to zero.
If a faulty ondition arises, one or more elements of the residual ve tor
hange to nonzero.
The statisti al approa hes generate residuals, with information of hanges
in system statisti s due to faults, e.g., hanges in mean value or ovarian e.

4.5 Geometri Approa hes to Change Dete tion


In the geometri framework, residuals for hange dete tion are generated by
re onstru tion of output measurements. Basis for the design is des riptions
of the system in luding potential faults and unknown inputs. The di eren e
between measured and re onstru ted outputs are error signals. These are
sensitive to faults and, in many ases, also to unknown inputs. In order
to give the error signals spe i ed properties in relation to the faults (i.e.,
to obtain fault isolation), it may be ne essary to lter the error signals. A
very general ar hite ture for residual generation is shown in Fig. 4.2. The
residual ve tor r(z ) an be al ulated as a fun tion of the fault ve tor f (z )
so that

r(z ) = H(z )f (z )

(4.6)

De nition 1

Dete tability: the ability to dete t the presen e of one out


of several faults f1 (k), f2(k), : : : , fm (k), de ned as

r(k) 6= 0; when f (k) 6= 0

(4.7)

72

CHAPTER 4. FAULT DETECTION AND ISOLATION


Faults Disturbances

Inputs

Physical
Plant

Outputs

Plant
Model
Error signals

Output error
Filter
Residuals

Figure 4.2: A general pro edure of residual generation in FDI


This means that when a fault o urs, one or more omponents of the residual ve tor will hange in magnitude and make it possible to re ognize, eg.
by a threshold dete tor, that some hange has taken pla e. Dete tability
implies that a set of faults an be dete ted - but not ne essarily isolated.
In mathemati al terms, the residual ve tor r(k) is a fun tion of all possible
faults:

r(k) = h(f1 (k); f2 (k); : : : ; fm (k)):

(4.8)

The geometri interpretation is depi ted in Fig. 4.3, where three faults are
onsidered.

De nition 2

Isolability:

the ability to isolate a fault de ned as

ri (k) 6= 0; when fi (k) 6= 0

(4.9)

Isolability means that the i0 th residual is a e ted by only the i0 th fault, then
the fault has been isolated. (Frank, 1990) [18.

4.5.1 Generation of residuals


In general, two types of fault spe i residuals an be generated:

The faults f1 (k), f2 (k), : : : , fm (k), an be dete ted and isolated simultaneously.

73

4.5. GEOMETRIC APPROACHES TO CHANGE DETECTION


r3

Fault space

r2

r1

Figure 4.3: Geometri interpretation for dete tion of several faults.

The faults f1 (k), f2 (k), : : : , fm (k), an be dete ted and isolated one
at a time.

The rst approa h leads to a single residual generator where the residual
ve tor r(k) is ltered to give desired sensitivity to parti ular faults.The
se ond an be implemented as a bank of lters, ea h being sensitive to a
parti ular fault. The two strategies are elaborated below.

One lter to generate a ve tor residual


The rst strategy an be implemented as one lter produ ing a residual
ve tor with the desired properties. Ea h residual is onstru ted to be insensitive to one spe i fault. The i0 th residual be omes a fun tion of all faults
ex ept number i:

ri (k) = hi (f1 ; f2 ; : : : ; fi 1 ; fi+1 ; : : : ; fm ):

(4.10)

If the fault fi (k) happens, then all residuals ex ept the i0 th will respond
to that fault. The number of residuals must be larger than two. The geometri interpretation is depi ted in Fig. 4.4. With a single residual generator
on guration, the number of possible error signals equals the number of
available measurements. This gives one bound on the number of faults it is
possible to isolate. If m measurements are onsidered, m error signals an
be onstru ted. The faults are mapped on omponents of residual ve tor
following Eq. 4.8. The number of faults whi h theoreti ally an be dete ted
and isolated are n  m. The fault isolation pro edure, depends on the possibility of designing a lter so that ea h omponent of the residual ve tor
has spe i properties to parti ular fault e e ts or a ombination of these.

74

CHAPTER 4. FAULT DETECTION AND ISOLATION


r3
Fault 1

Fault 2

r2

r1
Fault 3

Figure 4.4: Geometri interpretation for dete tion and isolation of one fault
at a time.

Bank of lters to generate residuals


A bank of residual generators, ea h of whi h are usually designed to be optimally sensitive to one spe i fault, gives a larger freedom in the design due
to a larger number of independent parameters in the bank of dete tors than
in an implementation with just one ve torized dete tor. The i0 th residual
is onstru ted to be sensitive to only the i0 th fault: ri (k) = hi (fi (k)). The
geometri interpretation is depi ted in Fig. 4.4: Frank, (1990)[18.
Ea h residual generator is onstru ted, to have optimal properties to
one spe i fault. Both input/output measurements and residual generator
dynami s are sele ted so that error signals are generated in an optimal way in
relation to the spe i fault. It is possible to obtain the same spe i ations
for the two ases.
The following des riptions of geometri approa hes for FDI onsiders a
single residual generator on guration. A set of su h residual generators
an subsequently be oupled into a bank and make simulataneous dete tion
and isolation. The next se tions des ribe di erent te hniques for residual
generation within the geometri approa h.

4.5.2 Parity Equations


Parity equations are signals showing in onsisten y between the non-faulty
model and pro ess output signals. The name parity omes from hemi al
relations where left and right hand side of a rea tion must be in balan e.
On Fig. 4.5 the residual generation an be seen using parity equations.
The output measurements are generated from the nonfaulty nominal pro ess

4.5. GEOMETRIC APPROACHES TO CHANGE DETECTION


u

Plant

y
+
-

Plant
Model

75

Filter

^
y

Figure 4.5: Geometri interpretation for simultaneous fault dete tion and
isolation.
des ription, referring to Eq. (4.5) without noise ontributions:

y^ (k) = Hu (z )  u(k)

(4.11)

The output error signals be ome:

ey (k) = y(k) y^ (k)


= y(k) Hu (z )  u(k)
= Hed (z )d(k) + Hef (z )f (k)

(4.12)

Those error signals an be sensitive to all potential fault and the disturban es. A lter W(z ) is onstru ted to generate residuals, r(k) = W(z )e(k),
whi h have the desired properties to faults.
To de ouple the disturban es, W(z )Hud (z ) must be zero. Fault dete tion an now be performed but not fault isolation. For dete tion and isolation of the i0 th fault on the i0 th residual, the i0 th row of W(z ) multiplied
with any olumn of Hf (z ) ex ept the i0 th must equal zero.

4.5.3 Diagnosti Observer


Consider the pro ess des ription in Eq. (4.1) negle ting the noise terms. The
output measurements an be re onstru ted by means of the observer

x^ (k + 1) = Ax^ (k) + Bu(k) + K(y(k) y^ (k))


y^ (k) = Cx^ (k)

(4.13)

where x^ (k) are the estimated states, y^ (k) the estimated outputs and K is
the observer feedba k gain matrix.
Using G = A KC, the state estimation error, ex(k), and the output

76

CHAPTER 4. FAULT DETECTION AND ISOLATION

estimation error, ey (k) are:


ex (k + 1) = x(k + 1) x^ (k + 1)
ex (k + 1) = Gex (k) + (E1 KE2 )d(k) + (F1

(4.14)
KF2 )f (k) + w(k) Kv(k)


ex (k) = (zI G) 1 (F1 KF2 )f (k) + (E1 KE2 )d(k)

+(zI G) 1 w(k) Kv(k)
ey (k) = y(k) y^ (k)
= Cex(k) + E2 d(k) + F2 f (k) + v(k)
The error ve tor ey (k), an be given fault spe i properties by multip ation
with a lter matrix W(z ),
r(z ) = W(z )ey (z )
(4.15)
The task is to design the observer feedba k gain, K and the lter matrix W, so that the observer has optimal properties to faults and unknown
inputs.
Several methods exist for designing the two matri es. One is the eigen
stru ture assignment approa h, Patton and Chen, (1991) [42. This approa h for observer design is based on the eigen pair equations, with vj as
right side eigenve tor of G, wT as left side eigenve tor of G, and j is an
eigenvalue:

[j I Gvj = 0
wjT [j I G = 0
det(j I G) = 0
Following the matrix inversion lemma:

(4.16)

1
(4.17)
(A + BCD) 1 = A 1 A 1 B(C 1 + DA 1 B) 1 DA
Sin e this lemma is valid for all sets of A; B; C; D matri es having appropriate dimensions, and A being invertible, we may use the following renaming:
A ! C 1 ; B ! D, C ! A 1 ; D ! B and obtain
(C 1 + DA 1 B) 1 = C CD(A + BCD) 1 BC
(4.18)
This is used to rewrite the estimation error ex (k) in Eq. (4.15) by setting
(zI G) 1 = (A + BCD) 1 , i.e., zI = A and G = BCD. Then ex (k)
be omes:
(4.19)

1
2
ex (k) = (z I + Gz : : : ) (E1 KE2 )d(k) + (F1 KF2 )f (k)
1
X

1
Gm z m (E1 KE2 )d(k) + (F1 KF2 )f (k)
= z I
m=0

4.5. GEOMETRIC APPROACHES TO CHANGE DETECTION

77

giving the residual r(k):


1

X
Gm z m (E1 KE2 d(z )
r(z ) = W Cz 1
m=0


+(F1 KF2 )f (z ) + E2 d(z ) + F2 f (z )

X
Gm z
= W Cz 1
m=0

(4.20)

E1tot + E2tot utot (z )

where

E1tot = [(E1 KE2 )(F1 KF2 )


E2tot = [E2 F2
utot (z ) = [d(z ) f (z )T

(4.21)

The goal is to design the matri es W and K, so that one residual, ri (k),
from the ve tor, r(k) in Eq. (4.21) is de oupled from other in uen es (fault
or unknown input) ex ept the parti ular fault fi (k). The left and right side
eigenve tor assignments are des ribed for this purpose.
Left Side Eigenve tor Assignment

If a row wjT C of WC in Eq. (4.21) is made a left side eigenve tor of G, for
an eigenvalue j , then in relation to Eq. (4.16):

WC(j I G) = 0
+
WCj = WCG
WC

1
X

m
= WC
j

1
X

m=0
=0
using Eq. (4.23) in Eq. (4.21) gives the residual:

(4.22)

Gm

X

Gm z m E1tot + E2tot utot (z )
r(z ) = W Cz 1
m=0
The limit value of the sum as m goes to in nity is:

1
X

j
z
m=0
and the residual be omes:

m

z j

for jj j < 1




CE1tot
+ E2tot utot (z )
rj (z ) = W (z )
z j

(4.23)

(4.24)

(4.25)

78

CHAPTER 4. FAULT DETECTION AND ISOLATION

This means, the left side eigen stru ture assignment results in a lter
matrix, whi h is independent of z .
If W furthermore is designed so that the i0 th row of W multiplied with
any olumn ex ept the i0 th of CE1tot and E2tot equals 0, then the i0 th residual
is only a e ted by the i0 th disturban e. Fault dete tion and isolation of the
i0 th fault is thus a omplished.
Right Side Eigenve tor Assignment

If all olumns of E1tot in Eq. (4.21) are made the right side eigen ve tors
of G for an eigenvalue, j then in relation to Eq. (4.16):
(j I G)E1tot = 0
+
j E1tot = GE1tot
+
(4.26)
1
1
X
X
Gm E1tot
m
j E1tot =
m=0
m=0
Substituting Eq. (4.27) into Eq. (4.21), r(k) will take the form of Eq.
(4.25).
The right side eigenstru ture assignment determines the values in K but
not W. The latter matrix is therefore free for further de oupling of any
other external in uen e ex ept the fault to be dete ted.
Still, if the i0 th disturban e is to be dete ted on the i0 th residual, then
the i0 th row of W multiplied with any olumn ex ept the i0 th of
CE1tot
+ E2tot
(4.27)
z j
must equal 0. The result is, that all in uen es, ex ept the i0 th fault, are
de oupled from the i0 th residual. In this design W be omes a fun tion of z .
Alternatively, the W matrix design an be done as des ribed for the left
side assignment, where W is independent of z .
Instead of making ea h olumn of E1tot the right side eigen ve tors of G,
it may be su ient to use only sele ted olumns. If, for instan e, the i0 th
fault within the fault ve tor f (k), in Eq. (4.21) is to be dete ted then any
olumn of E1tot ex ept the i0 th are made the right side eigenve tors of G.

4.5.4 Unknown Input Observer


Another way of generating the residuals, is to onsider the estimator as an
unknown input observer, Patton & Chen [42, Frank [18. This approa h
onsiders a new state z(k) as a linear transformation of the state ve tor,
x(k) so that z(k) = Tx(k). The observer has the form:

z(k + 1) = Mz(k) + Ju(k) + Ly(k)


y^ (k) = CT 1 z(k)

(4.28)

79

4.6. STATISTICAL METHODS TO GENERATE RESIDUALS

giving the estimation and output errors:

ex (k + 1) = Tx(k + 1) z(k + 1)
= Mex (k) + (TA MT LC)x(k) + (TB J)u(k)
+(TE1 LE2 )d(k) + (TF1 LF2 )f (k)
(4.29)
1
ey (k) = y(k) CT z(k)
= C(x(k) T 1 z(k)) + E2 d(k) + Fsfs (k)
= CT 1 ex (k) + E2 d(k) + F2 f (k)
Now if every part involving the states, the inputs and the unknown inputs
an be made zero, the state estimation error will only be in uen ed by the
faults ,i.e.:

TA MT LC = 0
TB J = 0
TE1 LE2 = 0
and the output error be omes:

ex (k) = (zI M) 1 (TF1

LF2 ) f (k)

(4.30)

(4.31)



ey (k) = CT 1 (zI M) 1 (TF1 LF2 ) f (k) + E2 d(k) + F2 f (k)

Now fault dete tion is a omplished. Furthermore, if the design matri es


ontain elements so that ey (k) has spe i ed properties to the faults, the
isolation is a hieved. Again it may be ne essary to multiply a lter to the
output error to give the residual spe i ed properties to the faults. Here the
lter will be ome a fun tion of z . For unknown input de oupling:

W(z )E2 = 0

(4.32)

Dete tion and isolation of the i0 th fault on the i0 th residual, implies that the
i0 th row of W(z ) multiplied with any olumn of:
CT 1 (zI M) 1 (TF1 LF2 ) + F2
(4.33)
ex ept the i0 th must equal zero.

4.6 Statisti al Methods to Generate Residuals


In the statisti al framework, the residuals for FDI are signals or quantities, whi h ontain information of the fault statisti s. Basis for the designs

80

CHAPTER 4. FAULT DETECTION AND ISOLATION

is knowledge of the system statisti s, both under normal and faulty onditions, and the system des riptions. Inspe tion of hanges in mean value
or ovarian e of the quantities is used for determining whether a fault has
o urred. The dete tion problem is further elaborated in the hapter on
statisti al fault dete tion.

4.6.1 Kalman Filtering


Appli ation of the Kalman lter in a statisti al appli ation, uses the fa t
that the Kalman gain, K, is designed so that the state estimation error
is white, and has minimal ovarian e under normal operation and in the
presen e of random noise. For sensor and a tuator fault dete tion, a bank of
lters is designed, ea h onsidering di erent fault onditions. Under normal
operation all innovations are expe ted distributed with zero mean and a
known ovarian e. The o urren e of a fault will ause at least one residual
to hange statisti s and the dete tion an then be performed by inspe tion
of hanges in the mean value of the residual.
The basi Kalman lter is given in Eq. (4.1). The output measurements
are re onstru ted by means of the estimator given in Eq. (4.20). The state
and output estimation errors are then:

ex (k + 1)
ex (k + 1)
ex (k)
ey (k)

=
=
=
=
=

x(k + 1) x^ (k + 1)
(4.34)
Gex (k) + (E1 KE2 )d(k) + (F1 KF2 )f (k) + w(k) Kv(k)
(zI G) 1 ((F1 KF2 )f (k) + w(k) Kv(k))
y(k) y^ (k)
Cex(k) + E2 d(k) + F2 f (k) + v(k)

The design approa h is, normally, to minimize the varian e of the estimation
error, whi h is denoted P(k):
h

P(k) = E (ex(k) ex (k)) (ex (k) ex (k))T

(4.35)

This means, K is not designed to give the residuals optimal properties


for the dete tion of faults. The gain matrix K and the estimation error
varian e P(k) an be determined from Eq. (4.37).

(4.36)
K(k) = AP(k)CT w +CP(k)CT 1
 1

CP(k)AT
P(k + 1) = AP(k)AT + v AP(k)CT w +CP(k)C T
0

The latter is the usual ve tor Ri atti equation where the ovarian e matri es
of state and measurement noise are


v = E vvT
w = E wwT

4.6. STATISTICAL METHODS TO GENERATE RESIDUALS

81

Fault spe i properties an be given to the error ve tor, ey (k), by multiplying with a lter matrix W so that:

r(k) = Wey (k)

(4.37)

This lter must ensure that unknown inputs are de oupled from r(k),
and that the desired properties to the faults are obtained. If the i0 th row of:

r(k) = H C(zI G) 1 (E1 KE2 ) + E2 d(k)

(4.38)

equals zero, then the unknown inputs are de oupled from the i0 th residual.
It is now possible to make fault dete tion, but not isolation, be ause the
i0 th residual is dependent of all faults. If the i0 th fault, fi(k) ontained in
the ve tor f (k), is to be dete ted and isolated from all other faults on the
residual ri (k), then the i0 th row of:

r(k) = W C(zI G) 1 (F1 KF2 ) + F2 f (k)

(4.39)

must equal zero as well. This pro edure determines the i0 th row of the lter
matrix W and is repeated for any other row.
The matrix W is a fun tion of z . If m in uen es are to be dete ted and
isolated simultaneously, then m innovations must be available and m 1
disturban es must be de oupled from ea h of them. The result is residuals
whi h are sensitive to only one in uen e. Usually, the Kalman ltering used
for statisti al FDI is on gured as a bank of lters.
The overall stru ture of the lter bank is illustrated in Fig. 4.6
If no fault is present, both rs (k) and ra (k) are innovations with zero
mean and a known ovarian e. On the other hand if the system a tuator
fails, then ra (k) is distributed with N (f ; f ), while rs (k) still is distributed
with N (0; ). (Gertler, 1988) [23, Tzafestas and Watanabe, (1990) [48.

4.6.2 Parameter Estimation


In general, the pro edure of parameter estimation, optimizes a performan e
fun tion in relation to the parameters to be estimated using measurements
and a model stru ture. This means that the parameters hange, when the
system dynami s hanges.
A typi al model is linear with lumped parameters on the form:

any(k + n) +    + a1 y(k + 1) + y(k) = b0 u(k) +    + bm u(k + m) + 


y(k) = T (k)
(4.40)
where  is a dis rete time noise pro ess. The parameters are estimated using:

82

CHAPTER 4. FAULT DETECTION AND ISOLATION


u

Plant

Kalman filter
for sensor fault
detection
rs
Kalman filter
for actuator
fault detection
ra

Figure 4.6: Illustration of a bank of Kalman lters for statisti al FDI.

^
y^(k) = T (k)

(4.41)

where (k) is a ve tor ontaining the measurements and  is a ve tor


ontaining the parameters to be estimated:
T (k) = [ y(k + 1)    y(k + n); u(k)    u(k + m)
 = [a1 : : : an ; b0 : : : bm

(4.42)

The output error

y(k) y^(k) = y(k) T (k)^

(4.43)

is minimized by optimizing the performan e fun tion, whi h means that the
parameters hange, with system hanges.
When knowing the nominal parameters and their varian e, fault dete tion an be performed, by omparing estimated values with their nominal
values. The pro edure of generating residuals by means of parameter identi ation is shown in Fig. 4.7.
If a fault happens it is dete ted as a hange in mean value of the parameter ve tor .
Pro edures and methods for parameter estimation, in the FDI frame
are treated in (Isermann, 1991 [28; Isermann, 1984; [27, Tzafestas and
Watanabe, 1990 [48; Ljung and So derstrom, 1983 [35, Ljung, 1987 [34).

4.6. STATISTICAL METHODS TO GENERATE RESIDUALS

f
u

ud
Actual
Plant

Parameter
Identification

Calculation of
physical parameters
P

Nominel
Plant
Parameter
Identification
n
Calculation of
physical parameters
Pn

Determination of
changes
P = Pn - P
ONLINE

OFFLINE

Figure 4.7: Residual generation based on parameter estimation.

83

84

CHAPTER 4. FAULT DETECTION AND ISOLATION

Chapter 5

The Change Dete tion


Problem
Having a omplished to generate residuals the issue arises of how to determine whether a fault has o urred. This analysis is disturbed by the
o urren e of noise in measurements and in sto hasti disturban es on the
pro ess at hand. It is therefore needed to onsider the statisti al properties of a residual and make statisti al based testing of the hypotheses:
fault or no fault. This hapter starts with re alling some elementary properties of sto hasti signals: the properties in amplitude distribution and
time/frequen y domain behaviour. The on ept of distan e is then introdu ed with the aim to explain how hange dete tion an be a omplished.
Methods for hange dete tion are then dealt with. The simple threshold test,
the usum based dete tor for evaluation of statisti al signals, where ertain
on den e an be obtained about the orre tness of a dete tion, and some
implementation onsiderations, in parti ular for on-line dete tion, where a
re ursive method is needed. Simple examples illustrate the on epts.

5.1 About sto hasti signals


Sto hasti signals have a random variation and are des ribed by two main
features:

 The amplitude distribution


 The time/frequen y domain properties.
By random we mean that there is no way to predi t an exa t value at a
future instant of time.
85

86

CHAPTER 5. THE CHANGE DETECTION PROBLEM

5.1.1 Amplitude distribution.


A random pro ess an be hara terized through the amplitude of measurements taken as a time sequen e. The properties an be fully determined by
al ulation of the moments of the pro ess:
Z 1
xn p(x)dx
Pn =
1
where p(x) is the probability density fun tion of the signal, and x the amplitude. The rst moment is the mean value,
Z 1
xp(x)dx
=
1
in other words, the mean value is the weighted linear sum of x(t) over all
values of x. Similarly, the mean square, also referred to as the varian e, is
given by the se ond order moment
Z 1
2
x2 p(x)dx
 =
1
A very easiest probability density fun tion, in terms of al ulations, is
the Gaussian distribution with mean value  and varian e 2 :
!
1
(x )2
p(x) = p exp
22
 2
whi h has the well-known bell shape.

5.1.2 Mean and varian e of a stationary pro ess


A stationary sto hasti pro ess is one whi h is un hanged in time. The most
important measures of a sto hasti signal are the two lowest order moments:
the mean value, and the varian e. Assuming a stationary pro ess, the mean
value is de ned by averaging over time
N
1X
x ( t)
(5.1)
N !1 N
i=0
The varian e of a s alar pro ess is de ned as the se ond order moment.

 = x = E fx (t)g  lim
n

2 = E (x(t) x)2 = lim


N

For a ve tor valued pro ess

x=4

!1 N

x1
::
xn

3
5

N
X

1 i=0

(x (t) x)2

(5.2)

87

5.1. ABOUT STOCHASTIC SIGNALS

The de nition of varian e is extended to the ovarian e matrix. Using


the notation
2 = E f(xj xj ) (xk xk )g
jk
(5.3)
o

Q = E (x x) (x x)T = 4

2 :::
2 12
11
2
21 ::: :::
2
::: ::: nn

3
5

(5.4)

5.1.3 Mean and varian e of a ltered stationary pro ess


The on ept of a white noise pro ess is very popular for omputational
reasons. The white noise on ept is easy to deal with when we look at
dis rete time. Whiteness means that there is no orrelation between any
two samples of the pro ess, regardless how lose in time we look at the
pro ess. This on ept has somewhat di erent properties in dis rete and
ontinuous time.

Dis rete time


The most important question for a sto hasti pro ess, in engineering terms
is, what happens when a lter is applied to a sto hasti pro ess. A lter
an be des ribed as a set of state spa e equations
xk+1 = Axk + Bw
with outputs (measurements)

yk = Cxk + v
where v and w are measurement noise and pro ess noise, respe tively with
ovarian es


E wwT = Qw
E vvT = Qv
The mean value is simply given by the propagation of the mean value of
the noise through the lter

Ix = Ax + Bw


y = Cxk + v
or, in lter terms

y = zlim
C (zI A) 1 Bw + v
!1

whi h, naturally, is just expressing the DC gain of the lter. The varian e
at the output of a lter is only slightly more ompli ated to al ulate, and

88

CHAPTER 5. THE CHANGE DETECTION PROBLEM

the following expression is very useful to know - and remember - sin e the
sole purpose of most ltering is to redu e the varian e of the sto hasti part
of a signal. With P being the varian e of the output,
n

P = E (y y) (y y)T

then

Pk+1 = APk + Pk AT + BQw BT + Qv


This equation is the matrix Ri atti equation.

Continuous pro esses


If going to ontinuous time, a di ulty is that, in prin iple, it is only the
integral of a sto hasti pro ess with un orrelated in rements whi h exists.
The problem lies in the fa t that we an not de ne a white noise sour e with
nite intensity and assume it has in nite bandwidth sin e this would be a
signal with in nite power. It is outside the s ope of this hapter to treat the
ontinuous noise ase, so the interested reader should onsult, lassi al texts
in statisti s and ontrol, e.g. (Jazwinski, 19xx ; 
Astrm, 1970 ; Maybe h,
1980 ).
To quote the result, the dynami lter is

x_ = Ax + B"
y = Cx + 
where " and  are ontinuous time sto hasti pro esses with intensities Q"
and Q de ned by
n

E (" (t) ") (" ( ) ")T = Q" (t  )


and similarly for Q : The propagation equations for the mean value are

x_ (t) = Ax (t) + B"


y(t) = Cx(t) + 
The propagation equations for the ovarian e are

P_ = AP + PAT + BQ" BT + Q
The steady state solution is obtained setting P_ = 0 and solving the resulting
algebrai matrix Ri atti equation. There are standard routines to do this.
Setting P_ equal to zero gives the steady state ovarian e matrix.

5.2. MEASURING THE DIFFERENCE BETWEEN STATISTICAL SIGNALS89

5.2 Measuring the di eren e between statisti al


signals
In geometry, distan e is the shortest line between two points. This is easy
to grasp in three dimensional spa e, and easy to extend to more dimensions.
The de nition of distan e makes it possible to measure area and volume
and determine whether two obje ts are equal in e.g. area or distan e. We
need similar measures for sto hasti signals. Probability theory de nes a
Kullbak Leibner information between two probability densities of a random
variable. The two probability distributions are made to assume two di erent
set of parameters, most simple two di erent hypotheses about the mean or
varian e. Basseville and Nikiforov (1993) [3 show that the Kullbak Leibner
measure an be approximated by
N
1X
p (y jy :::y )
ln 0 i 1 i 1
N i=1 p1 (yi jy1 :::yi 1 )
where the key fun tion is natural logarithm taken of the ratio of the probability density fun tions. These fun tions, in turn, are al ulated from the
measurements up to instant no. i. This fun tion is referred to as the loglikelihood ratio
p (y jy :::y )
si = ln 0 i 1 i 1
p1 (yi jy1 :::yi 1 )
Its role is paramount in statisti al fault dete tion.
Considering fault dete tion, a residual generator generates sequen es,
r(i), whi h are hara terized by a Gaussian distribution with mean ve tor
 and ovarian e matrix Q

KN (0 ; 1 ) 

p (r(i)) = N (; Q)

(5.5)

The statisti al dete tability is de ned in terms of the Kullba k distan e


between two onditional distributions before and after a hange. The hange
is from parameter 0 to 1


p (r (1); rk (2);    ; rk (i))


si = ln 1 k
p0 (rk (1); rk (2);    ; rk (i))

(5.6)

where px (rk (1); rk (2); : : : ; rk (i)) is the probability density fun tion for the
sequen e with mean value, x , taken over the samples rk (1) to rk (i). The
hange is said to be dete table if the Kullba k distan e exists and satis es:

K (0 ; 1 ) > 0

(5.7)

90

CHAPTER 5. THE CHANGE DETECTION PROBLEM

In other words, a hange is dete table if the log likelihood ratio has
hanged after a fault. In order to be statisti ally dete table, the mean
value, 0 , needs to be known - or estimated - for the system operating
under normal onditions. The des ription onsiders hanges from normal
to faulty onditions, and not hanges from one faulty situation to another.
The Kullbak measure is illustrated in the following example.

5.2.1 The Kullbak Distan e between Gaussian signals



Assume a Gaussian distribution N 0 ; 02 , i.e. with mean value 0 and
varian e 02 ; representing the no fault ondition H0


(ri 0 )2
1
(5.8)
p(ri ) = p exp
202
0 2

The faulty ondition is H1 hara terized by the Gaussian pro ess N 1 ; 12 .
The Kullbak information between the two onditions is

K (0 ; 1 ) =

N
p (r )
1X
ln H1 i
N k=1
pH0 (ri )

(5.9)

First, we al ulate the log-likelihood ratio




or

p (r )
si = ln H1 i
pH0 (ri )
0

2 1
(
r

)
i
1
1
B 1 p2 exp
C
212
C


= ln B
B
C
2
 1
A
(
r

)
i
0
p exp
2
0 2
20
 (r  )2
si = ln 0 + i 2 0
1
20

(5.10)

1 )2
(5.11)
212
It is useful to look at one hange of property only. We thus assume
that a residual generator has been designed su h that, upon o urren e of a
ertain fault, the residual will hange in the pres ribed way.
A hange in mean from 0 to 1 , with un hanged varian e, 0 = 1 = 
gives
 
si = 1 2 0


ri

(ri

1 + 0
2

(5.12)

A hange in varian e from 0 to 1 ; but un hanged mean 0 = 1 = 


gives

91

5.3. CHANGE EVALUATION

 2 2
si = ln 0 + 1 2 2 0 (ri )2
1
0 0

(5.13)

This means that the log likelihood ratio is a fun tion of the observation ri
and that the Kullbak distan e is an average of those over N observations,

K (0 ; 1 ) =

N
1X
s
N k=1 i

(5.14)

5.3 Change Evaluation


In the previous hapters, residual generation have been des ribed. The next
step in the fault handling pro edure is the evaluation of the residuals for
de ision of whether a fault o urred. Two basi ally di erent approa hes an
be used:




Change evaluation using a xed threshold test.


Change evaluation using a statisti al test.

5.3.1 Threshold tests


The simplest way of de iding whether a fault has o urred is to test ea h
omponent of a residual against a xed threshold value. A xed threshold
test is rarely su ient for robust fault dete tion be ause errors in the design
model makes the residuals dependent of the input ex itation. Robustness
to the input ex itation an be obtained by use of an adaptive threshold
sequen e whi h is a fun tion of the input signal. When onsidering a tuator
and sensor faults, the referen e signal for a ontrolled pro ess may be applied
with advantage, be ause it will not be in uen ed by any faults. Basi ideas
for generating an adaptive threshold were presented by (Emami-Naeini, 1976
[17; Ding and Frank [15 and later Patton and Chen, 1992 [44).
In general, all model based methods for residual generation an be des ribed as shown in 5.1. Gu (z ) represents the input/output des ription for
the pro ess used in the design, Gu (z ) represents additive modelling errors,
whileHu(z ) and Hy (z ) are the transfer fun tions, whi h give the relation
between the input/output measurements and the estimated output, y^. For
the ideal ase, with no modelling errors the residual is zero:

r(k) = y(k) y^(k) = Gu (z )u(k) Hu(z )u(k) Hy (z )y(k) = 0


when modelling errors are onsidered the residual be omes:

(5.15)

92

CHAPTER 5. THE CHANGE DETECTION PROBLEM

u
Gu+Gu

Hy

Hu
+

y^

Figure 5.1: Statisti al dete tion is applied to the residual.

r(k) = y(k) y^(k) = (Gu (z ) + Gu (k))u(k) Hu (z )u(k) Hy (z )y(k)


= (I Hy (z ))Gu (z )u(k)
(5.16)
It is assumed that Gu (z ) is bounded by the limit value , i.e.:

jjGu (z)jj 

(5.17)

This gives a residual ve tor whi h is bounded by:

jjr(k)jj  jj(I Hy (z))u(k)jj

(5.18)

An adaptive threshold an be determined as a fun tion of u(k):

T h(k) = (I

Hy (z ))u(k)

(5.19)

5.4 Statisti al Dete tion


A number of de ision te hniques are overed by statisti al (or hypothesis)
testing. These te hniques basi ally state di erent hypotheses, Hi, on erning the system's operational onditions and, by means of a de ision fun tion, gi (Hi ), determine whi h hypothesis that is a epted. This paragraph
des ribes two di erent te hniques:

 Weighted sum-squared residual (WSSR) te hnique

93

5.4. STATISTICAL DETECTION

 Sequential probability ratio test (SPRT)


The observations (residuals) used in the hypothesis testing are onsidered to be Gaussian sequen es with zero-mean, 0 = 0, and a known (or
estimated) varian e, 02 under normal operational onditions (non faulty),
i.e., typi ally residuals generated by means of Kalman ltering. The probability density fun tion of a Gaussian sequen e, rk , with N (; 2 ) is des ribed
by:
1
p(rk ) = p e
 2

(rk )2
22

(5.20)

Statisti al testing methods are des ribed in Basseville and Nikiforov,


(1994) [2, Gertler, (1991) [24, Tzafestas and Watanabe, (1990) [48.

5.4.1

The Weighted Sum-squared Residual Te hnique

The WSSR te hnique uses the relation between a Gaussian and 2 distribution. The following are des ribed for the one-dimensional ase, but an
easily be onverted to a multi-variable ase. One residual generated from a
Kalman lter, [r(1); r(2); : : : ; r(j ) = r where j is the number of samples, is
hara terized by a Gaussian distribution with zero mean and a ovarian e,
2 . The sum of the squared samples [r2 (1) + r2 (2) + : : : + r2 (j ) = (r2 ) has
a 2 - distribution. If the Gaussian distribution is given by N (0; 1), then
the sum of squares has the distribution 2 (n), where n is the degrees of
freedoms. If r has the distribution N (; 1) then the sum of squares has the
distribution 2 (n; ), where:

(j ) =

j
X
k

=1

2 (k)

(5.21)

The mean and the varian e of the distribution are given by:

2 = n + ;
2 2 = 2n + 4

(5.22)
(5.23)

If r has the distribution N (; 2 ), then the sum of squares has the same
distribution as for 2 = 1, when  ! 1.
Now, if the residual r(k) is weighted with the standard deviation, ,
whi h is onsidered known, then the result, rw (k), is a signal with zero
mean and a varian e of one. This means that the distribution of the weighted

94

CHAPTER 5. THE CHANGE DETECTION PROBLEM

sum squared sequen e, I (k), only ontains one parameter for determination,
namely the degrees of freedom, n. n is determined as j 1.
= r(k) 1

rw (k)

I (j ) =

j
P

rwT (k)rw (k) =

j
P

rT (k) 2 r(k)

(5.24)

k =1
=1
If a system fault happens the mean value of the residuals will hange.
Hypotheses are stated for the di erent faulty and non-faulty onditions. For
instan e the hypothesis that nothing has o urred H 0 = f :  = 0 g and
the hypothesis that something has o urred H1 = f :  6= 0 g gives the
following dete tion rule using the weighted sum squared sequen e I (k).
k

a ept H0 if I (k) 
reje t H0 (or a ept H1 ) if I (k) >
With the aid of 2 tables, the values for the innovation window length
(degrees of freedom) and the de ision threshold must be hosen so that trade
o are made between the probability of false dete tions (reje ting H0 when
really to be a epted, or a epting H1 when the reality is H0 ), PF , and the
probability of missed dete tions (a epting H0 when really to be reje ted,
or a epting H0 when the reality is H1 ), PM . PF an be determined from
the hoi e of on den e level = 1 PF . is typi ally hosen to 0.95,
0.995 or 0.999 giving a probability of false dete tion on 5%, 0.5% or 0.1%
respe tively. PM is dependent of the statisti s of the sequen e when a fault
is present whi h might not be known. The de ision rule an be written as:
g(k) =

1 when I (k) >


0 when I (k) 

(5.25)

In the above formulation, the algorithm is running only on e. The algorithm should be reset to zero every time a hypothesis has been on rmed in
order to run sequentially for on-line dete tion.

5.4.2

Sequential Probability Ratio Test

If a fault o urs, the e e t is a hange the mean value, , of the residual


r(k). Then a set of simple hypotheses an be onsidered, where H0 = f :
 = 0 g is the hypothesis on erning the pro ess under normal operational
onditions. Hi = f :  = i g is the hypothesis on erning the pro ess
under a faulty ondition, i = 1; 2; : : : ; m, where m is the number of fault
onditions. In the following des ription, testing between the two hypotheses,

95

5.4. STATISTICAL DETECTION

H0 and H1 , is des ribed. The tool for testing between the two hypotheses
is based on the log-likelihood ratio, de ned by:
pH 1 (ri (1); ri (2); : : : ; ri (j ))
(5.26)
pH 0 (ri (1); ri (2); : : : ; ri (j ))
where pHi (ri (1); ri (2); : : : ; ri (j )) is the probability density fun tion onsidering that hypothesis Hi is true and taken over the samples ri (1) to ri (j ).
The expe tation value of s(j ), E [s(j ), when hypothesis H0 is true is less
zero, while it is above zero when hypothesis H1 is true. A hange in the
mean value is then re e ted as a hange of sign in the mean value of s(j ).
The umulative sum of s(j ):
s(j ) = ln

S (j ) =

j
X

s(k)
(5.27)
=1
is the log likelihood ratio for the observations from r(1) to r(j ) and is
the de ision fun tion, when testing between H0 and H1 using the following
de ision rule:
k

a ept H0 when S  a
a ept H1 when S  h
ontinue to observe and test when a < S < h whi h an be rewritten to:
1 when S  h
0 when S  a
The threshold values a and h must ful ll the inequality:

g (r ) =

(5.28)

a<S<h
The two threshold values an be sele ted by the designer to re e t the
trade -o between the probability of false alarms PF (H1 (a fault) is dete ted but the real ondition is H0 (no fault)) and the probability of missed
dete tions PM (H0 (no fault) is dete ted but the real ondition is H1 (fault)).
Determination of the threshold values using PF and PM dire tly was
proposed by De kert, (1978) [14. The thresholds h and a are give by
1 PM
)
PF
P
a = ln( M )
1 PF

h = ln(

(5.29)

96

CHAPTER 5. THE CHANGE DETECTION PROBLEM

This is a very useful result from an engineering point of view. In an implementation, it will usually be de ided to run the test sequentially. This
means, as soon as one of the thresholds is rea hed, the asso iated hypothesis
is de lared TRUE, and the test is restarted. This enables us to nd a fault
that o urs at a random instant in time. While the test runs for the rst
time, we an not assume any of the hypotheses to be true with the desired
probability (PM , PF ):

Bibliography
[1 K. J. 
Astrom, J. J. Anton, and K. E. 
A. n. Expert ontrol. Automati a,
22(3):pp 227{286, 1986.
[2 M. Basseville and I. Nikiforov. Statisti al Change Dete tion. Prenti e
Hall, 1994.
[3 M. Basseville and I. V. Nikiforov. Dete tion of Abrupt Changes: Theory
and Appli ation. Information and System S ien e. Prenti e Hall, New
York, 1993.
[4 T. E. Bell. Managing murphy's law: Engineering a minimum-risk system. Spe trum, 1989.
[5 M. Blanke. Aims and means in the evolution of fault tolerant ontrol. In
European S ien e Foundation Workshop, Control of Complex Systems
(COSY), pages 22{32, Sept. 1995.

[6 M. Blanke. Design of dependable ontrol systems using a omponent


based approa h. In Pro . IFAC Workshop: On-line Fault Dete tion
and Supervision in the Chemi al Pro ess Industries, pages 187{195,
New astle Upon Thyne, UK, Jun. 1995.
[7 M. Blanke, S. A. Bgh, R. B. Jrgensen, and R. J. Patton. Fault
dete tion for a diesel engine a tuator - a ben hmark for fdi. Control
Engineering Pra ti e, 3:1731{1740, De . 1995.
[8 M. Blanke and R. B. Jrgensen. Reliability related to sensor and a tuator interfa e in ma hinery systems. Te hni al report, Aalborg University R93-4016., 1993.
[9 M. Blanke, S. B. Nielsen, and R. B. Jrgensen. Fault A ommodation
in Feedba k Control Systems. Le ture Notes in Computer S ien e Vol.
736. Springer Verlag, 1993. ed. R.L. Grossman, A. Nerode, A.P. Ravn,
and H. Ris hel.
[10 S. A. Bgh, R. Izadi-Zamanabadi, and M. Blanke. Onboard supervisor
for the rsted satellite attitude ontrol system. In Arti ial Intelligen e and Knowledge Based Systems for Spa e, 5th Workshop, pages
97

98

BIBLIOGRAPHY

137{152, Noordwijk , Holand, O t. 1995. The European Spa e Agen y,


Automation and Ground Fa ilities Division.
[11 J. Chen and R. J. Patton. A reexamination of fault dete tability and
isolability in linear dynami systems. In IFAC Safepro ess 94, Helsinki,
Finland, pages pp 590{596., 1994.
[12 R. David and H. Alla. Petri-nets for modeling of dynami systems - a
survey. Automati a Vol. 30. No. 2, pages pp 175{202, 1994.
[13 T. J. A. de Vries. Con eptual Design of Controlled Ele tro-Me hani al
Systems. PhD thesis, Universiteit Twente, NL., 1994.
[14 J. C. De kert. De nition of the f-8 dfbw air raft ontrol sensor analyti
redundan y management algorithm. Te hni al report, C.S. Drasper
Laboratory, Cambridge, Masse huset, 1978. Report R-1178.
[15 X. Ding and P. M. Frank. An approa h to robust residual generation
and evaluation. In Pro . Conferen e on De ision and Control, pages
656{661, Brighton, UK, De . 1991. IEEE.
[16 R. I. (ed). Postprints of IFAC Safepro ess 91, Baden-Baden,. Pergamon
Press, Oxford, UK., 1991.
[17 A. Emami-Naeini, M. M. Akhter, and S. M. Ro k. E e t of model
un ertainty on failure dete tion: The threshold sele tor. IEEE AC,
33(12):1106{1115, De . 1988.
[18 P. M. Frank. Fault diagnosis in dynami systems using analyti al and
knowledge-based redundan y. Automati a, 26(3):459{474, 1990.
[19 P. M. Frank. Enhan ement of robustness in observer-based fault dete tion. In Preprints of IFAC/IMACS Symposium SAFEPROCESS'91,
volume 1, pages 275{287, Baden-Baden, Sept. 10-13 1991. \A modi ed
version also published in Int. J. Control, Vol.59, No.4, 955-981, 1994".
[20 P. M. Frank. Appli ation of fuzzy logi pro ess supervision and fault
diagnosis. In Preprints of the IFAC Sympo. on Fault Dete tion, Supervision and Safety for Te hni al Pro esses: SAFEPROCESS'94, volume 2,
pages 531{538, Espoo, Finland, Jun. 13-16 1994.
[21 P. M. Frank. Advan es in fault toleran e by model-based fault diagnosis.
In ESF Workshop, COSY'95, pages 15{22, Rome, Italy, Sept. 1995.
[22 O. I. Franksen. Group representation of nite polyvalent logi - a ase
study using apl notation. Pro . IFAC World Congress, Helsinki,, pages
875{887, 1978.

BIBLIOGRAPHY

99

[23 J. J. Gertler. Survey of model-based failure dete tion and isolation in


omplex plants. IEEE Control Syst. Mag., 8(6):3{11, 1988.
[24 J. J. Gertler. Analyti al redundan y methods in failure dete tion
and isolation. In Preprints of IFAC/IMACS Symposium SAFEPROCESS'91, volume 1, pages 9{21, Baden-Baden, Sept. 10-13 1991. also
published in a revised version in \Control - - Theory and Advan ed
Te hnology, Vol. 9, No.1, 259-285, 1993".
[25 J. J. Gertler. Analyti al redundan y methods in failure dete tion
and isolation. Control Theory and Advan ed Te hnology, 1(9):259{285,
1993.
[26 S. A. Herrin. Maintainability appli ations using the matrix fmea te hnique. Transa tions on Reliability, R-30(2):212{217, Jun. 1981.
[27 R. Isermann. Pro ess fault dete tion based on modelling and estimation
methods: A survey. Automati a, 20(4):387{404, 1984.
[28 R. Isermann, editor. Preprints of IFAC/IMACS Symposium on Fault

Dete tion, Supervision and Safety for Te hni al Pro esses { SAFEPROSS'91, Baden-Baden, Germany, Sept. 10-13 1991.

[29 R. Isermann. Integration of fault dete tion and diagnosis methods.


In Preprints of the IFAC Sympo. on Fault Dete tion, Supervision and
Safety for Te hni al Pro esses: SAFEPROCESS'94, volume 2, pages
597{612, Espoo, Finland, Jun. 13-16 1994.
[30 K. Jensen. Coloured Petri Nets., volume 2 of EATCS Monographs on
Theoreti al Computer S ien e. Springer Verlag., 1994.
[31 R. B. Jrgensen. Development and Test of Methods for Fault Dete tion
and Isolation. PhD thesis, Department of Control Engineering, Aalborg
University, Fredrik Bajers Vej 7C, DK 9220 Aalborg, Denmark, Jul.
1995.
[32 D. Karnopp and R. Rosenberg. Introdu tion to Physi al System Dynami s,. M Graw-Hill., 1983.
[33 J. M. Legg. Computerized approa h for matrix-form fmea. IEEE Transa tions on Reliability, R-27(1):254{257, Jan. 1978.
[34 L. Ljung. System Identi ation: Theory for the User. Prenti e Hall,
1987.
[35 L. Ljung and T. Soderstrom. Theory and Pra ti e of Re ursive Identi ation. The MIT Press, Massa husetts and London, 1983.

100

BIBLIOGRAPHY

[36 C. P. Lunau. On the design of re e tive diagnosis systems. Te hni al


report, Aalborg University, Department of Computer S ien e, 1995.
[37 C. P. Lunau. A re e tive ar hite ture for pro ess ontrol appli ations.
In M. Aksit and S. Matsuoka, editors, ECOP'97 Obje t Oriented Programming, pages 170{189. Springer Verlag, 1997. Le ture Notes in
Computer S ien e, Vol. 1241.
[38 C. P. Lunau and J. K. Nielsen. Emma: An emergen y management
system for use onboard ships. In IFAC Workshop on Control Appli ations on Marine Systems CAMS'95, pages 164{173, Trondheim, Norway, May. 1995. International Federation of Automati Control.
[39 P. Maes. Computational Re e tion. PhD thesis, Arti ial Intelligen e
Laboratory, Vrije Universiteit Brussel, Belgium., 1987. Te hni al report
87-2.
[40 G. Mller. On the Te hnology of Array-Based Logi . PhD thesis, Ele tri al Power Eng. Dept., Te h. University of Denmark, Lyngby, Denmark,
1995.
[41 T. More. Notes on the Diagrams, Logi and Operations of Array Theory. In; Stru tures and Operations in Eng. Management Systems, (eds;
. Bjrke and O.I. Franksen) Tapir., 1981.
[42 R. Patton and J. Chen. A review of parity spa e approa hes to fault
diagnosis. In Preprints to SafePro ess 1991, volume 1, pages 239{55,
1991.
[43 R. J. Patton. Robust model-based fault diagnosis: The 1995 situation.
In Pro . IFAC Workshop on Supervision and Fault Diagnosis in the
Chemi al Pro ess Industries, New astle, UK., pages 55{78, 1995.
[44 R. J. Patton and J. Chen. A robustness study of model-based fault
dete tion for jet engine systems. In Pro . of the 1st IEEE Conf. on
Control Appli ation, pages 871{876, Dayton, Ohio, Sept. 13-16 1992.
[45 R. J. Patton, P. M. Frank, and R. N. Clark, editors. Fault Diagnosis in
Dynami Systems, Theory and Appli ation. Control Engineering Series.
Prenti e Hall, New York, 1989.
[46 D. N. Shields. Robust fault dete tion for generalized state spa e systems. In Pro . of the IEE Int. Conf.: Control' 94, pages 1335{1349,
Warwi k, UK, Mar h 21-24 1994. Peregrinus Press, Conf. Pub. No. 389.
[47 C. Tsui. A general failure dete tion, isolation and a ommodation system with model un ertainty and measurement noise. IEEE Transa tions on AC, vol 39. no. 11, pages pp. 2318{2321, 1994.

BIBLIOGRAPHY

101

[48 S. Tzafestas and K. Watanabe. Modern approa hes to system/sensor


fault dete tion and diagnosis. Journal A., 31(4):42{57, 1990.
[49 J. C. Willems. Paradigms and puzzles in the theory of dynami al systems. IEEE Trans. AC, Vol. 36. No. 3, pages 259{294., 1991.
[50 A. S. Willsky. A survey of design methods for failure dete tion in
dynami systems. Automati a, 12(6):601{611, 1976.
[51 J. Yuan. Strategy to establish a reliability model with dependent omponents through fmea. Reliab. Eng, 11(1):37{45, 1985.

You might also like