You are on page 1of 6

Joint 48th IEEE Conference on Decision and Control and 28th Chinese Control Conference Shanghai, P.R.

China, December 16-18, 2009

WeBIn4.6

PLS-based FDI of a Three-Tank Laboratory System


S. Klinkhieo & R. J. Patton, Senior Member IEEE

AbstractThe problems of fault detection and isolation of dynamic systems has been studied intensively in the recent years and many successful industrial applications have been reported. In the main these studies have been restricted to model based techniques, with few reports of successful implementation of data driven approaches. These data driven approaches have been range from the application of linear regression techniques, to neuro-fuzzy systems. This paper reports on application of, Multivariate Statistical Process Control (MSPC) methodologies, which can provide a diagnostic tool for the on-line or real time monitoring and detection of the process malfunction is proposed. Finally the effectiveness of Partial Least Squares (PLS) in FDI of the Three-Tank system are represented and discussed through simulation results.

comprehensive monitoring and detection of process malfunctions. In addition, these methods have been shown to be particular valuable for application to processes involving large amounts of data, as discussed by [1] and [10] In this paper, Multivariate Statistical Process Control (MSPC) methods are used as an effective set of tools for FDI and the diagnosis of abnormal operating conditions of the process variables. It should be noted that only a sudden fault (step fault) is studied and presented in this research. In Section II, the fundamental concept of PLS algorithms is first represented. Section III, the study on three-tank systems and fault diagnosis is discussed. Then, the developed experimental platform for on-line FDI with the three-tank system is presented in Section IV, and finally, the PLS based FDI simulation results are reported and analyzed. II. MULTIVARIATE STATISTICAL PROCESS CONTROL MSPC is based on statistical projection techniques, namely Principal Component Analysis (PCA), [11] and Projection to Partial Least Squares, [12]. These two methods are employed extensively for on-line or real time multivariate monitoring and detection schemes of the process malfunction. PLS offers certain attractive features, both in MSPC applications and control system applications. The basic approach of the algorithm is to identify the principal features in the data by dividing the variable space in to cause and effect, and performing a dimensionality reduction on them. In a sense, PLS provides a method for compressing the dimensionality of the data space. It then identifies the primary features in the cause variables that are able to describe the variation in the effect variables. A major feature of PLS is that it will deal with correlated data and will produce answers when Ordinary Least Squares cannot [12] and [13]. This property is not unique to PLS as there are alternative algorithms that can also cope with correlated data, such as recursive least squares and ridge regression [14]. A significant benefit that PLS offers over these approaches is that once the model is developed it can be used for both monitoring the condition of a plant, but also to control it. The PLS approach relies on decomposing the input matrix,

I. INTRODUCTION

uring the last two decades, there have been immense advances in the areas of advanced process control, specifically dealing with fault detection and isolation (FDI). The objective of these developments is to be able to detect and distinguish not only sudden faults (step faults) but also incipient faults (including drift faults) occurring in the process. The ability to detect and isolate these faults is crucial in order to avoid loss of product and reduced profitability and eventual inappropriate shut-down. In addition FDI methods enhance the safe operability of the processes, lack of attention to a suitable level of process monitoring and FDI will obviously lead to environmental and health issues [1] and [2]. There have been a number of successful industrial studies and these have been reported in research literature [2], [3], [4] and [5]. At the same time there have also been comprehensive reviews on the theoretical approaches to fault detection and process monitoring [6]. In the process control industry, data driven approaches such those based on soft computing [19] and MSPC techniques [2], are preferred to model based approaches for FDI. These methods are varied and a good overview providing a classification of methods, with a good framework under which quantitative model-based methods using neural networks, fuzzy logic, neuro-fuzzy, etc can be found in [7], [8], [9], [19], and [20]. Work reported in [1] and [2] developed concepts of process performance monitoring (allied to FDI) through MSPC for chemical processes. These methods are based on Principal Component Analysis (PCA) and Partial Least Squares (PLS). These studies have shown that MSPC is efficient in providing early warning capabilities for the

X KxM and the output matrix, Y KxN , to sum of rank one component matrices, [15]. Here K is the number of measurements or data sizes, M is the number of output variables and N is the number of input variables.
Typically, the PLS decomposition of X and Y are given is carried out in the following manner [14]:

R. J. Patton is with the Department of Engineering, University of Hull, Hull, HU6 7RX UK (phone: +44 (0)1482-46-5117; fax: +44 (0)1482-46-6664; e-mail: R.J. Patton@hull.ac.uk). S. Klinkhieo is with the Department of Engineering, University of Hull, Hull, HU6 7RX UK (e-mail: S.Klinkhieo@2005.hull.ac.uk).

978-1-4244-3872-3/09/$25.00 2009 IEEE

1896

WeBIn4.6
n

i =1 n

~ X i + Fn =

t p
i =1 n

T i i

+ Fn
T

fp
I O
T

T = Tn Pn + Fn = X n + Fn ,

(1) + En

PLS

PLS

i =1

~ Yi + E n =

i =1

i q iT u

fm

QT + E = Y +E , = U n n n n n where n is the number of rank one component matrices, ~ ~ i qiT , retained in the decomposition. X i = ti piT and Yi = u

Fig. 1. Generic setup of PLS In this case, it should be noted that the T PLS transformation is not invertible, the dimensionality of X is less than the dimensionality of I (DIM X < DIM I) and the dimensionality of Y is less than the dimensionality of O (DIM Y < DIM O). Namely, this has two implications: (a) the relationship between X & Y need not have the same characteristics as that between I & O (b) there need not be a direct one to one relationship between the variables in the space X and those in the space I on one hand and the space Y and the space O on the other. In addition, numerical analysis carried out in the transformed space cannot be translated back to the original space. III. THE NECST THREE-TANK BENCHMARK SIMULATION
where Bn represents a diagonal matrix containing the regression coefficients of the score model, bi , determined by PLS algorithm. The loading vectors are identified using a technique known as singular value decomposition (SVD), which calculates the loadings such that the first score explains the greatest variation in the data, the second score the next and so on. Each of the scores is uncorrelated to one another. However, it should be noted that the scores are often referred to as latent variables, i.e. each score vector represents an instance of a particular latent variable or LV [14]. Another way of looking at PLS is that it creates a new set of variables (latent variables) which are reflective of true dimensionality of the system. This new set of variables, are a set of strongly correlated variables and often only a few t-score vectors are needed to describe the process variation or process performance [12]. The number of retained tscore vectors is typically determined by cross-validation or the analysis of variance, etc as demonstrated for example by [3] and [10]. Fig. 1 shows the example of PLS algorithms the system which contains original input (I) and original output (O), respectively. PLS provides the TPLS -transformation which can reduce the dimensionality of the original input (I) and output (O) with the relationship I = f p (O) to the

i are referred to as the t-score vector The vectors t i and u and u-score vector, the vectors p i and qi are loading
represent the sum of n the rank one vectors, X n and Y n component matrices to reconstruct the input matrix and predict the output matrix, respectively, E n and Fn are

i , can be residual matrices. The u-score vectors, u estimated from the t-score vectors as follows:
= [u 1 , L , u 1 ] = [t1b1 , L , t nbn ] = Tn Bn U i (2)

The NECST 3-Tank Bench Mark Simulation [16] and [17] has been selected for both testing and verifying the ideas developed in this paper. The model represents a real 3-tank system (Fig 2) from the Research Centre for Automatic Control in CRAN-UHP, Nancy, FRANCE. The concepts presented will be later tested on this bench mark system. It can be seen that Fig. 2 shows the subsystem-1 has 3 inputs ( Q 01 , Pw and Q12 ) and 3 states ( x1 = [ L1, T1 , V12 ] ) with the following dynamics:
& = Q Q Q S1 L 1 01 12 10 & = Q (T T ) + S1 L1T 1 01 0 1 & =Q V 12 12 Subsystem-2 has 2 inputs ( Q 02 and Q 20 ) and 3 states ( x 2 = [ L 2 , T 2 , V 20 ] ) with the following dynamics;
& = Q + Q Q + (Q + Q ) Q S2L 2 12 02 20 32 32 leak 2 & )(T0 T2 ) S 2 L 2 T2 = Q12 (T1 T2 ) + Q 02 (T0 T2 ) + (Q32 + Q32 & =Q V
20 20

Pw c

(3)

(4) Subsystem-3 has 2 inputs ( Q 03 and Q32 ) and 2 states ( x 3 = [ L3 , V32 ] ) with following dynamics:
& = Q (Q + Q ) Q Q S3 L 3 03 32 32 30 leak 3 & =Q V
32 32

new dimensionality of X and Y which can reserve the relationship between input and output to the term of Y = f m ( X ).

(5)

1897

WeBIn4.6
yielding good fault-tolerance properties. However, in this study the fault information is not actually used for improving the fault-tolerant control performance. Details of this approach can be found in [21]. Further studies have involved fault compensation within this distributed system structure [22].
L
0.8 0.7 0.6 0.5
id

: (dashed lines)

L i : (solid lines)

L 1

Level (m)

L 3
0.4 0.3

Fig. 2. Schematic diagram of NeCST Three-Tank Benchmark System where S i , Li , Ti , Vij , Qij , Qleaki i = 0, 1, 2, 3 are the crosssectional areas of each tank, the level of liquid in tank-i, the temperature of liquid at the centre of the tank-i, the volume of liquid passing from tank-i to tank-j, the liquid flow rates between tank-i to tank-j, and the leak from tanki, respectively. ij =0 means the buffer tank. Pw is the power input. and c are the density and the specific heat capacity of the liquid inside the tank. It should be noted that the flow-rates on the system are controlled by either pumps or valves. The system working as follows:

L 2
0.2 0.1 0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

time (s)

T id : (dashed lines)
32

T i : (solid lines)

30

T1

Temperature ( oC)

28

T2
26

24

Pump_1 is kept constant at 0.75 m3/sec. Tank-1 is fed by Valve01 (Q01). Tank-2 is fed by Valve02 (Q02) and Pump_2 (Q12). Tank-3 is fed by Pump_3 (Q03).

22

20

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

time (s)

The liquid inside the Tank-1 is heated by an electrical heater. The Tank-2 is taking preheated liquid from the Tank-1 (Q12) and mixes it with a solution coming from the Tank-3 (Q32). Valve32, Valve_leak_1, Valve_leak_2, and Valve_leak_3 are totally closed (Q32 = Q10 = Qleak2 = Qleak3 = 0) and Valve30 is totally opened during the simulation. The performance objectives for the system are as follows:

Fig. 3. System outputs without fault IV. FDI FOR EXPERIMENTAL PLATFORM In order to study and develop FDI for the system when subjected to different types of faults, the experiment platform has been enhanced with the addition of 3 new software modules, as shown in Fig. 4. This modification allows for real-time /on-line FDI. It should be noted that the simulation is running in real-time.
Input parameters Data Collection Three-Tank Simulation FDI Fault analyzer Information of faults

Maintain the levels of each tank, L1 at 0.75 m, L2 at 0.3 m and L3 at 0.5 m. Maintain the temperature of 1st and 2nd tank, T1 at 30oC and T2 at 28oC.

The system is simulated initially without any faults and from Fig. 3 it can be seen that the outputs track the reference signal. The desired values (reference) for each control objective are shown in the dashed lines [18]. This example system constitutes a distributed control problem under autonomous learning supervision with each tank represented as one subsystem in the inter-connected structure. This involves the use of two-level constrained optimal control, fitting to the structure of a receding horizon control problem with an added supervision layer to compensate the inter-connection disturbance (states)

O I PLS algorithm

Y X

Fig. 4. On-line FDI scheme for Three-Tank System. First, the original input (I) and output (O) variables from Three-Tank simulation are collected and sent to PLS platform by the Data collection module. This module is

1898

WeBIn4.6
developed using LabVIEW (National Instruments, http://www.ni.com) running as a real-time parallel with the three-tank simulation. Namely, the both input and output variables from the Three-tank simulation (using MATLAB) have been send to Data collection module (using LabVIEW) every time interval (second) and such data will be collected and saved into the buffer within this module in term of the input and output matrices with the matrix sizes KxN and KxM, respectively, and then transfer these matrices to the next module. Second module is the PLS module, which takes the data, collates them into the relevant matrices, and performs the dimensionality reduction, and also determines the other values required for the FDI. However, as discussed in Section II, it should be noted that the dimensionality of original inputs and outputs are less than the dimensionality of the new ones and these new variables are linear combinations of the original measured variables. In the simulation, 44 input variables (I) were measured along with 22 output variables (O). The results from the PLS module using these variables are presented in Tables 1, and 2. Fig.5 shows the results after processing the PLS variables, shows the score and the corresponding explanations for the variances. From the results, it can be seen that the first three scores of the new dimensionality (X and Y) have captured the major features in the data. In other words only three score/latent variables (1st, 2nd and 3rd) are sufficient to describe the process performance while the 4th score contains very little variation and it can be assumed that this value explains the measurement noise in the system. Table 1 & 2 below list the variance of each of these scores: Score 1 2 3 4 % Variance
68.26 20.272 11.118
Variance Explained (%) 80 60 40 20 0 1 2 3 Score/Latent vector 4 Input variables (X) Output variables (Y)

Fig.5. Example of new input/output data sets produced by PLS algorithm. The results of the PLS module are then analysed by the fault analyzer module. This can provide information of: (a) process performance, (b) changes in the underlying operations of the process and (c) real- time implementation of a process monitoring schemes. However, when a process is recognized to be out of control or there are faults occurring, the fault analyzer needs specific methods to identify which variables are responsible for the change in the process. It may be possible to identify the set of the original variables whose contribution has increased over that defined in the nominal model and which are responsible for non-conforming behavior. One approach is through the implementation of a process variable contribution plot which depicts the change in the new observation variables relative to the average value calculated from the nominal PLS model [1] and [2]. Next section, fault analyzer, using (xi, yi) pairs which can provide the certain cluster for the particular patterns of faults will be discussed and the simulation results are also presented. V. FAULT ANALYSIS MODULE As discussed in Section IV, the fault-detection abilities of the FDI systems were evaluated using the simulation datasets, for period of 5,000 seconds under fault condition. Fig. 6 shows the FDI system for three-tank system in case of fault free. Definition 1: dU il = U il (t + 1) U il (t ), t where U il is output score/latent value-l of tank-i, respectively.

Cumulative Variance
68.26 88.532 99.65

0.35

100

Table 1: Variance information of the input variables Score 1 2 3 4 % Variance


75.331 15.3 7.569

Cumulative Variance
75.331 90.631 98.2

1.8

100

Table 2: Variance information of the output variables Fig.5 also shows that the first three latent values of X and Y can describe over 99.65% and 98.2% of the variation. However, as has been pointed out in [1] and [2], it is very important that the lower scores are not to ignored since they can provide additional information to aid process fault detection.

1899

WeBIn4.6
PLS for Tank-1
(a)

Fault detected

(b)

Relative to the magnitude of fault Fig. 6. FDI residuals for three-tank simulation for faultfree case The next sets of simulation results were carried out with bias faults (20%, 40% and 60%) of the electrical heater operating point after t = 2500 seconds. The results of these simulations are shown in Figs. 7, 8 and 9. PLS for Tank-1
(a)

Fig. 8. The example of the FDI system for three-tank simulation: (a) the fault detection (Fault in tank-1 is detected only) and (b) the relationship between the process and /or quality variables between output latent variable #1 and #4 due to bias fault (40% of heater operating point). PLS for Tank-1 Fault detected
(a)

Fault detected

Output latent value #4

20 15 10 5 0 -50

(b) (b)

Relative to the magnitude of fault


-30 -10 10 30 Output latent value #1 50

Relative to the magnitude of fault

Fig. 7. The example of the FDI system for three-tank simulation: (a) the fault detection (Fault in tank-1 is detected only) and (b) the relationship between the process and /or quality variables between output latent variable #1 and #4 due to bias fault (20% of heater operating point).

Fig. 9. The example of the FDI system for three-tank simulation: (a) the fault detection (Fault in tank-1 is detected only) and (b) the relationship between the process and /or quality variables between output latent variable #1 and #4 due to bias fault (60% of heater operating point). Fig. 7(a), 8(a) and 9(a) show the faults with 20%, 40% and 60% bias of heater operating point in tank-1 are detected (only detected).

1900

WeBIn4.6
Fig. 7(b), 8(b) and 9(b) present (xi, yi) pair with the particular pattern of clusters (fault information) depending upon; fault types, fault sizes, etc. It should be noted that in this study, the only the electrical heater fault with the various magnitudes (e.g. the magnitudes of bias faults; 20%, 40% and 60% of heater operating point, respectively) are presented. Throughout the simulation results it can be seen that the FDI units using PLS algorithm is able to detect faults and also identify the subsystem where the fault is occurred. ACKNOWLEDGMENT S Klinkhieo acknowledges PhD scholarship funding from the Synchrotron Light Research Institute (SLRI) under the Royal Thailand Government and discussions with C Kambhampati of the Computer Science Department at Hull University. The authors are grateful for funding support from the EC FP6 IST programme under contract No. IST004303NeCST (Project NeCST http://www.strepnecst.org/ ) VI. CONCLUSION There have been numerous studies on the utility of MSPC in general and PLS in particular. These results and studies have shown that PLS is a useful tool for both modeling, and controlling process plants as well as for an integrated condition monitoring system. All of these studies indicate that that MSPC based methods are able to handle the high dimensionality of the data, which is a given in process plants, and also extract relevant information from this data. It is this property which is attractive, and has been used here. This paper presented ideas for the development of an FDI methods based on PLS and clustering. However, the concepts though are generic; they require fine tuning and are application dependent. The results obtained from the test platform used, namely three-tank system using primary process data indicates promising results. Currently work is underway to make the FDI method more efficient and less process specific. REFERENCES
[1] A. J. Morris& E. B. Martin (1997), Process Performance monitoring Fault Detection Through Multivariate Statistical Process Control, IFAC Fault Detection, Supervision and Safety for Technical Process, UK. [2] A. Simoglou, E. B. Martin,A. J. Morris, M. Wood & G. C. Jones (1997), Multivariate Statistical Process Control System, IFAC Fault Detection, Supervision and Safety for Technical Process, UK. [3] J. F. MacGregor & T. Kouriti (1995), Statistical process control of multivariate processes. Control Engineering Practice, 3(3): 403-414. [4] S. L. Jms-Jounela, M. Vermasvuori, P. Endn & S. Haavisto (2003), A process monitoring system based on the Kohonen self-organizing maps, Control Engineering Practice, 11: 83-92. [5] T. Komulainen, M. Sourander & S. L. Jms-Jounela (2004), An online application of dynamic PLS to a dearomatization process, Computers & Chemical Engineering, 28(11): 2611-2619. [6] R. J. Patton, P. M. Frank & R. N. Clark (1989), Fault diagnosis in dynamic system: theory and applications, Prentice Hall. [7] R. Isermann & P. Ball (1997), Trends in the Application of Model-based Fault Detection and Diagnosis of Technical Processes, Control Engineering Practise, 5: 709-719. [8] J. Chen & R. J. Patton, Robust Model Based Fault Diagnosis for Dynamic Systems, Kluwer Academic Publishers ISBN 0-7923-841-3, 1999. [9] J. M. F. Calado, J. Korbicz, K. Patan, R. J. Patton & J. M. G. S da Costa, Soft Computing Approaches to Fault Diagnosis for Dynamic Systems, Invited Special Issue Paper, European Journal of Control, 7(2-3), July 2001. [10] J. F. MacGregor, T. E. Marlin, J. V. Kresta & B. Skagerberg (1991), Multivariate statistical methods in process analysis and control, AIChE symposium proceedings of the 4th international conference on chemical process control, 67: 79-99, New York: AIChE Publ. [11] I. T. Jolliffe (1986), Principal Component Analysis. Springer-Verlag, New York. [12] A. Hoskkuldon (1988), PLS Regression Methods, J. Chemometric, 2: 211-228. [13] B. Jnsson B, L. Klarns & M. Norberg (2003), Inteligent Alarm Handing In the Steel Manufacturing Industry, PerOlof Norberg, Advanced Process Control Ltd. & O Marjanovic and Keith Smith, Control Technology Centre Ltd. [14] X. Wang, U. Kruger & B. Lennox (2003), Recursive partial least squares algorithms for monitoring complex industrial processes. Control Engineering Practice, 11: 613-631. [15] P. Geladi & B. R. Kowalski (1986), Partial least squares regression: A tutorial. Analytica Chimica Acta, 187; 1-17. [16] D. Sauter, T. Boukhobza & F. Hamelin (2005), Decentralized and Autonomous Design for FDI/FTC of Networked Control Systems, NeCST Workshop in Ajaccio. [17] S. Sauter & C. Aubrun (2006), A heating system as a benchmark for networked control systems tolerant to faults. Internal Documentation of the EU Project IST-004303 NeCST, UHP, Nancy. [18] C. Perkgoz, C. Kambhampati, R. J. Patton & V. Palacka (2005), Integration of OPC Tools in Network Control Systems Tolerant to Faults, Proceedings of the 1st NeCST Workshop on Networked Control Systems, Corsica, http://www.strep-necst.org. [19] J. M. F. Calado, J. Korbicz, K. Patan & R. J. Patton (2001), Soft Computing Approaches to Fault Diagnosis for Dynamic Systems. European Journal of Control, 7, 248286 [20] B. Lennox (2005), Ed. Special issue on Recent Advances in Process Fault Monitoring. International Journal of Adaptive Control and Signal Processing, 19(4) [21] R.J. Patton, C. Kambhampati, A. Casavola, P. Zhang, S. Ding & D. Sauter (2007), A Generic Strategy for FaultTolerance in Control Systems Distributed over a Network, Invited Special Issue Paper, European Journal of Control, 13: 280-296. [22] R. J. Patton & S. Klinkhieo (2009), A Two-Level Approach to Fault-Tolerant Control of Large Scale Systems based on the Sliding Mode, Safeprocess09, 30th June-3rd July, Barcelona.

1901

You might also like