You are on page 1of 10

12 Field data collection and

feedback
12.1 REASONS FOR DATA COLLECTION
Failure data can be collected from prototype and production models or from the field. In either
case a formal failure-reporting document is necessary in order to ensure that the feedback is both
consistent and adequate. Field information is far more valuable since it concerns failures and
repair actions which have taken place under real conditions. Since recording field incidents
relies on people, it is subject to errors, omissions and misinterpretation. It is therefore important
to collect all field data using a formal document. Information of this type has a number of uses,
the main two being feedback, resulting in modifications to prevent further defects, and the
acquisition of statistical reliability and repair data. In detail, then, they:
Indicate design and manufacture deficiencies and can be used to support reliability growth
programmes (Section 11.3);
Provide quality and reliability trends;
Identify wearout and decreasing failure rates;
Provide subcontractor ratings;
Contribute statistical data for future reliability and repair time predictions;
Assist second-line maintenance (workshop);
Enable spares provisioning to be refined;
Allow routine maintenance intervals to be revised;
Enable the field element of quality costs to be identified.

A failure-reporting system should be established for every project and product. Customer
cooperation with a reporting system is essential if feedback from the field is required and this
could well be sought, at the contract stage, in return for some other concession.

12.2 INFORMATION AND DIFFICULTIES


failure report form must collect information covering the following:
Repair time - active and passive.
Type of fault - primary or secondary, random or induced, etc.
Nature of fault - open or short circuit, drift condition, wearout, design deficiency.
Fault location - exact position and detail of LRA or component.
Environmental conditions - where these are variable, record conditions at time of fault if
possible.

152 Reliability, Maintainability and Risk

Action taken - exact nature of replacement or repair.


Personnel involved.
Equipment used.
Spares used.
Unit running time.
The main problems associated with failure recording are:

1. Inventories: Whilst failure reports identify the numbers and types of failure they rarely

2.

3.

4.

5.

6.

provide a source of information as to the total numbers of the item in question and their
installation dates and running times.
Motivation: If the field service engineer can see no purpose in recording information it is
likely that items will be either omitted or incorrectly recorded. The purpose of fault reporting
and the ways in which it can be used to simplify the task need to be explained. If the engineer
is frustrated by unrealistic time standards, poor working conditions and inadequate
instructions, then the failure report is the first task which will be skimped or omitted. A
regular circulation of field data summaries to the field engineer is the best (possibly the only)
way of encouraging feedback. It will help him to see the overall field picture and advice on
diagnosing the more awkward faults will be appreciated.
Verification: Once the failure report has left the person who completes it the possibility of
subsequent checking is remote. If repair times or diagnoses are suspect then it is likely that
they will go undetected or be unverified. Where failure data are obtained from customers
staff, the possibility of challenging information becomes even more remote.
Cost: Failure reporting is costly in terms of both the time to complete failure-report forms
and the hours of interpretation of the information. For this reason, both supplier and customer
are often reluctant to agree to a comprehensive reporting system. If the information is
correctly interpreted and design or manufacturing action taken to remove failure sources,
then the cost of the activity is likely to be offset by the savings and the idea must be sold
on this basis.
Recording non-failures: The situation arises where a failure is recorded although none exists.
This can occur in two ways. First, there is the habit of locating faults by replacing suspect
but not necessarily failed components. When the fault disappears the first (wrongly removed)
component is not replaced and is hence recorded as a failure. Failure rate data are therefore
artificially inflated and spares depleted. Second, there is the interpretation of secondary
failures as primary failures. A failed component may cause stress conditions upon another
which may, as a result, fail. Diagnosis may reveal both failures but not always which one
occurred first. Again, failure rates become wrongly inflated. More complex maintenance
instructions and the use of higher-grade personnel will help reduce these problems at a
cost.
Times to failure: These are necessary in order to establish wearout. See next section.

12.3 TIMES TO FAILURE


In most cases fault data schemes yield the numbers of failureddefects of equipment.
Establishing the inventories, and the installation dates of items, is also necessary if the
cumulative times are also to be determined. This is not always easy as plant records are often
incomplete (or out of date) and the exact installation dates of items has sometimes to be
guessed.

Field data collection and feedback 153


Nevertheless, establishing the number of failures and the cumulative time enables failure rates
to be inferred as was described in chapter 5.
Although this failure rate information provides a valuable input to reliability prediction
(chapter 8) and to optimum spares provisioning (chapter 15), it does not enable the wearout and
bum-in characteristics of an item to be described. In Chapter 6 the Weibull methodology for
describing variable failure rates was described and, in Chapter, 15 it is shown how to use this
information to optimize replacement intervals.
For this to happen it is essential that each item is separately identified (usually by a tag
number) and that each failure is attributed to a specific item. Weibull models are usually,
although not always, applicable at the level of a specific failure mode rather than to the failures
as a whole. A description of failure mode is therefore important and the physical mechanism,
rather than the outcome, should be described. For example the phrase out of adjustment really
describes the effect of a failure whereas replaced leaking diaphragm more specifically
describes the mode.
Furthermore, if an item is removed, replaced or refurbished as new then this needs to be
identified (by tag number) in order for the correct start times to be identified for each
subsequent failure time. In other words if an item which has been in situ for 5 years had
a new diaphragm fitted 1 year ago then, for diaphragm failures, the time to failure dates from
the latter. On the other hand failures of another mode might well be treated as times dating
from the former.
Another complication is in the use of operating time rather than calendar time. In some ways
the latter is more convenient if the data is to be used for generic use. In some cases however,
especially where the mode is related to wear and the operating time is short compared with
calendar time, then operating hours will be more meaningful. In any case consistency is the
rule.
If this information is available then it will be possible to list:
- individual times to failure (calendar or operating)
- times for items which did not fail
- times for items which were removed without failing

In summary the following are needed:


-

Installed (or replacedrefurbished) dates and tag numbers

- Failure dates and tag numbers


-

Failure modes (by physical failure mechanism)

- Running times/profiles unless calendar time is be used

12.4 SPREADSHEETS AND DATABASES


Many data-collection schemes arrange for the data to be manually transferred, from the written
form, into a computer. In order to facilitate data sorting and analysis it is very useful if the
information can be in a coded form. This requires some form of codes database for the field
maintenance personnel in order that the various entries can be made by means of simple
alphanumerics. This has the advantage that field reports are more likely to be complete since
there is a code available for each box on the form. Furthermore, the codes then provide
definitive classifications for subsequent sorting. Headings include:

154 Reliability, Maintainability and Risk


Equipment code
Preferably a hierarchical coding scheme which defines the plant, subsystem and item as, for
example, RCI-66-03-5555, where:
Code
R

c1
66
03
5555

Meaning
Southampton Plant
Compression system
Power generation
Switchgear
Actual item

How found
The reason for the defect being discovered as, say, a two-digit code:
Code
01
02
03
etc.

Meaning
Plant shutdown
Preventive maintenance
Operating problem

Type of fault
The failure mode, for example:
Code
01
02
03
04
05
etc.

Meaning
Short circuit
Open circuit
Leak
Drift
No fault found

Action taken
Examples are:
Code
01
02
03
etc.

Meaning
Item replaced
Adjusted
Item repaired

Discipline
Where more than one type of maintenance skill is used, as is often the case on big sites, it is
desirable to record the maintenance discipline involved. These are useful data for future
maintenance planning and costing. Thus.
Code
01
02

03
etc.

Meaning
Electrical
Instrument
Mechanical

Field data collection and feedback

155

Free text
In addition to the coded report there needs to be some provision for free text in order to amplify
the data.

Each of the above fields may run to several dozen codes which would be issued to the field
maintenance personnel as a handbook. Two suitable types of package for analysis of the data are
spreadsheets and databases. If the data can be inputted directly into one of these packages, so
much the better. In some cases the data are resident in a more wide-ranging, field-specific,
computerized maintenance system. In those cases it will be worth writing a download program
to copy the defect data into one of the above types of package.
Spreadsheets such as Lotus 1-2-3 (Appendix 10) allow the data, including text, to be placed
in cells arranged in rows and columns. Sorting is available as well as mathematical manipulation
of the data.
In some cases the quantity of data may be such that spreadsheet manipulation becomes slow
and cumbersome, or is limited by the extent of the PC memory. The use of database packages
such as FOCUS (Appendix 10) permits more data to be handled and more flexible and fast
sorting. Sorting is far more flexible than with spreadsheets since words within text, within
headings or even 'sound-alike' words can be sorted.

12.5 ANALYSIS AND PRESENTATION OF RESULTS


Once collected, data must be analysed and put to use or the system of collection will lose
credibility and, in any case, the cost will have been wasted. A Pareto analysis of defects is
a powerful method of focusing attention on the major problems. If the frequency of each
defect type is totalled and the types then ranked in descending order of frequency it will
usually be seen that a high percentage of the defects are spread across only a few types. A
still more useful approach, if cost information is available, is to multiply each defect type
frequency by its cost and then to rerank the categories in descending order of cost. Thus the
most expensive group of defects, rather than the most frequent, heads the list, as can be seen
in Figure 12.1.
Note the emphasis on cost and that the total has been shown as a percentage of sales. It
is clear that engineering effort could profitably be directed at the first two items which
together account for 38% of the failure cost. The first item is a mechanical design problem
and the second a question of circuit tolerancing.
It is also useful to know whether the failure rate of a particular failure type is
increasing, decreasing or constant. This will influence the engineering response. A decreasing failure rate indicates the need for further action in test to eliminate the early failures.
Increasing failure rate shows wearout, requiring either a design solution or preventive
replacement. Constant failure rate suggests a reliability level which is inherent to that
design configuration. Chapter 6 explains how failure data can be analysed to quantify these
trends. The report in Figure 12.1 might well contain other sections showing reliability
growth, analysis of wearout, progress on engineering actions since the previous report,
etc.

156 Reliability, Maintainability and Risk


1. Summary of Data
Number o f machines in field
Operating hours (this period)
Number of corrective calls
Total cost o f calls
Total cost as % of sales

2. Incident Analysis
Repet itive Failures
a) Mechanical transporter
assembly - belt adjustment
b) Receiver carrier detector
drift
c) Electromechanical relays
d ) Gear Meshing
e) Printed Board 182c
output V T 2
f) Lamps
Non-repetitive Faults
g) Printed Board 424a
I C5
h ) Printed Board 11l e
R2
etc

50
5320

39
f4250
4%

Frequency
4

labour, travel and spares

Cost f

% of total

935

22

680

16

4
3
2

340
340
300

8
8
7

170

15

1485

35

39

4250

100

Figure 12.1 Quarterly incident report summary - product Y

12.6 EXAMPLES OF FAILURE REPORT FORMS


Figure 12.2 shows an example of a well-designed and thorough failure recording form as
once used by the European companies of the International Telephone and Telegraph
Corporation. This single form strikes a balance between the need for detailed failure
information and the requirement for a simple reporting format. A feature of the ITT form is
the use of four identical print-through forms. The information is therefore accurately recorded
four times with minimum effort. As an example of the need for more elaborate reporting,
consider an air traffic control service which is essentially concerned with the safety of life
in a dynamic situation. It is not surprising therefore to find a detailed maintenance reporting
system in such an organization. Three of the forms used by the British Civil Aviation
Authority are shown in Figure 12.3. They deal with corrective and planned maintenance
reporting in great detail.
It is unfortunate that few forms give adequate breakdown of maintenance times separated
into the various passive and active elements. To identify and record this level of information
increases the maintenance time and cost. It has to be justified if a special investigation is
required. Such an analysis can result in improved maintenance procedures, in which case it
may pay for itself by reducing long-term costs.

Field data collection and feedback

Figure 12.2 I l l Europe failure report and action form

157

158 Reliability, Maintainability and Risk

Figure 12.3(a) NATS equipment defect report

Field data collection and feedback

Figure 12.3(b) NATS planned maintenance report

159

160 Reliability, Maintainability and Risk

v)

P
.OY
P
-

v)

v)

V
g!

2
-

LL

.-UaJ