You are on page 1of 36

Reliability Engineering

Outline

• Reliability definition
• Reliability estimation
• System reliability calculations

2
Reliability
The probability that no (system) failure will
occur in a given time interval

A reliable system is one that meets the


specifications Do you accept this?
Reliability Importance

• One of the most important characteristics of a product, it


is a measure of its performance with time (Transatlantic
and Transpacific cables)

• Products’ recalls are common (only after time elapses). In


October 2006, the Sony Corporation recalled up to 9.6
million of its personal computer batteries

• Products are discontinued because of fatal accidents


(Pinto, Concord)

• Medical devices and organs (reliability of artificial organs)

4
Reliability Definitions

Reliability is a time dependent characteristic.

 It can only be determined after an elapsed time but


can be predicted at any time.

 It is the probability that a product or service will


operate properly for a specified period of time (design
life) under the design operating conditions without
failure.

5
Reliability is ….

Reliability : the ability of a system or component to perform its required functions under stated
conditions for a specified period of time

•Reliability Engineering: is an engineering field, that deals with the study of reliability and is
concerned with meeting the specified probability of success, at a specified statistical confidence level

•KEY DEFINITIONS OF RELIABILITY


–First, reliability is a probability. This means that failure is regarded as a random phenomenon: it is a
recurring event, and we do not express any information on individual failures, the causes of failures, or
relationships between failures, except that the likelihood for failures to occur varies over time according
to the given probability function.
–Second, reliability is predicated on "intended function:" Generally, this is taken to mean operation
without failure. However, even if no individual part of the system fails, but the system as a whole does
not do what was intended, then it is still charged against the system reliability. The system requirements
specification is the criterion against which reliability is measured
–Third, reliability applies to a specified period of time. In practical terms, this means that a system has a
specified chance that it will operate without failure before time . Reliability engineering ensures that
components and materials will meet the requirements during the specified time. (in terms of miles, gun
fire etc.)
–Fourth, reliability is restricted to operation under stated conditions. This constraint is necessary because
it is impossible to design a system for unlimited conditions. A Mars Rover will have different specified
conditions than the family car. The operating environment must be addressed during design and testing.
Also, that same rover, may be required to operate in varying conditions requiring additional scrutiny
RELIABILITY is ….

•Reliability : the ability of a system or component to perform its required functions under stated
conditions for a specified period of time

•Availability :Availability of the module is the percentage of time when system is operational.
Availability of a hardware/software module can be obtained by the formula given below.

•MTTR : Mean Time To Repair (MTTR), is the time taken to repair a failed hardware module. In an
operational system, repair generally means replacing the hardware module.

•MTBF: Mean Time Between Failures (MTBF), as the name suggests, is the average time between
failure of hardware modules. It is the average time a manufacturer estimates before a failure occurs in
a hardware module.

•MEAN: the expected value of a random variable, which is also called the population mean.

•VARIANCE: is the expected square deviation of that variable from its expected value or mean,
What do Reliability Engineers Do?

• Implement Reliability Engineering Programs


across all functions
– Engineering
– Research
– manufacturing
– Testing
– Packaging
– field service
How to Make A System Reliable

Maintenance Product Manufacturing


?
System System
Component determination
overview design

Component Reliability
design or testing & data ?
purchase analyses
Reliability as a Process
module
INPUT
Reliability
• Reliability Goals
Assurance Product
• Schedule time
• Budget Dollars Module Assurance
• Test Units
• Design Data Internal Methods
•Design Rules
•Components Testing
•Subsystem Testing
•Architectural Strategy
•Life Testing
•Prototype testing
•Field Testing
•Reliability Predictions
(models)
Early product failure
• Strongest effect on customer satisfaction
– A field day for competitors
• The most expensive to repair
– Why?
– Rings through the entire production system
– High volume
– Long C/T (cycle time)
• Examples from GE (but problem not confined to GE!)
– GE Variable Power module for House Air Conditioning
– GE Refrigerators
– GE Cellular
Early Product Failure
• Can be catastrophic for human life
– Challenger, Columbia
– Titanic
– DC 10
– Auto design
– Aircraft Engine
– Military equipment
Reliability as a function of System Complexity
Why computers made of tubes (or discrete transistors)
cannot be made to work

# of components Component Component


in Series Reliability = Reliability =
99.999% 99.99%
100 99.9 99.01
250 99.75 97.53
500 99.50 95.12
1000 99.01 90.48
10,000 90.48 36.79
100,000 36.79 0.01
Three Classifications of
Reliability Failure

Type Old Remedy- Repair mentality


• Early (infant mortality) • Burn-in

• Wearout (physical
degradation) • Maintenance

• Chance (overstress) • In service testing


Bathtub Curve

Infant Useful life Wear out


Mortality No memory
Failure Rate
#/million hours No improvement
No wear-out
Random causes

Time
Reliability
90
80
70
60
Prob 50
of dying 40
in the 30
20
next 10
year 0
0 2 5 12 16 19 30 50 70 86
(deaths/
1000)

Age
From the Statistical Bulletin 79, no 1, Jan-Mar 1998
Chance Failures
(Occur throughout the life a product at a constant rate)

• Insufficient safety factors in design


• Higher than expected random loads
• Human errors
• Misapplication
• Developing world concerns
Wear-out
(Occur late in life and increase with age)

• Aging
• degradation in strength
• Materials Fatigue
• Creep
• Corrosion
• Poor maintenance
• Developing World Concerns
Failure Types
• Catastrophic
• Degradation
• Drift
• Intermittent
Failure Effects
(What customer experiences)
• Noise
• Erratic operation
• Inoperability
• Instability
• Intermittent operation
• Impaired Control
• Impaired operation
• Roughness
• Excessive effort requirements
• Unpleasant or unusual odor
• Poor appearance
Failure Modes
• Cracking
• Deformation
• Wear
• Corrosion
• Loosening
• Leaking
• Sticking
• Electrical shorts
• Electrical opens
• Oxidation
• Vibration
• Fracturing
Reliability Remedies
• Early
• Quality manufacture/Robust
Design
• Physically-based models,
• Wearout preventative maintenance,
Robust design (FMEA)

• Tight customer linkages,


testing, HAST

• Chance
Some Initial Thoughts
Warranty
• Will you buy additional warranty?
Burn in and removal of early failures.

(Lemon Law).

E a r l y F a i l u r e s
I n c r e a s i n g
C o n s t a nF t a i l u r e
F a i l u r e RR aa tt ee
F a ilu re R a te

T i m e 23
Other Measures of Reliability

Availability is used for repairable systems

 It is the probability that the system is operational


at any random time t.

 It can also be specified as a proportion of time


that the system is available for use in a given
interval (0,T).

24
Some Initial Thoughts
Repairable and Non-Repairable
Another measure of reliability is availability (probability
that the system provides its functions when needed).
M a x i Rm e u l im a l be iv l i et y l

W it h R
e p a irs
R e lia b ility

N o Rep
a irs

T i m e
25
Other Measures of Reliability

Mean Time To Failure (MTTF): It is the average


time that elapses until a failure occurs.
It does not provide information about the distribution
of the TTF, hence we need to estimate the variance
of the TTF.

Mean Time Between Failure (MTBF): It is the


average time between successive failures.
It is used for repairable systems.
26
Mean Time to Failure: MTTF

∞ ∞
MTTF = ∫ tf (t )dt = ∫ R (t )dt
0 0

1 n
MTTF = ∑ ti
n i =1

1
2 is better than 1?
2
)t( R

1
Time t
0
27
Mean Time Between Failure: MTBF

28
Other Measures of Reliability

Mean Residual Life (MRL): It is the expected remaining


life, T-t, given that the product, component, or a system
has survived to time t.
1 ∞
L(t ) = E[T − t | T ≥ t ] = ∫ τ f (τ )d τ − t
R (t ) t
Failure Rate (FITs failures in 109 hours): The failure rate in
a time interval [ t1 −t2 ] is the probability that a failure per
unit time occurs in the interval given that no failure has
occurred prior to the beginning of the interval.

Hazard Function: It is the limit of the failure rate as the


length of the interval approaches zero.
29
Basic Calculations

Suppose n0 identical units are subjected to a


test. During the interval (t, t+∆t), we observed
nf(t) failed components. Let ns(t) be the
surviving components at time t, then the MTTF,
failure density, hazard rate, and reliability at
time t are:
n0

∑t
ˆf (t ) = n f (t )
i
MTTF = i =1
,
n0 n0 ∆t

ˆ n f (t ) ˆ ns (t )
λ (t ) = , R (t ) = Pr (T > t ) =
ns (t )∆t n0 30
Basic Definitions Cont’d

The unreliab
F( t ) = 1− R( t )

Time Interval (Hours) Failures in the


interval
0-1000 100
1001-2000 40
2001-3000 20
3001-4000 15
4001-5000 10
5001-6000 8
6001-7000 7
Total 200
31
Calculations

f (t ) 10−4 h (t ) 10−4

Time
100 100
Time Interval Failures = 5.0 = 5.0
×
200 10 3
×
200 10 3

(Hours) in the
interval
40
= 2.0 40
0-1000 100 ×
200 10 3 = 4.0
100 ×
10 3

1001-2000 40
2001-3000 20 20 20
= 1.0 = 3.33

Interval
3001-4000 15 200 ×
10 3
60 ×10 3

4001-5000 10
5001-6000 8
6001-7000 7
Total 200 7 7
= 0.35 = 10
200 ×
10 3
7 ×10 3

32
Failure Density vs. Time

×10-4

1 2 3 4 5 6 7 x 103

Time in hours
33
Hazard Rate vs. Time

×10-4

1 2 3 4 5 6 7 × 103

Time in Hours

34
Calculations

Time Interval Failures


(Hours) in the Time Interval ReliabilityR (t )
interval
0- 1000 200/ 200=1.0

0-1000 100
1001-2000 40 1001- 2000 100/ 200=0.5
2001-3000 20
3001-4000 15
4001-5000 10 2001- 3000 60/ 200=0.33
5001-6000 8
6001-7000 7
…… ……
Total 200

6001- 7000 0.35/ 10=.035

35
Reliability vs. Time

1 2 3 4 5 6 7 x 103

Time in hours

36

You might also like