You are on page 1of 43

RELIABILITY AND DIAGNOSTIC

Anna Andonova E-mail: ava@ecad.tu-sofia.bg

Faculty of Electronic Engineering and Technologies

1. NEED OF RELIABILITY IN PRODUCT DESIGN

Factors:
y product complexity y insertion of reliability related-clauses in design specifications y competition y awareness of cost effectiveness y public demand y the past system failures

1.1. Reliability in the product design process


Reliability tasks:
 stablishing reliability requirements definition  using reliability design standards/guides/checklists  allocating reliability, predicting reliability  reliability modeling  monitoring subcontractor/supplier reliability activities  performing failure modes effects and criticality analysis  monitoring reliability growth  assessing software reliability  environmental stress screening  preparing critical items list  performing electronic parts/circuits tolerance analysis.

1.2. Terms and definitions


            Reliability Failure Downtime Maintainability Redundancy Active redundancy Availability Mean time to failure Useful life Mission time Human error Human reliability

2. FAILURE DATA COLLECTION AND ANALYSIS


Sources of collecting failure related data during an equipment Life Cycle:
1. Warranty claims 2. Previous experience with similar or identical equipment 3. Repair facility records 4. Factory acceptance testing 5. Records generated during the development phase 6. Customers failure reporting systems 7. Tests: field demonstration, environmental qualification, and field installation 8. Inspection records generated by quality control/manufacturing groups
IEEE 500 Guide to the Collection and Presentation of Electrical, Electronic, Sensing Component, and Mechanical Equipment Reliability Data for Nuclear Power Generating Stations IEC 362 Guide for the Collection of Reliability, Availability, and Maintainability Data from Field Performance of Electronic Items

Table 1.
Common reasons for product failures
Wrong usage by the consumer Incorrect manufacturing Faulty reasoning

Failure rates for some electronic items, failures/106 h


Neon lamp 0,2

Failure rates for some mechanical items, failures/106 h


Heat exchanger Nut or bolt Knob (general) 6.11 244.3 0,02 2.082a

Human error rates for some tasks, error rate


Improper servicing/reassembly by the maintenance person Closing valve incorrectly Incorrect adjustment by the maintenance person Procedural error in reading given instructions Reading incorrectly Misunderstanding requirements by operator Installation error gauge of the 0.0153a

Solid state relay (commercial grade) Solid state relay (military specification grade) Single fiber connector Spring connection Terminal connection optic

0,0551a 0,029a

1800b 0,0134a

Poor understanding of the problem to be solved Unsatisfactory collection Incorrect storage data

0,1

Flexible coupling Slip ring (general) Pivot

9,987b

64500b

contact block

0,17a 0,062

0,667 1

5000a 0,0076a

Incorrectly stated problem with respect to basic principles Wrong or overextended assumptions Erroneous data

Crimp connection

0,00026a

Piston

0.0401a

a)

Use benign.

environment:

ground,

a)

Use environment: ground, fixed. b) Use environment: ground, mobile.

a)Errors

per plant month (for pressurized water reactors). b) Errors per million operations.

2.1. Reliability and maintainability management tasks in the product life cycle

The life cycle phases of a system:

 the concept and definition phase  the acquisition phase  the operation and maintenance phase  the disposal phase

The Concept and Definition Phase


 Defining all terms used, the system capability requirements, management controls, and parts control requirements  Defining a failure  Defining the reliability and maintainability goals for the system in quantitative terms.  Defining hardware and software standard documents to be used to fulfill reliability and maintainability requirements  Defining methods to be followed during the design and manufacturing phase  Defining constraints proven to be harmful to reliability  Defining system safety requirements and the basic maintenance philosophy  Defining data collection and analysis needs during the system life cycle  Defining management control for documentation and providing necessary documentation  Defining system environmental factors during its life cycle.

The Acquisition Phase


 Define  the system technical requirements  major design and development methods to be used  documents required as part of the final system  type of evaluation methods to assess the system, and demonstration requirements  Define the reliability and maintainability needs that must be fulfilled  Define the meaning of a failure or a degradation  Define the types of reviews to be conducted  Define the type of data to be supplied by the manufacturer to the customer  Define the life-cycle cost information to be developed  Define the kind of field studies to be performed, if any  Define the kind of logistic support required

The Operation and Maintenance Phase

 Developing failure data banks  Analyzing reliability and maintainability data  Providing adequate tools for maintenance  Providing appropriately trained manpower.  Managing and predicting spare parts.  Developing engineering change proposals  Preparing maintenance documents  Reviewing documents in light of any engineering change

The Disposal Phase

 activities needed to remove the system and its nonessential supporting parts  calculation of the final life-cycle cost  calculation of the final reliability and maintainability values of the system in question.

3. BASIC RELIABILITY EVALUATION AND ALLOCATION TECHNIQUES


The systems, reliability requirements are specified in the form of:
 failure rate  mean time between failures (MTBF)  availability.

Reliability evaluation techniques are known as:


 Block diagram  Decomposition  Delta-star  Markov modeling top-down process with basic objectives to translate the system reliability requirement

Reliability allocation

3.1. Bathtub hazard rate curve

Figure 1.

3.2. General reliability analysis related formulas Failure density function


(1)

Hazard rate function


(2)

(3)

General reliability function


(4)

since at t = 0, R (t) = 1

(5)

(6)

(7)

3.2. General reliability analysis related formulas Mean time to failure


(8)

(9)

(10)

Example 1: Assume that the failure rate of a microprocessor, , is constant. Obtain expressions for the microprocessor reliability, mean time to failure, and using the reliability function prove that the microprocessor failure rate is constant.

(11)

(12)

(13)

3.2. General reliability analysis related formulas Example 2: Assume that the failure rate of an automobile is 0.0004 failures/h. Calculate the automobile reliability for a 15-h mission and mean time to failure.

(14)

(15)

3.3. Reliability networks Series network Fig.2

(16) for independent units


let Ri= P (Ei) for i = 1, 2, 3, , n Ri is the unit reliability; for i = 1, 2, 3, , n P (Ei) is the probability of occurrence of event Ei; for i = 1, 2, 3, ,n

(17)

(18)

for constant failure rate, i, of unit i (19)

(20)

(21)

(22)

3.3. Reliability networks Example 2: A system is composed of two independent units in parallel. The failure rates of units A and B are 0.002 failures per hour and 0.004 failures per hour, respectively. Calculate the system reliability for a 50-h mission and mean time to failure.

Fig.3

Parallel network

3.3. Reliability networks


(23)

for independent units (24)

let Fi= P(E1) for i = 1, 2, 3, , n for identical units (25) for constant failure rate (28) (27) (26)

3.3. Reliability networks Example 2: A system is composed of two independent units in parallel. The failure rates of units A and B are 0.002 failures per hour and 0.004 failures per hour, respectively. Calculate the system reliability for a 50-h mission and mean time to failure. Let
A be

the failure rate of unit A and

the failure rate of unit B. for n = 2

(29)

3.3. Reliability networks r-out-of-n network The parallel and series networks are special cases of this network for r = 1 and
for independent and identical units

r = n, respectively.

(31)

(30) for constant failure rates of units (32)

(33)

3.3. Reliability networks Example 3: A computer system has three independent and identical units in parallel. At least two units must work normally for the system success. Calculate the computer system mean time to failure, if the unit failure rate is 0.0004 failures per hour

3.3. Reliability networks Standby redundancy


for independent and identical units, perfect switching and standby units, and unit time dependent failure rate (34)

For constant unit failure rate (35) (36)

3.3. Reliability networks Example 4: Assume that a standby system has two independent and identical units: one operating, another on standby. The unit failure rate is 0.006 failures per hour. Calculate the system reliability for a 200-h mission and mean time to failure, if the switching mechanism never fails and the standby unit remains as good as new in its standby mode.

3.3. Reliability networks


Fig.4

Bridge network
for identical units (38)

for constant failure rates of units (39) for independent units

(40)

(37)

3.3. Reliability networks Example 5: Assume that five independent and identical units form a bridge configuration. The failure rate of each unit is 0.0002 failures per hour. Calculate the configuration reliability for a 500-h mission

Fig.5

3.4. Reliability evaluation methods Network reduction approach - Example

3.4. Reliability evaluation methods Decomposition approach reliability of complex systems, which it decomposes into simpler subsystems by applying the conditional probability theory combining the subsystems reliability measures selection of the key unit used to decompose a given network efficiency of the approach depends on the selection of the key unit assumption that the key unit, say k, is replaced by another unit that is 100% reliable or never fail the key unit k is completely removed from the network or system.
(41)

Decomposition approach

3.4. Reliability evaluation methods

Example : An independent and identical units bridge network is shown in Figure 6. The letter R in the figure denotes unit reliability. Obtain an expression for the bridge network reliability by using the decomposition method.
(43)

(44) (45) (46) Fig.6

(42)

(47)

3.4. Reliability evaluation methods Delta-star method  the simplest and very practical approach to evaluate reliability of bridge networks transforms a bridge network to its equivalent series and parallel form  the transformation process introduces a small error in the end result
(50) (51) (52) (53) (54) (48) (48) (49)

Fig.7
(55) (56)

Delta-star method

3.4. Reliability evaluation methods

Example :A five independent unit bridge network with specified unit reliability Ri; for i = a, b, c, d, and e is shown in Figure 8. Calculate the network reliability by using the delta-star method and also use the specified data in Equation (38) to obtain the bridge network reliability. Compare both results.

Fig.9 Fig.8

3.4. Reliability evaluation methods Parts count method  very practically inclined method used during bid proposal and early design phases The information required to use this method includes generic part types and quantities, part quality levels, and equipment use environment. Under single use environment, the equipment failure rate can be estimated by

(57)

3.4. Reliability evaluation methods Parts count method Failure Rate Estimation of an Electronic Part MIL-HDBK-217 is used to estimate the failure rate of electronic parts better picture of the actual failure rate of the equipment under consideration than the one obtained through using Equation (57). An equation of the following form is used to estimate failure rates of many electronic parts:

(58)

(59)

3.4. Reliability evaluation methods Markov method powerful reliability analysis tool quite useful to model systems with dependent failure and repair modes widely used to model repairable systems with constant failure and repair rates breaks down for a system having time dependent failure and repair rates a problem may occur in solving a set of differential equations for large and complex systems The following assumptions are associated with the Markov approach: All occurrences are independent of each other. The probability of transition from one system state to another in the finite time interval t is given by t, where the is the transition rate (i.e., failure or repair rate) from one system state to another. The probability of more than one transition occurrence in time interval t from one state to another is very small or negligible (e.g., ( t) ( t) 0).

Markov method

3.4. Reliability evaluation methods

Example : Assume that an engineering system can either be in an operating or a failed state. It fails at a constant failure rate, , and is repaired at a constant repair rate, . The system state space diagram is shown in Figure 10. The numerals in box and circle denote the system state. Obtain expressions for system time dependent and steady state availabilities and unavailabilities by using the Markov method.
(60) (61)

Fig.10

3.4. Reliability evaluation methods Markov method Example


(62) (65)

(63) (66)

(64)

(67)

3.4. Reliability evaluation methods Markov method Example


The system steady state availability and unavailability can be obtained by using any of the following three approaches:

Approach I: Letting time t go to infinity in Equations (66) and (67), respectively. Approach II: Setting the derivatives of Equations (62) and (63) equal to zero and then discarding any one of the resulting two equations and replacing it with P0 + P1 = 1. The solutions to the ultimate equations will be system steady state availability (i.e., A = P0) and unavailability (i.e., UA = P1). Approach III: Taking Laplace transforms of Equations (62) and (63) and then solving them for P0(s), the Laplace transform of probability that the system is in operating state at time t, and P1(s), the Laplace transform of probability that the system is in failed state at time t. Multiplying P0(s) and P1(s) with the Laplace transform variables and then letting s in sP0(s) and sP1(s) go to zero result in system steady state availability (i.e., A = P0) and unavailability (i.e., UA = P1), respectively.

3.4. Reliability evaluation methods Markov method Example Approach I to Equations (66) and (67)
(68)

(69)

(70)

(71)

3.5. Reliability allocation assigning reliability requirements to individual parts or components to achieve the specified system reliability. reliability allocation problem is not that simple and straightforward but quite complex Some of the associated reasons are as follows: Role the component plays for the operation of the system Component complexity The chargeable component reliability with the type of function to be conducted Approaches available for accomplishing the given allocation task Lack of detailed information on many of the above factors in the early design phase Benefits clearly understand and develop the relationships between reliabilities of components, subsystems, and systems  seriously consider reliability equally with other design parameters such as performance, weight, and cost ensures satisfactory design, manufacturing approaches, and test methods

3.5. Reliability allocation Hybrid method

 result of combining two approaches  similar familiar systems reliability allocation approach  familiarity with similar systems or sub-systems  assume that reliability and life cycle cost of previous similar designs were adequate  factors of influence method is based upon the following factors  Complexity/Time  Failure criticalityEnvironment  State-of-the-Art  the hybrid method is better than similar familiar systems and factors of influence methods

3.5. Reliability allocation Failure rate allocation method  concerned with allocating failure rates to system components when the system required failure rate is known assumptions associated with this method: System components form a series configuration System components fail independently Time to component failure is exponentially distributed. using Equation (22)
(72)

(73)

3.5. Reliability allocation Failure rate allocation method The following steps are associated with this method: 1. Estimate the component failure rates i for i = 1, 2, 3, , n, using the past data. 2. Calculate the relative weight, i, of component i using the preceding step failure rate data and the following equation:

(74)

(75)

3. Allocate failure rate to component i using the following relationship:

(76)

3.5. Reliability allocation Failure rate allocation method Example : Assume that an engineering system can either be in an operating or a failed state. It fails at a constant failure rate, , and is repaired at a constant repair rate, . The system state space diagram is shown in Figure 10. The numerals in box and circle denote the system state. Obtain expressions for system time dependent and steady state availabilities and unavailabilities by using the Markov method.

You might also like