|
|
| Reliability Prediction Methods An Overview
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
T h i r d Q u a r t e r 1 9 9 9
8
Introduction
The intent of this article is to provide a basic overview of exist-
ing reliability prediction methods and some of the issues sur-
rounding their use. The applicability of various methods will be
discussed and a new Reliability Analysis Center methodology
called "PRISM" will be introduced and compared to existing
methods. Mr. William Denson presents a comprehensive
overview of PRISM elsewhere in this Journal.
Purpose of Predictions
Reliability predictions have several purposes, the primary being:
· Feasibility evaluation, which seeks to determine if design reli-
ability goals can be met
· Comparing competing designs
· The identification of potential reliability problems, such as
over stressed or misapplied parts
· To provide input to other reliability and maintainability tasks
about the relative failure frequency of subsystems and compo-
nents
Empirically-based tools have commonly been used for electron-
ic system-level reliability predictions; however, some critics
argue that physics-of-failure methods offer a better alternative.
MIL-HDBK-217 Current Status
The empirically based MIL-HDBK-217, Reliability Prediction
of Electronic Equipment, has been widely used over the last three
decades to predict the reliability of most military and some
commercial electronic systems. Until Department of Defense
acquisition reform in the mid-1990's, it was contractually
required on many U.S. Government contracts developing new
electronic systems. Although it remains an active Department of
Defense handbook, there has been no effort put forth in recent
years to update or maintain the methodology due to a major Air
Force reorganization that merged the former Rome Laboratory
into the newly formed Air Force Research Laboratory. With
this realignment came the total elimination of all reliability
research which took place at the former Rome Laboratory.
Since the stand-up of the new Air Force
Research Laboratory, preparing activity
responsibility for MIL-HDBK-217 (and sev-
eral other reliability specifications and stan-
dards) has been in a state of limbo, with no
other Government organization willing to
take on the unfunded task of maintaining the
handbook.
Other Empirical Methods
In addition to MIL-HDBK-217, there are a
number of other empirically-based methods
with varying degrees of similarity to MIL-
HDBK-217 models. Some of the models are
unique to a given industry, such as telecom-
munications or automotive. These methods include (Ref. 3-6):
· Bellcore
· British Telecom HRD-5
· Nippon Telephone & Telegraph
· CNET (French)
Physics-of-Failure versus Empirical Methods
During the late 1980's and throughout the 1990's empirically-
based reliability prediction methods in general, and MIL-
HDBK-217 in particular, have come under heavy criticism
(Ref. 1) as being inadequate. The criticisms have basically cen-
tered on the following issues:
· Method is not a good indicator of field reliability
· Temperature cycling is not accounted for
· Method does not reflect new manufacturing trends
· Method does not differentiate good quality and design
practices
· System level factors that influence reliability are penalized
(e.g., transient protection circuits)
· Method is not "science-based"
To replace existing methods, the critics have called for a
more science based physics-of-failure method to be used.
Historically, this approach has been used for mechanical stress
analysis to ensure a design is of sufficient durability to exceed
the required product life. However, empirically-based methods
have been widely favored for overall electronic system reliability
predictions. This is because of the complexity of electronic sys-
tems which tend to fail for causes and at times that are impos-
sible to foresee. Most components selected during a product
design effort have already been designed, tested, fully character-
ized, and are being manufactured on established lines by ven-
dors such as Texas Instruments, Motorola, Intel, etc.; their fun-
damental device design is "perfect" for the useful life of most
electronic systems. This implies that all life limiting failure
mechanisms far exceed the useful operating life of a typical elec-
tronic system, leaving only latent manufacturing defects, com-
ponent variability and misapplication to cause field failure.
Figure 1: Bathtub Curve
By Seymour Morris, Reliability Analysis Center
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
T h i r d Q u a r t e r 1 9 9 9
9
Where one stands on the "perfect component" premise greatly
affects how he views the application of physics-of-failure as a
tool for quantifying system reliability. The differences between
the two approaches are depicted on the classic bathtub curve
shown in Figure 1. Empirical methods predict a failure rate, ,
caused by randomly occurring failures during any period of a
system's useful life. These failures occur at random times during
useful life and are caused by manufacturing defects, assembly
errors, component variability, and customer use variations.
Physics-of-failure methods predict when a single specific failure
mechanism will occur for an individual component due to
wear-out (end-of-life). It assumes that the "floor" of the bathtub
curve approaches zero and system reliability (life) is defined by
the "weakest link" in a multitude of competing failure mecha-
nisms. For system level reliability predictions, physics-of-failure
is usually not a practical approach since there are numerous
unique failure mechanisms (e.g., electromigration, solder joint
cracking, die bond adhesion strength,
die cracking, bond wire corrosion, etc.)
per electronic device, each requiring a
fairly complex modeling approach. This
process requires very detailed knowledge
of all device material characteristics,
geometry, and applications which may
be unavailable to system designers, or
which may be proprietary.
A physics-of-failure type of analysis is
most beneficial at the device level where
the device design can be influenced, such
as with newly developed hybrid circuits,
or in investigating an established prob-
lem. It should be considered as a com-
plementary approach to evaluate product
reliability based on part selection,
applied stress, historical part defect rates
and system application. Table 1 summa-
rizes the two approaches and when each
may be appropriate for use.
PRISM A New Methodology
PRISM is the name of a new
empirically-based
Reliability
Analysis Center software suite
that ties together several tools in a
comprehensive reliability predic-
tion methodology that addresses
many of the shortcomings of tra-
ditional methods. The PRISM
concept is to try to account for
the myriad of factors that can
influence system-level reliability.
The name PRISM comes from
the concept of "focusing" all
these factors into a single system
reliability assessment.
The premise of traditional
methods is that the failure rate is primarily determined by com-
ponents comprising the system. As critics of MIL-HDBK-217
and similar methods have often pointed out, increased system
complexity and component quality have resulted in a shift of
system failure causes away from components to more "system-
level" factors including manufacturing processes, design, system
requirements, interface, software and mechanical problems.
Historically, these factors have not been explicitly addressed in
traditional prediction methods. PRISM provides a framework
to address many of these factors, including:
· Electronic part failure rate models ("RACRates") for many
component types. These models include factors for operating
temperature, thermal cycling, power stress, manufacturing
maturity, dormancy, duty cycle, humidity, and other part
application characteristics.
· Electronic parts reliability data (EPRD) and non-electronic
parts reliability data (NPRD). These modules represent 26.1
Initial Field Deployment Data
Field Data
Factory Testing
Paper Analysis
Upper Conf. Level
"True MTBF"
Lower Conf. Level
Time
MTBF
Use Empirical Methods:
·When reliability estimates are per-
formed for large, complex products
·When reliability estimates are devel-
oped on a quick-turnaround basis
·When there is a need to estimate the
relative merits of competing designs
·When there is no way to change the
fundamental design of the components
·When the only design flexibility is to
select different components or limit
applied component stresses
Use Physics-of-Failure Methods:
·When a detailed understanding of life-
limiting failure mechanisms is needed
·When new component technologies
need to be assessed and no historical
data exists
·For detailed component design prior
to life testing and qualification
·When design flexibility exists at the
component level
·To investigate the root cause of a
failure
Table 1: Reliability Prediction Application
Figure 2: Prediction Refinement
Empirical Methods
Deterministic
Method
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
T h i r d Q u a r t e r 1 9 9 9
10
Table 2: Reliability Prediction Method Characteristics
Characteristics
Predicts operating part
failure rates for a popu-
lation of devices
Predicts non-operating
part failure rates for a
population of devices
Failure rate models
account for industry
improvement trends
(growth factor)
Predicts end-of-life
Constant failure rate
assumption - Part failure
rates can be summed
Number of environ-
mental categories
addressed
Electronic part
categories covered
Operating temperature
considered in models
Temperature cycling
explicitly considered in
models
Electrical stress consid-
ered in models
Manufacturing con-
trol/screening level
considered
Parts count method
available
Parts stress method
available
Addresses system level
design and manufactur-
ing factors
Provides approach to
analyze field data
Software addressed
Mechanical parts
addressed
PRISM
Yes
Yes
Yes
No
Yes
Infinite. Can specify
temperature
extremes, cycling
rate and absolute
temperatures
Most with RACRate
models. Very exten-
sive with EPRD and
NPRD data inter-
face
Yes
Yes
Yes
Yes, through system
level process grad-
ing factors
Yes
Yes
Yes
No
Yes
Yes
Physics-of-
Failure
No
No
No
Yes
No
Infinite
Not specific.
Concept
applies to all.
Some
Some
Some
No
No
Models
address stress
Some
Yes
No
Not specific.
Concept
applies to all.
MIL-HDBK-
217
Yes
No
No
No
Yes
14
Extensive
Yes
No
Yes
Yes
Yes
Yes
No
No
No
No
Bellcore
Yes
No
No
No
Yes
3
Most
Yes
No
Yes
Yes
Yes
Yes
No
Yes
No
No
British
Telecom
Yes
No
No
No
Yes
3
Most
Yes, >70°C,
microcircuits
only
No
No
Yes, micro-
circuits and
discrete
semiconduc-
tors only
No
Yes
No
No
No
No
Nippon
Telegraph
and
Telephone
Yes
No
No
No
Yes
3
Most
Yes
No
Yes
Yes
No
Yes
No
No
No
No
French
(CNET)
Yes
No
Yes
No
Yes
19
Extensive
Yes
No
Yes
Yes
Yes
Yes
No
No
No
No
MIL-HDBK-
217
Bellcore
British
Telecom
Nippon
Telegraph
and
Telephone
French
(CNET)
Physics-of-
Failure
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
T h i r d Q u a r t e r 1 9 9 9
11
In Memoriam
David F. Barber
Sept 12 1922 - June 28, 1999
The reliability community mourns the loss of David F. Barber, who died of a heart attack on June 28, 1999. From 1962 to his retire-
ment in 1979, Dave was Chief of the Reliability Branch for the Rome Air Development Center (now the Information Directorate of
the Air Force Research Laboratory), where he directed the leading DoD research and development activity in reliability and main-
tainability. He served on various high level government committees, such as the Weapons Systems Effectiveness Industry Advisory
Committee, and was long involved in reliability conferences such as the Reliability Physics Symposium and the Annual Reliability
and Maintainability Symposium. Over the years, he was involved in virtually every conference management position, including
General Chairman of both symposiums.
Born in Utica, NY, in 1922, Dave earned a BA in mathematics from Hamilton College and a MS in meteorology from MIT. He
served as an Air Force Weather Officer from 1942 to 1947. He was an Atmospheric Research Scientist at the Air Force Cambridge
Research Laboratories from 1948 to 1951, a Research and Development Meteorologist at RADC from 1951 to 1953, then director
of various classified programs until 1962, when he was selected as Chief of the Reliability Branch.
Always a strong supporter of his community, Dave served 12 years on the Adirondack Central School Board. He was a past
President of the Oneida-Madison-Herkimer Counties School Board Association, and a past Director of the New York State School
Boards Association. He was a member of the Rome, NY, Rotary Club for 15 years.
On his retirement, Dave used his experience in organizing and managing conferences to create his own consulting company, Scien-
Tech Associates, which has continued to serve the reliability community, helping stage both the Reliability Physics Symposium and
the Annual Reliability and Maintainability Symposium, among others.
Dave was an accomplished golfer. Beside playing, he sometimes reported on major area tournaments for local radio, and, once,
as a substitute announcer for CBS sports. He was playing golf when he was stricken.
Dave was gregarious and friendly, always willing to help. His contributions will be greatly missed by all. Those who knew him will
miss his camaraderie even more.
and 2.5 trillion part hours of RAC data for electronic and
mechanical parts, respectively. They provide a means to
include devices that are not otherwise addressed by most exist-
ing prediction methods.
· An assessment of processes used in the design and manufacture
of the system, including factors contributing to the following
failure causes: parts, design, manufacturing, system manage-
ment, induced, wearout, no defect found and software. These
processes are graded with a metric that corresponds to the
degree to which actions have been taken to mitigate the occur-
rence of system failure due to these failure causes.
· A software reliability prediction methodology and a means to
factor software into the overall reliability prediction estimate.
· A methodology to make refinements to the reliability as addi-
tional relevant data becomes available. The approach uses
Bayesian data combination to dynamically adjust the reliabil-
ity estimate over time, as shown in Figure 2.
Summary
Table 2 provides a high level summary of commonly used empir-
ically-based reliability prediction tools, along with the new
PRISM methodology. Also shown for comparison purposes is the
physics-of-failure approach for component life determination.
References:
Morris, S.F. and J.F. Reilly, "MIL-HDBK-217 - A Favorite Target," 1993
IEEE Reliability and Maintainability Symposium Proceedings, pp. 503-
509, 1993.
MIL-HDBK-217, "Reliability Prediction of Electronic Equipment," 1995.
3TR-TSY-000332, "Reliability Prediction Procedure for Electronic
Equipment," Issue 2, 1988.
"Handbook of Reliability Data for Components Used in Telecom-
munications Systems," British Telecom, Issue 4, 1987.
Recueil De Donnees De Fiabilite Du CNET Collection of Reliability
Data From CNET), Center National D'Etudes des Telecommunications
(National Center for Telecommunications Studies), 1983.
Standard Reliability Table for Semiconductor Devices, Nippon
Telegraph and Telephone Corporation, 1986.
Bowles, J. B., "A Survey of Reliability-Prediction Procedures For Micro-
electronic Devices," IEEE Transactions on Reliability, Vol. 41, No. 1,
1992.
|
|
|
|