|
|
| RAC is a DoD Information Analysis Center Sponsored by the Defense Technical Information Center and Operated by IIT Research Institute
INSIDE
T h e J o u r n a l o f t h e
5
Real-Time
Prognostic
Condition-Based
Maintenance for
High Value Systems
11
Opinion: The
Military Services
Still Rely on
Reliability
15
Industry News
17
Whats New From
the RAC?
21
Future Events
22
Tribute
Reliability Analysis Center
Fourth Quarter - 2001
Introduction
The third in a series, this article overviews sever-
al statistical procedures frequently used in relia-
bility modeling and data analysis [1, 2, and 3] and
illustrates the underlying philosophy. Although
statistical how-tos are well explained in many
excellent sources [4, 5, 6, and 7], the whys and
basis for them usually are not [8].
The first article discussed random variables (RV)
and their distributions and parameters. The second
article discussed some problems dealing with the
estimation and testing of unknown distribution
parameters based on a random sample. In this 3rd
article, we apply to modeling and data analysis
some of these previous concepts.
Statistical models are used in reliability because of
the inherent variation in empirical data and the
definition of reliability. Engineers work with data
obtained from, hopefully, random samples. They
need to understand and take advantage of inherent
and unavoidable variability - statistics is the sci-
ence that studies variability. In addition, the con-
cept of reliability is wholly probabilistic. Statistics
and reliability are inextricably interwoven.
In this article, we discuss several statistical mod-
eling procedures. We first discuss data analysis
principles and their practical implementation. For
example, the distribution of data must first be
established, whether the data come from a single
sample or from two or more batches. Then, the
data are tested for potential outliers. We will see
how outliers may be removed from the sample, if
necessary. If there are two or more batches, we
assess whether these can be pooled together (i.e.,
if they come from the same population), or if the
data analysis results must be obtained separately
for each individual batch. Finally, having satis-
factorily determined the underlying distribution,
we apply the appropriate statistical models
(regression, ANOVA) to analyze our data, accord-
ing to our needs and objectives.
Establishing the Underlying
Distribution and Parameters
Any (parametric) statistical result obtained from a
data set depends on the specific distribution
assumed, as well as on the parameters estimated
for the data set in question. Hence, the importance
of establishing both the underlying population dis-
tribution and its corresponding parameters.
If
there is any serious estimation error in this initial
step, everything else that we do will be wrong
since it will be based on this initial result.
In the first article of this series, we saw how F(x),
the Cumulative Distribution Function (CDF) and
f(x), the probability density function (pdf), are
related by the equation: F(x) = ò´f(t)dt. For the
exponential case, for example:
Fq(x) = ò´f(t)dt = ò´ 1/q exp(-t/q) dt
= 1 exp(-x/q)
Two types of Goodness of Fit (GoF) tests can be
used to assess the underlying statistical distribu-
tion of the data [8]. Both these GoF methods test
that a completely specified (all distribution
parameters are known) distribution Fq(x) fits a
data set. Such composite GoF hypothesis (H0)
has several parts:
specifying the distribution
function hypothesized as well as its parameters.
One of the two types compares the actual
(observed) number of sample points with the cor-
responding expected number, obtained under the
hypothesized pdf, for several data subintervals.
An example of this type is the Chi Square GoF test
[6]. The basis for the Chi Square test can be bet-
ter understood by superimposing by eye the
hypothesized pdf over the histogram of the data
Statistical Analysis of Reliability Data, Part 3: On
Statistical Modeling of Reliability Data
By: Jorge Luis Romeu, IIT Research Institute
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
F o u r t h Q u a r t e r - 2 0 0 1
2
and assessing how close they agree. Actually, the Chi Square pro-
cedure reapportions the n data points to data subintervals,
according to the proportional area that, over these subintervals,
the pdf exhibits. Then, these expected values are compared
with the actual or observed data points in such subintervals. If the
results are close, the fit is acceptable.
The second type of GoF test compares vertical distances between
empirical (Fn) and theoretical (F0) CDFs, evaluated at the ordered
sample points.
Examples of this type of GoF tests are the
Kolmogorov Smirnov (K-S) and the Anderson and Darling (A-D)
[6]. The basis for K-S or A-D can be better understood by super-
imposing by eye the hypothesized distribution CDF over the
empirical cumulative function and then assessing the height dif-
ferences between them, at the sample points. As before, if the two
CDFs are close the fit is good. Otherwise, it is a poor fit.
Both of these GoF approaches assume that the data come from a
completely specified and continuous distribution (F0) with
known parameter q. However, both of these approaches have
been extended or approximated for the case when the parameters
are unknown and need to be estimated from the sample, which is
the usual case in practice.
When a composite GoF hypothesis H0 is rejected, however, more
than one alternative or possibility may occur. For example,
assume the hypothesis H0; a data set was drawn from the Normal
distribution, with mean m and variance s2. Now assume H0 is
rejected. This may occur because the distribution is not Normal,
even when the mean and variance may be the ones stated. It may
also occur because the distribution is indeed Normal, but the
mean, or the variance, or both, are not the ones stated in H0. It
may finally occur that none of the stated assumptions in H0 is
true, i.e., neither the distribution nor its parameters are as
assumed in H0.
On the other hand, it is also important to remember that, when
H0 is not rejected, it just means that we have not found enough
grounds to question its validity (i.e., the assumptions made).
This allows us to assume H0 is correct. The A-D GoF test, for
Normality, for one [1] or several [4] samples, is highly regarded
among univariate GoF tests. Its asymptotic distribution (i.e., for
large sample sizes) has been thoroughly studied. Many statistics
software packages have implemented A-D in analytical and/or
graphical form. K-S and the Chi Square GoF tests are also
excellent, when applicable [4, 6, and 8].
After the distribution of the data set is established, it is screened
for potential outliers. This can be done by using the MNR test
[8], which singles out unusually high/low observations in a
Normally distributed data set. The outliers uncovered must then
be thoroughly checked for accuracy (clerical errors) and proper
implementation (testing errors). If errors are detected, they must
be corrected or the data point discarded. If no errors are ascer-
tained, the data point should remain in the sample.
Three statistical distributions are widely used in reliability and
tested for GoF [3, 7]. The Weibull is one of them and is often
justified, for theoretical reasons, in the derivation of extreme val-
ues and by long practice. The Lognormal is also widely used in
reliability studies. If a RV X is distributed Lognormally, then
Log (X) is distributed Normally.
We can test GoF for
Lognormality by testing Log (X) for Normality (e.g., via A-D or
K-S). Finally, the Exponential is also widely used in reliability,
especially true when initial screening and efficient replacement
policies remove infant mortality and aging elements from the
well-known bathtub curve (hazard rate function), leaving only
the quasi-constant useful life element. The Exponential is a
special case of the Weibull, with a shape parameter of unity.
Shape and scale parameters are estimated from the data via ana-
lytical or graphical methods [3, 7] and then tested for GoF. If
none of the previously mentioned distributions fit the data set,
then a nonparametric method may be used [8]. However, this
method is less accurate and also requires larger sample sizes.
When working with a single sample, the described procedures
will be implemented. If working with more than one, the k-sam-
ple A-D GoF test can be implemented to assess the hypothesis
(H0) that all samples (batches) come from the same population
[8]. In the affirmative case, we can pool all the batches into a
single, combined sample from which we obtain the desired
results. If the test rejects H0 then an individual analysis must be
carried out for each batch. Alternatively, we may implement
ANOVA [4, 8] methods.
Finally, if the variable of interest is associated with other meas-
urements, then regression methods [5, 8] can be employed. We
first verify that the regression model assumptions (i.e., inde-
pendence and identically distributed observations, linearity, nor-
mality) are met. If so, we can obtain the model parameter esti-
mates. We must also check the model appropriateness. If the
general linear model is applicable [5] then either the ANOVA or
the regression procedure will provide the desired allowables [6].
These methods are discussed next.
Regression Models
Assume that we have two quantitative measurements: Xi (e.g.,
height) and Yi (e.g., weight). Assume that the pairs Pi = (Xi , Yi)
1 £ i £ n constitute the data in our data set. Assume that the vari-
ables X and Y are associated (i.e., functionally related). Perhaps
variable Xi may be easier, faster, cheaper or more accurately
obtained than Yi. Or we may be able to exploit some (X-Y)
functional relationship to estimate the parameters of some (e.g.,
AMSAA-Crow) model of interest. If such is the case then we
can use this (X-Y) functional relation to our advantage and
obtain an improved estimation of Y, through X. This is the idea
behind the use of regression analysis.
A suggested first step is to plot Yi vs. Xi, for each Pi = (Xi , Yi),
1 £ i £ n (e.g., each persons weight vs. height). If the variables
X and Y are not associated (e.g., there is no association between
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
F o u r t h Q u a r t e r - 2 0 0 1
3
a persons height and weight), then the resulting cloud of points
Pi = (Xi ,Yi), will be uniformly and randomly scattered all over
the plane [8]. Draw two lines (one vertical through the average
of the projections over the X-axis; one horizontal, through the
average of the projections over the Y-axis) over the plane. They
divide the plane into four quadrants. Under H0 (i.e., variables X
and Y are not associated) the set of points Pi should be equally
and randomly distributed among the four plane quadrants.
If X and Y are associated, H0, is rejected and the number of
points in each quadrant differs. If there is a positive association
(i.e., when X increases/decreases, so does Y) then the points will
tend to cluster in the upper right and lower left quadrants. If
there is a negative association between X and Y, the points will
cluster in the upper left and lower right quadrants.
The indicator covariance between X and Y, characterizing
such relationship, is defined as: Cov(X, Y) = Sxy = å(xi -`x)
(yi -`y)/(n-1); where`x and`y are the two corresponding sample
averages. The covariance indicator is positive when a positive
association between X and Y exists, negative when a negative
association exists, and zero if no association between the two
variables exists (e.g., their joint variation is not coordinated).
As a measure of association between two variables, covariance
is difficult to interpret because it depends heavily on the units in
which variables X and Y are being measured. For example, the
reader can obtain the sample covariance between height and
weight, first measured in inches/pounds and then in meters/kilo-
grams, to verify they differ. The correlation coefficient [4, 5]
defined as rxy = Sxy /Sx Sy (where Sx is the sample standard devi-
ation of variable Xi or Yi) is a normalized covariance.
Correlation rxy measures the association between X and Y just as
the covariance does. However, the correlation is easier to inter-
pret, since rxy always lies between 1 and 1.
In addition, rxy is dimensionless. The reader may recalculate
the sample correlation rxy between height and weight, first in
inches/pounds and then in meters/kilograms, and verify how
they now agree. Correlation is also a measurement of linear
association between X and Y. That is, if rxy > 0 and close to unity
there is a linear trend that models the association between X
and Y, with positive slope. If rxy < 0 and close to 1, this linear
trend has a negative slope. If rxy @ 0 , there is no trend that mod-
els the relationships.
It is therefore, very useful, to obtain an estimate of the slope of
such a linear trend (called the linear regression) and to use it to
obtain a better estimate of Y (the dependent variable) given a
value of X (the predictor). In mathematical terms:
Yi = b0 + b1 Xi + ei ; 1 £ i £ n
is the equation for a simple linear regression model. The multi-
ple regression model is just an extension of this equation, when
there are two or more predictor variables, X1, X2, etc.
Yi = b0 + b1 Xi1 + b2 Xi2 + ... bk Xik + ei ; 1 £ i £ n
The model error terms ei are independent and identically distrib-
uted Normal, with mean 0. The bj (0 £ j £ k) are called regres-
sion coefficients. The parameters (bj ; 0 £ j £ k; s2) are estimat-
ed from the data during the regression analysis.
Note that, given an adequate sample, it is always possible to
obtain an estimate of its distribution and parameters. However,
if a RV, Y, is associated with another RV, X, we can use correla-
tion information to obtain an improved estimation (i.e., one with
a smaller variance) of Y, given X. This is a clear advantage that
statistical modeling of the data introduces.
Linear Regression analysis requires three or more (k ³ 3) levels
of measurements for the predictor variable X. If there are fewer
levels, we must wait until more data (levels) are gathered.
A regression model exists only if it is statistically significant, i.e.,
if its corresponding F-test rejects the null hypothesis H0:
b1 = b2 = ... = bk = 0. For if H0 holds, variable Y (= b0 + e) does
not depend on the predictor (X). Not necessarily all of the
models independent variables need to be statistically significant
(i.e., bj ¹ 0 for all j). For example, some predictor variables (Xj)
may be highly significant (i.e., have a coefficient bj ¹ 0) while
others may be redundant (i.e., not significant or bj = 0). The over-
all regression model (as a whole) however, may still remain sta-
tistically significant. These situations suggest correlation among
predictors and require variable selection methods. Through such
selection methods, redundant predictors are weeded out and the
resulting regression model significantly improves [5].
If there exist four or more levels of measurements for the inde-
pendent variable (X) then it is possible to fit a quadratic regres-
sion model [8] to the data:
Yi = b0 + b1 Xi + b2 Xi
2
+ ei ; 1 £ i £ n
If two or more regression models are statistically significant, we
must compare them to select the most efficient [5, 8]. The best
regression provides the maximum of information with the mini-
mum of terms. The best model is then retained and used.
Regression models are sensitive to violations of their three sta-
tistical assumptions. These assumptions must be checked before
the model can be correctly used. The analysis of regression
residuals (ei) allows us to verify the regression assumptions [5,
8]. Normality of the data is checked via the K-S or the A-D GoF
test or via graphical methods. If the normality assumption is
rejected, data should be transformed [4] and the regression
model recalculated.
The assumption of equality of variance s2 can also be checked
using statistical tests such as Bartletts [6] or through graphical
analysis of residuals. If any of these procedures indicate that
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
F o u r t h Q u a r t e r - 2 0 0 1
4
variances are not equal, the data should be transformed and the
regression model should be recalculated. Regression models are
based on two procedures. First, an optimization process is used
to select a function that minimizes the sums of squares of the
errors to each data point (åe2i). Then, distribution assumptions
(e.g., normality, etc.) are imposed on the errors ei. If an invalid
regression model is used (where the assumption of independ-
ence, normality, or equality of variance of ei are not met), then
the test confidence levels and the confidence intervals derived
are no longer those claimed from the model [8].
For example, the point estimator for Yi (weight) given Xi
(height) is valid, because of the optimization part of the regres-
sion procedure. However, if the distributional assumptions of
the model residuals are violated, then the confidence (interval)
estimation for Yi given Xi and its probabilistic statements (e.g.,
that the mean weight for a person six feet tall is, with 90% con-
fidence, between 180 and 220 lbs.) are approximate.
ANOVA Models
We have seen that we can have a single or several data samples
(batches). If data come from the same population then we can
pool the samples and work with the larger combined data set.
However, if not all samples come from the same population,
mixing them would be a mistake. Instead, these separate sam-
ples become bivariate data. For, now, each data point provides
two pieces of information: one is its property measurement and
the other is its batch or sample number.
ANOVA [4, 5 and 8] is the procedure used to establish whether
k Normal batches, of n elements each, have the same mean.
Otherwise, the group means differ. Two different estimates of
the common variance are compared. One estimate is obtained by
combining the variance estimators within groups, the other from
the variance estimator between groups. If all k group means are
equal, then these two variance estimators are close (for both esti-
mate the same variance parameter), and their ratio is close to
unit. If sample means differ, the ratio of these two variance esti-
mators is different from unit. The ANOVA model is:
yij = m + aj + eij ; 1 £ i £ n ; 1 £ j £ k
where aj is the contribution of the ith sample (group) to the gen-
eral m and eij is the error term which is distributed normally,
with mean 0 and variance s2. Under H0, all group means are
equal, hence all aj = 0; 1 £ j £ k. Under H1, at least one aj ¹ 0.
Hence, at least one group has a different mean: mj = m + aj ¹ m.
One crucial ANOVA assumption is that all group or batch vari-
ances s2are equal. This assumption must be tested before imple-
menting the ANOVA results. If the test fails (i.e., there is reason
to believe that not all groups have the same variance s2) then
data transformation or other procedures must be implemented
before/in lieu of ANOVA [4].
Another important ANOVA consideration is the number of data
points (nj; 1 £ j £ k) per group. ANOVA works better under bal-
anced designs (i.e., nj = n) where all k groups are of equal size
n. For example, think of the sample size n as the total informa-
tion received. Think of the k groups as k informants and of the
ANOVA test as an assessment procedure that is based on the
information provided by k different informants. Optimally, we
would like to assign equal weight to all informants contribution,
and not to have to rely too heavily on excess information from
some (potentially biased) informants, at the expense of lacking
information from the others.
In practice, however, samples are often of different sizes [4, 8].
To correct for this problem, we can use effective sample sizes
(n) obtained via the formula: n = (N-n*)/(k-1); where n* =
ån2j/N; N = ånj and 1 £ j £ k. When nj = n (i.e., all groups have
the same size) then n* = n= n (i.e., the effective size is the group
size). Statistical analysis strives to obtain the most efficient and
unbiased assessment (test) from the data (information).
ANOVAs bivariate data is categorical (qualitative). Each data
point Pij = (Yij , j) includes the property measurement Yij and its
corresponding sample group j. The group is not quantitative.
When both measurements are quantitative, i.e., when Pi = (Xi ,
Yi), the association between the two variables can be established
in a better way: via regression.
Summary
In three short review articles, we have discussed some of the
main ideas and concepts behind several statistical procedures
used in reliability applications in particular, and in industrial
applications in general.
By stressing statistical thinking over the mechanics of statisti-
cal applications, the practitioner gains a better understanding of
statistics. Understanding will encourage a more frequent and
better use of statistical methods among practitioners and engi-
neers in the field.
Bibliography
1. Anderson, T.W. and D.A Darling, A Test of Goodness of
Fit, JASA, Vol. 49 (1954), Pages 765-769.
2. Box, G.E.P., W.G. Hunter, and J.S. Hunter, Statistics for
Experimenters, John Wiley and Sons, Inc., New York, NY,
1978.
3. Coppola, Anthony, Practical Statistical Tools for the Reliability
Engineer, Reliability Analysis Center, Rome, NY, 1999.
4. Criscimagna, N., Maintainability Toolkit, Reliability
Analysis Center, Rome, NY, 2000.
5. Dixon, W.J. and F.J. Massey, Introduction to Statistical
Analysis, McGraw-Hill, New York, NY, 1983.
6. Draper, N. and H. Smith, Applied Regression Analysis, John
Wiley and Sons, Inc., New York, NY, 1980.
7. Reliability Analysis Center, Reliability Toolkit: Commercial
Practices Edition, Rome, NY, 1995.
T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r
F o u r t h Q u a r t e r - 2 0 0 1
5
Abstract
Many industries operate high value equipment often remotely that
must perform reliably in severe environments. The U.S. Navy (USN)
operates such an equipment the submarine Towed Array System
(TAS) comprised of integrated hydraulic, mechanical, electronic and
acoustic subsystems. To maintain this systems capability, the Navy
has stressed conventional approaches to operation and maintenance.
The USN invested in a prognostic Condition Based Maintenance
(CBM) proof of concept for an individual ship TAS by developing the
Thinline Health Monitoring System (THMS). THMS collects real-time
and discrete reliability data, synchronizing these with other historical
information, and the TASs current condition assessment. As a predic-
tive intelligent code, it uses Bayesian Belief Networks (BBNs) to
extract the full value of real-time data and provide a complete range
of system performance evaluations, from diagnosis to prediction.
Drawing upon THMS success, the USN supported expanding this
capability fleet-wide to assess the health of the entire submarine
TASs population. Plans have been developed to build a relational
database, accessible to a geographically separated towed systems
community via the Internet, for interactive analysis and diagnostics.
The methodology described in this paper is directly translatable
to other government and commercial critical systems that cannot
afford either unscheduled or unnecessary maintenance.
Introduction
Conventional approaches to operation and maintenance have been
used for Submarine Towed Array Systems (TAS) to maintain the
system level capability necessary for ships operations. TASs are
mission essential for obtaining acoustic information in support of
a high percentage of critical submarine deployments. By itself, a
TAS is a complex configuration that requires integrated remote
operation of hydraulic, mechanical, electrical, electronic and
acoustic subsystems in a severe ocean environment. It is nearly
impossible to observe the full functioning of each TAS component
during operation. Furthermore, if malfunctions occur at sea,
repairs most often need to be deferred until return to port. Repairs
are frequently costly, and if performed afloat, have the potential to
result in repair-induced failures owing to poor accessibility and an
adverse repair environment. Adopting a prognostic Condition
Based Maintenance (CBM) capability completely alters the TASs
maintenance landscape by monitoring current system conditions
and predicting degradations so that necessary repairs can be com-
pleted in advance under favorable conditions.
In 1999, the U.S. Navy (USN) funded a team effort by Areté
Associates and Life Cycle Engineering (LCE) to use existing
control and signal parameters from an in-service towed system
(both array and handler) in constructing a device for evaluating
and displaying operating system health. At the individual ship
system level, the USN invested in a proof-of-concept Thinline
Health Monitoring System (THMS) for an OA-9070/TB-23 thin-
line towed array baseline configured SSN 688 Class submarine
[1]. The Areté/LCE team produced THMS as a real-time method
for assessing the current condition of the TAS and demonstrated
the ability to dynamically predict future system health. The prin-
cipal elements that support this capability are real-time sensor
inputs, a mature Preventive Maintenance program, and an
embedded Bayesian Belief Network (BBN) intelligent code.
Having demonstrated a prognostic, next generation maintenance
capability for individual systems, the USN then funded CBM
concept development for the fleet population of towed arrays
under the Small Business Innovative Research (SBIR) program.
Integral to this capability is a comprehensive discrete historical
and real-time web-based relational database and a powerful soft-
ware toolbox to permit diagnostic and prognostic information
mining simultaneously to geographically separated users (includ-
ing operators, design engineers, vendors and logisticians). The
inherent object oriented BBN tree framework permits the exten-
sion of the THMS health assessment output to serve as an input
to the overarching BBN population model. A similar approach is
directly applicable to other systems and industries.
Background
The Navys core maintenance efforts reside in a Reliability
Centered Maintenance (RCM) program practiced at the level of
Preventive Scheduled Maintenance (PSM). In this paper, we
compare the introduction of a high-level CBM capability based on
intelligent software with in-use preventive maintenance (PM) pro-
grams. In this regard, PSM is implemented as a process of devel-
oping a critical and functional list of failures and using default sta-
tistical data to develop a scheduled maintenance program to pre-
vent expected failures. Some naval programs have made the addi-
tional investment to construct age-related failure distributions and
reexamine the PM plan to bring it in line with operational per-
formance. References are expanding to include CBM either as a
reliability subject of its own or covered under the umbrella of
RCM [2]. Here we use the RCM restricted domain of PSM to dis-
tinguish it from a program incorporating a higher-level form of
maintenance using real time probabilities of health and prediction.
PSM uses statistical information derived from historical data to
8. Romeu, J.L and C. Grethlein, A Practical Guide to Statistical
Analysis of Material Property Data, Advanced Materials and
Processes Technology Information Analysis Center, Rome,
NY, 1999.
9. Sadlon, R., Mechanical Applications in Reliability
Engineering, Reliability Analysis Center, Rome, NY, 1993.
10. Scholz, F.W. and M.A. Stephens, K-Sample Anderson-
Darling Tests, JASA, Vol. 82 (1987), Pages 918-924.
Real-Time Prognostic Condition-Based Maintenance for High
Value Systems
By: Harry Bishop and William Matzelevich, Areté Associates
Edward Rossi, Life Cycle Engineering, Ron Thomas and Meeiyun Hsu
|
|
|
|