T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r F i r s t Q u a r t e r - 2 0 0 1 9 lar industry is over-represented on the committee. Membership is free, but members must pay any expenses associated with attending meetings. A Membership Application must be com- pleted and applicants must submit a biography (½-page maxi- mum with the application. The biography should include quali- fications (degrees, experience, etc.), industry represented, and special interests (i.e., maintainability, human reliability, etc.). Membership application forms can be requested from Patricia Kopp at . For more information on Z1 and the Dependability Subcommittee, refer to the US Standards Group web site or con- tact the Chair of the Dependability Subcommittee, Ned H. Criscimagna, at (301) 918-1526, . New ISO President The International Organization for Standardization (ISO) has announced that Mr. Mário Gilberto Cortopassi from Brazil took office as the organizations new President on January 1, 2001. Mr. Cortopassi will serve a two-year term. Mr. Cortopassi is a successful industrialist. His formal training is as a chemist, and he has gained a wealth of experience in the textile and synthetic fiber industries. He has been actively involved in standardization for over 30 years. Mr. Cortopassi stated in his inaugural message that international standards are more necessary than ever to facilitate business, encourage free trade, and foster progress in society. He singled out standardization, metrology, testing, conformity assessment, and certification as key instruments in achieving business suc- cess in a global market. The new President cited ISOs success in responding to market- driven requirements by modernizing its own processes to deliv- er standards in a timely and efficient manner. Mr. Cortopassi called for even stronger support for ISO from its constituent members, pointing out that ISOs success greatly contributes to the efficiency of the global marketplace, which in turn extends prosperity to all nations. Statistical Analysis of Reliability Data, Part 1: Random Variables, Distributions, Parameters, and Data Introduction Sometimes, engineers have problems understanding the basis for the statistical procedures they need when analyzing reliability data. But this is not surprising. In many engineering curricu- lums, the study of statistics is limited to one or two courses (3 to 4 credit hours). These courses are usually theoretical, do not address data analysis, and cover a wide range of statistical tech- niques. Finally, other engineering courses emphasize the physi- cal (deterministic) rather than the stochastic laws governing the processes under analysis. This article is the first of a series written to provide engineers with a practical understanding of statistical analysis of reliabili- ty data. This article discusses random variables, statistical dis- tributions and their parameters, and data collection issues, including the special problem of outlier (or extreme value) detection and treatment. The second article addresses parameter estimation and hypothesis testing, emphasizing goodness of fit procedures used to identify and select suitable distributions from a given set of data. In the third article, the concepts from the first two articles will be applied to reliability estimation and assess- ment problems. The fourth article discusses data collection and data quality problems. Statistical Distributions Statistics deals with the study of phenomena and processes that (1) yield more than one outcome, and (2) occur in a random fash- ion [1, 2, 3, 4, 6]. Results of the random processes under obser- vation are called random variables (RV) and are denoted with a capital letter, say X. Specific outcomes (denoted in lower case) are called events and the set of all possible RV outcomes is called the sampling space. For example, from the process of rolling two dice and taking their sum, we observe X, the random variable sum of both dice. Similarly, from the process of life testing we observe X, the random variable life of the device. In the dice example, the sampling space consists of integers 2 through 12. An event (X = n) is rolling a given sum and it occurs with a probability (P{X = n}) (Figure 1). For the life testing example, the sampling space consists of all positive values of time and an event {X < t} is observing a life of less than t units (Figure 2). The graphical pattern of occurrence of such random outcomes (e.g., Figures 1 and 2) provides an intuitive way to understand the meaning of the statistical distribution of an RV. The abscis- sa of such graphs represent the sampling space of X (all possible outcomes) and the ordinate represents a value proportional to the frequency of occurrences of the outcomes. Such graphs repre- sent the probability density function (pdf) when the sampling space of X is continuous (Figure 2) or the mass function when it is discrete (Figure 1). The area under the curve of the mass/den- sity function is one. The Cumulative Distribution Function (CDF) of an RV is non-decreasing, has a value between zero and one, and is defined for both the mass and density functions as: F(a) = P{X £ a} where a is any feasible value in the sampling space of X. By: Jorge Luis Romeu, IIT Research Institute T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r F i r s t Q u a r t e r - 2 0 0 1 10 Hence, all random variables have a distribution, uniquely described by one or more parameters. The mass/density functions provide an objective, precise way to describe the probability mechanism governing the random process that produces them. For example, contrast the (graphi- cal) flat pattern from rolling an honest die, where the occurrence of any of its six sides is equally likely, with that of the sum of two dice (Figure 1), where a sum of 7 is more likely than that of 12, or with the decreasing pattern of the exponential (Figure 2). Such patterns (distributions) can be numerically described by a set of fixed numbers called parameters. In the sum of two dice example, the set (1/36, 2/36, 3/36, ... 1/36) of frequencies asso- ciated with the possible sums, uniquely describe its distribution (pattern). In the exponential case, the mean describes it. Statistics is about investigating those distributions and parame- ters. In this series of articles both quantitative and qualitative RVs are addressed. Quantitative RVs are numerical and exhibit mathematical properties of order and distance. These RV are said to have a stronger measurement scale level, which allows the implementation of certain statistical methods, not always appro- priate for qualitative variables [5]. Qualitative RVs (e.g., attrib- utes such as pass/fail) are categorical or can be ordered at best. Statistical distributions can be discrete or continuous, according to whether their corresponding RV sampling space is discrete or continuous. The result of rolling a die is an example of a discrete RV; the life of a device is an example of a continuous RV. Their corresponding graphical patterns yield step or continuous mass/density functions. The probabilities for individual out- comes (e.g., rolling a sum of 2 or observing 3 failures in the field) can be calculated for discrete RVs. The probabilities of ranges (e.g., that a device life is longer than ten hours or between three and ten hours) can be calculated for continuous RVs. For exam- ple, the probability of rolling a sum of three or less (denoted P{X < 3}) is obtained by adding the discrete mass function; the probability of observing a life of less than three hours (denot- ed P{X < 3}) is obtained by integrating the continuous pdf. These examples illustrate the one-to-one relationship between the distributions and their corresponding mass/density functions. In addition to being discrete or continuous, distributions can be symmetric or skewed, according to whether their mass/density functions are or are not symmetric with respect to one point in their sampling space. Distributions can also be unimodal or multi- modal, or have no mode, according to whether their mass/density functions have one or more (local) maximums (modes). The dis- tribution of the RV sum of two dice in Figure 1, is an example of a symmetric, unimodal distribution. Its mean and mode are both 7, about which the distribution is symmetric. The exponential dis- tribution, in turn, is skewed to the right and has no mode (peak). As may be surmised, the number of statistical distributions that can arise is infinite, posing a difficult practical problem. To deal with it, well known and thoroughly studied families of statis- tical distributions that are easy to manipulate and fit different patterns and have a small and easy to interpret number of param- eters, have been developed. Two examples of discrete families of distributions (and their respective parameters) are the Binomial (with parameters n, number of trials and p, probability of success at any trial) and the Poisson (with rate of occurrence l). Two examples of continuous distributions are the Normal (with mean m and standard deviation s) and the exponential (with mean 1/l). Often, the exact distribution of a random process under study is unknown but can be satisfactorily approximated by one of these well-known distribution families, by finding suitable combina- tions of parameters. If we can live with the difference between the exact probability of any event and its approximation, then we will work with the latter as if it were its exact distribution. Much statistical work is spent in (1) selecting a specifically well-suited family of distributions, (2) verifying that such selection is cor- rect, (3) estimating adequate parameters, and (4) deriving prob- abilistic results with them. DICE 1 2 3 4 5 6 1 2 3 4 5 6 2 3 4 5 6 8 3 4 5 6 8 9 4 5 6 8 9 10 5 6 8 9 10 11 6 8 9 10 11 12 x 2 3 4 5 6 7 8 9 10 11 12 P{X = n} 0.028 0.056 0.083 0.111 0.139 0.167 0.139 0.111 0.083 0.056 0.028 x is the Sum of Two Honest Dice P{X = n} is the probability of two honest dice adding up to a particular value, n 2 3 4 5 6 7 8 9 10 11 12 0.028 0.056 0.083 0.111 0.139 0.167 P{X = n} n 7 7 7 7 7 7 Figure 1. Dice graphical pattern 0 5 10 15 20 25 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Median (= 6.93) Mean (= 1/ = 10) f(t) = e-t Figure 2. Exponential distribution with mean of 10 T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r F i r s t Q u a r t e r - 2 0 0 1 11 The previous discussion shows that it is important to fully under- stand the concepts of RVs, their distributions and their correspon- ding parameters, because they provide an objective and precise way of describing a random phenomenon under study. Applying these concepts to a data set provides practical and useful, proba- bilistic statements on events of interest, such as what is the Reliability of the device, if its mission time is ten hours? Conversely, a pre-specified probability (e.g., Reliability = 0.99) may be required by designers or the procurement office, as the per- formance measure of a device. Therefore, samples of such devices may be drawn and tested for compliance with these requirements. Distribution Parameters Parameters are population-fixed values that uniquely characterize and help describe the distribution of a RV (e.g., l in the expo- nential distribution). Parameters allow the graphing of the RV specific mass/density function (outcome patterns). The location, dispersion, shape, scale, and threshold parameters, all of which are widely used in Reliability applications, will be discussed. Location parameters respond to the question Where is the dis- tribution? A particularly useful subset of the location parame- ter is given by the three measures of central tendency: mean, median and mode. The mean is the outcome located at the cen- ter of gravity of the mass/density function graph. The median is the outcome such that half the population scores below (or above) it. The mode is the value where the mass/density func- tion peaks (most frequent outcome). Mean and median are unique but multiple modes may coexist (in a multimodal distri- bution). If a distribution is symmetric and unimodal (e.g., Normal) then the mean, median and mode coincide. If it is skewed (e.g., exponential), they will differ. If a distribution is skewed (non-symmetric), then one tail is longer than the other is, and the mean is less important than the median and mode. For example, the mean of the RV household income may have little meaning if the population consists of several billionaires and millions of landless peasants (it provides little information about the situation). In such a case, (1) the median income level is such that half the population income lies above and below it, and (2) the modal income level is that which is most frequent and around which there is some population clus- tering. The latter two parameters provide more useful and mean- ingful information about the population income. In addition, if we add (subtract) a few billionaires to the population, the mean will be affected, whereas mode and median will be much more resilient to such changes. Such resilience is referred to as the robustness of a parameter and is considered a good quality. Other location parameters of interest are the quartiles and the per- centiles. A percentile is an outcome value within the sampling space of the RV such that a given percent of the population scores a result less than or equal to such outcome. For example, by defi- nition the median is the fiftieth percentile (because 50% of the pop- ulation scores less than or equal to it). Other important percentiles are the lower (1st) and upper (3rd) quartiles. They define values where 25% of the population (75% of the population) are less than or equal to such quartiles. Between the 1st and the 3rd quartiles lie half of the population closest to the center (median). The Characteristic Life of the Weibull distribution is an example of a percentile (63.2%) with a well-known engineering interpretation. Dispersion parameters respond to the question How does the random process vary about some location parameter? Some well-known dispersion parameters are variance, range, and Interquartile Range (IQR). The standard deviation is the square root of the variance. In a Normal distribution, the standard devi- ation yields the distance from the mean to the abscissa of the inflection point of the density function. The range is the differ- ence between maximum and minimum outcomes. The IQR is the difference between the upper and lower quartiles. Dispersion parameters are used to characterize or compare popula- tion variability. And variability is always associated with risk in statistics. If, for example, the means of two positive RV are the same, their variances can be compared directly. But if the means differ, then an indirect dispersion parameter, such as the Coefficient of Variation (the ratio of the standard deviation to the mean, for a positive RV) is used. Also, as distributions depart from symmetry, the IQR is more useful than the variance for the same reasons that the median and mode are more useful than the mean. By varying the shape and scale parameter, a specific family of distributions can describe a specific population (i.e., by obtaining a good fit or approximation to the exact RV distribution). A Weibull, for example, can approximate a Normal or exactly describe the exponential by adjusting its shape parameter. Other useful parameters include the threshold parameter, which pro- vides a lower bound for the RV range of possible values. The Weibull [4] is a good example of such a three-parameter distribu- tion. It is also worth noticing that, in most distributions, the mean and variance are no longer density function parameters (e.g., Normal) but are obtained as a function of the shape and scale. Finally, Skewness and Kurtosis are two parameters that describe a distributions degree of (dis)symmetry and peakedness. Parameters help visualize the outcome patterns of an RV, which allows us to better understand them. Extreme Values or Outliers Data analysis begins with identifying a suitable family of distri- butions, and its corresponding set of parameters, that accurately characterizes the random phenomenon under study. We then can analyze the distribution behavior, especially in the tails, where the real action takes place. For it is in the distribution tails where a distinct behavior really occurs, a fact particularly impor- tant in hypothesis testing. Hypothesis testing allows us to ascer- tain whether an unusually high or low observation may have a reasonable probability of occurrence, or whether such an unusu- al observation constitutes a rare event under the current model assumptions, signaling out a possible anomaly (e.g., some assumptions made are wrong). (Continued on page 14) T h e J o u r n a l o f t h e R e l i a b i l i t y A n a l y s i s C e n t e r F i r s t Q u a r t e r - 2 0 0 1 14 An outlier or rare event is defined as an observation (in the tails of the RV range) that occurs with a very small probability. It is incorrect to believe that an outlier is always an erroneous observation or that it should be automatically removed from the sample. In the dice example, the sum 12 occurs with probabili- ty 1/36=0.028 but may occur at any trial with that probability. We may perform the dice experiment three times in a row and roll three sums of 12 (an event that occurs with probability 2.14 x 10-5, very small but not zero). As another example, if the life of a device is exponentially distributed with a mean (i.e., 1/l) of 100 hours, we may observe one device that lasts more than 500 hours, although the probability of such an event is only 0.0067. These outlying events seldom occur, but they can, and some- times do! They may provide grounds for us to believe that (1) the dice are loaded or that the actual mean life of the device is more than 100 hours or (2) that we have been extremely lucky or unlucky and have observed a rare event. The occurrence of low-probability results raises a red flag but does not ensure foul play. What statistics provides is a useful and scientific context in which to analyze them. For example, in a particular life test we may observe that a large number of otherwise acceptable devices fail. We observe that in all previous life tests (say 99) of the same device, we did not observe such a high number of failures. Such a result is a rare event (occurs once in 100 times), and we may be tempted to auto- matically discard it as an anomaly and assume the information provided is erroneous. But we may well be discarding very useful information. It may happen that, say, an unusual combination of humidity, temperature and pressure, that only occurs once in a 100 times, greatly affects the failure mechanism of the device. And it may be that the life test in which we have observed such large number of failures was conducted precisely under such unusual conditions. If instead of discarding these unusual test results as outliers we submit them to further lab and statistical analyses, we may be able to discover the real reasons behind them. On the other hand, rare events and outlying observations often result from clerical errors or some other unrelated circumstances. Such cases do warrant discarding the unusual observation because it no longer represents the population under analysis. Only in this case is it proper to remove such elements from the data set. Data Collection Weve discussed observations of events, data points obtained by gathering information from the population of interest or under study. Such data constitute the life and blood of statistical analy- sis. Hence, the next few paragraphs focus on the important sub- ject of data collection. We collect a sample of data from an entire population to study it and do not have the time or means to look at it in its entirety. But we want our data analysis results to be valid for the entire popu- lation and not just the sample. To extend our analyses results from sample to population (called extrapolation in statistical terms), the sample must meet several criteria. The sample must be representative of the population. Hence, the sample must be randomly drawn from the entire population of interest and sample elements must be independent. A draw is random when every element has the same probability of being selected. Two draws are independent if one result does not, in any way, affect the other. Finally, data collection is very expensive and time consuming. On the one hand, we strive to get as much data (information) as we can afford. The more information we obtain (larger sample), the smaller the margin of error and the more precise the esti- mates. On the other hand, time and budget constraints force us to work with samples much smaller than we might desire. Good statistics helps us to extract as much information as possible from these samples or to define the optimal sample size to meet our requirements. Conclusions and Summarization Statistical analysis is more than just the mechanical application of a set of fixed procedures and equations. In fact, many statis- tical procedures and equations result from the systematization of the process of scientific experimentation, developed under cer- tain statistical assumptions and conditions. If such underlying assumptions and conditions (e.g., normality, independence, homogeneity of variances, etc.) are not met, then the analysis results obtained from the statistical procedures used are not valid or will have a different statistical interpretation (i.e., different probabilities of occurrence). This article and those that follow in the series provide addition- al insight into the statistical thinking process. By applying sta- tistical thinking to their analysis, engineers will improve their use of statistics as a reliability analysis tool and will extract greater benefits from their data analysis work. Bibliography 1. Mann, N., R.E. Schafer and N. Singpurwalla, Methods for Statistical Analysis of R and Life Data, Wiley, New York, 1974. 2. Reliability Analysis Center, Reliability Toolkit: Commercial Practices Edition, Rome, NY, 1994. 3. Rohatgi, V.K., An Introduction to Probability Theory and Mathematical Statistics, Wiley, New York, 1976. 4. Romeu, J.L. and C. Grethlein, Statistical Analysis of Material Property Data, AMPTIAC, Rome, NY, 2000. 5. Romeu, J.L. and S. Gloss-Soler, Some Measurement Problems Detected in the Analysis of Software Productivity Data and their Statistical Consequences, Proceedings of the 1983 IEEE COMPSAC Conference, Pages 17 to 24. 6. Ross, S.M., Introduction to Probability and Statistics for Engineers and Scientists, Wiley, New York, 1987. Statistical Analysis of Reliability Data (continued from page 11)