This is just an Excerpt from a larger document, click here to view the entire document.Fitting a Normal Using the Anderson-Darling GoF Test
Anderson-Darling (AD) is widely used in practice. For example, MIL-HDBKs 5 and 17 [4, 5, and 2], use AD to test Normality and Weibull. In this and the next section, we develop two examples using the AD test; first for testing Normality and then, in the next section, for testing the Weibull assumption. If there is a need to test for Lognormality, then log-transform the original data and use the AD Normality test on the transformed data set.
The AD GoF test for Normality (Reference [5] Section 8.3.4.1) has the functional form:
(1)
where F0 is the assumed (Normal) distribution with the assumed or sample estimated parameters (μ, σ); Z(i) is the ith sorted, standardized, sample value; "n" is the sample size; "ln" is the natural logarithm (base e) and subscript "i" runs from 1 to n.
The null hypothesis, that the true distribution is F0 with the assumed parameters, is then rejected (at significance level =
0.05, for sample size n) if the AD test statistic is greater than the critical value (CV). The rejection rule is:
We illustrate this procedure by testing for Normality the tensile strength data in problem 6 of Section 8.3.7 of [5]. The data set, (Table 1), contains a small sample of six batches, drawn at random from the same population.
Table 1. Data for the AD GoF Tests
338.7
308.5
317.7
313.1
322.7
294.2
To assess the Normality of the sample, we first obtain the point estimations of the assumed Normal distribution parameters:
sample mean and standard deviation (Table 2).
Table 2. Descriptive Statistics of the Prob 6 Data
Variable
N
Mean
Median
Data Set
6
315.82
315.40
Under a Normal assumption, F0 is normal (mu = 315.8, sigma =
14.9).
We then implement the AD statistic (1) using the data (Table 1) as well as the Normal probability and the estimated parameters (Table 2). For the smallest element we have:
Table 3 shows the AD statistic intermediate results that we combine into formula (1). Each component is shown in the corresponding table column, identified by name.
Table 3. Intermediate Values for the AD GoF Test for Normality
i
X
F(Z)
ln F(Z)
n+1-i
F(n+1-i)
1-F(n1i)
ln(1-F)
1
294.2
0.072711
-2.62126
6
0.938310
0.061690
-2.78563
2
308.5
0.311031
-1.16786
5
0.678425
0.321575
-1.13453
3
313.1
0.427334
-0.85019
4
0.550371
0.449629
-0.79933
4
317.7
0.550371
-0.59716
3
0.427334
0.572666
-0.55745
5
322.7
0.678425
-0.38798
2
0.311031
0.688969
-0.37256
6
338.7
0.938310
-0.06367
1
0.072711
0.927289
-0.07549
The AD statistic (1) yields a value of 0.1699 < 0.633, which is non-significant:
Therefore, the AD GoF test does not reject that this sample may have been drawn from a Normal (315.8, 14.9) population. And we can then assume Normality for the data.
In addition, we present the AD plot and test results from the Minitab software (Figure 2). Having software for its calculation is one of the strong advantages of the AD test. Notice how the Minitab graph yields the same AD statistic values and estimations that we obtain in the hand calculated Table 3. For example, A-Square (= 0.17) is the same AD statistic in formula (1). In addition, Minitab provides the GoF test p-value (= 0.88) which is the probability of obtaining these test results, when the (assumed) Normality of the data is true. If the p-value is not small (say 0.1 or more) then, we can assume Normality. Finally, if the data points (in the Minitab AD graph) show a linear trend, then support for the Normality assumption increases [9].
The AD GoF test procedures, applied to this example, are summarized in Table 4.
Finally, if we want to fit a Lognormal distribution, we first take the logarithm of the data and then implement the AD GoF procedure on these transformed data. If the original data is Lognormal, then its logarithm is Normally distributed, and we can use the same AD statistic (1) to test for Lognormality.
Figure 2. Computer (Minitab) Version of the AD Normality Test (Click to Zoom)
Table 4. Step-by-Step Summary of the AD GoF Test for Normality
Sort Original (X) Sample (Col. 1, Table 3) and standardize: Z = (x - μ)/σ
Establish the Null Hypothesis: assume the Normal (μ, σ) distribution
Obtain the distribution parameters: μ = 315.8; σ = 14.9 (Table 2)
Obtain the F(Z) Cumulative Probability (Col. 2, Table 3)
Obtain the Logarithm of the above: ln[F(Z)] (Col. 3)
Sort Cum-Probs F(Z) in descending order (n - i + 1) (Cols. 4 and 5)
Find the Values of 1- F(Z) for the above (Col. 6)
Find Logarithm of the above: ln[(1-F(Z))] (Col. 7)
Evaluate via (1) Test Statistics AD = 0.1699 and CV = 0.633
Since AD < CV assume distribution is Normal (315.8, 14.9)
When available, use the computer software and the test p-value