This is just an Excerpt from a larger document, click here to view the entire document.Numerical Examples
To illustrate these issues, consider Table 2. It shows decreasing sample sizes from two populations that have the same 8% defect rate. But the standard deviation of the number of defectives, per batch of 1000, for the first population is 8.6. And that of the second population is 25 (three times larger). We want to show the consequences of decreasing the sample size (or observation time) and increasing variability in the phenomenon under study (represented here by a larger standard deviation).
Table 2. Measures for Populations from Figures 1 and 2 for Different Sample Sizes
Population 1 (See Figure 1)
Population 2 (See Figure 2)
First notice how, for very large samples (n ≥100), the theoretical and estimated Means and Standard Deviations are relatively close. For smaller sample sizes (n ≤ 20), the Standard Deviations, Maximums, Minimums, Quartiles and even the Means vary rapidly. This is one of the problems with small sample sizes. They drive down our confidence in the estimation of the population mean number of batch defectives, which constitutes our statement of interest (the statement could have also been the reliability of the product, its mean life, or any other population parameter).
Our statement of interest consists in providing a confidence interval (CI) that includes the population batch mean defectives, with a high probability (e.g., 95% of the time we obtain such an interval). The confidence statement is, precisely, that the CI will cover the true average batch mean defectives at least 95% of the time. This raises two issues: the length of the CI and its probability. These two issues are related by the following equation:
The formula has four elements: length of the interval 2H, or half-length H, which is added and subtracted from the sample mean (x), confidence level (1 - α), population standard deviation (σ) and sample size (n). Z is the Normal Standard percentile.
Since s is a fixed (population) characteristic of the process, we can only control two of the remaining three factors: length (or half-length) of the interval, confidence level, and sample size. For example, let confidence level 1 - α = 0.95 and sample size n = 30. Then, from Table 2 and data, and using z(0.975) = 1.96 we obtain the two CIs for the respective populations:
The results have the following meaning: the CI (77.95,84.11) for the first population, and CI (68.81,86.71) for the second, include 95% of the time the true mean number of defectives (80) per batch of 1000 items produced. We recall that the second group has a standard deviation three times larger than that of the first. It could very well be that (once in 20 times) the sample includes, by chance, say several very low numbers and produces a CI that does not contain the true batch mean defectives. Also, for the same confidence level (95%) and sample size (n = 30) the length (2H) of both intervals is very different (6.16 vs. 17.9) due to the different variations (σ = 8.6 or 25) of the two processes.