This is just an Excerpt from a larger document, click here to view the entire document.Discussion
We can always state, without hesitation (or the need to draw any sample), that the batch mean number of defectives always lies between zero and the batch size. And we can have absolute (i.e., 100%) confidence in this statement. However, this result is of little use, for our interval is so large that it has no practical value. Hence, a high confidence by itself is not enough.
The objective of deriving a CI for an (unknown) parameter is to estimate its value. Let's compare it to throwing a hat over a coin sitting on a table. The CI is the hat and the coin is the parameter. If the hat (CI) is too small, it will not cover the coin (parameter) very often. But if it does, the area where we know the coin lies is small. If instead we use a large Mexican sombrero (large CI) it will likely cover the coin (parameter) often. But the sombrero (CI) will be so large that the coin (parameter) may be as lost under the hat as before - and we will have gained nothing from deriving such CI. This is the crucial trade-off problem in a confidence interval derivation: to find one, with a large enough probability of coverage (1 - α), and a small enough CI length (2H) as to be of practical use.
Therefore, we must strike a balance between the usefulness of the statement (e.g., the CI length 2H) and the assurance it instills (e.g., confidence level 1 - α). Such a balance depends highly on the variation (standard deviation s) of the phenomenon under study and is achieved by using an adequate sample size (n). As illustration, reconsider the previous example, but now pre-establishing a FIXED CI half-length (H) of ± two units, for a confidence level of 95%. This means that the population and sample means will be, 95% of the times, at most two units apart.
For this case, the sample size (n) required to achieve such halflength (H = 2), with population σ = 8.6, confidence level 1 - α = 0.95 (that implies zα/2
= 1.96) is obtained by using the equation:
That is, we need to draw a sample of size n = 71 batches of 1,000 items, and average their respective number of defects per batch. Then, 95% of the time, this average will be, at most, two units away from the true population mean (80) of batch defects.
The restrictions that must be imposed in a statistical analysis to achieve a given confidence level are similar to those imposed to achieve high confidence for investors in the stock market. They include restrictions about stockbroker expertise and experience as well as about the length of stock market observation. In the statistical context, these factors are now replaced by restrictions on sample size, confidence level, interval length, etc.