This is just an Excerpt from a larger document, click here to view the entire document.
SECTION TWO - BASIC RELIABILITY CONCEPTS

This section discusses the concept of reliability and its importance as a product characteristic.

2.1 The Definition of Reliability

Succinctly put, reliability is a performance attribute that is concerned with the probability of success and frequency of failures and is defined as:

       The probability that an item will perform its intended function understated conditions, for either a specified interval or over its useful life.

2.2 The Importance of Reliability

Reliability is a measure of a product's performance that affects both product function and operating and repair costs. Too often performance is thought of only in terms of speed, capacity, range, and other "normal" measures. However, if a product fails so often that it's seldom available for use, its speed, range, and capacity are not relevant. Reliability is critical to safety and liability as well.

The reliability of a product is a primary factor in determining operating and repair costs, which are partially a function of the number of repairs needed over time, the number of spare parts required, and the number of maintenance personnel necessary. Other factors such as the repair policy (or maintenance concept) affect these costs, but reliability is a significant, and often the principal, factor.

Reliability determines whether or not a product is available to perform its function. A product with perfect reliability (i.e., no failures during the life of the product) would always be available for use. But perfect reliability is difficult to achieve. So, even when a "good" level of reliability is achieved, some failures are to be expected. The effects of failures on availability (and cost) can be minimized with a "good" level of maintainability (a measure of how quickly the product can be repaired). Consequently, product reliability and maintainability (R & M) are said to be complementary characteristics, and can be combined to measure the percentage of time that the product is available for use. If the product never failed, the reliability would be infinite and the availability would be 100%. Or, the product never needed to be repaired, its maintainability would be zero and, again, availability would be 100%.

2.3 Key Reliability Issues

For any product or system, the key reliability issues from any customer's perspective are:
  • What measures of reliability are important to me?
  • What levels of reliability are necessary to meet my needs?
  • How will I determine if the required levels of reliability have been achieved?
From a supplier's perspective, the issues are:
  • What reliability activities are the most effective for the product or system, such that the reliability program objective is achieved?
  • What reliability design goals are appropriate to ensure that customers' needs are met?
  • What design approaches will be most effective in achieving the required reliability in the expected environmental and usage profiles?
  • What tasks can be effectively used to assess progress towards reliability goals and requirements?
  • What are the most appropriate means to determine if the reliability objective has been achieved?
  • How can the designed-in reliability be retained during manufacturing and operational use, thereby ensuring reliable performance?
In the commercial world, the average consumer is not usually concerned with the second set of issues - they are left to the supplier to confront. If a supplier does a poor job, the customer will go elsewhere for the product. Thus, competition in the marketplace provides a strong incentive to meet the customer's needs. In the defense world, the degree of competition is often less than in the commercial world. If dictated by the nature of the product (e.g., used only by the military), the risks (e.g., very high with unproven technologies being used), and the type of acquisition (e.g., new development), it may be beneficial for the government customer to take an active role in addressing the supplier's issues. Some industrial customers may also benefit from being involved with some of these issues, especially those dealing with measuring progress and determining the achieved level of reliability.

2.4 The Basics of a Reliability Program

The objective of any set of reliability tasks is to design and manufacture a reliable product in a cost effective manner. In order to effectively achieve that result:
  • The customer's reliability expectations and needs should be fully understood.
  • All levels of supplier management should be actively committed to meeting the reliability objective through an appropriate allocation of resources.
  • The attributes of design and manufacturing that impact reliability should be considered as an integral part of the system engineering process.
  • The product should be designed for the intended use environment and the consequences of failure understood.
  • The reliability of the design should be verified to ensure that goals and requirements have been achieved.
Customer's Reliability Expectations and Needs Should be Fully Understood. Adequate levels of reliability are essential to the overall performance of the product. These levels may be expressed by the customer (in which case, they are part of the customer's specification) or determined by the supplier as necessary to compete satisfactorily and to minimize liability. In either case, it is the customer's needs that must be understood and satisfied.

Supplier Management Should be Actively Committed. The ability to successfully achieve the reliability objective is dependent on the demonstrated level of commitment by management, particularly at the upper levels. This commitment can be reinforced by making sure that this objective is an integral part of the corporate technological and business strategy. Demonstrated and emphatic management commitment will help reinforce any "culture change" that may be necessary to implement those actions necessary to achieve the program objective.

Reliability Attributes of Design and Manufacturing Should Be an Integral Part of the System Engineering Process. By making the reliability aspects of design and manufacturing an integral part of the system engineering process, reliability requirements will be addressed concurrently with other performance requirements. In this way, reliability activities will be integrated with other engineering and design tasks, thereby avoiding duplicative effort and making the best use of output information and results. In planning a reliability program, the integration of design, analysis, and other tasks to minimize costs and maximize the use of task results should be explicitly addressed.

Product Should be Designed for the Intended Use Environment. To be reliable, the product should be designed for the environment in which it will be used, and the design should be thoroughly understood. Characterizing the environment is a top priority. The use environment includes all the stresses experienced by a product during packaging, shipping, and handling; storage; operation; and repair/maintenance. It addresses the types of users and product duty cycles. Understanding the strengths and weaknesses of a design requires that critical failures be analyzed to determine their root cause and product-level effects, and to change the design to eliminate or minimize the effects of the failure modes.

Reliability of the Design Should be Verified. Verifying that the reliability objective has been met can be accomplished through testing or analysis. Through testing, the product's design (and the tools used to create that design) can be validated. Testing may uncover unexpected design weaknesses or unsatisfactory performance, and serves as a development tool that provides the feedback needed by engineers to refine their design and revise their analyses. Extensive testing may become prohibitive due to the nature of the product (very simple or based on prior, proven technology; or because it is too expensive). In such cases, analytical means can be used to determine if requirements have been met. Many times, both testing and analysis are used.

The objective of a sound reliability program should be similar to that for other design characteristics, since reliability is an inherent design characteristic. Attempts to improve the inherent reliability of a product after the design is "frozen" are usually expensive and inefficient. In addition, reliability depends on other factors, most notably how the product or system is actually used and repaired if failures occur. These factors can easily compromise the performance of a "good" design. For example, poorly trained repair personnel can cause product reliability to suffer due to induced failures. So, although suppliers may concentrate on achieving reliability through sound engineering, design, test, and manufacturing, it must be remembered that many "post" manufacturing activities should be planned early. These range from strategizing repair policies to establishing failure data tracking systems in order to capitalize on the inherent reliability characteristics of the product.

2.5 Reliability Oriented Tasks

The remaining sections of this Blueprint provide insight into the reliability tasks that may be appropriate for different product development situations. To set the stage for those discussions, the tasks that are common to the reliability discipline must be introduced. Table 1 includes those tasks that have become common practice over the years. They represent an extensive set of activities grouped by the technical nature of the activity. Because the Blueprint format is oriented towards the performance of tasks that serve a reliability purpose, the tasks in the table identified are cross-referenced to the purposes they can address in a reliability program. In the later sections that address each purpose, those referenced tasks will be discussed in greater detail. The intent is to emphasize the purposes that the tasks serve, rather than the task itself.

Table 1: Reliability Tasks
Type of Activity Tasks and Description Relevant to Issues
3.1 3.2 3.3 3.4 3.5 3.6
D
E
S
I
G
N
Critical Item Control. Monitoring in-house and suppliers’ activities to reduce the risk to product reliability from items identified as critical. Can include hardware and software. X     X   X
Critical Item Identification. Cataloging items that have relatively high impact in determining product reliability. Can include hardware and software.     X      
Derating. Limiting the maximum allowable stresses on a part to a designated value below its rated maximum stress in order to improve its reliability.     X      
Design Reviews. Formal or informal independent evaluation and critique of a design to identify and correct hardware or software deficiencies. X   X X    
Environmental Characterization. Determination of the operational stresses the product can be expected to experience.   X X     X
Fault Tolerance. Designing alternate means to continue operation when components of a product fail.   X X      
Parts Application. Using parts under design rules intended to assure that they will operate reliably under the expected operational stresses.     X      
Parts Selection. Choosing parts that will be effective and reliable in the planned application and which should be available at reasonable cost during the product’s life.     X      
Supplier Control. Monitoring suppliers’ activities to assure that purchased hardware and software will have adequate reliability. X     X   X
Thermal Design. Consideration of heat generation and dissipation in the product in order to prevent reliability problems caused by the effects of temperature.     X      
A
N
A
L
S
I
S
Allocations. Translation of product reliability goals into reliability goals for the components making up the product.   X X      
Design of Experiments (DOE). Systematically determining the impact of process and environmental factors on a desired product parameter, in order to reduce product variability by controlling the factors.     X X   X
Dormancy Analysis. Determination of the effects of expected periods of storage or other non-operating conditions on the reliability of the product.   X X X    
Durability Assessment. Determination of whether or not the mechanical strength of a product will remain adequate for its expected life.   X X X X  
Failure Modes, Effects & Criticality Analysis (FMECA). Systematically determining the effects of part or software failures on the product's ability to perform its function. This task includes FMEA.     X X X X
Failure Reporting Analysis & Corrective Action System (FRACAS). A closed-loop system of data collection, analysis and dissemination to identify and correct failures of a product or process.     X X X X
Fault Tree Analysis (FTA). Using inductive logic to determine the possible causes of a defined undesired operational result.     X X X  
Finite Element Analysis (FEA). Determining the mechanical stresses present in products through simulation by decomposing the product into simple elements.     X X    
Life Cycle Planning. Determining reliability (and other) requirements by considering the impact over the expected useful life of the product. X X X X X X
Modeling & Simulation. Creation of a representation, usually graphical or mathematical, for the expected reliability of a product, and validating the selected model through simulation.   X X      
Parts Obsolescence. Analysis of the likelihood that changes in technology will make the use of a currently available part undesirable. X   X X   X
Predictions. Estimation of reliability from available design, analysis or test data, or data from similar products.   X X X X  
Repair Strategies. Determination of the most appropriate or cost effective procedures for restoring operation after a product fails. X   X     X
Sneak Circuit Analysis (SCA). Investigation to discover the existence of unintended signal paths in a product.     X X X  
Thermal Analysis. Analysis of the heat dissipations, transfer paths and cooling sources to determine if part/product temperatures are consistent with reliability needs.   X   X    
Translations. Determine product design goals (i.e., product reliability) from the user’s operational requirements.   X        
Worst Case Circuit Analysis (WCCA). Analysis of the effects of variability in the components of a product on the product’s performance.     X X X  
T
E
S
T
Accelerated Life Test. Testing at high stress levels over compressed time periods to draw conclusions about the reliability of a product under expected operating conditions, based on formulated correlation factors.     X X X  
Environmental Stress Screening (ESS). Operating a product under high stress to identify defects (by causing them to become failures) in order to eliminate them before a product is shipped to its user.           X
Production Reliability Acceptance Test (PRAT). Testing a product during production to assure that its reliability has not degraded.         X X
Reliability Demonstration Test (RDT)/Reliability Qualification Test (RQT). Testing a product to demonstrate whether its reliability requirement has been achieved.         X  
Reliability Growth Test (RGT)/Test Analyze and Fix (TAAF). Testing a product to identify reliability deficiencies in order to eliminate their causes.       X X  
Test Strategy. Determination of the most cost effective mix of tests for a product. X   X X X X
O
T
H
E
R
Benchmarking. Comparison of a supplier’s performance attributes to its competitors’ and to the best performance achieved by any supplier in a comparable activity. X X        
Statistical Process Control (SPC). Comparing the variability in a product against statistical expectations, to identify any need for adjustment of the production process. X         X
Quality Function Development (QFD). Capturing the desires of the customer and translating these to tasks needed in the product development program. X X        
Market Survey. Determining the needs and wants of potential customers, their probable reaction to potential products, and their level of satisfaction with existing products X X       X
Inspection. Comparing a product to its specifications, as a quality check. X         X

2.6 Product Program Phases

Each product, from the simplest to the most complex, passes through a sequence of phases during its life cycle. The definitions of the phases vary among commercial companies, and within the military. Table 2 describes the sequence of general phases that will be used in this document to describe a product's life.

Table 2: Product Life Cycle Phases
Concept/ Planning Design/
Development
Production/ Manufacturing Operation/ Repair Wearout/ Disposal
  • Formulate ideas, estimate resources and financial needs
  • Identify risks & requirements
  • Program objective
  • Identify and allocate needs and requirements
  • Propose alternate approaches
  • Design and test the product
  • Develop manufacturing, operating, and repair/ maintenance tasks
  • Refine and implement manufacturing procedures
  • Finalize production equipment
  • Establish quality processes
  • Build & distribute the product
  • Implement operating, installation and training procedures
  • Provide repair and maintenance service
  • Repair warranty items
  • Provide for performance feedback
  • Implement refurbish- ment and disposal tasks
  • Resolve potential wearout issues

What sometimes distinguishes one phase from the next is a decision milestone, sometimes referred to as a "gate." It represents a point in time where the program can go forward or stop. For many products, the phases may be abbreviated or combined. For example, the Concept/Planning and Design/Development phases may be combined under a compressed schedule for a new product that is simply an update or slightly modified version of an older, proven product. Reliability tasks for this type of program would concentrate only on the differences between the old and the modified product. As a result, the number of engineering tasks would be reduced. It is important to understand that tasks performed in one phase are often the result of the analysis, trade-offs and planning performed in an earlier phase. For example, trade-offs addressing approaches to manufacturing printed circuit boards would be performed during Design/Development, with the implementation of the process decision to follow during the Production/Manufacturing phase.



Journal Article V13, N1 Add to Read Later list 
Reliability Theory Explains Human Aging and Longevity
Journal Article V10, N4 Add to Read Later list 
Program Managers Handbook - Common Practices to Mitigate the Risk of Obsolescence
Journal Article V12, N3 Add to Read Later list 
An Introduction to Task Analysis
Journal Article V6, N2 Add to Read Later list 
The Status of Reliability Engineering Technology