TRW Automotive Assesses PRISMŽ Methodology for Internal Use
By: Mary G. Priore, Parveen S. Goel, Ph.D., and Rachel Itabashi-Campbell
Executive Summary
TRW Automotive (formally TRW Chassis System Division) in Detroit, Michigan conducted a 10-month in-house effort to identify improved methodologies for assessing reliability and estimating warranty costs. The effort was in response to prohibitive limitations in traditional methodologies, including MIL-HDBK217F, Reliability Prediction of Electronic Equipment, with regard to automotive electronic systems. The objective was to identify methodologies that go beyond or improve upon classical assessment methods. A key task was the systematic evaluation of RACs PRISMŽ methodology for applicability and usefulness in assessing TRW electronic systems. The study consisted of acquiring a thorough understanding of the methodology itself, conducting a PRISM assessment on a system with demonstrated performance, and critically analyzing the PRISM results, including predicted vs. actual failure rates and predicted high failure rate contributors. Results of the effort were positive, with TRW training a team of internal PRISM experts and adopting PRISM methodology as a starting point in a comprehensive methodology to cover a wide range of electro-mechanical systems.
Background
Traditional reliability assessment methodologies, including MIL-HDBK-217F, Reliability Prediction of Electronic Equipment, possess several well-documented deficiencies that historically limited their usefulness with regard to automotive electronic systems. Recently, TRWAutomotive conducted an inhouse effort to identify new and better methodologies for assessing reliability and estimating warranty costs. Objectives of the effort were:
- Address going beyond and improving upon classical methods
- Deal with situations when failure data is scarce
- Integrate available soft failure information
- Quantify data available in qualitative format
One component of this effort was the evaluation of RIAC PRISM technology for use in assessing the reliability of TRWs automotive electronic systems.
PRISM was chosen for evaluation because of its departures from traditional reliability prediction methodologies. These departures address weaknesses identified previously by TRW, weaknesses that had made earlier prediction methodologies inappropriate for use on TRW automotive systems. The basic premise of PRISM is that the reliability of a system is the result of the failure potential inherent in each of the components and sub assemblies, and how well the organization mitigates potential failure modes through product development process, manufacturing control, supplier quality management, maintenance programs, and the robustness of the software. The PRISM capabilities that TRW found most appealing are:
- Accounts for non-component failures by taking into account process-related information
- Includes a software reliability model
- Is capable of integrating a priori knowledge, such as historical data on a similar product
- Handles all available field and test data
- Predictions are focused on most likely outcome instead of worst case
- Prediction results are in calendar hours vs. operating hours
- Models do not penalize commercial components
- Models are updated and kept current
Procedure
To proceed, a pilot case study was developed. For the study, TRW Automotive identified a small, electronic subsystem having both hardware and software components for which considerable reliability data and information were available. TRW tasked RIAC to perform a reliability analysis of the system, using PRISM, to demonstrate the methods and usefulness of a PRISMbased analysis and the potential for applicability to TRW Automotive systems and equipment. RIAC performed the analysis based on the Bill of Materials, system operational and environmental information, and warranty data provided by TRW.
RIAC applied a detailed PRISM methodology to the subject system exercising several of PRISMs capabilities, including the incorporation of existing field/warranty data, the software reliability model, and process grading.
The PRISM analysis was conducted first using PRISM models and data alone, then with TRW-supplied warranty data incorporated into the PRISM model using the Bayesian analysis option within PRISM. This option allows for the refinement of an initial PRISM prediction with the incorporation of new data on a system, as it becomes available.
RIAC engineers began the analysis by entering a hierarchical breakdown defining a series configuration structure of the system into PRISM.
The next step involved defining or defaulting the numerous design and environmental conditions such as component and system stress levels and operating environment based upon information provided by TRW. Where available, actual stress/derating information for system components as well as values of rated power for resistors and capacitance for capacitors were used as specified by TRW. Data on actual component temperature rise above board ambient temperature was used as specified by TRW. PRISM default temperature rises were used to define the temperature rise above board ambient temperature for all other components. Application environment and operating profile conditions for the reliability prediction were based on the temperature, vibration, relative humidity, duty cycle and cycling rate either provided to RIAC by TRW, or defaulted.
Next, PRISM process grading was conducted. PRISM process grading is intended to broadly capture the degree to which actions have been taken to mitigate the occurrence of system and component failures due to the internal processes in place dealing with design, manufacturing, management, quality, infant mortality, reliability growth, induced failures, can-not-duplicate failures, wearout and parts. The process grades were determined from TRW responses to a series of PRISM process grade questions. In PRISM, the defaulted values correspond to the average or typical values that can be expected from a good manufacturer and a good design.
Finally, available TRW warranty data was entered into PRISM. PRISM combines available part and system data with the initial prediction by means of Bayesian analysis.
PRISM then calculated individual component failure rates as well as the total system failure rate both with and without the incorporation of TRW warranty data. PRISM output reports included a component pareto report which ranks system components according to their percentage contribution to the overall system failure rate and an assembly breakdown detail report which lists assembly failure rates and component failure rates by assembly.
Results
Table 1 compares PRISM predicted values vs. the actual field data based on the first six months warranty data. In the table, X represents the actual value of failure rate calculated from TRW warranty data (corresponds to the PRISM inherent failure rate), excluding induced and no-defect-found failures, and Y represents the failure rate calculated from TRW warranty data with all returns, including induced and no-defect-found (corresponds to the PRISM logistic failure rate).
Table 1. PRISM Reliability Summary
| Type of Analysis |
Failure Rate
Calculation |
Failure Rate
(Failures per 106
Calendar Hours) |
| PRISM w/o Warranty
Data |
PRISM Inherent |
2.3X |
| PRISM Logistic |
1.8Y |
| PRISM w/ Warranty
Data |
PRISM Inherent |
~X |
| PRISM Logistic |
~Y |
| TRW Warranty Data
Alone |
Data Inherent |
X |
| Data Logistic |
Y |
Assumptions: Operating Temperature 31°C, Non-operating Temperature 14°C, Duty Cycle 5%. Failure rates include system level multiplier.
TRW then evaluated the PRISM Pareto results table that provides the rank ordering of predicted part failure rate contributors for the sample system. A sample PRISM Pareto chart is shown in Table 2. Actual part descriptions have been replaced with generic descriptions due to the proprietary nature of the data.
Table 2. Sample PRISM Pareto Chart
Electrical component, active
| Rank Order of Greatest Failure Rate Contributors |
| Generic Part Type |
Qty. |
% Contribution to
System Failure Rate |
| Non-electrical component |
|
29.62 |
| Electrical component, passive |
|
23.26 |
| Electrical component, active |
|
8.8 |
| Electrical component, active |
|
5.86 |
| Electrical component, passive |
|
5.31 |
| Electrical component, active |
|
5.03 |
| Electrical component, active |
|
2.93 |
| Non-electrical component |
|
2.93 |
| Electrical component, active |
|
1.31 |
| Electrical component, active |
|
1.06 |
| Electrical component, active |
|
0.73 |
| Electrical component, active |
|
0.69 |
| Non-electrical component |
|
0.69 |
| |
|
0.59 |
The results identify which devices are expected to be the largest failure rate contributors in the fielded system, as designed. Recommended design and derating guidelines for these components are included within the report. As is commonly the case, approximately 88% of the total predicted failure rate was attributable to 20% of the total parts. While the precise ordering of top failure rate contributors by PRISM did not exactly match the order seen in the field, TRW field and test results confirmed that some of these parts were indeed among the high contributors in fielded systems. In fact, TRW had already taken action to design out problem parts identified by PRISM at the time the analysis was complete. One notable exception, however, is software. The software was predicted to be one of the greatest failure rate contributors, but to date TRW has not observed any field failure due to this products software. This is one discrepancy noted by TRW reviewers between the predicted Pareto ranking and the observed field performance of the product.
In the next step, TRW examined PRISM process grading results. PRISM process grading addresses differences in system/assembly design, manufacturing, part procurement, management, and testability processes. It is intended to broadly capture the degree to which actions have been taken to mitigate the occurrence of system and component failure due to these processes. For the TRW pilot system, the process grades were determined from TRW responses to the process grade questions. In PRISM, the defaulted values correspond to the average or typical values that can be expected from a good manufacturer and a good design.
The results from TRWs responses to the process grading demonstrate an improvement over the default values from a good design in most categories. Table 3 compares the PRISM default values with those for TRW, with the best improvement in the Manufacturing category. TRW scored average ratings in the areas of wearout and infant mortality. All other scores were well above average.
Table 3. Comparison of Process Grade Factors
Process Grade
Category |
Default
Value |
TRW
Value |
Percentage
Difference* (%) |
| Design |
0.0941 |
0.0323 |
48.89 |
| Manufacturing |
0.1420 |
0.0134 |
82.75 |
| Part Quality |
0.2430 |
0.1300 |
30.29 |
| System Management |
0.0360 |
0.0133 |
46.04 |
| Can Not Duplicate |
0.2370 |
0.2120 |
5.57 |
| Induced |
0.1410 |
0.0272 |
67.66 |
| Wearout |
0.1080 |
0.1080 |
0.00 |
| Growth |
1.0000 |
0.9290 |
3.68 |
| Infant Mortality |
0.9720 |
Default |
0.00 |
*The positive values for percentage difference demonstrate improvement over an averaged/good industry process.
TRWs Assessment and Future Work
TRW believes that reliability assessment generates important metrics in the product development process. Reliability metrics are used to evaluate the robustness of the new concept, disclose design deficiencies, and verify the effectiveness of corrective actions taken. The use of reliability metrics, however, is not without challenges due to the typically limited availability of statistical evidence to support predictions in the early phases of the product development process. Compensating somewhat for the limited availability of reliability data is the presumed common knowledge that new engineering designs tend to be evolutionary, rather than revolutionary, in that existing designs are very often modified to arrive at new designs to suit the new requirements. Keeping the evolutionary nature of its products in mind, TRW is working to collect and use reliability information from and knowledge of the predecessor design to derive better and more realistic reliability estimates for future designs.
TRW recognizes the PRISM methodology to be one of the first attempts if not the first attempt in industry to systematically account for non-component related failures. In doing so, the development of the method to quantify qualitative information by way of process grading is noteworthy. PRISMs ability to incorporate prior knowledge, by way of Bayesian process, fits the evolutionary nature of engineering very well. With the integration of deterministic, probabilistic, and empirical approaches to derive system-level, as well as component-level and software reliability prediction, PRISM without doubt has taken reliability and warranty predictions beyond mere statistical calculations.
The PRISM program does not, however, serve as a single answer to what TRW is ultimately looking for. For one thing, the PRISM prediction is static. It is based on a point estimate, pieceat-a-time approach and is not intended or designed to provide a time-dependent failure rate. This would be a very desirable feature for warranty forecasting.
TRW attributes the weak correlation between the predicted software failure rate and the observed rate mentioned earlier to the fact that the PRISMs software robustness model is based strictly on a single quality assessment methodology whereas the TRW group who participated in this study employs a different process model and a set of metrics for assuring its software quality. TRW believes that the self-learning capability of the model in development will help reduce this type of gap.
Secondly, some TRW reviewers of the PRISM methodology would like to see confidence limits around the predicted failure rate. PRISM at this time does not provide them.
Further, several TRW reviewers have taken issue with the way screening is treated in the PRISM methodology. PRISM is programmed so the presence of the end-of-line screening favorably affects the prediction outcomes. This is, however, contrary to what has actually been observed at various facilities.
Finally, PRISMs failure rate models basically follow exponential distributions. TRW has observed that not all electrical components fail exponentially.
Having said that, TRW believes that the PRISM methodology provides a positive starting point for performing reliability assessment on electronic systems, as well as a means to gain design and process insight, when very scarce data exist. It has definitely provided the first step towards developing a very comprehensive and exhaustive model for assessing the reliability of its products and systems, a model that will be of great help in future decision-making. Steps have been taken to incorporate the above-mentioned reviewers feedback into the new model.
Conclusions
The TRW PRISM project is the beginning of a long-term effort within TRW to develop a comprehensive reliability prediction methodology by integrating all of the existing best practices, knowledge sources, and tools. PRISM has been found to perform as designed. Like any software program, it is sensitive to assumptions; when used appropriately, it can serve as a valuable design tool for facilitating design trade-off decisions, identifying potential high failure rate items, and assisting in identifying areas for improvement in the manufacturing process to improve reliability. A PRISM Team of engineers at TRW Automotive has attended the RIAC PRISM training and worked closely with RIAC engineers to develop an in-house expertise in PRISM. TRW is looking to use PRISM as a starting point for assessing reliability of electronic control units used on steering systems. Eventually, outcomes of the PRISM prediction will become data to be input into a comprehensive reliability model currently being worked on, which will enable reliability assessment of an entire electro-mechanical system.