Robustness Validation Semiconductor 2015
Robustness Validation Semiconductor 2015
Robustness Validation Semiconductor 2015
Robustness Validation
of Semiconductor Devices
in Automotive Applications
The revision of this handbook under sec- • Robustness Validation for MEMS - Appendix
tion 9.1, explains the application of the deci- to the Handbook for Robustness Validation
sion flow in the Q100/101 annex in more of Semiconductor Devices in Automotive
detail. In addition, other improvements from Applications (2009).
Robustness Validation practice, new tutorials
and publications are subject of this revision. • Handbook for Robustness Validation of
Automotive Electrical/Electronic Modules
Andreas Preussger andcontent copy: SAE Standard J1211
Core Team Leader (2008, under review).
RV Group
Editor in Chief 3rd edition • Automotive Application Questionnaire for
Electronic Control Units and Sensors (2006,
Daimler, Robert Bosch, Infineon).
Andreas Preussger
Core Team Leader
RV Group
Editor in Chief 2nd edition
3
Preface (first edition from April 2007)
Can you imagine hiking on a steep mountain The basic philosophy behind the Robustness
trail in the black of night not knowing how Validation methodology described in this
close to the edge of the cliff you are? Would Handbook is to gain knowledge about the
you feel safe? size of the guard band by testing the semi-
conductor to failure, or end-of-life. The goal
Electronic components, such as semiconduc- of Robustness Validation is to achieve lower
tors, have technical limits that might be very ppm-failure rates by ensuring adequate guard
close to the edge of the customer’s specifi- band between the ‘real-life’ operating range
cation. When this occurs, the semiconductor of the semiconductor and the points at which
can malfunction and possibly cause an opera- the semiconductor fails.
tional failure of a critical vehicle system.
The current ‘test-to-pass’ statistical method
As in the hiking analogy, wouldn’t it be better used to select and qualify semiconductor
to have the information as to how close the devices does not provide information regard-
semiconductor actually performs with regard ing the amount of guard band. This is very
to the specification limits, or better yet, to similar to hiking in the dark without knowing
know that there is a the safety zone, or guard where the edge of the cliff is.
band, between to semiconductor’s perfor-
mance and the specification limits? The safer way is to use Robustness Validation
approach. Please read on.
4
Foreword (first edition)
The quality of the vehicles we buy and the We started this way by introducing screening
competitiveness of the automotive industry methods after the product had been produced
depend on being able to make quality and after product has successfully survived a
reliability predictions. Qualification measures standard qualification. Then the focus shifted
must provide useful and accurate data to pro- to reliability methodologies applied on tech-
vide added value. Increasingly, manufacturers nology level during development.
of semiconductor components must be able
to show that they are producing meaningful Now product qualification again changes from
results for the reliability of their products the detection of defects based on predefined
under defined Mission Profiles from the whole sample sizes towards the generation of knowl-
supply chain. edge by generating failure mechanisms spe-
cific data, combined with the knowledge from
Reliability is the probability that a semicon- the technology field. Now we can generate
ductor component will perform in accordance real knowledge on the robustness of products.
with expectations for a predetermined period
of time in a given environment. To be effi- Qualification focuses on intrinsic topics of
cient reliability testing has to compress this products and technologies, requiring only
time scale by accelerated stresses to generate small sample sizes. Defectivity issues now put
knowledge on the time to fail. To meet any a big load on monitoring measures, which are
reliability objective requires comprehensive now needed to demonstrate manufacturability
knowledge of the interaction of failure modes, and the control of extrinsic defects.
failure mechanisms, the Mission Profile and
the design of the product. Ten years ago you This handbook should give guidance to engi-
could read: “Qualification tests of prototypes neers how to apply Robustness Validation
must ensure that quality and reliability tar- during development and qualification of sem-
gets have been reached”. iconductor components. It was made possible
because many companies, semiconductor
This approach is no longer sufficient to guar- manufacturers, component manufacturers
antee robust electronic products for a failure (Tier1) and car manufacturers (OEMs) worked
free life of the car, which is the intention of together in a joint working group to bring in
the Zero-Defect-Approach. The emphasis has the knowledge of the complete supply chain.
now shifted from merely the detection of fail-
ures to their prevention. I would like to thank all teams, organizations
and colleagues for actively supporting the
Robustness Validation approach.
Andreas Preussger
Core Team Leader
Robustness Validation Group
Editor in Chief 1st revision
5
Acknowledgement
We would like to thank the team members of various committees and their associates for their
important contributions to the completion of the 1st edition of this handbook. Without their
commitment, enthusiasm, and dedication, the timely compilation of the handbook would not
have been possible.
Representative of ZVEI
Winter, Rolf – ZVEI
Representative of SAE
Michaels, Caroline – SAE International
Representative of JSAE
Wakiya, Tadashi – Tokai Rika Co., Ltd
6
Nakaguro, Kunio – Nissan
Narumi, Kenji – Tram
Petersen, Frank – Elmos Semiconductor
Schilde, Bernd – Brose Fahrzeugteile
Schmidt, Ernst – BMW
Senske, Wilhelm – Daimler Chrysler
Takasu, Yuji – Tokai Rika
Unger, Walter – Daimler Chrysler
Vanzeveren, Vincent – Melexis
Wilson, Peter – On Semiconductor
Wulfert, Friedrich-Wilhelm – Freescale Semiconductor
We would like to thank the other members of the RV Forum for their
contribution to the 3rd revision
Representative of ZVEI:
Winter, Rolf – ZVEI
7
Content
1. Introduction 10
2. Scope 10
6. Technology Development 21
7. Product Development 22
8
9.5 Characterization Plan 39
9.5.1 Process Characterization 39
9.5.2 Device (Semiconductor Component) Characterization 40
9.5.3 Production Part Lot Variation Characterization 40
9.6 Sample Size and Basic Statistics 40
12. Improvement 45
12.1 Stress Set-up Review 45
12.2 Mission Profile Review 45
12.3 Application Review 45
12.4 Screening Strategy 45
12.5 Design for Reliability (DfR) 46
12.6 Technology/Design Solution 46
13. Monitoring 47
13.1 Planning 47
15. Examples 49
15.1 Examples of the Lack of or Poor Qualification 49
15.1.1 Delamination between Mould Compound and Die/Lead Frame 49
15.1.2 Qualification of a New Leadframe Finish 49
15.1.3 Via-Problems in Semiconductor Component Metallization 50
15.2 Integrated Capacitor Design 51
15.3 Requirement Temperature Cycles 51
15.4 Power Electronics Design 52
15.4.1 Typical Construction of a Power MOS Device 52
15.4.2 Physics of Failure 52
15.4.3 Impact of Die Attach Degradation on Thermal Management of
a Power MOS 53
15.4.4 Degradation Model 53
15.4.5 Design for Lifetime Tools 54
15.4.6 Impact on Design of the Application and Impact on Component
Selection Step by Step Approach 54
9
1. Introduction
In 2006 members of SAE International Auto- This RVHB provides the automotive electron-
motive Electronic Systems Reliability Stand- ics community with a common qualification
ards Committee, ZVEI (German Electrical and methodology to demonstrate acceptable reli-
Electronic Manufacturers` Association), AEC ability. The Robustness Validation approach
(Automotive Electronics Council) and JSAE requires testing the component to failure,
(Japanese Society of Automotive Engineers) or end-of-life (EOL), avoiding invalid failure
formed a joint task force and published the mechanisms, and evaluation of the Robust-
first version of the Robustness Validation ness Margin between the outer limits of the
Handbook (RVHB) together with an update of customer specification and the actual perfor-
the corresponding SAE document (SAE Recom- mance of the component.
mended Practice J1879, General Qualification
and Production Acceptance Criteria for Inte- Since then the principles defined in this hand-
grated Circuits in Automotive Applications), book have been applied in modules, systems
which was a content copy of the Robustness and other application areas. For details see
Validation Handbook. Section 19.
2. Scope
This document will primarily address intrinsic Other procedures addressing extrinsic defects
reliability of electronic components for use are particularly mentioned in the monitoring
in automotive electronics. Where practical, chapter. Striving for the target of Zero Defects
methods of extrinsic reliability detection and in component manufacturing and product
prevention will also be addressed. The current use it is strongly recommended to apply this
handbook primarily focuses on integrated circuit handbook. If the handbook gets adopted as
subjects, but can easily be adapted for use in a standard, the term ‘shall’ will represent a
discrete or passive device qualification with binding requirement.
the generation of a list of failure mechanisms
relevant to those components. Semiconductor This document does not relieve the supplier
device qualification is the main scope of the of the responsibility to assure that a product
current handbook. meets the complete set of its requirements.
10
3. Definition of Robustness Validation
Robustness Validation (RV) is a process to A Mission Profile defines the conditions of use
demonstrate the robustness of a semicon- for the component in the intended application
ductor component under a defined Mission (see Section 5). The Mission Profile establishes
Profile. RV represents an approach to qual- the basis for the RV approach, providing neces-
ification and validation that is based on sary additional information that is not described
knowledge of failure mechanisms and relates in the datasheet. Experience shows that a sim-
to specific Mission Profiles. The knowledge ple passing on of specifications down the supply
gained by applying this approach leads to chain is inadequate for and incapable of captur-
improvement that extends beyond the com- ing the necessary information. Rather, an inter-
ponent and its manufacturing process under active process including the entire value chain is
consideration. RV contains great potential for needed to achieve a common understanding of
re-use, which contributes in its entirety to a and a mutual agreement on the requirements,
significant increase in quality and reliability, which is a key factor for success of a project.
time to market and reduction of costs. Last This interactive process has to be started in the
but not least, this will result in improvement early concept and definition phase of the project.
of the competitiveness of all involved partici- Cross-functional and inter-company communica-
pants from the value adding chain. tion across the entire value chain shall, therefore,
be established as good practice.
Table 3.1 Illustrates the Meaning of RV by Contrasting Positive (IS) and Negative (IS NOT) Statements
A process to gain knowledge of the failure A process to gain knowledge of where the
mechanisms of a semiconductor component functional
11
4. Robustness Validation Basics
This methodology is based on three key com- The RV Flow (Figure 4.1) is part of the devel-
ponents: opment process. It starts with the transfer of
• Knowledge of the conditions of use (Mission the Mission Profile from the module level to
Profile, see Section 5) the level of the semiconductor component.
• Knowledge of the failure mechanisms and For details of this transfer, see Section 5. The
failure modes and the possible interactions process ends with release for mass production
between different failure mechanisms and definition of the related monitoring plan.
• Knowledge of acceleration models for the
failure mechanisms needed to define and
assess accelerated tests.
Development
Start with Mission Profile Module Application
Technology
Product Spec (7) Technology Spec (6)
Design Rules (6)
Tech FMEA/
n Risk Assessment
Spec Covers
Prod Appl.
y
Knowledge
Matrix
n Robustness
Improvement (12)
Sufficient
12
4.3 Robustness Diagrams all variations of the product and the applica-
tion’s environment. The failure could result
Results of RV can be represented by the use of in different failure modes X, Y, Z, depending
Robustness Diagrams. on the values of the parameters A and B. A
robust component is a component that is able
The Commodity Component Robustness Dia- to maintain all the required characteristics
gram, shown in Figure 4.2, represents the first under the conditions of use over the lifecycle
use of a robustness diagram, and is initiated without degradation to out-of-spec values.
at the conclusion of the finalization of the Mis-
sion Profile. At this point, the Semiconductor The Commodity Component Robustness Dia-
Component Supplier investigates whether the gram should be reviewed with the customer
Mission Profile requirement can be achieved to demonstrate the actual robustness of the
by using the relevant commodity device. component when developing the application
FMEA.
Figure 4.2 provides such a pictorial representa-
tion for two parameters, A and B, which have The Application-Specific Component Robust-
a certain relationship, such as voltage and ness Diagram, shown in Figure 4.3, represents
temperature. Many parameters may be simple the second use of a robustness diagram and
enough to plot one-dimensionally. The red is initiated at the conclusion of the RV Stress
box represents the area of the application’s Test. At this point, the Component Supplier
specification, which the commodity compo- demonstrates to his customer the robustness
nent must meet or exceed. The light blue area of the semiconductor component to exceed
represents the commodity components actual the application specification requirement.
performance. The Robustness Margin is the
distance between any point of application
specification and the point of failure of the
commodity component, taking into account
Ro
Component Capability
ilu Z
M
bu
s
Parameter B
tn
es
Semiconductor
s
Ma
Component Specification
rg
n i
Customer
Application Spec
IC re
Fa ode
ilu Y
M
Parameter A
13
Figure 4.3 Application-Specific Component Robustness Diagram.
The IC specification for parameters A and blue area. When any point (Ai, Bj) lies outside
B can be represented by a box (in red/Fig- the component capability a failure criterion
ure 4.3) that displays the minimum and max- related to A, B or both parameters is violated
imum allowed values. Naturally, the range and the semiconductor component fails. The
of parameter values for a certain application type of failure mechanism that causes the fail-
must lie within this box. However, the spec- ure depends on the parameter values and can
ification limit does not imply that the prod- vary along this component capability curve.
uct will fail at this point. RV identifies the Examples for parameters A and B are given in
point of failure for the values of (A, B). The Table 4.2.
line connecting all points of failure gives the
component capability as shown by the light
Parameter A Parameter B
14
4.4 Difference between RV Approach In stress-based standards, all tests have fixed
and Stress Test Driven Qualification stress conditions over a predefined period of
Standards time [5]. Only a few of the stress tests really
focus on single failure mechanisms. The
The stresses address multiple failure mecha- sample sizes are selected as a compromise
nisms and the test it self being considered pass between failure mechanism detection and the
when NO stress relevant failure occurs. Par- economies of testing and material sets. Stress
ticular business fields usually require specific time is typically chosen to address the antic-
stress recipes, prescribed by standards specific ipated design life of the part based on accel-
to each of them, promoting in the most cases eration models for temperature, voltage, and
single failures with extrinsic defect nature. humidity using mean acceleration factors. As
At the end, these are almost neither system- an example, temperature acceleration is typi-
atic, nor relevant for the real application, and cally addressed by ‘average’ activation energy
only very few intrinsic defects being triggered of Ea = 0.7 eV, while the spectrum of failure
with relevance to the actual service life of the mechanisms ranges from -0.2 eV to 3.3 eV.
component. Investigations of the failures trig- Depending on the dominating failure mech-
gered by these generic tests usually require anism, the use of average values for Ea could
substantial effort on failure analysis and to result in misleading interpretations of stress
yields almost in root cause information with test results. The information gleaned from
less or no importance for component’s actual these tests, while comforting when detecting
service life. Both, effectivity and efficiency Zero Defects, may be misleading to the cus-
of the stress test driven qualification may be tomer. This is caused by the fact, that if no
therefore questionable. failures are generated:
• The actual robustness of the product being
On the other hand, the RV approach requires NOT known.
the institution of wear out studies on particu- • Acceleration factors are NOT measured.
larly chosen tests promoting specific intrin- • There is no proof that the intended failure
sic failure modes and provides significant mechanisms have been triggered.
amounts of failure mechanism specific infor- • The dominant failure mechanism may not
mation. Detailed studies on the accordingly be sufficiently accelerated to demonstrate
triggered failure mechanisms and activation the lifetime requirements.
energies will successfully yield in accumula-
tion of valuable knowledge on relevant fail- In the past, this approach helped the customer
ures. This represents in consequence the basis to compare products from different suppliers
for the Robustness Assessment and supports and to generate a large database of stress test
the calculation of the actual Robustness Mar- results performed under identical conditions.
gin relevant to the component application As the robustness was not known, the quality,
specific Mission Profile. reliability and Robustness Margins could not
be improved effectively, or may even have
Thus, all the accumulated knowledge gen- been unintentionally reduced. Some examples
erated through testing, requested by RV, for which traditional stress-test methodologies
represents is added value and the owning have been unable to detect subsequent field
organization is invited to re-use it as often as issues are described in Section 15.1.
requested.
Development activity is now required to gen-
erate a failure mechanism risk assessment and
a stress methodology that is able to characterize
the failure mechanisms.
15
4.5 Failure Mechanism
Extrinsic failures, on the other hand, are On the other hand it is the main task of the IC
random in nature and a large sample size is design to ensure the expected semiconductor
needed to characterize the critical part of the robustness by addressing all known intrinsic
distribution. failure mechanisms and where ever possible
the particular manufacturing process distur-
Defect density related failures are typical exam- bances, too, through the accurate application
ples for the last group. Therefore, the sample of accordingly developed and engineered
size must be chosen depending on the type of design rules and simulation tools integrated
failure to be addressed by a specific test and in the design flow.
the failure rate target to be demonstrated.
16
5. Mission Profile / Vehicle Requirements
As mentioned in the previous section, the The ideal flow for the generation of these
knowledge on the actual conditions of use conditions of use is illustrated in Figure 5.1.
in the overall system of the semiconductor Starting from the Mission Profile for the vehi-
device under investigation represents one of cle (such as a car or truck), the corresponding
the key components of RV. The RV process high-level requirements are defined. These
for any relevant component shall start always requirements are then transferred from the
with the generation of the Mission Profile different system levels, module level, and
based on its actual conditions of use in the electronic control unit to the level of the semi-
environment of the current and next higher conductor component (see Figure 5.1).
level of the component hierarchy. The supplier
of the semiconductor component will develop As mentioned before this shall not represent
a set of profile assumptions based on market a one-direction process along the chain, but
research and/or interactions with customers to rather an interactive, iterative agile communi-
capture the majority of user application sce- cation, up and down the entire supply chain,
narios. The generation process of the Mission as specifications development proceeds.
Profile for the component in questionrepre- Thereby the requirements become step by step
sents a detailed, back and forward oriented more clear and shall be finally and mutually
communication process across the entire value concluded by all involved parties at the point
adding chain on each detail of the actual Con- of freezing the specification. This is still valid
ditions of Use in the chosen application. The for the Mission Profile, too.
primary and overarching objective is to ensure
the requested/expected quality and reliability Examples of the contents of a Mission Profile
over the entire service life of the final product on ECU level can be found in the paper ‘Auto-
of the OEM. Therefore the BEST known PRAC- motive Application Questionnaire for Elec-
TICE to mutually conclude in good faith for tronic Control Units and Sensors’, published
the actual realization on the best technical, by ZVEI [9].
reliable and cost saving trade-off shall be
established in order to ensure competitive-
ness and the necessary margin to each of the
involved partners.
Vehicle
System
Re
qu
lts
ire
Sub System
su
m
Re
en
ts
n
io
Sp
at
ec
id
ECU
ifi
l
Va
ca
tio
n
Semiconductor
Component
Freeze of Freeze of
Specification Design
17
The Mission Profile represents the collection 5.2 Conditions of Use
of all relevant environmental load/stress and
conditions of use to which a component will The conditions of use are affected by various
be exposed during its full life cycle. parameters, such as service life or mounting
Life cycle is defined as the time period location. The following section provides an
between the completion of the manufacturing overview of the conditions of use and the cor-
process of the semiconductor component and responding requirements.
the end of life of the vehicle. In the same way, a new evaluation is required
The Mission Profile includes: if the conditions of use change for a current
• Transport component; for instance, if this component
• Storage shall be used in a new application.
• Processing In the following text, aspects of the Mission
• Operations in the intended application Profile are discussed in more detail.
Each of the profile items listed above can 5.3 Vehicle Service Life
occur more than once. It is not state-of-the-art
methodology to replace field application con- The most general data concerns the vehicle
ditions by specific stress conditions. A stress service life. This comprises information on
test plan cannot replace the Mission Profile. • Service Lifetime
A specific example of lifetime prediction that The total lifetime of the car.
could be made based on Mission Profile is • Mileage
shown in reference 13. He total number of miles/kilometers that
the car is assumed to be driven during its
5.1 Commodity Products vs. ASICs service life.
• Engine On Time
In the case of commodity products, these Mis- The amount of time that the engine and
sion Profiles are usually defined without a spe- component is switched on (key-on time) and
cific user (as in the case of an ASIC), based on operational during the service lifetime.
the intended customer base and applications. • Engine Off Time
This case is similar to the case of an ASIC; the The amount of time that the engine is
difference being that the input does not come switched off while several applications are
directly from the customer but instead from running (such as the radio on).
internal sources (such as marketing and prod- • Non-operating time
uct definition). The definition of Mission Pro- The amount of time remaining by subtract-
files for commodity products requires infor- ing engine-on and engine-off time from the
mation and experience by the semiconductor total service lifetime.
supplier for certain applications. Contents of
the Mission Profile shall be documented for An example of this kind of data is given in
communication to users. Table 5.1 below.
Note:
There are applications that operate continuously during ‘non-operating’ time (such as theft protection, alarm system).
18
5.4 Environmental Conditions and Stress/ 5.9 Thermal Conditions
Load Factors
The various levels of component integration
The environmental conditions can be classi- require a clear understanding and definition
fied into four main categories as listed below: of the meaning of the temperature under con-
sideration. Figure 5.2 indicates the locations
5.5 Thermal Conditions of different possible points for temperature
measurement for different levels of integration.
• Seasonal/daily variation of outside
temperature and extremes The temperature measurement locations at
• Ambient temperature inside ECU the points defined in the Figure 5.2 can be
• Junction temperature used to describe the thermal conditions in the
ECU and the semiconductor components. The
5.6 Electrical Conditions temperatures are defined as follows:
TVehicle Mounting Location Ambient: Temperature at 1 cm
• Voltage distance from the ECU package.
• Current TECU Package: Temperature at the ECU package.
• Energy (transients) TECU Ambient: Temperature of the free air inside
• Electric field the ECU.
• Magnetic field TECU PCB: Temperature on the PC board
TComp. Case: Temperature at the component case
5.7 Mechanical Conditions surface.
TComp.Pins: Temperature at the component pins.
• Vibration TJunction: Junction temperature of the semicon-
• Shock ductor component (or substrate).
• External load, such as pressure or tensile forces
Thermal conditions include information about
5.8 Other Conditions these temperatures.
1 cm
TComp.Package
TEEM Internal
TComp.Pins TJunction
19
Actual component temperature depends not result in the same failures as vibration but
only on the outside temperature, but is heav- are different from the ones stimulated by
ily dependent on the way of mounting (such mechanical stress due to temperature cycling
as proximity to power devices) and the way [14]. For specific components, such as sen-
of cooling (for example, air flow, heat sinks, sors, mechanical loads – such as pressure –
etc.). Electrical operation of the device itself are inherent in their intended use.
leads to an additional active heating of the
device, which must be taken into account. 5.12 Other Conditions
20
6. Technology Development
Technology Development is the activity that During the pre-production phase, product
creates a process flow and design rules; in reliability and characterization shall specially
most cases, this is in combination with a cell focus on the risks identified by risk assess-
library. Details are described in Section 3 ments (FMEA) during product and technology
(Process) of the RV Manual. The input for this developments. Data collection and analysis
process is created from the Mission Profile of validate the process ability of the technology.
the products or generic applications, which
are planned to be produced with that tech- Prior to technology development projects, the
nology. It is documented in the Technology reliability knowledge must be developed in
Specification. A basic part of the qualification reliability methodology projects. These pro-
of a technology is the characterization of its jects should focus on:
variability. • New materials (such as metal gates)
• New application areas
To improve the time to market, some new • New process recipes
technology development uses a new product • New transistor designs (such as FinFet)
as test vehicle. In this case, both qualifications • New device elements (such as solenoids)
are performed in parallel. A multidisciplinary
team approach shall be used to link the two Deliverables of methodology projects could be:
parallel development flows and to check their • Physical degradation models
progress. Risk management at the design and • Phenomenological models in cases where
technology levels shall lead the qualification the degradation physics is not known
process. • Model parameters for new materials or
technologies
The design rules are defined based on pro- • Spectrum of failure mechanisms for new
cess line capability, elementary device simu- materials and technologies
lations, reliability evaluations, and historical
experience. The design rules must be validated After qualification has been achieved, the
by characterization and reliability testing of development phase ends with the readiness
library elements or specifically designed test for high volume production. Major delivera-
structures. Worst case and marginal structures bles at this point in time are:
should be considered as well as process varia- • Fully documented POR
tions. The results of these validations are part • Evaluated monitoring plan (see Section 13)
of the RV result for each product manufac- • Evaluated control plan
tured on the evaluated technology. The same • SPC operational, including evaluated
generic validation procedure should be used • control limits
for technology levels as for products. Sugges- • Process and Product FMEA or DRBFM
tions for design strategies related to identi- • Evaluated and qualified design library
fied potential failure mechanisms should be
extracted from the Knowledge Matrix (see
Section 16).
21
7. Product Development
With the exception of pilot products for devel- If the measured robustness is below expecta-
opment of new technologies, products are tion, there are several possible reactions (see
usually developed using already qualified Section 12).
technologies and libraries. Re-use of qualified
elements shall be extensively encouraged. The results of the characterization are used to
Previous production data concerning the tech- finalize the data sheet and set up the testing
nology to be used, including production reject required, assuring that all devices produced
analysis, shall be inserted in the Knowledge comply with the functional requirements
Matrix. Risk assessment should be focused on established for the application. It should be
differences between new product and prod- noted that the characterization activities, as
ucts already in production. a whole or in part, might go through various
iterations before they reach the final stage.
The development flow starts with a planning The number of iterations depends on the
phase in which detailed plans are gener- device maturity and the findings from bench
ated and validated, including the necessary testing and especially application testing by
resources. Experiences from previous product the user.
developments should be taken into account.
From lessons learned and best practices, it
Validated design rules, libraries, and simula- is believed that joint user-supplier emphasis
tion models should be singled out. Sugges- on several key development areas will help
tions for design solutions related to identi- achieve best application performance. It is
fied potential failure mechanisms should be therefore expected that extended develop-
extracted from the Knowledge Matrix. ment tasks will be a normal part of a sup-
plier’s process and be defined and executed
Design reviews ensure that the design meets according to their internal processes. Those
the requirements in an effort to catch errors key areas are defined below.
before they become defects in the design. For
risk analysis DRBFM could be a very helpful
approach. Simulations, preliminary test vehi-
cle characterization, and preliminary relia-
bility results such as pre-qualification data
allow validation of the design concept. Risk
and robustness assessment shall be regularly
reviewed taking these results into account.
More rigorously accelerated stress testing can
be used to find the ‘weakest links’ in early
development phase.
22
8. Potential Risks and Failure Mechanisms
The Mission Profile of an electronic component In this database, every failure mechanism is
and the manufacturing technology used con- described with the following information:
stitutes the basis for identification of potential • Name of the failure mechanism.
risks to fail in the application together with • Typical cause of the failure mechanism.
the potential failure mechanisms. The deci- • Typical effect of the failure mechanism (con-
sion base and the result of this risk assessment sidered at the product level of the electronic
should be documented for further reporting. component).
The Knowledge Matrix provides a database to • Material(s) affected by the failure mechanism
support this risk assessment process. • The method to detect the failure.
• The parameter to characterize the failure
8.1 The Knowledge Matrix mechanism.
• Characteristics of the product and applica-
The Knowledge Matrix is a publicly accessi- tion known to calculate reliability figures.
ble database containing data on the current • Design of a structure to characterize the
state of knowledge of failure mechanisms. failure mechanism.
Extended versions could exist based on com- • Methods to prevent the failure mechanism
pany specific data; some of this data may be by design or preventive methods during
confidential. fabrication.
• Optimum stress method to stimulate the
Weblink to the Knowledge Matrix: failure mechanism.
The Knowledge Matrix can be found on the • Acceleration model for the failure mechanism.
website at • Reference describing the physical degrada-
http://www.sae.org/standardsdev/robust- tion model of the failure mechanism.
nessvalidation/km.htm
or
http://www.zvei.org/RobustnessValidation
under ‘Device Level’
23
8.2 How to Use the Knowledge Matrix What is new (compared to the most similar
process available, for instance)?
To prepare the qualification plan, the poten-
tial risk and failure mechanisms must be iden- Technology/Process
tified. Selecting valid fail mechanisms from • Process step (etch, deposition, etc.)
the Knowledge Matrix requires a review of • Material
the entire Knowledge Matrix based on previ-
ous qualification efforts and anything new for Device
the part to be considered. The cause and the • New circuit configuration
failure column could contribute some ideas • New voltage/current levels
that could help to make this list of failure • New element (such as a capacitor)
mechanisms as complete as possible. To check
whether requirements are affected, the effect Design
column, which gives information about the • New structure
effect at the product level, should be taken • New layout
into account. The application column delivers • Feature size
additional information about whether certain (e. g. from 90 to 65 nm)
failure mechanisms are relevant because they
are accelerated by certain environmental con- Specification
ditions, like temperature or voltage. Before • New parameters (AC, DC, timing)
the failure mechanism is chosen for the risk • Changed parameters (limits, extremes)
list, it should be determined if it is related to
only a specific material. Application Environment
• Determine the new environmental stress for
For the project at hand, make a list of applica- the application.
ble known potential failure mechanisms using • Determine how each stress/combination of
the matrices for each semiconductor group: stresses affects the device.
• Technology/process (supported by PFMEA)
• Device (supported by DFMEA) For each additional failure mechanism determine
• Assembly/package (supported by DFMEA) • The characteristics/elements in accordance
• Application/environment with the various categories.
• As a minimum: the reliability test that would
To complete the list with additional potential stimulate/precipitate the failure mechanism.
failures, check the following topics: • Determine if it is possible to accelerate the
additional failure mechanism without intro-
ducing new failure mechanisms, which would
be unexpected under normal use conditions.
24
1. Find the failure mechanism related to the failure cause or Affecting Operating Conditions
voltage and select the subsystem chip.
No Sub Material Failure Failure cause Failure mode Detection Character Affecting
System mechanism Method of Degrad Operting
Conditions
37 chip SiO2 additional charges mobile ions Vth shift causing spec weak comp. spec Vth shift V, T,
violation violation after stress
44 chip poly Si NVM charge loss SILC bit flip or retention Vfh Cell ? Vfh V
ESD fails
74 chip High k dielectrics Gate delectric hard BD surface roughness leak increase & G shor IG leak IG leak A, V, T
contamination
ESD
lattice defects
charge trapping
local GOX thinning
variation of oxide
thickness
mobile ions
dielectric defectivity
51 chip SiO2<=4nm GOX hard BD surface roughness G short IG leak IG leak A, V, T
contamination
ESD
lattice defects
charge trapping
local GOX thinning
variation of oxide
thickness
mobile ions
dielectric defectivity
52 chip SiO2<=4nm GOX hard BD surface roughness G short IG leak IG leak A, V, T
contamination
high E-field
lattice defects
pinholes
charge trapping
local GOX thinning
mobile ions
dielectric defectivity
ESD
66 chip SiO2<=4nm GOX soft BD surface roughness leak increase IG leak IG leak A, V, T
contamination
high voltage
lattice defects
charge trapping
local GOX thinning
variation of oxide
thickness
mobile ions
dielectric defectivity
ESD
75 chip SiO2<=4nm hot carrier injection variation of oxide ID, gm, Vth changes ID subthreshold PMOS IDS V (VDS,
(HCI) field induced thickness (increase or decrease slope vs. VDS VGS);
injection and trapping variation in work depending on channel vs. VGS T; f
of electrons in gate function length) characteri-
oxide near drain region and dopant profile zation
of device line edge roughness
76 chip PMOS gate hot carrier injection variation of oxide ID, gm reduction ID subthreshold NMOS IDS V (VDS,
dielectric (HCI) field induced thickness Vth increase slope vs. VDS VGS);
injection and trapping variation in dopant vs. VGS T; f
of electrons in gate profile characteri-
oxide near drain region line edge roughness zation
of device
61 chip NMOS gate IMD/ILD hard BD contamination, G short IG leak IG leak A, V, T
dielectric CU-diffusion
high E-field
charge trapping
local oxide thinning
mobile ions
ESD
line edge roughness
89 chip IMD, ILD Metal residues causing metal scratch, litho increased leakage defect inspection leakage V
latent defects defect current current
77 chip Cu, AiCu(Si) NBTI, charge trapping process induced or increase in absolute Vfh PMOS IDS V (VDS,
preexisting traps value of Vth vs. VDS VGS);
variation of oxide degradation of vs. VGS T; f;
thickness mobility characteri- duty cycle
variation in dopant zation
profile
surface roughness
93 chip PMOS gate PBTI, charge trapping process induced or Increase in absolute Vfh NMOS IDS V (VGS,
dielectric pre-existing traps value of Vth, decrease vs. VDS, VGD);
esp. nitrided variation of oxide in ld vs. VGS T; f,
oxides thickness characteri- duty cycle
NMOS gate variation in dopant zation
dielectric; profile
esp. nitrided surface roughness
oxides
25
2. One failure mechanism to be taken into
account is gate oxide hard breakdown.
4. The potential effect on IC level is a gate-sub- 9. At this point in time, an overview of all
strate short. failure modes triggered by TDDB stress
can be generated. The sum of all these
Failure mode
aspects gives a full picture of the coverage
G short of the failure mechanisms for the qualifi-
cation plan.
5. The characteristic for detection and char-
acterization is the same: the gate leakage 10. For this particular example the physical
current. model describing oxide breakdown is the
percolation model and the acceleration
Detection Method Caracter of Degrad
model to be used should be the E-model,
IG leak IG leak if gate oxide thickness less than 4nm.
For additional details, see references in
6. The extrapolation from test to product/ the Knowledge Matrix. Figure 8.1 illus-
application level must be done for the volt- trates how a cumulative failure distribu-
age, the temperature and the area, which tion measured on a test structure must be
means that temperature and gate-oxide transformed to the condition in the semi-
area are the other two limiting factors for conductor component.
gate oxide relibility.
Ref (Stress Accelaration Ref (accel
Affecting Operting Conditions Method) Model model in
JEP122F)
A, V, T
JP001 Percolation 5.1.2.1
E-model
7. The optimum design of the test structure
is a transistor array. For this test structure,
the failure criterion of the gate leakage
current must be specified.
Stress Method
transistor array or capacitor
26
Figure 8.1 Extrapolation of Failure Distribution
Voltage
Extrapolation
ln(-ln(1 - F))
t63% use
t63% stress
Area
Extrapolation
Temperature
Extrapolation
Measured Intrinsic
Dielectric
Failure Distribution Statistical
Extrapolation
F@ tlife
Lifetime
(F = cumulative
failure density)
log(time) Target Lifetime
(e.g. 10 y)
27
8.3.4 Limits of Application Range of • Stress test conditions have to be developed.
Test Methods • Analysis technologies have to be developed.
Stress tests could be restricted to: The frequency of introducing new technolo-
• Certain technologies gies stays constant or might increase in the
• Certain materials future.
• Certain parameter ranges • The time for implementing the results of the
new reliability methodology has to be used
Example Helium Fine Leak Test: more efficient to reach the targets.
• Designed to evaluate hermeticity of MEMS • The resources have to be focused.
packages
• Perfect for metallic seals 8.3.7 Limited Knowledge on Models and
• For polymer sealed packages of no use due Failure Mechanisms
to absorption properties of polymers
Keep in mind that the qualification statement
8.3.5 Limited Resources for Reliability is statistical in nature:
Evaluation • Extrapolation from stress to operating con-
ditions
Resources for reliability evaluation are limited • The qualification statement describes the
because high level experts and test equipment situation at a certain point in time
are needed. On the other hand the project • Defects and maverick phenomena on low
schedule limits the available time for these failure level have to be covered by contain-
activities. Time when information for produc- ment activities
tion decision has to be available is defined by
market, not related to the complexity of the RV performed correctly generates the basic
problem. information to achieve ppm levels but qualifi-
Therefore resources have to be concentrated cation cannot demonstrate these levels statis-
on the most critical issues, preferably during tically see also Section 9.5.
the early phase of development. Activities A lot of progress has been made to understand
which do not generate information have to be the physics behind the failures, but a continu-
avoided. The trade-off between residual risk, ous effort is needed.
costs and time-to-market has to be found for
every product.
28
9. Creation of the Qualification Plan
Each Qualification Plan consists of three basic C] Robustness Validation may be applied
elements: with detailed alignment between Tier1 and
• Characterization plan (Section 9.5) Semiconductor Component Manufacturer.
• Reliability test plan (Section 9.2)
• Demonstration of manufacturability (Sec- In addition, not shown in the flow charts, the
tion 9.5.1) expected end of life failure probability may
• A basic consideration how to select the be an important criterion. Regarding failure
appropriate qualification strategy is des- probabilities, the following points should be
cribed in section 9.1 using a flow created considered:
for AEC Q100/101 • No fails in 231 devices (77 devices from
3 lots) are applied as pass criteria for the
9.1 Relation to AEC-Q100/101 Stress major environmental stress tests in AEC
Test Conditions and Durations Q100/101. This represents an LTPD (Lot
Tolerance Percent Defective) = 1, meaning
a maximum of 1 % failures at 90 % confi-
Note: dence level.
Direct references from AEC Q100 are in Italic. • This sample size is sufficient to identify
Similar statements can be found in AEC Q101. intrinsic design, construction and/or mate-
rial issues affecting performance.
In the early phase of a development project • This sample size is NOT sufficient or
a decision has to be made on the appropriate intended for process control or PPM eval-
qualification strategy and the standard to be uation. Manufacturing variation failures
applied. are kept under control by proper process
“Two flow charts are available to facilitate controls and/or screens such as described
both Tier 1 and Semiconductor Component in AEC-Q001, -Q002.
Supplier in a reliability capability assessment: • Three lots are used as a minimal assurance
• The flow chart in figure 9.3, describes the of some process variation between lots. A
process at Semiconductor Component monitoring process has to be installed to
Supplier to assess whether a new compo- keep process variations under control.
nent can be qualified according to AEC- • Sample sizes are limited by part and test
Q100/101. facility costs, qualification test duration and
• The flow chart in appendix E, describes (for limitations in batch size per test.
details see Handbook for Robustness Val-
idation of Automotive Electrical/Electronic A detailed description of flow chart steps is
Modules, ZVEI) given below (numbers refer to these specific
• (1) the process at Tier 1 to assess flow chart steps).”
whether a certain electronic component
fulfills the requirements of the mission 9.1.1 Basic Assessment
profile of a new Electronic Control Unit
(ECU), and 1.1 Items to consider in constructing a Mis-
• (2) the process at Semiconductor Com- sion Profile Assessment:
ponent Manufacturer to assess whether
an existing component qualified accord- • Type of application
ing to AEC-Q100/101 can be used in a • Requirements of service life and usage
new application. • Environmental conditions / Mounting
location
In summary, the flow charts result in the fol- • Construction of the ECU
lowing three clear possible conclusions: • Power Dissipation of ECU and components
A] AEC-Q100/101 test conditions do apply. • Reliability requirements in terms of lifetime
B] Mission Profile specific test conditions may and related failure probabilities
apply.
29
A structured analysis of the mission profile will 1.3 Performance of ‘basic calculation’ facili-
identify potential reliability risks in an early tate the assessment via a high level check of
stage of development cycle, so that these risks the criticality of the mission profile in a given
can be addressed by appropriate component application.
selection and validation. These calculations enable the translation from
the component mission profile to equivalent
1.2 Translation of ECU mission profile to qualification test duration under specified
component mission profiles, taking different conditions. The decision to be made here is
loading on component level into account. An strategic.
overview of loads during component life cycle • Chose no if already known that product
is given in figure 9.1. For details see section 5. is marginal or critical.
• Chose ‘yes’ for uncritical product e. g.
Component’s Life Cycle with references to already qualified
products and being not at the extremes
assembly vehicle service life of its specification.
30
The possibility to perform the qualification A: Conclusion: perform qualification accord-
according to AEC Q100/Q101 or determine ing to AEC Q100/Q101 test conditions
if additional testing and/or data is required,
because the combination of this component in
this application is critical or marginal, deter-
mines the next step. The choice is:
• Yes: It is critical or marginal, requiring
further analysis as to what new tests/condi-
tions/data are required for qualification
• No: The standard AEC Q100/Q101 require-
ments are appropriate for this application/
component combination
7000
6000
Tt = 150 °C Arrhenius Ea = 0.7 eV tt = 1695 h
5000
4000
3000
(junction tem- (activation energy; 0.7 eV (test time)
2000
= exp • − =
-40 -20 0 20 40 60 80 100 120 140 160 180
f t
tu (use time) k B Tu Tt mechanism and range from Af
and Tu (use -0.2 to 1.4 eV)
temperature)
from example in kB = 8.61733 x 10-5 eV/K
annex F (Boltzmann’s Constant)
Thermo- nu = 54,750 cls ∆Tt = 205 °C Coffin Manson m=4 nt = 1034 cls
mechanical number of (thermal cycle (Coffin Manson exponent; 4is (number of
engine on/off temperature m
a presumed value and to be cycles in test)
cycles over 15 change in test Af = ∆Tt used for cracks in hard metal
years of use) environment: ∆Tu alloys, actual values depend n=n t
u
31
9.1.2 Mission Profile Validation on 2.4 A comparison with the standard qualifica-
Component Level tion duration tSTAND is to be made. In case tSTAND
> tCALC + tSM, the component is assumed to be
2.1 The recommended base for assessing the not critical/ marginal. The robustness margin
critical failure mechanism(s) is the Robustness tSM has to be defined based on the application
Validation Knowledge Matrix or JEP122. Risk and customer requirements. Assessment of
assessment should be performed covering at criticality shall include the accumulated fail-
least the following main considerations: ure probability until end-of-life. Criteria for a
• New materials or interfaces decision shall include not only test conditions
• New design or production techniques and durations as compared to the standard,
• Critical use conditions but also coverage of critical failure mecha-
nisms by the tests. Such coverage consider-
Methods for risk assessment could be FMEA ations include applicability of assumptions
(AIAG Manual), Risk Assessment, FTA or sim- used in calculating the stress conditions, such
ilar. as variation of the activation energy for differ-
ent failure mechanisms. Beyond that it has to
2.2 In case acceleration models are in use in be assessed, if particular failure mechanisms
the company or known from the literature, are addressed by the standard test method at
they can be taken to perform lifetime calcu- all. A case in point is active cycling of power
lations. Experiments, simulation, or literature devices, which is not adequately addressed by
study can be used to create such acceleration standard qualification tests. In addition, spe-
models. Sufficient acceleration may be impos- cific requirements regarding fail probabilities
sible due to limiting physical boundary con- may not be covered by standard test proce-
ditions. In such a case minimum stress times dures. MIM capacitors, for instance, are known
should be defined to demonstrate sufficient to fail due to extrinsic defects. A requirement
robustness margin, (e. g., based on change or of, e. g., less than 100 ppm for extrinsic fail-
degradation of any electrical or physical prop- ures is not covered by standard tests and sam-
erties during or after stress and the impact on ple sizes. The assessment should be aligned
the specific application). with Tier1.
2.3 The acceleration model is used to calcu- 2.5 In case the component standard qualifica-
late the acceleration factor for the standard tion is not sufficient the supplier may define
stress condition. This in return gives the cal- additional tests on product level or change
culated minimum required stress time tCALC (in the test conditions to close the gap between
h or number of cycles) to demonstrate reliabil- Q100/101 coverage and mission profile
ity without failures. Two examples for thermal requirement.
and thermo-mechanical loading are described
in table 9.2. It is assumed that the failure 2.6 The possibility to create additional data
mechanisms listed in column 4 are critical (Flow Chart 1) or show that additional data
for the intended application with the mission is not critical/ marginal determines the next
profile described in column 2. Column 3 gives step. The choice is
a proposed stress test and condition. The cal- • Yes: It is possible to find additional tests on
culated test duration in the last column refers product level or to change the test condi-
to this stress condition. The standard test con- tions to close the gap between Q100/101
dition has to be adapted to these test times. coverage and mission profile requirement.
An example for an additional test, which is not • No: It is not possible to demonstrate the
covered by standards like AEC Q100, is listed fulfillment of the reliability requirements
in table 9.3. It should be done in addition to according to mission profile by a test on
standard testing. product level.
32
B: Conclusion: Testing must be performed
according to mission profile specific test con-
ditions. This means, that standard tests are
used with adjusted test times and different
test conditions e. g. higher temperature.
Loading Mission Stress Critical failure Acceleration Model Model Parameters Calculated
Profile Input Conditions mechanism (all temperatures in K) Test Duration
Thermal tu and Tu from Tt = 150 °C Lifted glassiva- Arrhenius Ea = 0.42 eV tt = 2860 h
example in (junction tem- tion Ea 1 1
(test time)
annex F perature in test A = exp • − tu
t=
f
33
Table 9.3 Example for an Additional Test Required by Specific Application
Loading Mission Stress Critical failure Acceleration Model Model Parameters Calculated
Profile Input Conditions mechanism (all temperatures in K) Test Duration
Thermo- nu = 11,000 cls ∆Tt =165 °C 2nd level (board Norris-Landzberg a = 2.65 nt = 865 cycles
mechanical number of cold (thermal cycle level) solder a b
tt 1 1
b = 0.136 (number of
∆Tt
starts over 15 temperature joint fatigue A = exp c
−
c = 2185 cycles in test)
∆Tu tu Tmax,u Tmax, s
f
Af
∆Tu = 80 °C -40 °C to
(average 125 °C), Test: TC Modification according to
thermal cycle Pan N. et al, Proc. SMTA,
temperature 2005
change in use
environment)
34
9.1.4 Application Note
1.1
Basic Assessment Determination of reliability test criteria for a new component based
on mission profile requirements of intended application Responsibility
Determine Mission Profile Tier 1 +
on ECU level Process at Component Manufacturer (CM): Tier 1 CM
CM
Assess whether a new component can be qualified according to AEC-Q100 test conditions
1.2
Determine mission profile
of the component
including loading
Mission Profile Validation on Component Level Robustness Validation
2.1 on Component Level
1.3 Basic No Determine critical Failure
calculation? Mechanisms
Figure 9.3 Flow Chart 1 – Reliability Test Criteria for New Component
35
9.2 Reliability Test Plan • Acceleration model used.
• Vehicle (= test structure): The test structure
All failure mechanisms that have been iden- must be representative of or related to the
tified as potential risks must be addressed by product design and the application condi-
reliability data. Information already existing tions that the product may experience in the
from previous investigations or data from the field. This may require detailed documen-
development work could be used to confirm tation.
low risk levels (see also Section 6). The appli- • Stress method.
cability of generic data must be demonstrated. • Stress conditions (parameterization of the
Some types of input to the Qualification Plan stress test): Stress conditions must be opti-
could be extracted from the knowledge database: mized with respect to the failure mechanism
• The test structure that could be used for to be addressed.
reliability characterization. Circuits, sub • Sample size or number of lots: Qualification
circuits, library elements, or the complete shall provide statistically valid data for the
semiconductor component should be con- demonstration of intrinsic failure mecha-
sidered as the appropriate test structure. nisms. Failure rates in the range of ppm at
Criteria such as availability or analysability the product level cannot be demonstrated
should also be taken into account. in the qualification phase.
• The stress method that could be used to • Parameter for characterization (P) of the
address and accelerate the failure mechanism. test structure.
• Method of failure analysis for characteriza-
Special attention must be given to the failure tion, if needed.
rate of the specific test structure. There is no • Fail criteria (Pfail) or acceptance criteria
generic rule about the manner in which this • Readout times or intervals and criteria for
number is calculated from the failure rate tar- the end of test.
get of the product because of the potential dif-
ference in the failure paretos. If, for example, In certain instances, reliability validation may
one failure mechanism dominates the failure also require verification at the ECU level. This
rate of the product, the assumption that more can only be accomplished by the user of the
than 50 % of the product failure rate may be component and requires agreement and coop-
due to this dominant mechanism is reasona- eration between the manufacturer of the com-
ble. However, if several failure mechanisms ponent and the user.
have comparable failure rates, the product
failure rate must be distributed among them. An example and template that includes these
For the assessment of reliability, the Qualifi- elements in a qualification plan is shown in
cation Plan shall contain the following infor- Appendix A. For every reliability characteri-
mation for every stress test: zation, a target value is needed as a gate to
• Targeted failure mechanism(s), including an separate the failure case from the expected
explanation of relevance (give rationale if performance according to its requirement.
the typical failure mechanisms are rated as This target value is applied to the parameter
irrelevant). P that is used for characterization of the deg-
radation during stress. In some cases, Ptarget is
not directly defined as a requirement. In such
cases, the relationship between the require-
ment and P must be known. The target value
could be a lower or upper limit or a range and
shall include the relevant tolerances.
36
The characterization column of the Knowledge These groups of objects – called qualification
Matrix indicates which parameter should be families – could consist of wafer technologies
measured during stress to generate the deg- or parts of it, assembly technologies or parts
radation over time. To calculate the lifetime of it, packages, or semiconductor components
under stress conditions, a fail criterion must with similar functions, specifications or appli-
be defined that is related to the requirements cation conditions. The relevance of the appli-
or the spec values. Examples for degradation cation of generic data must be supported by
parameters are: other documents or data.
• Leakage current for gate oxide related failure
mechanisms. A qualification family will be defined by its
• Resistance for electro- and stress migration. manufacturing attributes (material, site and
• Transistor parameters (such as threshold processes).
voltage, drain current or transconductance). Examples:
• A die family will be defined by its wafer fab
Acceleration may be limited due to items attributes.
such as competing failure mechanisms or the • A package family will be defined by its
intrinsic robustness of the system or design. assembly attributes only.
An insufficient number of failures may occur
during economically acceptable test duration Family definitions, test results and the appli-
(for example, due to physical boundary condi- cability of those must be clearly communi-
tions). There is no generic solution that fits all cated to the customer.
cases, but potential options are:
• Choose more sensitive failure criteria and Example for one failure mechanism –
correlate the results. electromigration:
• Increase the sample size and stop the test A certain functionality required by the cus-
after the first part of the failure distribution tomer of an electronic component can be
has been measured. achieved only if the product has a certain
• When only a portion of the distribution complexity. The minimum feature size of
fails, the statistical solution for lognormal the technology could be defined from the
distributions is described in JESD 37; for required complexity. This minimum feature
other cases, see e. g. [15]. size is associated with a maximum current
• If there is no failure, one could make the density in interconnects, which together with
following assumptions and estimate the a corresponding lifetime and failure rate, is
lifetime of the failure mechanism: a reliability target for the reliability qualifi-
-- Use a known model and typical cation of the technology. The failure mecha-
parameters (such as for lognormal nism related to current density is electromi-
distribution). gration. The failure criterion is defined by the
-- Assume that the first device under test maximum resistance change tolerated by the
fails right at the end of stress time. design of the product. In this case, the critical
parameter for reliability qualification is ΔP
Before performing many expensive and = ΔR. By applying reduced current densities
time-consuming qualification tests, it should to a design, the target failure rate could be
be determined whether data are already avail- reduced or the lifetime could be prolonged by
able that demonstrate the robustness of the keeping the failure rate constant.
semiconductor component with respect to a
certain failure mechanism. These generic data
could have been generated by testing an object
different from the one under discussion, but
the data may be valid for a group of objects. An
object could be a semiconductor component, a
package, a wafer – or a package technology.
37
9.3 Definition of a Qualification Family 9.4 Qualification Envelope
9.3.1 Wafer Fab
For ASICs, the alignment of requirement with
All semiconductor components using the same the specification of technology and semicon-
technology, process and materials with com- ductor component can be done by direct cor-
mon major elements (such as 90 nm effective relation. For cases in which a broad spectrum
channel length, Cu metallization, intermetal of applications must be addressed, an adapted
dielectric material, shallow trench isolation), approach for generating the qualification plan
are categorized as one qualification fam- must be applied. Taking the worst-case values
ily and are qualified by association when for each specified characteristic separately is
one family member is successfully qualified. one way to create this envelope (specification X
Family re-qualification is required when the in Figure 9.4). In many cases, however, this
process or material is changed significantly procedure leads to overly conservative spec-
(major process changes). Typical considera- ification, unnecessarily increasing the cost of
tions for wafer fab process descriptions are: development and qualifications. It is advanta-
design rules, lithography technique, metalli- geous to define the envelope in more detail to
zation, polysilicon, dielectrics etc. generate an efficient solution (specification R).
Figure 9.4 illustrates the difference between
9.3.2 Assembly Processes the two approaches for two spec parameters
A and B.
The processes for plastic and hermetic pack-
age technologies must be considered and The more efficient approach fulfils the same
qualified separately. All semiconductor com- requirement from Applications 1 and 2 and
ponents using the same process and mate- prevents over-engineering.
rials, with common major elements (such
as biphenyl mold compound, Alloy 42 lead- In cases where specific applications are not
frame material, Pb-free lead plating), are known, it becomes a strategic decision as to
categorized as one qualification family and how to define the spec area of the semicon-
are qualified by association when one fam- ductor component and the correspondent
ily member is successfully qualified. Family qualification envelope. The trade-off between
re-qualification is required when the process costs, Robustness Margin, and spec area must
or material is changed significantly (major be found based on generic market informa-
process changes). Typical considerations for tion. In this case, the robustness of the semi-
assembly process description: leadframe, die conductor component must be measured with
attach, package material, bonding, external respect to this generic specification, so that
lead finish etc. the robustness for an in tended application
Specification X
Appl. 2
Parameter B
Specification R
Appl. 1
Parameter A
38
can be calculated in the process of choosing a electrical parameters affecting the perfor-
semiconductor component for a specific mod- mance of the device in a given application.
ule design. This means, for instance, that the The packaged devices from this material can
same product could fit into a safety relevant then be assembled into an automotive system
application with lower spec values and high to understand which region of the process
robustness and fit into an uncritical appli- space yields the best performing devices. This
cation with a higher spec value and lower type of characterization should be performed
Robustness Margin. for new supplier semiconductor component
designs that have little or no relation to pre-
9.5 Characterization Plan vious designs, the first automotive applica-
tion of an existing semiconductor component
The Characterization Plan should include the design, or a more demanding automotive
plan for material and testing to ensure the application for an existing design. The char-
functionality of the product over all produc- acterization requirements may be reduced if
tion variations and all temperatures and volt- appropriate data exists from other project(s).
age ranges. Testing should be at both spec lim-
its and beyond (as appropriate) to determine 1. Determine relevant failure modes of the
the margins. The plan should include process semiconductor component that can affect
variability (that is, corner lot details) to show the application.
the range of production material. Data should 2. Correlate these failure modes with the cor-
be statistically summarized to show Cpk of each responding process parameters that affect
parameter. them.
3. Determine if there is independence between
Any data from previous characterizations is the failure modes and the corresponding
also useful. The purpose of this is to make sure process parameter. If not, a design-of-ex-
that the design and production are capable of periments may need to be performed.
maintaining specified Cpk values for all speci- 4. Assign statistically significant sample sizes
fied parameters. to each process corner split when perform-
ing electrical testing.
9.5.1 Process Characterization 5. Record variable electrical parameter data
over extremes of temperature, voltage, fre-
Note: quency, and/or loading. The variables data
The user shall not exceed agreed upon speci- for each parametric test should include
fication limits under any circumstances. Char- a summary of mean, standard deviation,
acterization beyond specification limits is for minimum, maximum, Cpk and upper and
information on robustness only. If at any time lower spec limits for each process corner
a part is found to operate, during the applica- and extreme sampling.
tion, beyond the agreed upon limits, requires
agreement by both parties, especially legal These variability considerations should be
and financial indemnification to the supplier done by appropriate simulations of critical
on the part of the user. sub circuits. Typical characterization plans
may include the demonstration of process
variability. This may be in the form of a corner
Process variability characterization may take lot plan including device parameters (thresh-
the form of process corner matrix lots contain- old voltage Vt, effective channel length Leff,
ing groups of wafers that have one or more etc.) to be varied and the expected effect on
process steps varied by plus or minus several semiconductor component performance. The
sigmas. If process variations cannot be pro- yield values shall be assessed with respect to
duced in a dedicated manner, test samples the target yield and potential yield detractors
may be selected by using SPC data to identify should be used as a starting point for continu-
wafers or test samples near or beyond the lim- ous improvement planning.
its. The chosen process steps should be known
to directly correlate to specific functions or
39
9.5.2 Device (Semiconductor Compo- part tested. The variables data for each
nent) Characterization parametric test may include a summary
of mean, standard deviation, minimum,
The characterization conditions depend on maximum, Cpk and upper and lower spec
the component under consideration. For digi- limits for each extreme sampling.
tal circuits, typical characterization plans may 3. If attribute data is to be taken, electrical
include: tests over extremes of temperature, volt-
• Min/Max operating parameters – tests that age, frequency and/or loading may result
find min/max conditions for supply voltage, in Schmoo plots diagramming the func-
bus timing, frequency, etc. tional parameter space within which the
• Margin testing – voltage will be charac- part will operate.
terized to find the potential fail levels (I/O
levels). 9.6 Sample Size and Basic Statistics
• Current level characterization – measure all
leakage currents (Idd etc.) to determine mar- Typically sample sizes available for qualifica-
gins to the specification. tion are small compared to the failure target
-- Power supply current level charac- which should be achieved under high vol-
terization – measure Iddq and ΔIddq, ume production of the electronic component.
to determine static test pattern and A Zero-Fail at stress strategy is only able to
power supply current margins to the demonstrate that catastrophic problems are
spec. not expected to happen. This can easily be
-- Leakage current level characteriza- seen in Table 9.4 where the number of test
tion – measure all leakage currents. devices are listed which are needed to demon-
Measuring leakage as a condition of strate a certain failure number with 90 %
connecting Vcc/GND directly to CMOS confidence assuming that after the test no
gate and determining the spec limit, failure is detected. To give an example: 0/77
in order to detect potential defects, failures at 90 % CL demonstrates 30,000 dpm
helps to improve robustness. failures.
• PLL characterization (stabilization time,
lock, jitter).
• Oscillator parametric tests. Table 9.4 Required Sample Size
for Different Failure Targets Assuming
9.5.3 Production Part Lot Variation 90 % CL
Characterization
samples failure (dmp)
Production parts from a centered process can
be characterized over temperature, voltage, 4 100,000
frequency and loading to understand its inher- 231 10,000
ent variance. Devices from specific parametric
462 5,000
extremes can then be assembled into a sys-
tem to observe if the process centering yields 2304 1,000
weak regions in the process space. This type
4606 500
of characterization can also be used if process
corner matrix lot characterization proves to 23027 100
be too expensive or time consuming for the
115130 20
anticipated benefit in yield and performance.
232600 10
1. Decide on a statistically significant sample
size if the entire lot is not to be tested. A failure distribution as shown in Figure 9.5
2. Record variable electrical parameter with 50 failed devices out of 77 samples could
data over extremes of temperature, volt- be used for extrapolating down to the ppm
age, frequency, and/or loading for each target level at a specific stress test time.
40
Figure 9.5 Example of a Weibull Probability Plot for 50/77 Failed Devices Including 90 %
Confidence Curve
99.00
90.00
50.00
10.00
5.00
1.00
0.50
CDF F(t) / %
end of test
0.10
0.05
0.01
5.00E-3
1.00E-3
5.00E-4
1.00E-4
0.1 1.0 10.0 100.0 1000.0
Time (a.u.)
Note:
The RV Method generates statistics knowledge
which could be used for better risk analysis.
41
10. Stress and Characterization
The stress tests must be performed according The output of the qualification test shall be
to the requirements specified in the Qualifi- documented with the parameters listed:
cation Plan. The equipment must satisfy the • P = P(t), the change of the parameter over
requirements with respect to the stress test the time of stress if there is a continuous
parameters as defined in the Qualification degradation or
Plan, and the tolerances of the parameters • Pi = P(ti), if there are discrete readout inter-
must be known. The reference column of the vals, denoted by the subscript i.
Knowledge Matrix contains a detailed descrip- • Fail distribution (TTF) under stress condi-
tion of the method and how to perform the tion.
test. • Confirmation is needed that the intended
Characterization data must be completely failure mechanism has really been acti-
logged for all readouts. The test at readout vated. Different failures that may occur but
must comprise the full program. There must are not correlated to the addressed failure
not be a stop-on-first-fail nor must parameter mechanism must receive special attention
values be substituted by error log values. The and must be treated separately in the life-
latter should only help to identify problems time/risk assessment. Such failure mecha-
during testing. The parameter values will be nisms may show up as irregularities in the
needed for further drift/fail analysis. Critical degradation curve or failure distribution,
parameters may be monitored continuously such as bimodality.
during the entire characterization procedure • Model used to convert results from stress
in order to react quickly and in a focused test to lifetime at use conditions.
way in case of failures. The measurement fre- • Other factors that must be taken into
quency must be adapted to the level of accel- account, such as duty cycle.
eration.
Based on this information, the lifetime per
Note: failure mechanism can be calculated:
It is not useful to log parameters not related to • Lifetime under stress conditions tstress
the applied stress for all readouts. • Lifetime under use conditions tuse
Note:
In some cases, only the time to (catastrophic)
fail is recorded (example: in many cases, TC
leads to catastrophic failures; for example,
electrical failure due to cracking, that can-
not be observed during degradation, that is
the crack initiation and propagation). In such
cases, only the time to fail, tf, is available.
42
11. Robustness Assessment
The robustness assessment must be done • Degradation of the small signal gain of a
separately for each identified failure mech- common source amplifier at various stress
anism using the Knowledge Matrix when the voltages
potential risks and failure mechanisms for this • Degradation of the output current of a cur-
qualification were assessed. Failure mecha- rent source at various stress temperatures
nisms that were not identified but did occur
in the qualification will also be assessed for Whenever the degradation curve reaches the
robustness. The robustness assessment is done failure criteria Ptarget, the lifetime tP(Si) corre-
by compiling a robustness diagram and com- sponding to the stress value Si is determined.
paring stress test and characterization data to
the requirements. A basic guideline giving details on how to
generate and use a failure distribution has
11.1 Lifetime as a Function of Stress Value been published by ZVEI in 2012 (see Appendix
D3, How to measure lifetime for Robustness
During reliability qualification, the reliabil- Validation – step by step)
ity characteristic P has been measured over
time as a function of the stress value Si (see 11.2 Determine Boundary of the Safe
Figure 11.1). Operating Area
Examples for such degradation curves could be: The boundary of the safe operating area can
• Resistance degradation at various current be calculated from the stress lifetime values
densities and temperatures tp(Si) (see Figure 11.2). Stress/time values
• Degradation of the 1-level vs. read/write below the measured curve do not result in a
cycles of non-volatile memories failure; values on and above the curve will
result in a failure.
Figure 11.1 Reliability Characteristic as a Function of Time
The stress-lifetime curve can be extrapolated
P P = P(t) to use conditions if the acceleration model is
known (see Knowledge Matrix). If the model
S1 < S2 < S3 is not available from the Knowledge Matrix
or relevant standards, one should apply the
Ptarget S1 method described in JESD91A (Method for
S2 Developing Acceleration Models for Electronic
S3
Component Failure Mechanisms) [2].
43
Figure 11.3 Robustness Target Definition If the measured robustness curve (blue curve
in Figure 11.4) is outside the target area (black
P boundary in Figure 11.4), the robustness per
Robustness Required
Robustness Target Boundary failure mechanism is sufficient. The blue bar
Point of Use defines the measured Robustness Margin with
Probust respect to a specific failure mechanism and a
target value (see also Figure 4.2).
An example for such a type of robustness tar- Because in general robustness of a parame-
get could be the maximum allowed leakage ter means that the parameter is lees sensi-
current calculated based on the maximum tive to the change in the statistics of the input
allowed quiescent current over a use time of parameters (conditions of use, process varia-
ten years of operation. The robustness target tions, etc.). Therefore, it is useful to introduce
could be found by defining the robust value a deviation parameter σ to the RIF. Further-
for the leakage current after 10 years of oper- more, because reliability lifetimes are often
ation and by defining the operation time with- Weibull or log-normal distributed, so ln(t) is
out violation of the leakage current target. used which is (approximately) normally dis-
Values between this robustness targets could tributed. In analogy with Cpk, an RIF of a relia-
be found by interpolation. bility parameter is defined by
44
12. Improvement
If the measured robustness is less than the 12.2 Mission Profile Review
target, there are several possible reactions.
Some of the measures could be applied before A more detailed review of the Mission Pro-
the robustness measurement in the develop- file, especially for the critical topics with low
ment phase. In all cases, this is the preferred robustness values, should be performed to
approach. The following options for improve- identify safety margins that have been added
ment are not sorted by priorities or effectiveness. due to the lack of knowledge or data. The
Each option must be checked to determine which result of this activity could be a newly defined
measure is the most effective and the most effi- point of use. A tool like FMEA could support
cient and therefore has to receive top priority. this activity to quantify the risks associated
with critical topics.
12.1 Stress Set-up Review
12.3 Application Review
When the evaluated robustness does not
match the targeted level, the first step (least The robustness of a product is reflected in the
expensive) consists of a review of the set-up application. Improvements could be made by
and the data. alignment of the system design with the user
application through co-engineering activities
The following points should be checked: between the supplier and the user. Designel-
• Confirmation of data: ated issues could be addressed if the supplier
-- Are the stress conditions correctly and the user are involved and understand the
defined (e. g. to avoid overstress that use application and requirements. Also, both
stimulates irrelevant failure mecha- design teams can become educated on proper
nisms)? Is the equipment calibrated? device use and specification:
-- Are the measurements conditions under • Check if a robustness target is required in detail.
control? • More robust system design.
• Review of sample selection: • Part de-rating.
-- Are the samples representative for the
production? 12.4 Screening Strategy
-- Were weak engineering samples selected
for test? The screening strategy should be adapted to
-- Are the already stressed samples iden- the failure rate target. It should address the
tified? failure mechanisms identified during the
• Feasibility of failure mode: reliability qualification. The failure mecha-
-- Is there a good matching between tar- nism Knowledge Matrix could be used in the
geted and obtained failure mode? adapted stress method.
-- What is the model used? The screening strategy should be based on:
-- How are the model parameters used • Deployment of Part Average Testing to
from the extrapolation chosen? detect and eliminate the outlier devices.
• Review of stress method: • Deployment of Statistical Yield Analysis and
-- Is there a good matching between Statistical Bin Limit to separate the abnor-
applied stress and targeted failure mal wafers.
mechanism? • Standardization of tests programs: defini-
-- Did the test comply with the appropri- tion of a basic test program template, based
ate industry standard test method (i. e., on frontend technology, blocks functional-
JEDEC, IEC)? ity, product and application specificities.
• etc. If burn-in is used to reduce failure rates, it must
If the insufficient robustness is confirmed, be demonstrated by a burn-in study that the
improvement measures must be defined. relevant failure mechanism is really addressed.
In some cases, the weak parts are related to
certain locations on the edge of the wafer. If
this is the case, the wafer edge exclusion zone
can be changed to solve the problem [10].
45
12.5 Design for Reliability (DfR) If there is more than one option to improve the
robustness, a trade-off must be found. The solu-
DfR is a powerful approach to prevent reliability tion depends on the weight of these factors:
problems in the application by early application • Cost
of design measures. Therefore, the measures • Schedule
listed below should have been applied already • Quality
during the design phase. Depending on the rea- • Performance
sons for insufficient robustness, the enhancement When robustness improvement cannot be
of these measures should be discussed in this achieved by the means indicated above, the
phase. Potential measures are: project situation and the related risks must be
• Redundant design. reviewed between the customer and the semi-
• Combined with an adequate simulation tool conductor component supplier.
and the data of the RV, the weak links in the The preferred strategy is to perform iterations
design could be identified and mitigated. in the definition of the product mission criti-
Examples are redundant vias and broad- cal elements in order to offer a more effective
ened lines in the interconnect part of the trade-off between risks, performance, devel-
semiconductor component. opment timing, development costs, semicon-
• Reliability simulation. ductor component cost, and end-user require-
• Simulation should be performed again ments.
using the RV results to improve the accuracy A typical example of such a situation would
of the simulation data. be if the current prevention techniques do not
• Part de-rating. enable the expected target in the Mission Pro-
file to be met. A potential trade-off in such
12.6 Technology/Design Solution a situation is to increase the silicon size, and
costs, to add error detection and correction
Technology solutions are the set of smart solu- circuitry, for instance at embedded DRAM spe-
tions (coupling process, design, and application) cific failure modes.
able to resolve the RV gap. The comprehensive Before implementing the solution for improv-
list cannot be provided here as these solutions ing robustness, the proposed solution must be
require a case-by-case definition, but examples reviewed with respect to several aspects:
could be: • Is the expected improvement good enough
• Junction temperature watchdog. This device with respect to the robustness target?
monitors the product operating junction • Does the solution influence the robustness
temperature and is able to activate a low with respect to other failure mechanisms?
power mode with reduced functionality • What is the implementation risk (probabil-
mode when the junction temperature is ity that the implementation fails)?
above a defined limit. In most cases, the solution of a problem of low
• Lowering voltage for voltage driven risks robustness generates some basic knowledge.
(pushing the process to its capability limit). This could be:
• Multiplying or removing critical elements, • New design strategies.
like decoupling capacitors, if they do not • Better understanding of degradation or
meet the Mission Profile requirements. extrapolation models.
• Critical element redundancy and switch • Better understanding of Mission Profiles.
capability: The critical element (like a This knowledge should be fed back to the cor-
capacitor) is controlled via a defined cir- responding knowledge base to be made avail-
cuitry able to switch to a new capacitor if able for the next generation of projects. New
the first one fails. failure mechanisms or their model descrip-
• Use chip parasitic structures to clamp or tion should be used to update the Knowledge
bypass critical stress conditions. Matrix (see Section 8.1).
• Use stress relief packages to absorb and
limit damages related to mechanical con- Note:
straints if present in the application. Continuous improvement to achieve the con-
• Tighten process limit. cept of ‘Zero Defects’ [6], while important and
• Chip redesign to address robustness issues. comprehensive in scope, is outside the pur-
view of this document.
46
13. Monitoring
47
There will be occasions on which the supplier document. Suffice it to say that the user and
will want to implement a change to a qualified supplier will need to agree on a RV Plan for
product in production to improve the product, changes using the concepts outlined in this
throughput, manufacturing capacity and/or document. In case of changes in the product
cost. While there are a number of industry or new application conditions, the monitoring
standards and individual user and supplier plan must be reviewed.
requirements for qualifying these changes
[3, 4, 5], these are outside the scope of this
The section defines documentation contents as 14.2 Documents for Communication, Hand-
well as communication paths along the value outs and General Remarks
chain for clear understanding and agreement
among the partners. Focus is set on the basic The contact partners and addresses or func-
relationship between the semiconductor com- tional entities in the organizations of the
ponent and the ECU manufacturer. Special involved parties must be agreed and docu-
cases, such as direct communication Tier2 mented at the start of the project. Restrictions
to OEM or intermediate steps from Tier1 to to information that is considered to be confi-
OEM regarding component aspects need to be dential must be clearly identified and docu-
described and contracted individually at the mented in detail during project start.
beginning of the cooperation.
Generally, the share of confidential informa-
Reporting differentiates between Commodity tion is regulated and described by a Non-Dis-
Components and ASICs. closure Agreement between the supplier and
the customer.
14.1 Content, Structure
48
15. Examples
15.1 Examples of the Lack of or Poor The new mould compound, which should
Qualification soLve the problem, had been selected by
15.1.1 Delamination between Mould Com- using liquid-liquid temperature shock, a
pound and Die/Lead Frame highly accelerated stress test sometimes used
during development, which allows relative
This example shows that end-of-life testing judgment of robustness related to an already
is needed in order to correctly assess the risk qualified reference status. The test ran until
of potential delamination between the mould the parts started to fail (end-of-life), so that
compound and die/lead frame. the deviation in the behavior of the weak
parts compared with parts that had been used
After being used for two years with 2 million previously without failures became obvious:
parts in a safety critical application without The weaker parts/lots with field failures did
displaying any prior trouble, the semiconduc- show a significant earlier end-of-life behavior
tor component suddenly displayed a sharp than the older good ones that showed suffi-
increase of catastrophic failures due to lifted cient lifetime results during qualification.
bond wires. The failure analysis traced the origin of the
‘improvement’ to a modification of the mould
Temperature cycling using standard stress test compound that yielded a lower adhesion to
qualification procedures did not show any die/lead frame.
failed parts, which means that the material in
use in the field would have passed qualifica- Thus, by using end-of-life data, this wear
tion because the standard stress test resulted out mechanism became clearly visible (see
in no fails after the required stress time. This Figure 15.1).
demonstrates the weakness of the standard
method [5] because either the automotive The new compound was checked against the
application was not covered by the stress con- end-of-life data again and could be released
dition or the acceleration factors were differ- in a short time based on relative assessment
ent from the ones used in the standard. Both of lifetime with respect to already existing
weaknesses would have become obvious if qualification data that had demonstrated its
end-of-life testing as required by RV would robustness in an earlier qualification. Even
have been chosen. though the physics of failure are not fully
understood; EOL data can be used to compare
Figure 15.1 Failure Distribution of different materials.
Liquid-Liquid-Temperature Cycling
This particular problem also shows the urgent
need for specific monitoring tests to detect
80 changes in the robustness before they become
70 Conspicuous problems in the field.
Qualified
60
Failure Rate [%]
New
15.1.2 Qualification of a New Leadframe
50
Finish
40
30 The qualification results of a new low-cost lead-
20 frame finish did not show any electrical failures
with tested parts (45/0). The stress tests had been
10
performed according to a stress test driven stand-
0 ard with preconditioning followed by tempera-
3k 6k
Temperature-Cycles ture cycling. So, the decision was made to change
to the new material according to this ‘positive’
go/no-go result, because intrinsic wear-out prob-
lems seemed to be excluded by the tested sam-
ple. Further analysis was not done, because there
49
seemed to be no indication of any different result Those via problems (see Figure 15.2) lead
compared to the zero fails from the original qual- to timing delays of transistor switching and
ification performed at the first qualification with signal. As a result, the following problems in
the standard frame finish. memories occurred:
• Timing delay of memory.
But, additional investigations done by another • Read problems of memory data.
party with only 18 parts as part of pre qualifi- • Data bus timing delay.
cation work, using end-of-life tests, revealed • Miss calculation due to data timing delay.
that an unexpected metallurgy caused increased
weakness of the internal stitch bond joints of the This commercial IC was sold in millions out-
bond wires to the leadframe with a sharp incre- side of the automotive industry with no field
ase in failure-rate due to electrically open bonds complaints reported.
after some additional stress. It could be shown A comparison of the Mission Profile for which
that degradation had already been started at the the IC was designed (commercial high volume,
end of the test time required by the stress test low cost) and the automotive profile showed a
driven standard without causing an electrically severe mismatch in the life expectancy of the
detectable fail. metallization between both profiles. The weak
From the data of the end-of-life test, the poten- point was the submicron via process capa-
tial failure rate in an automotive application was bility, which could not cover the automotive
calculated to be in the range of 5,000 ppm. profile.
Conventional stress testing did not detect this
End-of-life testing avoided a catastrophic field weak point based on the insufficient sample
situation by generating Robustness Data with sizes needed to detect this 1000 ppm occur-
critical limits by using a small sample but testing rence. But end-of-life analysis on test struc-
until failure, which other approaches 5 were not tures confirmed the missing capability for the
able to reveal previously. automotive profile. The following IC was mod-
ified by using a redundant double via technol-
15.1.3
Via-Problems in Semiconductor ogy in areas with sparsely populated vias and
Component Metallization with a sufficient safety margin between the
process capability and the requirement in the
After the release of a microcontroller based on automotive application. As a result, no more
a standard qualification 5, the failure rate in failures happened. Because this weak point is
the field approached 1000 ppm. The failure extrinsic in nature, it cannot be detected by
analysis showed open vias in the IC metalliza- qualification, if it has a lower failure rate, as
tion. Such opens could be caused by: in this example. In this case, the main focus
• Via formulation problem should be on monitoring measures.
(existence of a void underneath via)
• Via hole over etching Generally, the countermeasure activities for
(pass through the barrier metal) these problems were both reduction of parti-
• Barrier metal formulation problem cle density and strengthened process control.
(existence of void side wall of via) For example:
1. Process control of the critical Cpk values.
Figure 15.2 Open by Void Under a Via 2. Test program:
• Sufficient test coverage and Iddq test
blocks.
via circuit wiring • Test criteria validation.
• Operation margin validation under
the condition of specification limit fre-
intermetal dielectric quency and power supply voltage.
void circuit wiring 3. Electrical characteristics:
The control limit should be set after the RV
of static/dynamic electrical characteristics
Source: Infineon Technologies AG variation for each device terminal.
50
15.2 Integrated Capacitor Design
Some devices need capacitors with large be generated during reliability qual-
capacitances. They are implemented as large ification stress testing. Monitoring of
area capacitances using a specific dielectric extrinsic defects should be part of the
as the isolator and are implemented into the monitoring plan.
interconnect part of the device structure. The
reliability assessment using the knowledge 15.3 Requirement Temperature Cycles
base uses the following steps:
1. The following data must be generated Different points of use in the car have different
within requirements management: requirements with respect to the number and the
• Voltage at the capacitor ΔT of temperature cycles. The following steps
• Junction temperature during use describe how to cover this topic within RV:
• Effective time of use (duty cycle) 1. Requirements management delivers
• Capacitor area needed the Mission Profile. Example of a typ-
• Safety margin for intrinsic break down ical IC for an air-conditioning system:
2. The potential risk and failure mechanism 9000 cycles with ΔT = 110 °C
for capacitors is the hard breakdown of the 7500 cycles with ΔT = 80 °C
dielectric. 3000 cycles with ΔT = 70 °C
3. The information for the Qualification Plan
can be extracted from the knowledge base: 2. From the Knowledge Matrix, we learn that
• Test structure design large area capacitor. a certain set of failure modes, such as bro-
• Parameter for characterization leakage ken bond wire and solder fatigue, is gen-
current. erated by this stress.
• Degradation model
-- Percolation model 3. The most effective test structure to evaluate
• Acceleration model this failure mode is the assembled IC. For
-- Linear E-field for voltage extrapolation from stress to operating condi-
-- Arrhenius for temperature tions, the Coffin-Manson model is used.
-- Area scaling with poisson distribution
4. Tests must be performed according to the 4. Acceleration factor: AF = (ΔTstress/ΔTuse)p
description in JP001.01. The basic model
data are usually measured during technol- 5. If the Coffin-Manson constant p is known
ogy development and qualification. For a from experiments, it is possible to calcu-
new product, these numbers could be used late the required number of cycles at a
to calculate the expected intrinsic failure certain ΔTstress (here -40 °C to +140 °C, ΔT
rate for the capacitor area planned includ- = 180 °C), which are needed to cover the
ing the intended safety margin, typically in three conditions from 1.
terms of lifetime. Robustness areas could be This number plus the safety margin is
defined for: the target. Tmax and Tmin should not be in a
• Capacitor area vs. lifetime (E, T const) parameter range where new or not appli-
• Capacitor area vs. electrical field (T, life- cation relevant failure mechanisms may
time const) be activated. Table 15.1 gives information
• Electrical field vs. lifetime (A, T const) about how the use conditions are trans-
-- The extrinsic fail distribution is domi- ferred into stress condition for two failure
nated by defects. Therefore, especially modes (the assumption is that these num-
for large area structures, the defect bers have been measured for the material
density must be kept under control in and process used):
production by monitoring. First, data • Al wire bond fracture: p = 5
to guarantee a certain upper level • Delamination: p = 4.2
for the extrinsic distribution could
51
Table 15.1 Example of OEM Vehicle Mission Profile Parameters – (High Level)
52
Figure 15.5 Cracks in the Die Attach Due to Accumulated Damage during Thermal Cycling [17]
The accumulation of damage during thermal Figure 15.6 Deterioration of Thermal
cycling can result in significant delamination Response as Function of the Heating
between the die backside and the lead frame Time for a New and an Aged Device [18]
as shown in Figure 15.5.
53
For soft solder die attach, low cycle fatigue by 15.4.6 Impact on Design of the Applica-
cyclic accumulation of plastic deformation is a tion and Impact on Component Selection
widely known degradation mechanism [19]. The Step by Step Approach
number of cycles to fail (CTF) can be expressed
as in formula (15.1) with A0 acts as a constant a) Mission Profile consideration on applica-
and q is the Coffin Manson Exponent, which is tion level: Evaluation of thermal cycling
a specific to the degradation mechanism and load on component level.
typically material dependent. For ductile mate- Output: Table with ΔT and amount of
rials the such as soft solder materials the Coffin expected cycles in the field.
Manson exponent ranges from 1 < q < 3 [20]
b) Calculation of equivalent damage due
CTF = Ao (ΔT)-q (15.1) to the environmental load. Calculate the
N2 = N1(ΔT1/ΔT2)-q (15.2) equivalent amount of qualification cycles
for each ΔT and add these values to
With formula (15.1) the acceleration can be determine the equivalent damage based
calculated and expressed in formula (15.2) in equation (15.2) using the appropriate
whereas N1 represents the number of temper- coefficient q.
ature cycles at temperature T1 and N2 for T2
respectively. c) Provide the thermal response curve for
this damage or aging level.
By knowing the expected temperature cycling
load of the component in the application the d) Check if the thermal response of the aged
equivalent stress can be expressed in numbers component is still suitable for the application.
of temperature cycles under qualification test
conditions. By knowing the level of degrada- e) If not suitable, select a component with a
tion for this temperature cycling load and the better thermal response in initial state and
impact of the components characteristics a go to b).
lifetime assessment can be performed.
Responsibilities and deliverables:
15.4.5 Design for Lifetime Tools a), d), e) hardware designer of
the application
In order to support the hardware engineer b) and c) component supplier
regarding the selection of an appropriate power
MOS for his specific application, a tool would
make sense which can deal with the number of
thermal cycles expected in the application, and
calculates the expected deterioration of the ther-
mal response compared to new devices (Thermal
response curve as shown in Figure 15.6).
54
Appendix A – Knowledge Matrix
A template that can be used for reporting purposes is available on the ZVEI homepage.
http://www.zvei.org/RobustnessValidation under ‘Implementation’.
55
Appendix C – Terms, Definitions and Abbreviations
56
Appendix D – References and Additional Reading
D.1 References
57
D.3 Other Robustness Validation Documents
• Knowledge Matrix is published on ZVEI and SAE homepage (yearly update, currently 4th ver-
sion under review)
• Robustness Validation for MEMS – Appendix to the Handbook for Robustness Validation of
Semiconductor Devices in Automotive Applications, 2009
• Handbook for Robustness Validation of Automotive Electrical/Electronic Modules / content
copy: SAE Standard J1211 (2008, under review)
• Automotive Application Questionnaire for Electronic Control Units and Sensors (2006, Daim-
ler, Bosch, Infineon)
• Pressure Sensor Qualification beyond AEC Q 100 (2008, IFX: S. Vasquez-Borucki)
• Robustness Validation Manual – How to use the Handbook in product engineering (2009,
RVobustness Validation Forum)
• How to measure lifetime for Robustness Validation - step by step, ZVEI 2012
• H.-P. Hoenes (Fairchild Semiconductor), ‘Powering the Future – Design and Simulation: The
Key to high Reliability’, electronica automotive conference, 2006
• A. Gottschalk (RELNETyX), ‘Considerations of a Robustness Validation Procedure’, electronica
automotive conference, 2006
• A. Preussger (Infineon Technologies), ‘Robustness Validation – Improved IC Qualification for
Automotive Applications’, electronica automotive conference, 2006
• O. Mende (Audi), ‘ General Conditions for the successful Integration of Semiconductor Devices
in Automotive Electronics’, electronica automotive conference, 2006
• E. Schmidt (BMW), ‘Targets and Requirements of Robust Automotive Electronics’, electronica
automotive conference, 2006
• H. Keller and A. Preussger, ‘Robustness Validation’, Tutorial at ESREF, 2006
• A. Preussger, ‘Qualification Strategies’ Tutorial at IRW, 2006
• W. Kanert, ‘Qualification Strategies in the Age of Zero Defect’, AEC Reliability Workshop, 2006
• W. Kanert, ‘Robustness Validation – A Short Introduction’, FIEV, 2006
• W. Kanert, R. Rongen, and Z. Liang, ‘Robustness Validation’ – an Introductory Tutorial, Pro-
ceedings of AEC Reliability Workshop 2013
• W. Kanert and R. Rongen, ‘Robustness Validation and Failure Rates’, Proceedings of AEC Reli-
ability Workshop 2014
• W. Kanert, ‘Robustness Validation – A physics of failure based approach to qualification,
Microelectronics Reliability’, 54 (2014), 1648-1654
• Position Paper (R. Winter)
• Z. Liang, R. Kho, B. Bollig, R. Rongen, and F. Kuper, “Reliability Robustness Indicator Figure
and its Application”, Proceedings of AEC Reliability Workshop 2012
58
D.6 Stress Test Standards
59
Appendix E – Relation to AEC-Q100/101 for already
qualified electronic components
Flow chart 2 (fig E1) describes (for details see These calculations enable the translation from
Handbook for Robustness Validation of Auto- the component mission profile to equivalent
motive Electrical/Electronic Modules, ZVEI) qualification test duration under specified
• (1) the process at Tier 1 to assess conditions. The decision to be made here is
whether a certain electronic component strategic.
fulfills the requirements of the mission
profile of a new Electronic Control Unit • Chose no if already known that product is
(ECU); and marginal or critical.
• (2) the process at Semiconductor Com- • Chose ‘yes’ for uncritical product e.g. with
ponent Manufacturer to assess whether references to already qualified products and
an existing component qualified accord- being not at the extremes of its specifica-
ing to AEC-Q100/101 can be used in a tion.
new application.
1.4 By applying the ‘basic calculation’, the
A detailed description of flow chart steps is mission profile is translated into an equivalent
given below (numbers refer to these specific stress with the same conditions as the quali-
flow chart steps). fication standard test. Commonly accepted
acceleration models and parameters are used
E.1 Assessment on ECU level and can be taken from the literature and/or
standards (e. g. JEP122). Examples are given
1.1 Items to consider in constructing a Mis- in table 9.1.
sion Profile Assessment:
• Type of application 1.5 This calculated stress duration tCALC (in
• Requirements of service life and usage hours or number of cycles) has to be com-
• Environmental conditions / Mounting loca- pared to the standard qualification duration
tion tSTAND, taking a safety margin tSM into account.
• Construction of the ECU
• Power Dissipation of ECU and components 1.6 In case tSTAND > tCALC + tSM, the component is
• Reliability requirements in terms of lifetime assumed to be not critical/marginal. The safety
and related failure probabilities margin tSM has to be defined based on the
application and customer requirements; there
A structured analysis of the mission profile will are no standardized rules for this. Assessment
identify potential reliability risks in an early of criticality shall include the probability of
stage of development cycle, so that these risks failure until end-of-life.
can be addressed by appropriate component
selection and validation. E.2 Mission Profile Validation on Com-
ponent Level
1.2 Translation of ECU mission profile to com-
ponent mission profiles, taking different load- 2.1 The recommended base for assessing the
ing on component level into account. Loads critical failure mechanism(s) is the Robustness
could be caused by assembly, shipping, stor- Validation Knowledge Matrix or JEP122. Risk
age, operation or environment. Vehicle ser- assessment should be performed covering at
vice life is typically split into operating and least the following main considerations:
non-operating parts.
• New materials or interfaces
1.3 Performance of ‘basic calculation’ facili- • New design or production techniques
tate the mission profile assessment via a high • Critical use conditions
level check of the suitability of a component
(or list of components) for the given applica-
tion.
60
Methods for risk assessment could be FMEA 2.5a Based on the calculations in 2.3 the
(AIAG, …), Risk Assessment, FTA or similar. mission profile is more severe than the AEC-
Q100 test conditions. In this case test condi-
2.2 In case acceleration models are in use in tions have to be defined which are equivalent
the company or known from the literature, or more severe than the mission profile. This
they can be taken to perform lifetime calcu- can either be a longer test time or harsher test
lations. Experiments, simulation, or literature conditions such as higher temperatures. Both
study can be used to create such acceleration based on the acceleration models evaluated
models. Sufficient acceleration may be impos- under 2.2. This means that more severe test
sible due to limiting physical boundary con- conditions are applied for the product qual-
ditions. In such a case minimum stress times ifications than the conditions defined in the
should be defined to demonstrate sufficient AEC Q100.
robustness margin, (e. g., based on change or
degradation of any electrical or physical prop- 2.5b If qualification data with the required
erties during or after stress and the impact on test conditions are not available, it has to be
the specific application). decided if the qualification test data can be
generated by the component supplier with
2.3 The acceleration model is used to calculate reasonable effort.
the acceleration factor for the standard stress
condition. This in return gives the calculated 2.6 Once the test conditions are defined, the
minimum required stress time tCALC (in hrs or technical feasibility has to be evaluated. On
number of cycles) to demonstrate reliability product level there are limitations regarding
without failures. maximum temperatures, voltages or currents
which can be applied. Therefore the level of
2.4 A comparison with the standard qualifi- lifetime acceleration is limited. In cases a rea-
cation duration tSTAND is to be made. In case sonable product qualification scenario cannot
tSTAND > tCALC, the component is assumed to be be identified, another approach has to be
not critical/marginal. The robustness margin applied. In this case the capability of the tech-
tSM has to be defined based on the application nology and the design of a component have
and customer requirements. Assessment of to be evaluated specifically based on the tech-
criticality shall include the accumulated fail- nology lifetime characterization acc. to the
ure probability until end-of-life. Criteria for a Robustness Validation process.
decision shall include not only test conditions
and durations as compared to the standard, B: Testing must be performed according to
but also coverage of critical failure mecha- mission profile specific test conditions.
nisms by the tests. Such coverage consider-
ations include applicability of assumptions E.3 Robustness Validation on Compo-
used in calculating the stress conditions, such nent Level
as variation of the activation energy for differ-
ent failure mechanisms. Beyond that it has to 3.1 In case of capability of a component for a
be assessed, if particular failure mechanisms given mission profile cannot be achieved until
are addressed by the standard test method at step 2.6, the use of more capable alternative
all. A case in point is active cycling of power components should be considered. If alterna-
devices, which is not adequately addressed by tive components with higher robustness are
standard qualification tests. In addition, spe- not available either the mission profile has to
cific requirements regarding fail probabilities be adopted or the technological capability of
may not be covered by standard test proce- the component has to be assessed applying
dures. MIM capacitors, for instance, are known the robustness validation approach.
to fail due to extrinsic defects. A requirement
of, e.g., less than 100 ppm for extrinsic fail- C: Testing must be performed according to
ures will not be covered by standard tests and mission profile requirements following the
sample sizes. Robustness Validation strategy with focus on
61
critical failure mechanisms. The test plan shall a temperature mission profile is given in
be aligned between CM and Tier 1. line 1 of table E.1. For a predefined stress
condition each of the conditions can be
Process for qualification plan generation quantified using the acceleration model,
based on mission profile: here the Arrhenius model. The result is the
a) List all conditions, operating, non-operat- equivalent stress time.
ing, production or transport with all rel- c) For a temperature cycling and a humidi-
evant parameters like temperature, tem- ty-temperature example the calculation
perature cycles, humidity or other and the is shown in line 2 and 3 using the Cof-
correspondent time the condition applies. fin-Manson and Hallberg-Peck model.
b) For a temperature mission profile all peri-
ods with identical temperatures are sum-
marized. An example for one condition of
Loading Mission Stress Test Stress Acceleration Model Model Parameters Calculated
Profile Input Conditions (all temperatures in K, Test Duration
not in °C)
One opera- tu = 2,000 h High Tt = 125 °C Arrhenius Ea = 0.7 eV tt = 232 h
tional mode (single mode Temperature (junction tem- (activation energy; 0.7 eV (test time)
Ea 1 1
operating use) Operating perature in test A f = exp • − is a presumed value, actual
tu
Life environment) k B Tu Tt values depend on failure t t =
Tu = 87 °C (HTOL) mechanism and range from Af
(single mode Also applicable for High -0.2 to 1.4 eV)
related junction Temperature Storage Life
temperature in (HTSL) and NVM Endurance, kB = 8.61733 x 10-5 eV/K
use environ- Data Retention Bake, & (Boltzmann’s Constant)
ment) Operational Life (EDR)
Thermo- nu = 54,750 cls Temperature ∆Tt =205 °C Coffin Manson m=4 nt =1034 cls
mechanical (number of Cycling (thermal cycle m (Coffin Manson exponent; 4 is (number of
∆Tt
engine on/off (TC) temperature
f = A
a presumed value and to be cycles in test)
cycles over 15 yr change in test ∆Tu used for cracks in hard metal nu
of use) environment: alloys, actual values depend n
t =
-55 °C to 150 °C) Also applicable for Power on failure mechanisms and Af
∆Tu =76 °C Temperature Cycle (PTC) range from 1 for ductile to 9
(average for brittle materials)
thermal cycle
temperature
change in use
environment)
Humidity tu = 3,000 hr Temperature RHt = 85 % Hallberg-Peck p=3 Tt = 24.5 h
& (engine off time Humidity (relative humidity (Peck exponent, 3 is a pre-
p
tu
Temperature over 15 yr of Bias in test environ-
Af = • exp Ea • 1 − 1
RHt sumed value and to be used
t t =
use) (THB) ment) RHu k B Tu Tt for bond pad corrosion) Af
RHu = 91 % Tt = 85 °C Also applicable for Highly Ea = 0.8 eV
(average relative (ambient tem- Accelerated Steam Test (activation energy; 0.8 eV is a
humidity in use perature in test (HAST) and Unbiased Humid- presumed value)
environment) environment) ity Steam Test (UHST).
See (Note)s. kB = 8.61733 x 10-5 eV/K
Tu = 27 °C (Boltzmann’s Constant)
(average tem-
perature in use
environment)
Table E1 Examples for calculating test durations based on single conditions from Mission Profile
62
Assessment on ECU level Assessment of available qualification data against a specific
1.1
mission profile Responsibility
Determine Mission Profile Process at Tier 1: Tier 1 +
Assess whether a certain electronic component fulfills the requirements of the Tier 1 CM
on ECU level CM
mission profile of a new ECU
Process at Component Manufacturer (CM):
Assess whether an existing component qualified according to AEC-Q101 test
1.2 conditions fulfills the requirements of the mission profile of a new application
Determine mission profile
of the component
including loading
Mission Profile Validation on Component Level Robustness Validation
2.1 on Component Level
1.3 Basic No Determine critical Failure
calculation? Mechanisms
Yes
2.2 C
Perform Robustness
Determine acceleration Validation with
models detailed alignment
between SCM and
Yes Tier 1 on mission
1.4 2.3 B profile and/or critical
Calculate test duration Calculate test duration failure mechanisms*
with standard acceleration with selected acceleration Mission Profile
models for standard tests models for standard tests validated*
and test conditions and test conditions
1.5
No No
Yes
Yes
Yes 2.5a Additional Yes 2.5b Additional No
1.6 Critical/
data data can
Marginal?
available? be created?
No * Note that this flow chart does not cover
A No
the consideration of AEC acceptance criteria
based on LTPD-sampling plan (max. of 1%
AEC-Q101 test conditions
failures allowed) versus the Robustness
exceed / are sufficient*
Validation failure rate extrapolation
63
Appendix F – From Mission Profile to Test Condition
(an example)
In this example the distribution of temperature stress over the lifetime, the temperature mission
profile of the component is shown in figure F1.
7000
6000
5000
4000
3000
2000
1000
A conservative approach for a failure mechanism associated with high temperature would be to
link each bin to its maximum temperature. In this case this would result in the values given in
table F1.
and using Arrhenius equation the contribution to a stress time at 150 °C could be calculated for
each of the four temperatures (see tab F2).
64
The resulting cumulative stress time for 150 °C is 2857 h for failure mechanism 1 and is 1209 h
for failure mechanism 2. These stress times are representing the load for all four temperatures
under use conditions.
The range of activation energies for typical temperature related failure mechanism is between
-0.2 eV and 1.4 eV. Figure F2 demonstrates the effect of activation energies on acceleration
factors that are used for life and stress time calculation.
1,000,000.0
100,000.0
10,000.0
stress at 150°C
use at 48°C
Acceleration Factor AF
1,000.0
100.0
1.0
-0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
0.1
Activation Energy Ea (eV)
It should be noted that the acceleration factor from 48 °C to 150 °C changes be 7 orders of
magnitude for the range of typical activation energies. For acceleration from 108 °C to 150 °C
the change is still 2 orders of magnitude. This means that wrong assumptions for the critical
failure mechanism and related activation energy could result in severly wrong life time forecast.
65
Notes
66
Bildnachweis: Titel: © ArchMen – Fotolia.com