IGBT Failure
IGBT Failure
IGBT Failure
CONTENTS
1. Foreword 2. A brief outline of semiconductor device reliability 2. 1. Change in failure rates of semiconductor devices 2. 2. Failure factors of semiconductor power modules 2. 3. Heat-fatigue phenomenon in semiconductor modules for electric power 2. 3. 1. Heat stress model during module actuation 2. 3. 2. Fault mechanism with power cycle and thermal cycle 2. 3. 2. 1. Power-cycle life fault mechanism 2. 3. 2. 2. Thermal-cycle fault mechanism 3. Quality-guaranteeing activities 3. 1. Mass production procedures 3. 2. Environmental control 3. 3. Periodic inspection of manufacturing equipment, instrumentation and maintenance control 3. 4. Material purchasing control 3. 5. Manufacturing process control 3. 6. Intermediate and final inspections 3. 7. Quality information 4. Reliability Tests 4. 1. Reliability Test Method
1. Foreword Power modules, semiconductor devices for electrical use, were launched on the market in the late 1970s as BIPtype modules (transistors, thyristors, etc.) embedded in bipolar-type semiconductor chips and again in the early 1980s as MOS-type modules (IGBT etc.) embedded in MOS-type semiconductor chips. Currently they are widely used in many home electric appliances such as air-conditioners, refrigerators, washing machines, etc., and in applications for various industrial inverter devices, servo, UPS, electric and electronic peripherals. At the same time, the reliability of the device has increased rapidly along with improvements in semiconductor technology. Typically, for equipment that demands high reliability, a semiconductor device failure rate of 10 to 100 FIT (1 FIT = 1109/hour)is required. In order to realize that level of reliability, naturally the reliability of the semiconductor must be improved. It is very important to consider the harmony of the equipment design and working conditions under the various new stresses added to the semiconductors characteristics and reliability. It is often observed that the failure rate in the market place is markedly different for semiconductor devices manufactured in the same way because of a weakness in machine design or a difference in usage. Here we introduce our companys reliability test results and activities to guarantee quality with regard to semiconductor device reliability, by examining problems of typical machine design and usage that must be take into consideration and cases of faults that actually reached the market. 2. A brief outline of semiconductor device reliability 2.1. Change in failure rates of semiconductor devices In general, the failure rate of electrical equipment and parts follows the so-called bathtub curve shape shown in Fig. 1, where after an early failure period, it passes through an incidental failure period before reaching wear-out failure. From this failure-rate curve, the selection of a semiconductor device for use in equipment has to consider these three points: early failure period failure rate, incidental failure period failure rate, and usable life period, in addition to the equipment use, influence and spread of faults in the device, and the preventive maintenance system, etc. In general, the failure-rate curve of semiconductor devices resembles Fig. 1. (b) and tends to decrease over time. Changing the way of looking at this, even when the failure rate lowers in the incidental occurrence fault period and becomes stable, from the fault distribution pattern it can be said that the early failure shape continues. The change of the failure rate over time of the semiconductor device is shown in Fig. 2, and while a high failure rate is shown after manufacture, by edging and debugging the failure rate decreases further. For semiconductor devices that require high reliability, high-temperature edging and electrical edging are used for edging and debugging. Because the failure rate curve of semiconductor devices shows a declining variation pattern, to increase the reliability of the equipment it is necessary to consider the minor initial-failure factors (especially major failure rates such as disconnects, shorts, etc.). Next, while it goes into assembly conditioning and edging with equipment maker, the major failure rate is 0.1%. If the rate dramatically exceeds this value, there is a problem in the device, circuit design, assembly process or test process and a study of the cause is necessary. If this is left, it can show up in frequent fault occurrences in the market. Caution is necessary when the failure rate is high, as there are many cases when there is a correlation between the assembly-conditioning and edging- period failure rates and market failure rate. The failure rate for heavy faults within a certain period is 0.1%. When the equipment comes on the market, the failure rate decreases dramatically because the stress level declines, and it is usually between several Fit and several 100 Fit. Because of this, design with some leeway on the device-use side is required, and generally it is desirable to have a maximum voltage rating of below 50~60%, and a maximum junction temperature rating of below 70~80%. One other important element that must be remembered is the semiconductor device in use, the circuitry used and the environmental conditions (various cases of stress, etc.), which can affect improvements in reliability too.
As stated above, when it comes to equipment reliability design, it is necessary to consider the problem of performance/reliability against economy when selecting a device. It is not easy to attain both high performance/high reliability and economy, so a balance of both must be selected. It can be said that selecting a semiconductor device by considering the performance, reliability and harmony of the equipment is an important learning activity for the user.
Failure rate
(a)
(b)
Time
Failure rate
C-A-B-C Early failure period (factory) C-D Early failure period (field) D-E Incidental failure period (field) E-F Wear-out failure period (field) (O-A-B-C-D debugging term)
0 A
250 B
1000 C
2000
3000 D
X E
Time F
2.2. Failure factors of semiconductor power modules When a device is returned from the market or the equipment-assembly conditioning evaluation as faulty and a fault analysis is performed on a good product, there may be a problem with the device in use or the environmental conditions, or there may be a defect in the device. Using the IGBT module as an example, the fault factors are listed below:
Device characteristics and unmatching of machine-side circuitry and manufacturing conditions Overvoltage VCE overvoltage (between collector and emitter) Turn-off surge voltage Increase in bus voltage Control signal anomaly External noise (lightning surge) Measurement defect VGE overvoltage (between gate and emitter) Static electricity Gate drive circuit anomaly Gate oscillation High voltage mark External surge Overheating (excess current, overload) Heat dissipation design defect Short (lack of dead time, improper control signal, etc.) Excess current Lack of gate voltage Gate wire open Abnormal increment in switching frequency Decline in switching speed Lack of heat dissipation Junction heat fatigue Insulation defect (ceramic crack, internal solder re-melt) Cooling fan anomaly (abnormal stress) Excess voltage IGBT chip manufacturing defect Pattern defect (by foreign matter, etc.) Surface preparation defect (dopant ion) Module manufacturing defects Wirebond junction defect Insulation/base board junction defect (solder, etc.) Internal electrode solder defect Metalization defect
Among the factors above, one of the factors deciding the usable life period is heat fatigue failure inside the module between the chip and wire junction or between the insulation substrate and the base plate (solder junction). The next section introduces a description of heat-fatigue phenomenon and cases of failure.
2.3. Heat-fatigue phenomenon in semiconductor modules for electric power 2.3.1. Heat stress model during module actuation Two actuation patterns of the heat stress model during module actuation are shown in sections in Fig. 3 When selecting a module, consideration must be given to its usable life and the machine design.
s
Actuation mode 1 Actuation pattern shows few changes in case temperature (base plate temperature) but frequent changes in
junction temperature (known as PC life or power cycle life) s Actuation mode 2 Actuation pattern shows relatively calm temperature changes at system actuation and shut down (known as thermal cycle life)
Tj
Temperature
Tc
Time
Tj Tc Tj-C Tj Tc ON OFF
2.3.2. Fault mechanism with power cycle and thermal cycle 2.3.2.1. Power-cycle life fault mechanism Fig.4 shows the structure of a typical power module as module actuation causes a change in the junction temperature and stress appears by the difference in linear expansion coefficients of the aluminum wire and the silicon chip, causing a crack to appear on the junction side. The crack develops and eventually separates completely. With the comparatively gentle changes in module-case temperature by inverter actuation, etc., when conditions cause frequent junction temperature changes, power cycle disruption must be taken into consideration during machine design. Fig.5 shows a photograph of a case of junction shear due to power cycle. Fig.6 shows the results of our companys test carried out on a module products power-cycle life (power cycle life curve). 2.3.2.2. Thermal-cycle fault mechanism Fig.4 shows the stress distortion that occurs in the solder layer caused by the difference in the linear expansion case between the base plate and the insulation substrate. At system actuation and shutdown, the power modules case temperature (Tc) changes comparatively gently while large temperature change occurs at actuation pattern. When this stress is repeated, a crack appears in the solder and when the crack reaches the bottom of the power chip, it causes increased thermal resistance and thermal disruption. Finally, due to the thermal resistance increase, the Tj increases and power-cycle proof decreases, and power cycle life reaches wire-shear mode. Fig.7 shows a photograph of a solder layer crack between the insulation substrate and base plate caused by thermal cycle. Fig.8 shows the results of a thermal-cycle life test on module products carried out by the company.
Wire
Insulation substrate (both sides copper foil) Solder layer Base plate
aluminum wire silicon chip Insulation substrate crack silicon chip Copper base plate aluminum wire
1,000,000,000
100,000,000
10,000,000
Cycle
1,000,000
100,000
10,000
1,000,000
100,000
Cycle
10,000
3. Quality-guaranteeing activities The quality, price, delivery time limit and service of a product are all important elements but the most important thing as long as the product exists is the quality of the product and the continuing service to the user of the product. In the semiconductor industry, the quality levels required by products are very high. On the other hand, high quality control is required in the mass-production systems with very advanced technology such as the process control capability seen in the wafer process or the minute work seen in the assembly process. A brief description of the quality-guaranteeing activities is given below. 3.1. Mass production procedures From trial production in development through trial mass production until mass production, a series of type-approval tests are carried out for performance and reliability, along with an examination of a typical design illustration. Fig.9 shows a quality-guarantee system illustration from development through mass production. The next chapter provides information on the reliability tests and reliability confirmation in the type-approval tests. 3.2. Environmental control In the semiconductor industry, the environment greatly effects product quality and control limits are established to control the environment precisely and monitor dust, moisture and temperature. The same measures are taken with the gas and water used in the factory. 3.3. Periodic inspection of manufacturing equipment, instrumentation and maintenance control The semiconductor industry is a manufacturing industry and control of the manufacturing equipment and instrumentation is important in device production. Regular checks and maintenance are performed to prevent any decline in accuracy, faults, etc., in the device. 3.4. Material purchasing control Strict analysis and inspection are performed using spectrum analysis etc. based on receipt inspection norms. After sufficient sample examination has been performed to confirm the quality of the supplied materials and all problems solved, formal delivery begins. Due consideration is also given to the quality control of the suppliers manufacturing process. 3.5. Manufacturing process control The conditions that greatly influence quality control, such as purity of demineralized water atmosphere, furnace temperature, gas flow rate, etc., are measured by instruments mounted on the respective parts, automatically recorded and checked by engineers using check sheets. Moreover, processes that can have a larger effect on characteristics, such as depth of diffusion, surface concentration, etc. are recorded and used in the control data for working conditions. In addition, to assure stable quality, control is performed with data acquisition regarding the assembly processes which affect the wire-bond process adhesion pressure, strength control, etc. 3.6. Intermediate and final inspections Our policy on intermediate and final inspections is thus: With the aim of measuring the reduction in variation and improving the maintenance of quality, and thus producing better products, data on the products quality characteristics, such as appearance, size, structure, and electrical characteristics, etc., are returned to the first manufacturing stage. Intermediate inspections include wafer test and assembly process sampling inspections. All are carried out by two independent checks: one by the works section, based on the motto of making quality from the manufacturing process, and one by the quality-control section. By correcting quality according to the independent checks, we are able to check on points that are hard to discover in the finished product. After product completion the final inspection is conducted as a finished-goods inspection. Finally, electrical properties and external appearance are inspected. From the viewpoint of the customer who will use the product, to confirm the total performance and quality of the product, before the product is stored it is again sampled and subjected to quality guaranteeing inspections for external appearance, electrical characteristics and reliability. The quality of the warehouse is also strictly checked for each lot. Fig.9 shows a plan of the quality- guaranteeing system described above. 3.7. Quality information Various data on quality, such as inspection results and customer information, is created mainly in the qualityguaranteeing section, and is rapidly sent to the related section including a manufacturing department for the continued improvement of quality. Furthermore, in order to measure the modernization of data control, a rational and effective computerized quality control system is adopted.
Stage
Market
Sales
Manufacturing
Quality assurance
Production control
Preparation of specifications and instructions Decision of Mass-Production Initial flow control Customer Order production Production plannnig
Qualification test
Manufacture
Purchasing of materials
Wafer process
Environmental control
Document control
Notice of control change Problem treatment (research, correction, prevention), quality data, upgrading, fault analysis, data acquisition Packing
Customer
Delivery
Flow of information
QC training
4. Reliability Tests 4.1. Reliability Test Method Mitsubishi semiconductor devices can be used with full satisfaction because they are built to high levels of reliability: precise quality control through the design and manufacturing processes along with quality-guaranteeing inspections of every production lot assures high reliability. Various reliability tests are performed to check the level of reliability. In this chapter, an example of a test for a representative type of power module is introduced, and the content of the test is shown in Table 1. Mitsubishi semiconductor devices are tested for reliability in accordance with the Japan Electronics and Information Technology industries Association (JEITA). (Related standard: IEC).
Table 1. Mitsubishi power module reliability tests
Test Item Thermal Shock Temperature Cycle Vibration
Environmental Testing
Test Conditions -40: (60min.)~125: (60min.), 10 cycle Condition B : 10~500Hz/15min., 10G, 6 hours Condition A: 260 5:,10 1sec., use of rosin-type flux Condition A: 235 5:, 5 0.5sec., use of rosin-type flux M8 : 8.83~10.8N m, 10 1sec. M6 : 2.94~4.5N m, 10 1sec.
Mounting Torque
Ta=125:, 1000 hours Ta=-40:, 1000 hours Condition B: Ta=60:, RH=90%, 1000 hours Ta=125:, VCE=max. rating voltage 0.85V, VGE=0V, 1000 hours Ta=125:, VCE=20V, VGE=0V, 1000 hours Tc=50:, 5000 cycle (Tj=100:)
Low Temperature Storage Moisture Resistance High Temperature Reverse Bias High Temperature Gate Bias Intermittent Operation
VCE=600V -IC=15A IC=15A, VD=VDB=15V VCE (sat) U.S.L. 1.2 Input=On VSC L.S.L. 0.9 U.S.L. 1.1 VD=15V UVD, UVDB L.S.L. 0.9 U.S.L. 1.1 Trip Short circuits all terminals and Dielectric breakdown Dielectric strength imprints AC2500 V, 1min. between external heat dissipation fins. Note: U.S.L. : Upper Specification Limit; L.S.L. : Lower Specification Limit
High Temperature Storage Low Temperature Storage Moisture Resistance High Temperature Reverse Bias Intermittent Operation
VCE=1700V (rated voltage), U.S.L. 2.0 Tj=25:/125: IGES VGE= 20V U.S.L. 2.0 Above Tj=25: VEC -IC=1200A (rated current) U.S.L. 1.2 VCE (sat) IC=1200A (rated current), VG=15V U.S.L. 1.2 VCE=10V I.V.D. 1.5 I.V.D. 0.5 VGE (th) IC=120mA (rated current 10-4) L.S.L. 0.8 U.S.L. 1.2 34 class: 4000V; above Short circuits all terminals, and No dielectric breakdown 50 class: 6000V Dielectric strength imprints AC4000 V (rms), 1min between external heat dissipation fins. Note: U.S.L. : Upper Specification Limit; L.S.L. : Lower Specification Limit