DK1758 ch19
DK1758 ch19
DK1758 ch19
I. INTRODUCTION
As device design rules continue to shrink and die sizes grow, the control of particulate
contamination on wafer surfaces has become more and more important in semiconductor
manufacturing. It has been found that defects caused by particles adhering to the wafer
surface were responsible for more than 80% of the yield loss of very-large-scale integrated
circuits (VLSIs)(1). Although particulate contaminants could be introduced at any point
during the wafer manufacturing and fabricating processes, particles generated within
process equipment are the most frequent cause. Not only mechanical operations (e.g.,
valve movement, wafer handling, shaft rotating, pumping, and venting) but also the
wafer processing operation (e.g., chemical and physical reactions) can produce particles.
Since the production of a device involves numerous processes and takes many days,
it would be too late to look for the defects and their sources at the end of the process.
Currently, defect metrology is carried out at critical process steps using both blanket
(unpatterned) and patterned wafers. There is a dispute about whether defect control
using blanket wafers is necessary. The opposing argument states that, in addition to the
problem associated with the individual process step that can be identi®ed using either
blanket or patterned wafers, metrology on blanket wafers does not reveal any of the
problems related to the integration of the processes and, therefore, should be eliminated.
There are several additional factors to be considered, however. Inspection speed, cost, and
sensitivity (the smallest size of particle detectable) are all better on blanket wafers. In
addition, though a problem may be originally identi®ed on a patterned (production)
wafer, to isolate which tool and the root cause within that tool requires many partition
tests using blanket wafers (either Si monitor wafers, for mechanical tests, or full ®lm
wafers, for process tests) to be performed. Finally, the speci®cations for particle adders
in a tool/process and the statistical monitoring (baselining) of that speci®cation are carried
out on blanket wafers. Therefore, to ensure high yield in mass production, blanket wafer
monitoring is absolutely necessary and will not go away any time soon.
The light-scattering-based technique for defect detection has been used for inspec-
tion of unpatterned wafers for more than two decades. For a perfectly smooth and ¯at
surface, the light re¯ects specularly. In the presence of a surface contaminant or surface
roughness, the specular light is disturbed, and, as a result, scattered light is produced. For
a given incident light source, the amount of light scattered by the surface and the con-
In a surface scanner, a narrow beam of light illuminates and is scanned over the surface.
For a perfectly smooth and ¯at surface with no surface contaminant, the light re¯ects
specularly (i.e., the angle of re¯ected light equals the angle of incident light, Figure 1a). In
the presence of a surface defect, which can be a particle on the surface, surface roughness,
or subsurface imperfection, the specular light is disturbed and, as a result, a portion of the
incident light is scattered away from the specular direction (Fig. 1b and c). In the specular
direction, an observer sees the region causing the scattering as a dark object on the bright
background (bright ®eld). In contrast, away from the specular direction the observer sees a
bright object on the dark background (dark ®eld). Both bright-®eld and dark-®eld tech-
niques are used for wafer inspection, but the latter gives a better detection sensitivity for
small defects. A photomultiplier tube (PMT) is used to collect the scattered light in the
dark ®eld. The amount of light scattered from the surface imperfections, which depends on
the physical and optical properties of these imperfections, is measured as the light scatter-
ing cross section, Csca , measured in square micrometers.
The study of light scattering by particles can be dated back to the 19th century. The
general theory of light scattering by aerosols was developed in 1908 by Gustav Mie (2). It
gives the intensity of light (I) scattered at any angle, y, by a sphere with a given size
parameter (ratio of the perimeter of the sphere to the wavelength of the incident light
pd=l and complex refractive index (m) that is illuminated by light of intensity I0 (W=cm2 )
and wavelength l. Basically, the relationship between the light-scattering cross section and
the particle size can be divided into three regimes according to the particle size (d) relative
to the wavelength of incident light (l). For particles much smaller than the wavelength of
incident light (i.e., d < 0:1l), Rayleigh-scattering theory applies. In this case the light-
scattering cross section for light from a particle with a complex refractive index of m is
2
2p5 d 6 m2 1
Csca for d l 1
3 l4 m2 2
For particles larger than 0:1l, the Mie equations must be used to determine the angular
distribution of scattered light. For a sphere suspended freely in a homogeneous medium,
the intensity of the unpolarized scattered light at a distance R in the direction y from the
center of the sphere is then given by
I0 l2 i1 i2
I y for d l 2
8p2 R2
pd 2
Csca for d l 3
2
In the case of a particle adhering on the surface of a wafer, the scattered ®eld
becomes extremely complicated due to the complex boundary conditions of the surface.
The amount of light scattered from the surface depends not only on the size, shape,
refractive index (i.e., composition), and orientation of the particle, but also on the pro®le
(e.g., thickness, surface roughness) and refractive index (i.e., composition) of the surface.
The light-scattering system for a surface has been solved exactly by Fresnel.
Approximations that combine these two theories to solve the complex system composed
of a particle and a surface have been determined by researchers (3,4). Details on these
developments are beyond the scope of this review.
For a real wafer, the surface is neither perfectly smooth nor without any surface
defects. Some of the examples of surface imperfection that diminish the yield are particles,
crystal-originated particles (COPs), microroughness, ions, heavy metals, organic or inor-
ganic layers, and subsurface defects. The light scattered from each of these imperfections is
different. The signal generated by general surface scatter (e.g., roughness, surface residues)
is not at a discrete position, as is the case for a particle, but rather is a low-frequency
signal, observed across the effected regions of the wafer. This low-frequency signal is often
referred to as haze. Figure 2 illustrates a signal collected along a scan line (this is part of
the ``review mode'' operation of the scannerÐa detected defect is examined in more
detail). The signal from a discrete defect sits on top of the background haze. The MDS
is limited by the variation of this haze background (``noise''), which is statistical, plus
effects of varying microroughness. It corresponds to a signal-to-niose ratio (S=N) of
Figure 2 Signal, background, and noise: light-scattering signal in a scan across a light point defect
(LPD) (review mode).
A ray of light behaves like an electromagnetic wave and can be polarized perpendi-
cularly (S) or parallel (P) to the incident plane, which is the plane through the direction of
propagation of light and the surface normal. Figure 3 plots re¯ectivity against the thick-
ness of an overlayer on a substrate (in this case oxide on Si) for S and P polarization. On a
bare surface, or one with only a few angstroms of native oxide, S polarization has a
maximum in re¯ectivity and therefore a minimum in scattering. P polarization behaves
in the opposite manner. Therefore for rough bare surfaces, the S polarization channel can
be used to suppress substrate scattering and hence to enhance sensitivity to particle
defects. For speci®c overlayer thickness, destructive interference between surface and over-
layer/substrate re¯ectivity reverses the S and P behavior (see Fig. 3).
There is a variety of commercial particle scanners available, but only a few are in common
use. There is an intrinsic difference in requirements for detection of defects on patterned
versus unpatterned wafers. For patterned wafers it is of paramount importance to recog-
nize defects that are violations of the pattern itself. To do this requires complex image
A. KLA-Tencor Surfscans
1. 6200 Series
The KLA-Tencor Surfscan 6200 series (Figure 4a), which replaced the older 4000 and 5000
series, illuminates the wafer surface using a normal incident laser beam. The incident laser
beam is circularly polarized and is swept across the wafer by an oscillating mirror (170 Hz)
in the X direction, while the wafer is being held and transported by a vacuum puck in the
Y direction. The combination of these two motions provides a raster scan of the entire
wafer surface. With the illuminated spot and detector on its focal lines, a cylindrical mirror
of elliptical cross section is placed above the wafer to redirect scattered light to the
detector. The collection angle of the 6200 series is limited by the physical build of this
mirror and is illustrated in Fig. 4b. After loading, each wafer undergoes two scanning
cycles. The prescan (fast) is for haze measurement and edge detection, while the second
scan (slow) is for surface defect detection and edge/notch determination. Any defects
scattering light above a de®ned threshold of intensity are categorized as light point defects
(LPDs). The 6200 series is designed for smooth surfaces, such as bare silicon, epitaxial
silicon, oxides, and nitrides. For rough surfaces, such as polymers and metals, the sensi-
tivity of the 6200 degrades severely. The reason for this can be traced to the scattering
geometry. Normal incidence, and the detector's collecting over a wide solid angle, con-
stitutes an ef®cient arrangement for collecting scattering from the surface roughness,
thereby decreasing the signal/noise ratio, i.e., the particle scattering/surface background
scattering.
roughness and is used for smooth surfaces, such as bare silicon wafer. The C-U polariza-
tion is for oxide, nitride, and other moderate surfaces. The S-S polarization is the least
sensitive to surface roughness and is suitable for rough surfaces, such as metal. However,
because of the ®xed low-/side-angle collection optics, the collection angle varies depending
on the position of the laser spot with respect to the detector (Fig. 5b). As a result, a sphere
located on the right-hand side of the wafer could appear twice the size as when located on
the left-hand side of the wafer. Furthermore, the grazing-incident angle exacerbates the
variation of the size and shape of the laser spot as the beam sweeps across the wafer. The
end result of these con®gurational differences from the 6200 series is that (a) on smooth
surfaces the sensitivity (i.e., minimum detectable size) is poorer, (b) on rough surfaces the
sensitivity is better, and (c) the sizing accuracy is much more uncertain and dependent on
the orientation of the defect (unless it is spherical).
The basic version of the SP1 (so-called SP1-classic) is equipped with a normal laser beam.
Similar to the 6200 series, this is mainly for smooth surface inspection in dark ®eld.
Differential interference contrast (DIC) technology is used in the bright-®eld detection.
Any surface feature with appreciable slope change can be detected as a phase point defect
(PPD). This bright-®eld DIC channel can detect large defects, about 10 mm and larger,
such as mounds, dimples, and steps. An additional oblique beam (70 from the surface
2. KLA AIT
The AIT is a patterned-wafer defect detection tool employing low-angle incident light and
two PMT detectors at a low collection angle (Figure 8). It has tunable S, P, circular
polarization of incident and collected light to allow several combinations of polarization
settings. The AIT also has a tunable aperture and programmable spatial ®lter for cap-
ability on patterned wafers. It appears that the AIT is being used in industry for both
patterned and unpatterned defect detection. This is not because of any advantage over the
6000 or SP1 series for sensitivity or accuracy in unpatterned wafers, but because it is fast
enough and versatile enough to do both jobs fairly well.
3. KLA 2100
The 2100 series has been the Cadillac of scanners for a number of years for patterned-
wafer inspection. Unlike all the other tools, it is a true optical imaging system (not
scattering). It was very expensive (maybe four or ®ve times the cost of the 6000 tools),
but gave accurate image shape and size information, with close to 100% capture rate (i.e.,
it didn't miss defects), but only down to about 0.2±0:3 mm. Because it scans the whole
wafer in full imaging mode, it is also slow. In addition to its use for patterned wafers, it is
used as a reference, or veri®cation, tool to check the reliability and accuracy of the bare
wafer scanners.
Wafer inspection is the ®rst and crucial step toward defect reduction. When a particle/
defect excursion is observed during a semiconductor manufacturing process, the following
questions are asked: How many defects are on the surface? Are they real? How big are
they? What are they? Where do they come from? How can they be eliminated? Wafer
inspection provides the foundation for any subsequent root-cause analysis. Therefore, the
A. Variation of Measurements
First of all, the variation of the measurements made by a well-trained operator under a
well-controlled environment must be acceptable. The counting variance, the variation of
defect counts, has been used as an indication of the reliability of a wafer inspection tool.
However, since the counting variance is in direct proportion to the average number of
defect counts, it is not a fair measure of the reliability of the measurement. The coef®cient
of variation (CoV), which takes repeatability and reproducibility into account, is a better
measure of the uncertainty of the measurements of a given inspection tool. The coef®cient
of variation is de®ned as
p
s2RPT s2RPD
CoV 5
2 mRPT mRPD
where mRPT , sRPT , and mRPD , sRPD are the average count and corresponding standard
deviations of the repeatability test and reproducibility test, respectively. The repeatability
is normally determined by the accepted industry approach of measuring a given test wafer
continuously 30 times without any interruption. The reproducibility test can be carried out
either in the short term or the long term. The short-term reproducibility is determined by
measuring the test wafer 30 times continuously, loading/unloading the wafer between
scans, while the long-term reproducibility is obtained by measuring the test wafer regularly
once every day for 30 days.
Due to their well-de®ned spherical shape and commercial availability in calibrated
sizes down to very small values (0:06 mm), polystyrene latex (PSL) spheres have been used
to specify the performance of wafer inspection tools. However, real-world defects are
hardly spherical and exhibit very different light-scattering behavior than PSL spheres. A
low variation of measurements of PSL spheres, therefore, does not guarantee the same
result for measurements of real defects. Custom testing is needed for characterizing the
performance of the tool for typical defects of interest. This is done whenever the metrology
tool performance becomes an issue.
B. Sizing
Defects must be sized in a reproducible and repeatable manner, since it is the total number
of defects larger than a certain size that is most commonly used as a speci®cation in a
wafer process. Such a number is meaningless if the cut size (threshold) of the metrology
tool used and the surface scanner cannot be precisely stated. The sizing is achieved by
calibrating the light-scattering response (e.g., the cross section, in mm2 ) of known size
defects. Since PSL spheres can be obtained in known graduated sizes over the particle
size ranges of interest, they have become the standard for this calibration.
Since real defects are neither polystyrene latex nor (usually) spheres, it must be kept
in mind that the reported size of a real defect using this industry standard calibration
approach represents a ``PSL equivalent size'' and does not give real sizes. We will refer to
this again several times, but note here that it is currently considered more important to the
industry that the sizing be repeatable, with a good precision, than that it be accurate; that
is, the goal is that for a given, non-PSL, defect, everyone gets the same ``PSL equivalent
absolutely essential that the proper calibration curve for the surface scanner be used and
that, for each change in material, thickness of material, or even change in process recipe, a
new calibration be done and a new recipe set. Without this, total particle counts will be
under- or overestimated, and, at the limit of sensitivity, haze will be confused with the real
particle counts.
None of the foregoing procedures, which are aimed at giving reproducible results,
addresses the issue that real defects are not PSL spheres, so their true sizes differ from their
``PSL equivalent'' sizes. There are, in principle, three ways of relating the ``PSL equivalent
size'' to true size, but in practice only one is used: that is to review the defects by SEM (or
optical microscopy if the defect is large enough). From our long experience of doing this
on the 6200 series (see Sec. III.A.1 and Chapter 20, Brundle and Uritsky), we can give some
rough guides for that tool. If the defect is a particle in the 0.2±1-mm range, 3-dimensional
in shape, and with no strong microroughness on it, the ``PSL equivalent'' size is often
correct within a factor of 2. For particles smaller than 0:2 mm or much larger than 1 mm,
Figure 10 Calibration curves for bare silicon and 2000-AÊ oxide substrates on the 6200.
C. Capture Rate
An absolute de®nition of capture rate would specify 100% when all of the particles of
interest are detected. Now obviously, if you are working close to the MDS there is a
sizable probability of missing a signi®cant fraction. (The very term minimum detection
size implies calibration with PSL spheres and, therefore, a discussion on capture rates
of PSL spheres.) One moves signi®cantly higher than the MDS threshold (i.e., to the S=N
ratio of 3:1 suggested by KLA/Tencor for the 6200) to avoid such statistical variations;
i.e., you move to a higher bin size to create a high level of precision. In this situation, what
reasons are there for the capture rate of PSL spheres to be less than 100%? Other than the
small percentage of variability discussed in Sec. V.B., which is basically due to counting
electronics, there should be only one fundamental reason that is signi®cant: The area
de®ned by the laser spot size and the raster of the laser beam means PSL spheres that
are close enough together will be counted as one LPD event, reducing the capture rate.
The amount below 100% depends on the distribution density of the PSL spheres. In actual
use we will see later that there seem to be instrumental factors not under the control of the
user that also affect capture rate signi®cantly.
How is the capture rate determined? A common procedure used to be to use the
KLA 2100 imaging scanner as the benchmark, i.e., with the assumption it has a 100%
capture rate. This works only for real sizes larger than 0:2 mm. At smaller sizes there is no
practical way other than relying on the PDS to know how many PSL spheres are deposited
into a spot and then to check this with SEM direct searching and compare the total found
to that found by the scanner.
Once one moves away from PSL spheres, capture rate has a signi®cantly different
meaning and is rarely an absolute term, since real defects may scatter very differently from
PSL spheres. For instance, a physical 1-mm defect with a 0.1 ``PSL equivalent size'' in one
model scanner may have a very different value in another model because of strong angular
effects not present for PSL spheres. The practical de®nition of capture rate, then, becomes
comparative rather than absolute. For example, CMP microscratches of a certain char-
acter (see later) are detected effectively in a particular SP1-TBI mode, but only to about
D. False Counts
False counts is a way of checking, when there is doubt, on whether LPDs represent real
defects. What causes false counts? Remember the recipe has been set with the detection
threshold at a level (S=N 3:1, for instance) such that there is con®dence there are no
false counts in the smallest bin used (the threshold de®nes that size). Given this, the only
way a signi®cant number of false counts can occur (industry likes there to be less than 5%)
is if the haze and/or noise level, N, increases. This can happen if, for the particular wafer in
question, the surface roughness has increased (either across the whole wafer or in patches).
All this is saying is that the recipe used is now inappropriate for this wafer (or patch on the
wafer), and there will be doubt about the validity of the number of defects detected in the
smallest bin size. Such doubt usually arises when an apparent particle count rises without
any changes having occurred in the process being monitored.
The ®rst step in establishing whether there are false counts is to revisit LPDs in the
smallest bin size in review mode (e.g., Fig. 2), and establish whether each S=N is greater
than 3. If S=N is lower than 3, this LPD must be revisited by optical imaging or SEM
review to establish if it is genuine (optical imaging will be appropriate only if we're talking
about a large minimum bin size). If nothing is detectable, the LPD is considered a false
count (or a nuisance defect). If the number of false counts is found to be too high, the
recipe for the scanner has to be changed, increasing the threshold, and the wafer
rescanned. Now the minimum bin size being measured will be larger, but the number of
LPDs detected in it will be reproducible.
E. Defect Mapping
Defect mapping is important at three different levels, with increasing requirements for
accuracy. First, the general location of defects on the wafer (uniform distribution, center,
edge, near the gate valve, etc.) can give a lot of information on the possible cause. Second,
map-to-map comparison is required to decide what particles are adders in a particular
process step (pre- and postmeasurement). Finally, to review defects in a SEM, or any other
analytical tool, based on a light-scattering defect ®le requires X, Y coordinates of suf®-
cient accuracy to be able to re®nd the particles.
Targeting error is the difference between the X, Y coordinate values, with respect to
some known frame of reference. For the general distribution of defects on a wafer, this is
of no importance. For map-to-map comparison, system software is used to compare pre-
and postcoordinates and declare whether there is a match (i.e., it is not an adder).
However, if the targeting error is outside the value set for considering it a match (often
the case), a wrong conclusion is reached. Here we are talking of errors of up to a few
Random error due to spatial sampling of the scattered light. This type of error arises
from the digital nature of the measurement process. As the laser spot sweeps
across the wafer surface, particles and other surface defects scatter light away
from the beam. This scattered light signal is present at all times that the laser
spot is on the wafer surface. In order to process the scattered light signals
ef®ciently, the signal is digitized in discrete steps along the scan direction.
Unless the defect is directly under the center of the laser spot at the time a
sample is made, there will be error in the de®ned coordinates of this defect.
Depending on the laser spot size and the sampling steps, this type of error can
be as much as 50 mm.
Error due to the lead screw nonuniformity. It is assumed that the wafer translation
under the laser beam is linear in speed. However, random and systematic errors
exist, due to the imperfection of the lead screw, and will be integrated over the
travel distance of the lead nut. The contribution of this type of error depends
on the wafer diameter and the tolerance of peak-to-peak error of the lead
screw.
Error due to the sweep-to-sweep alignment (6200 and 6400 series). The 6200 and 6400
series use a raster-scanned laser spot to illuminate the surface contaminant. In
order to keep the total scan time as short as possible, data is collected on
sweeps moving from right to left across the wafer and from left to right on
the next consecutive pass. To align between consecutive sweeps, a set of two
high-scattering ceramic pins is used to turn the sampling clock on and off.
Random errors of as much as twice the size of the sampling step could occur
if sweeps are misaligned.
Error due to the edge/notch detection. When the start scan function of the 6200 or
6400 series is initiated, the wafer undergoes a prescan for edge detection as
well as the haze measurement. The edge information is gathered by a
detector below the wafer. The edge information is stored every 12 edge
points detected. The distance between successive edge points could be as
much as few hundred microns. Since curve ®tting is not used to remove the
skew between sweeps, a systematic error on edge detection results. Similar
systematic errors exist for the SP1 series. The center of the wafer is de®ned
by the intersection of two perpendicular bisectors from tangent lines to the
leftmost edge and the bottom-most edge. Once the center of the wafer is
found, the notch location is searched for. The notch location error can be
signi®cantly affected by the shape of the notch, which can be quite vari-
able. The edge information (the center of the wafer) and the notch location
(the orientation of the wafer) are what tie the defect map to the physical
wafer.
Error due to the alignment of the laser spot to the center of rotation. This type of
systematic error applies only for a tool with a stationary laser beam, such as the
SP1 series.
The ®nal mapping error is a combination of all of the types of errors just described
and can be characterized by a ®rst-order error and a second-order error. The ®rst-order
error is the offset of defects after alignment of the coordinate systems. The second-order
error (or point-to-point error) is the error of the distance between two defects after
correction of the misalignment of the coordinate systems. The ®rst-order error has
not received as much attention as its counterpart. However, the success of map-to-
map comparison depends not only on point-to-point accuracy but also on the ®rst-
order mapping accuracy. If a defect map has a ®rst-order error of as much as
700 mm, which it can have at the edge of a wafer using the SP1, the software routines
for map-to-map comparison will fail, even if the second-order point-to-point accuracy is
very good.
In this section, practical applications for the use of KLA/Tencor scanners (6200, 6400,
SP1, SP1-TBI) are given. The purpose of these studies is to evaluate the performance of
the Surfscans and establish a better understanding of these tools so as to better interpret
particle measurement data, and to establish their most reliable mode of usage.
B. Repeatability
To understand the measurement variation of the KLA-Tencor Surfscan 6200 and SP1-
TBI on real-world particles, a 200-mm (for SFS 6200) and a 300-mm (for SP1-TBI) bare
silicon wafer contaminated by environmental particles was used for the repeatability and
reproducibility test on two SFS 6200 and one SP1-TBI. The repeatability test was done
by scanning the test wafer 30 times continuously without any interruption. The repro-
ducibility test was 30 continuous measurements, with loading/unloading of wafer
between measurements. Both high-throughput and low-throughput modes were used
in this study to determine the variation due to the throughput settings. The results
(Tables 1 and 2) indicate that about 4% of the LPD counts were due to the measure-
ment variation of the instruments, which is acceptable. No contamination trend was
observed during the course of this experiment. Although the throughput setting had no
apparent effect on the measurement variation, it signi®cantly affected the number of
defects captured. Whereas, for the PSL spheres of a speci®c size in Sec. V.A there was a
maximum 5% effect, here both 6200s showed a 20±25% capture rate loss at high
throughput (Table 1). The greater effect is probably because there is a distribution of
PSL equivalent sizes present, many being close to the threshold level. Inadequate elec-
tronic compensation in high throughput pushes these to a lower PSL equivalent size,
and many fall below the detection limit.
For the SP1-TBI, more than half of the total particles are missed in high-throughput
mode (Table 2)! Figure 12 shows the size distribution, as measured in low- and high-
Table 1 Average LPD Counts, Repeatability, and Reproducibility for Two SFS 6200s
throughput modes. Clearly most of the particles in the lowest bin sizes are lost. The high-
throughput mode involves scanning about seven times faster, with an automatic gain
compensation. From Sec. V.A., this clearly worked for PSL spheres of a ®xed
(0:155-mm) size, but it does not work here. Again, PSL equivalent size is being pushed
into bins below the cutoff threshold. This dramatic change in behaviorÐ0:155 mm PSL
spheres in high-/ low-throughput modes where no capture rate loss occurs, compared to a
50% loss for real environmental contaminantsÐpoints to the need to be careful of assum-
ing apparently well-de®ned PSL calibrations map to real particle behavior. Use of the low-
throughput setting is strongly recommended for measurement using the SP1-TBI. Several
other SP1-TBI users have also come to this conclusion.
Figure 12 Size distribution reported by the SP1-TBI under high and low throughput.
6200A 93 73 81 51
6200B 33 20 53 26
6220 59 41 Ð
6400 133 177 60 32
SP1-classic 104 60 98 55
SP1-TBI (normal) 40 20 39 21
SP1-TBI (oblique) 38 14 38 22
mapping accuracy, an XY standard wafer (which is different from the one used in previous
work) was loaded into the wafer cassette with an arbitrary orientation. A dummy scan was
carried out at high throughput, and the wafer was unloaded, with notch up (0 ). The wafer
was then reloaded and unloaded with the notch rotated 30 . This was repeated six times,
rotating each time. A second set of measurements was repeated on another day, but in this
set the robot was allowed to ``initialize'' the system ®rst; that is, the robot runs through a
procedure to optimize alignment of the stage, etc. The coordinate ®les in both sets of
measurements were then corrected for coordinate misalignment (i.e., the 1st-order correc-
tion was eliminated) to bring them into alignment. After this was done, a consistent error
(20 10 mm) was found on all maps (Figure 15a). This indicates that wafer orientation
does not affect the point-to-point (i.e., the second-order) accuracy. However, the size of
the ®rst-order errors clearly showed a strong dependence on the wafer loading orientation
(Fig. 15b). Data also showed that the mapping error was least when the wafer was loaded
with notch up (0 ). Figure 16 shows the XY positions of the geometric center of all marks
for various wafer loading orientations. Notice that the geometric centers move counter-
clockwise around a center point and that the rotation angle of these geometric centers is
similar to the increment of the loading orientation ( 30 ). This indicates that the ®rst-
order mapping error was likely dominated by the misalignment of the wafer center to the
rotation center of the stage. The calculated centers of rotation were (99989, 99864) and
(99964, 99823) for measurements done on 12/7/99 and 12/16/99, respectively. A variation
range of about 50 mm in radius was observed on these calculated rotation centers. The
angular offset of the patterns of the geometric centers suggests that the alignment of the
wafer center to the rotation center depends on the initial wafer loading orientation as well.
Figure 17 shows the total rotation needed to correct the misalignment of the coor-
dinate system for each map. The similar trend shown for the two sets of measurements
suggests that the notch measurements were repeatable. The consistent offset ( 0:035 )
was a result of a change in misalignment (improvement) of the rotation center by the robot
initialization in the second run. Additional measurements were repeated several times for
the same loading orientation. No signi®cant difference was found from measurements
done at the same time for a given loading orientation. The variation of total rotation of
measurements with the same orientation indicates the uncertainty of the notch measure-
ment. This is equivalent to about 17 mm at the edge of a 200-mm wafer loaded with notch
up (0 ).
To summarize, although the point-to-point accuracy was not affected by the orien-
tation of the wafer loaded into the cassette, the ®rst-order mapping accuracy of the SP1-
TBI showed a strong dependency on the wafer loading orientation. Our data clearly shows
Figure 17 Total rotation needed to correct misalignment of the SP1 coordinate system (0 denotes
the wafer was loaded with notch up).
D. Sizing Accuracy
The sizing performance of a surface inspection tool is generally speci®ed by the manufac-
turer based on polystyrene latex (PSL) spheres on bare silicon wafers. However, real-world
defects on the surface of the wafer are usually not perfect spheres. They could be irregular
chunks, ¯akes, bumps, voids, pits, or scratches. We measured the size of electron-beam-
etched pits using all Surfscans of interest. For the 6XY0 series, the reported size for these
pits was consistent across the wafer surface (Figure 18h), though the 6400 signi®cantly
underestimates the size by 70%. Pit sizes reported by the SP1 series with normal illumina-
tion, SP1-classic (Fig. 18e) and SP1-TBI DCN (Fig. 18f), and DNN (Fig. 18g) were
strongly dependent on the location of the pits with respect to the wafer center (i.e., the
rotation center). Similar to the 6400, the SP1-TBI with oblique illumination (Figure 18f,
DCO) underestimates the pit size by 70%. Table 4 summarizes the averaged LPD sizes for
electron-beam-etched pits measured by all Surfscans of interest. For comparison, sizes for
0:72-mm and 0:155-mm PSL spheres measured by these Surfscans are also listed in Table 4.
In contrast to the case of pits, the 6400 and SP1-TBI oblique overestimated the PSL
spheres by 40% and 30%, respectively. This is due to the oblique incidence, but the
magnitude of the discrepancy will depend strongly on the type and size of the defect.
The strong variation of size with radius found for etch pits with the SP1 and SP1-
TBI may be connected to the fact that the rotational speed of the wafer is changed as it
translates under the laser beam (to attempt to keep the dwell time per area constant).
Such CMP utilizes a chemically active and abrasive slurry, composed of a solid±
liquid suspension of submicron particles in an oxidizing solution. Filters are used to
control the particle size of the abrasive component of the slurry. Over time and with
agitation, the colloids tend to agglomerate and form aggregates that are suf®ciently
large to scratch the wafer surface during polishing. These micron-scale scratches (micro-
scratches) are often missed because of their poor light-scattering nature. The fundamental
difference in the light-scattering behavior between particles and microscratches can be
Figure 19 Light-scattering pattern for a 0:2-mm PSL sphere on silicon substrate. The gray scale is
corresponding to the intensity of the scattered light. The white is hot and the dark is cool.
0:2-mm range were detected by the SP1 (sum of both narrow and wide) but missed by the
6200. We suspected the SP1 was detecting microscratches that the 6200 was missing. The
wafer was re-examined on the SP1-TBI under normal illumination with both narrow and
wide channels set at a 0:2-mm threshold. The DN/DW size ratio was plotted against
frequency of occurrence (Figure 23). Two lobes appear with the separating value being
at a ratio of about 2.5. The anticipation would be that the lobe with a ratio below 2.5
represents particles and that with the ratio above 2.5 represents microscratches. To verify
this, 65 defects in the smallest (0:2-mm) bin size were reviewed by the Ultrapointe Confocal
Laser microscope. All were real defects, and 57 of the 65 were, indeed, microscratches. The
physical size (6 mm in length) observed by a confocal microscope (Ultrapointe) far exceeds
the PSL equivalent size of 0:2 mm.
To summarize this section, then, it is clear that by using all the available channels in
the SP1-TBI (different angular scattering) it is possible to derive signatures of defects that
are very different physically, e.g., particles versus pits or microscratches. It remains to be
seen whether the distinction is suf®cient to be useful for particles with less dramatic
difference, though we have also been able to easily distinguish small COPs from very
small particles this way. Other scanners, with availability of multiple channels, such as
the Applied Excite, or ADE tool, can also perform this type of distinction.
VI. CONCLUSIONS
In this chapter we have tried to give a summary of the principles behind using light-
scattering for particle scanners, a description of the important parameters and operations
in practically using scanners, and a discussion of the caveats to be aware of. It is very easy
to get data using particle scanners and just as easy to misinterpret that data without
expertise and experience in how the scanners actually work. Owing to the fact that particle
requirements in the industry are at the limit of current scanner capability (e.g., sizing
accuracy, minimum size detectability, accuracy/reproducibility in particle counting) it is
very important they be operated with a full knowledge of the issues involved.
We have also presented the results of some of our efforts into delineating, in a more
quantitative manner, some of the important characteristics and limitations for the parti-
cular set of scanners we use (6200, 6400, SP1, SP1-TBI).
In the future we expect several developments in the use of particle scanners. First,
there is a push toward integrating scanners into processing tools. Since such a scanner has
to address only the particular process in hand, it is not necessary that such a tool be at the
forefront of all general capabilities. It can be thought of more as a rough metrology check;
when a problem is ¯agged, a higher-level stand-alone tool comes into play. Second, there
should be greater activity toward obtaining a particle ``signature'' from the design and use
of scanners with multiple channels (normal, oblique, incidence, different angular regions
of detection) to make use of the difference in scattering patterns from different types of
defects. Third, it is likely that there will still be a push to better sensitivity (i.e., lower
particle size detection). This, however, is ultimately limited by the microroughness of the
surface and so will be restricted primarily to supersmooth Si monitor wafers. The issue of
improved targeting accuracy will be driven by customers' increased need to subsequently
review defects in SEMs or other analytical tools. Either the scanner targeting accuracy
must improve (primarily elimination of 1st-order errors) or on-board dark-®eld micro-
scopes have to be added to the SEMs (e.g., as in the Applied SEMVision) or a stand-alone
optical bench (such as MicroMark 5000) must be used to update the scanner ®les to a
5-mm accuracy in a fast, automated manner.
Finally the issue of always working in ``PSL equivalent sizes'' must be addressed.
Everyone is aware that for real defects, real sizes are not provided by scanners and the
error can be very different for different types of defects. The push to greater sensitivity,
then, is more in line with ``we want to detect smaller'' rather than any sensible discussion
of what sizes are important to detect.
The authors would like to acknowledge stimulating discussions with many of our col-
leagues, in particular, Pat Kinney of MicroTherm, and with Professor Dan Hireleman
and his group (Arizona State and now Purdue University).
REFERENCES
1. T Hattori. In: KL Mittal, ed. Particles on Surfaces: Detection, Adhesion, and Removal. New
York: Marcel Dekker, 1995, pp 201±217.
2. H van de Hulst. Light Scattering by Small Particles. New York: Dover, 1981.
3. K Nahm, W Wolfe. Applied Optics 26:2995±2999, 1987.
4. PA Bobbert, J Vleigler. Physica 137A:213, 1986.
5. HE Bennett. Scattering characteristics of optical materials. Optical Engineering 17(5): 1978.
6. Y Uritsky, H Lee. In: DN Schmidt, ed. Contamination Control and Defect Reduction in
Semiconductor Manufacturing III. 1994, pp 154±163.
7. P-F Huang, YS Uritsky, PD Kinney, CR Brundle. Enhanced sub-micron particle root cause
analysis on unpatterned 200 mm wafers. Submitted to SEMICON WEST 99 Conference:
Symposium on Contamination-Free Manufacturing for Semiconductor Processing, 1999.
8. F Passek, R Schmolk, H Piontek, A Luger, P Wagner. Microelectronic Engineering 45:191±
196, 1999.
9. T Quinteros, B Nebeker, R Berglind. Light Scattering from 2-D surfacesÐA New Numerical
Tool. DDA 99ÐFinal Report. TR 350, 1999.