Zhengchen

Download as pdf or txt
Download as pdf or txt
You are on page 1of 139

VLSI IMPLEMENTATION OF A HIGH-SPEED DELTA-SIGMA ANALOG TO DIGITAL CONVERTER

A Thesis Presented to The Faculty of the Russ College of Engineering and Technology Ohio University

In Partial Ful llment of the Requirement for the Degree Master of Science

By Zheng Chen November, 1997

ii THIS THESIS ENTITLED VLSI IMPLEMENTATION OF A HIGH-SPEED DELTA-SIGMA ANALOG TO DIGITAL CONVERTER by Zheng Chen has been approved

for the School of Electrical Engineering and Computer Science and the Russ College of Engineering and Technology

Janusz Starzyk, Professor School of Electrical Engineering and Computer Science

Warren K. Wray, Dean Russ College of Engineering and Technology

iii

ACKNOWLEDGEMENTS
I sincerely thank my thesis advisor, Dr. Janusz Starzyk, for his many hours of generous assistance. Not only did Dr. Starzyk recommend many useful references, but also attributed his insights in many theoretical issues. I will remember the refreshed feeling that I got each time I discussed with him a headache problem disturbing me for several days. That feeling gave me new energy and guided me to the right direction. Special thanks go to my family and my dear friend, Ling Zhu, as well as his parents, whose support gave me spiritual strength in pursuing my ambition and my staying in the United States. I am very thankful also, to Dr. Voula Georgopoulos, Dr. Xuefeng Fang at IDT Inc., and Mr. David Zar at Washington University in St. Louis, for their help on digital lter, switched-capacitor circuit, and MOSIS design kits, respectively. Finally, thanks to all my friends, who assisted me upon request and were very helpful in their suggestions.

iv

TABLE OF CONTENTS
LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : viii Chapter 1 INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 FUNDAMENTAL OF DELTA-SIGMA DATA CONVERTERS :
2.1 Overview of Various A D Converter Types : : : : : : : : : : : : : : : 2.2 Basic Concepts of Delta-Sigma A D Conversion : : : : : : : : : : : : 2.2.1 A D Conversion Terms : : : : : : : : : : : : : : : : : : : : : : 2.2.2 Oversampling Without Noise Shaping : : : : : : : : : : : : : : 2.2.3 Noise-Shaping Oversampling  Modulation : : : : : : : : : 2.3 Architectures of  A D Converters : : : : : : : : : : : : : : : : : : 2.3.1 Higher-Order Single-Stage  Converters : : : : : : : : : : : 2.3.2 Multi-stage Cascade  Converters : : : : : : : : : : : : : : 2.3.3 Multi-bit  Converters : : : : : : : : : : : : : : : : : : : : : 2.4 Digital Filters for  A D Converters : : : : : : : : : : : : : : : : : 2.4.1 Basic Principles of Decimation : : : : : : : : : : : : : : : : : : 2.4.2 Multistage Implementation : : : : : : : : : : : : : : : : : : : : 2.4.3 Filter Structures : : : : : : : : : : : : : : : : : : : : : : : : : 3.1 A Cascaded Multibit  Modulator : : : : : : : : : : : : : : : : : : 3.1.1  Modulation at Low Oversampling Ratios : : : : : : : : : :

1 3
4 9 10 12 13 18 19 20 22 24 25 26 28 33 33

3 ANALOG MODULATOR DESIGN : : : : : : : : : : : : : : : : : : : 33

v 3.1.2 Linear Analysis of the Modulator : : : : : : : : : : : : : : : : 3.2 Modi ed Modulator with Interstage Coupling : : : : : : : : : : : : : 3.2.1 Interstage Coupling : : : : : : : : : : : : : : : : : : : : : : : : 3.2.2 Multibit Quantizer : : : : : : : : : : : : : : : : : : : : : : : : 3.2.3 Modi ed Integrators : : : : : : : : : : : : : : : : : : : : : : : 3.3 VLSI Implementation : : : : : : : : : : : : : : : : : : : : : : : : : : : 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 3.3.6 3.3.7 Top-Level Structure of the Modulator : : : : : : : : : Switched-Capacitor Integrator : : : : : : : : : : : : : Wide-Swing Constant-Transconductance Bias Circuit Fully Di erential Folded-Cascode Opamp : : : : : : : Regenerative Track-and-Latch Comparator : : : : : : 3-bit Flash ADC and 3-bit DAC : : : : : : : : : : : : Simulation : : : : : : : : : : : : : : : : : : : : : : : : 35 36 37 38 39 41 41 43 47 51 55 56 60 68 70 70 77 78 79 81 83 83 90 94 96

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

: : : : : : :

4 DIGITAL DECIMATOR DESIGN : : : : : : : : : : : : : : : : : : : 68


4.1 Digital Filter Design : : : : : : : : : : : : : : : : : : : : : : : : : : : 4.1.1 Fourth-Order Comb sinc4  Filter : : : : : : : : : : : : : : : : 4.1.2 Half-band Filters : : : : : : : : : : : : : : : : : : : : : : : : : 4.2 Digital Filter Structures : : : : : : : : : : : : : : : : : : : : : : : : : 4.2.1 Two's Complement Arithmetic : : : : : : : : : : : : : : : : : 4.2.2 Pipelining Structure for Comb Filter : : : : : : : : : : : : : : 4.2.3 Parallel and Pipelining Structure for Half-Band Filters : : : : 4.3 VHDL Implementation : : : : : : : : : : : : : : : : : : : : : : : : : : 4.3.1 Programming and Simulation : : : : : : : : : : : : : : : : : : 4.3.2 Synthesis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.1 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5.2 Future Work : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

5 CONCLUSION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 94

vi

BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 100 Appendix A MOSIS SCMOS PROCESS : : : : : : : : : : : : : : : : : : : : : : : : 104


A.1 SCN08HP Technology and Parameter File : : : : : : : : : : : : : : : 104 A.2 SPICE Model File : : : : : : : : : : : : : : : : : : : : : : : : : : : : 105

B VHDL PROGRAMS : : : : : : : : : : : : : : : : : : : : : : : : : : : : 107


B.1 B.2 B.3 B.4 B.5 C.1 C.2 C.3 C.4 8-to-3 Encoder : : : : : : : : : : : : : Error Cancellation Logic : : : : : : : : Comb Filter : : : : : : : : : : : : : : : Half-Band FIR Filters : : : : : : : : : Structure Modeling for Digital Circuit

: : : : :

: : : : :

: : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

: : : : : : : : :

107 109 110 112 116 119 119 121 124

C MATLAB PROGRAMS : : : : : : : : : : : : : : : : : : : : : : : : : : 119


Herrmann Estimation of FIR Filter Order : Frequency Response of sinck Comb Filters : Two-Stage Half-Band Filter Design : : : : : Half-Band Filters with Rounded Coe cients

vii

LIST OF TABLES
2.1 Di erent A D Converter Types. : : : : : : : : : : : : : : : : : :
5

3.1 Transistors in the Bias Circuit. : : : : : : : : : : : : : : : : : : : 50 3.2 Transistors in the Opamp Circuit. : : : : : : : : : : : : : : : : : 53 3.3 Clock Signals De ned in the Simulator. : : : : : : : : : : : : : : 61 4.1 Design Speci cations of the Decimator. : : : : : : : : : : : : : : 69 4.2 Nonzero Coe cients of the 11-tap Half-Band Filter. : : : : : 72 4.3 Nonzero Coe cients of the 165-tap Half-Band Filter. : : : : : 73 4.4 Decimal Equivalents of Numbers 0.11 to 1.00. : : : : : : : : : 78 4.5 Periodical Test Patterns for Digital Filters. : : : : : : : : : : : 90 4.6 Global Cell Usage Statistics for Digital Filters. : : : : : : : : : 91 4.7 Hierarchy Statistics for Digital Filters. : : : : : : : : : : : : : : 92

viii

LIST OF FIGURES
2.1 Block diagram of pipelined A D converter. : : : : : : : : : : : : : : 2.2 A 4-bit interpolating A D converter. : : : : : : : : : : : : : : : : :
7 9

2.3 A  modulator and its linear model. : : : : : : : : : : : : : : : : 14 2.4 A rst-order  modulator. : : : : : : : : : : : : : : : : : : : : : : 15 2.5 A second-order  modulator. : : : : : : : : : : : : : : : : : : : : : 17 2.6 Spectral density of noise for some di erent noise-shaping modulation. 18 2.7 Chain of integrators with distributed feedback and distributed
feedforward inputs. : : : : : : : : : : : : : : : : : : : : : : : : : : : 20

2.8 Block diagram of cascaded modulator. : : : : : : : : : : : : : : : : 21 2.9 Simpli ed linear model of a  ADC with nonlinearity errors. : : : 23 2.10 a Leslie-Singh structure. b An Equivalent representation. : : : : 25 2.11 A  A D converter system. : : : : : : : : : : : : : : : : : : : : : 26 2.12 Decimation by a factor M. : : : : : : : : : : : : : : : : : : : : : : : 26 2.13 FIR structures for decimation by a factor M. a Direct form. b
E cient direct form. : : : : : : : : : : : : : : : : : : : : : : : : : : 29

2.14 Symmetrical FIR structure for decimation by a factor M. : : : : : : 30 3.1 Block diagram of a third-order cascaded multibit  A D converter. 34 3.2 A cascaded multibit  ADC with interstage coupling. : : : : : : : 37 3.3 Modi ed structure of the third-order cascaded multibit  ADC.
Note that the Y2p is de ned in Equation 3.6. : : : : : : : : : : : : 40

ix

3.4 Top-level schematic of the modulator. : : : : : : : : : : : : : : : : : 41 3.5 The A-to-D block in Figure 3.4. : : : : : : : : : : : : : : : : : : : : 42 3.6 A parasitic-insensitive switched-capacitor integrator. : : : : : : : : : 43 3.7 The CLOCK GENERATOR block in Figure 3.4. : : : : : : : : : : : 44 3.8 The INTE1 block in Figure 3.5. : : : : : : : : : : : : : : : : : : : : 45 3.9 The BIAS block in Figure 3.4. : : : : : : : : : : : : : : : : : : : : : 48 3.10 Conventional a and wide-swing b cascode current mirrors. Ibias
typically is set to the nominal or maximum input current, Iin. : : : 49

3.11 The OPAMP block in Figure 3.8. : : : : : : : : : : : : : : : : : : : 52 3.12 The CMFB block in Figure 3.8. : : : : : : : : : : : : : : : : : : : : 54 3.13 The QUANTIZER 1B block in Figure 3.5. : : : : : : : : : : : : : : 55 3.14 The QUANTIZER 3B block in Figure 3.5. : : : : : : : : : : : : : : 57 3.15 The BUFF block shown in Figure 3.14. : : : : : : : : : : : : : : : : 58 3.16 The CMPR block shown in Figure 3.14. : : : : : : : : : : : : : : : : 59 3.17 Block diagram of a 3-bit di erential DAC. : : : : : : : : : : : : : : 60 3.18 Bias voltages from simulation of the circuit shown in Figure 3.9. : : 62 3.19 Open-loop freqency response of the opamp shown in Figure 3.11 with
200fF capacitive load. : : : : : : : : : : : : : : : : : : : : : : : : : 63 64 65

3.20 Open-loop transient analysis of the opamp shown in Figure 3.11 with
200fF capacitive load. : : : : : : : : : : : : : : : : : : : : : : : : :

3.21 Simulation results of the modulator shown in Figure 3.5. a Outputs
of the integrators and 1-bit quantizer. : : : : : : : : : : : : : : : : :

3.22 Simulation results of the modulator shown in Figure 3.5. b Outputs
of the 3-bit quantizer. : : : : : : : : : : : : : : : : : : : : : : : : : :

66

4.1 Block diagram of digital lters. : : : : : : : : : : : : : : : : : : : : 69 4.2 Frequency response of sinc4 lter with a decimation ratio of 6. : : : 71 4.3 Frequency responses of 11-tap and 165-tap half-band lters. : : : : : 74 4.4 Frequency responses of 11-tap half-band lter with rounded
coe cients. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : coe cients. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 75 76

4.5 Frequency responses of 165-tap half-band lter with rounded

4.6 Structure for error cancellation logic. : : : : : : : : : : : : : : : : : 79 4.7 Structures for the fourth-order comb lter. : : : : : : : : : : : : : : 80 4.8 Structure for the 11-tap half-band lter. : : : : : : : : : : : : : : : 81 4.9 Structure for the 165-tap half-band lter. : : : : : : : : : : : : : : : 82 4.10 Simulation results of the modulator shown in Figure 3.5 with dc input
1. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : of 3

88

4.11 Simulation results of the digital circuits modeled in Appendix B.5. : 89 5.1 Layout for the opamp shown in Figure 3.11. : : : : : : : : : : : : : 97

Chapter 1 INTRODUCTION
The emergence of powerful digital signal processors for telecommunication and multimedia applications implemented in CMOS VLSI technology creates the need for high-resolution analog-to-digital A D converters that can be integrated in fabrication technologies optimized for digital circuits and systems. However, the same scaling of VLSI technology that makes possible the continuing dramatic improvements in digital signal processor performance also severely constrains the dynamic range available for implementing the interfaces between the digital and analog representation of signals. Oversampled A D converters based on  delta-sigma modulation combine sampling at rates well above the Nyquist rate with negative feedback and digital ltering in order to exchange resolution in time for that in amplitude. These converters are especially insensitive to circuit imperfections and component mismatch, and therefore provide a means of exploiting the enhanced density and speed of scaled digital VLSI circuits so as to avoid the di culty of implementing complex analog circuit functions within a limited analog dynamic range. A  modulator consists of an analog lter and a coarse quantizer enclosed in a feed back loop. Together with the lter, the feedback loop acts to attenuate the quantization noise at low frequencies while emphasizing the high-frequency noise. Since the analog signal is sampled at a frequency much greater than the Nyquist rate and the quantization noise in the low-frequency signal band can be shaped to the high frequency range, a conversion with high resolution may be achieved by removing out-of-band quantization noise with a digital low-pass lter operating on the output of the  modulator.

2  data converters have been successfully applied in many low-frequency elds, such as digital audio and ISDN systems, while e orts are dedicated to developing new  structures for higher-frequency applications. A new structure 1 has been proposed for conversion rate up to 2 MHz. The purpose of this thesis is to implement a  A D converter system exploiting that proposed structure in a 0.8-m CMOS technology, so that it can achieve 12-bit resolution and 2.08 MHz conversion working at 50 MHz clock for 1 MHz signal band. Chapter 2 discusses the concepts and structures of  modulation as well as digital lters for decimation of sampling rate. A general introduction of A D converters is also given in Chapter 2. Chapter 3 contains the design of the analog  modulator. This includes discussion sections on the new structure, VLSI implementation of the structure, and SPICE simulation results. Many analog building blocks, such as folded-cascode opamp, comparator, ash A D converter, and switched-capacitor integrator, are designed on transistor level. Chapter 4 presents the design of digital lters in Matlab, VHDL modeling and simulation, and synthesis of standard-cell integrated circuits. All of the VHDL codes and Matlab programs are included in the appendices. A conclusion chapter gives some discussions on the future layout design.

Chapter 2 FUNDAMENTAL OF DELTA-SIGMA DATA CONVERTERS


Analog-to-digital interfaces are becoming increasingly more important as translators between state-of-the-art digital signal processing DSP systems and the stubbornly analog outside world. With strict demands for higher accuracy in signal processing, these interface stages must also become more precise. In particular, the data converters, both analog-to-digital A D and digital-to-analog D A, must be at least as accurate as the overall precision of the DSP systems. At the same time, as both the feature sizes and bias voltages of very-large-scale integrated VLSI systems decrease, the accuracy and dynamic range of analog components are reduced, making the fabrication of monolithic high-resolution data converters more di cult. Compared with the conventional Nyquist-rate converters, oversampling converters, where the sampling and processing rate is much higher than the Nyquist rate, relax the requirements placed on the analog circuitry at the expense of more complicated digital circuitry. This trade-o becomes more desirable for modern VLSI technologies. In order to achieve extra resolution and lower the oversampling ratio OSR, noise-shaping techniques are usually used in oversampling converters, which is commonly referred to as delta-sigma  modulation 2 .1 In this chapter, an overview of various A D converters is presented rst. Then the basics of oversampling converters and architectures for  modulators are discussed. Finally, digital lters are described for decimating the modulated signal. Since this work is dedicated to a  A D converter ADC, the issues special for 
1 Delta-sigma modulation is also sometimes referred to as sigma-delta modulation.

4 D A converters DAC will not be covered here. Interested readers may refer to 3, Chapter 12 for details of  DAC.

2.1 Overview of Various A D Converter Types


Analog-to-digital converters can be classi ed in a number of ways. For example, we may classify them into two categories: Nyquist-rate and oversampling 4, Chapter 10 , on the basis of their sampling rate of the input signal. If the bandwidth of the signal is f0 , then the sampling rate fs has to be at least Nyquist rate 2f0 in order to avoid aliasing. In the conventional Nyquist-rate A D converter, the sampling rate fs of the analog input signal is the same as the output data rate. Hence, a one-to-one correspondence exists between analog input samples and digital output words. Whereas, in the oversampling A D converter, the fs is much higher than Nyquist rate, and the oversampling ratio OSR fs=2f0 is usually in the range from 16 to 512. Thus, every output word of the oversampling data converter is found as a weighted average of many consecutive analog input samples. Since the oversampling A D converter will be thoroughly discussed in the rest of this chapter, this section mainly discusses some types of Nyquist-rate A D converters. We may further classify these Nyquist-rate A D converters into three categories|low-to-medium speed, medium speed, and high speed Table 2.1 5, 6 . For comparative purpose, oversampling converter is listed in the table.2 It should be noted that the speed and resolution ranges in Table 2.1 just show a rough estimate. The classi cation criteria may need to be expanded based on the actual design. A brief look at these Nyquist-rate converters may give us an idea of how they work and where they might t:
2 The conversion rate is equivalent to that of a Nyquist-rate converter.

Table 2.1: Di erent A D Converter Types.


Samples Resolusion Type of A D Converter per second Bits Classi cation Low-to-Medium Integrating 10 30 12 24 Speed, dual-slope High Accuracy Oversampling 103 106 12 22 delta-sigma Medium Speed, Sampling successive 104 106 8 16 Medium Accuracy approximation High Speed, Parallel  ash 106 109 6 16 5 7 Low-to-Medium Subranging two-step, 10 10 8 16 multistep or pipelined Accuracy Interpolating non106 108 8 12 folding, folding
Integrating converter: This is the simplest and a popular approach for realizing high-accuracy data conversion on very slow-moving signals. The basic function of a quantizer is to electronically de ne a range of input values, subdivide that range into a set of subregions i.e., 2n levels for n-bit resolution, and then decide within which subregion the input sample lies. A dual-slope integrating converter achieves this by rst performing an integration of an unknown input signal ,Vin  in a xed time period T1 , followed by the integration of a known reference signal of opposite polarity Vref  to examine how long it takes to decay the integration to zero T2. The time ratio of T2=T1 can be used to determine the Vin level. This kind of converters are widely used in measurement instruments such as voltage or current meters because of their simplicity and insensitivity to hardware imperfections. However, the conversion speed is quite slow. For example, the worst-case conversion speed occurs when Vin equals Vref . In this case, the conversion is equivalent to going from the lowest level to the highest level then to the lowest level again, which needs 2n+1 clock cycles.

6
Sampling converter: This is so far the most popular fully integrated solution for moderate performance systems. This works by applying a binary search" algorithm to successively approximate the input sample with the output of a digital-to-analog converter DAC, whose input is also the ADC's output, and test each bit, instead of each level in the integrating converter. A binary search divides the search space in two each time, and the desired input level can be found in n steps for a set of 2n levels. Specially, in the rst clock cycle, the most signi cant bit MSB of the DAC's input is set to `1' so that the output of the DAC equals half of Vref . The comparison determines whether the MSB should be kept on '1' or not. In the second clock cycle, the bit next to the MSB is determined, followed by next bit and so on until the least signi cant bit LSB is determined. Thus, a successive-approximation converter requires n clock cycles to complete an n-bit conversion. The cost for this speed-up is that a high-speed DAC with precision on the order of the conversion itself is required. Parallel converter: In parallel conversion, all subregions or levels are examined simultaneously using one comparator per subregion. Therefore, it is the highest-speed A D converter, so called ash converter. The result can be obtained in the order of one clock cycle, but the cost in hardware area and power consumption is much greater than successive approximation, especially for higher resolutions e.g., 8-bit resolution results in 28 , 1 = 255 comparators!. A 3-bit ash ADC will be designed as a part of this thesis work. Subranging converter: Very similar to ash converters, except that these perform two or more substeps or multisteps instead of requiring a complete conversion in one cycle. Multistep

IN

S/H

+
n-bit A/D n-bit D/A

n 2

OUT

(REMAINDER)

n bits

IN

UNIT (n)

UNIT (n)

UNIT (n)

Figure 2.1: Block diagram of pipelined A D converter.


converter is also referred as to pipelined converter. These converters utilize a series of cascaded blocks in which a low-resolution A D converter is used to estimate the signal, and a D A converter and an analog subtraction block remove this estimate from the signal. A remainder voltage is passed on to the next stage where the process is repeated. A block diagram is shown in Figure 2.1 4, Chapter 9 . The low-resolution A D converter inside each cascade block is usually a ash converter. However, these converters can potentially achieve a throughput rate comparable with parallel converters but at much less hardware cost at large resolutions. It is straightforward to see that subranging converters save a lot of comparators over their ash counterparts. For example, an 8-bit two-step converter needs only 25 , 2 = 30 comparators, if each step involves a 4-bit ash converter, a big reduction comparing to 255 comparators required for a parallel converter.

8
Interpolating converter: A serious problem that the ash converters face is the large number of opamps attached to the input signal, which introduce quite high input capacitance so that a very big and power-hungry bu er has to be added to drive the signal. Interpolating converters can alleviate this problem by reducing the number of input ampli ers by an interpolating factor of k. For example, in a 4-bit interpolating A D converter Figure 2.2 5, Chapter 13 , only 4 input ampli ers may be used instead of 15 in a ash converter i.e., interpolating factor of 4. These input ampli ers behave as linear ampli ers near their threshold voltages but are allowed to saturate once their di erential inputs become moderately large. As a result, noncritical latches as simple as an inverter need only determine the sign of the ampli er outputs i.e., above or below the threshold, which is near the midpoint of the two logic levels. Other subregions are interpolated between adjacent outputs of these ampli ers e.g., 3 interpolative levels between each two adjacent ampli ers in Figure 2.2. The output of all the latches is encoded by the digital logic circuit. A folding architecture 7 is often used to further reduce the large number of latches in the above interpolating converters. A folding A D converter works like a two-step converter while not requiring an accurate D A converter.

Other low-speed and low-accuracy A D converters are not discussed here since the major emphasis in this thesis is on advanced DSP applications in telecommunication. All of the above ADCs except integrating converter have been successfully used in most telecommunication applications. Perhaps the most signi cant development in telecommunication ADCs in recent years has come from the oversampling delta-sigma approach.

Figure 2.2: A 4-bit interpolating A D converter. 2.2 Basic Concepts of Delta-Sigma A D Conversion
As mentioned above, oversampling delta-sigma converters are more tolerable to imprecision of analog circuits by taking advantage of complicated and fast digital circuits. The second advantage of delta-sigma converters is that they simplify the requirements placed on the analog anti-aliasing lters for ADCs and smoothing lters for DACs. Furthermore, a sample-and-hold is usually not required at the input of the oversampling ADC. Therefore, they have become popular in recent years at least for medium to low speed applications such as high delity digital audio 8, 9, 10 , digital

10 telephony 11, 12, 13 , and instrumentation. Future applications in digital video and digital radar systems are imminent as faster technologies become available.

2.2.1 A D Conversion Terms


Before detailed concepts of  modulation are discussed, some commonly used terms describing the performance of data converters need to be reviewed.

Resolution The number of bits of resolution refers to the smallest analog input level

to cause change in the digital output word. Thus, an n-bit resolution implies that the converter can resolve 2n distinct analog levels. example, with input full scale= 5V and resolution=10 bits, then 1 LSB= 5=210 = 4:9mV .
1 LSB, or mathematically, ation of the actual V0:::01 from 2 0:::01 , 1 Eoff = V V 2
LSB

LSB Least Signi cant Bit One LSB=full-scale input voltage=resolution. For

O set Error For an A D converter, the o set error is de ned as the relative devi-

in units of LSBs, where V0:::01 refers to the input voltage as the output data becomes 0 : : : 01.

Gain Error This is also called the full-scale error, which refers to the di erence
, V0:::01 , 2n , 2 Egain = V1:::11 V
LSB

between the actual input that produces a full-scale output word and the ideal voltage when the o set error has been reduced to zero. Or mathematically,

in units of LSBs, where V1:::11 refers to the input voltage as the output data becomes 1 : : : 11.

11

Di erential Nonlinearity DNL Error In an ideal converter, each analog step

size is equal to 1 LSB. After the o set and gain errors have been removed, DNL error is de ned as the variation in analog step size away from 1 LSB. Thus, a converter with a maximum DNL of 0.5 LSB has its step size varying from 0.5 LSB to 1.5 LSB. For an ADC, when the DNL reaches 1 LSB or greater, there will be missing output codes. straight line between zero and full-scale endpoints of the converter's transfer response, expressed in LSBs. An ADC is guaranteed not to have any missing codes if the maximum INL error is less than 0.5 LSB.

Integral Nonlinearity INL Error The INL is the worst-case deviation from a

Accuracy The absolute accuracy of a converter is de ned to be the di erence between

the expected and actual transfer responses, which includes the o set, gain, and nonlinearity errors. The relative accuracy is the accuracy after the o set and gain errors have been removed, which is usually referred to as the maximum INL error. Accuracy can be expressed as a percentage error of full-scale value, as the e ective number of bits, or as a fraction of an LSB. For example, in 1 LSB integral linearity, an 8-bit accuracy implies 1=28  0:4 order to achieve 2 matching, while a 12-bit accuracy implies 1=212  0:025 matching. Note that a converter may have 12-bit resolution with only 10-bit accuracy, or 10-bit resolution with 12-bit accuracy. An accuracy greater than the resolution means that the converter's transfer response is very precisely controlled. to the background noise power, or equivalently, the rms ratio of the input signal amplitude to the noise amplitude. For an ideal ADC with a sinusoidal input, the SNR related to the resolution n is

Signal-to-Noise Ratio SNR This is the ratio of the original input signal power

SNRrms = 6:02 n + 1:76 dB

2.1

12

Dynamic Range The dynamic range of a converter is usually speci ed as the ratio

of the rms value of the maximum amplitude of the sinusoidal input signal to the rms output noise plus the distortion measured when the same sinusoid is present at the output. The rms output noise plus distortion is obtained by rst eliminating the sinusoid from the measured output. Dynamic range can also be expressed as an e ective number of bits using the relationship presented in Equation 2.1. Essentially, this is an indication of how far it is possible to go below the full-scale input signal without hitting noise and or distortion.

2.2.2 Oversampling Without Noise Shaping


The notion of using arti cially high sampling rates and 1-bit quantization to achieve high resolution A D conversion at a lower rate has been of interest ever since delta modulation 14 was rst proposed in 1946. Extra dynamic range and SNR can be obtained by spreading the quantization noise power over a larger frequency range.

Quantization Noise
Quantization of amplitude and sampling in time are at the heart of all digital modulators. Periodic sampling at rates more than twice the signal bandwidth need not introduce distortion, but quantization does. A quantizer can be modeled as adding quantization error en to input xn to generate output y n, i.e., y n = xn+ en, where n refers to n-th sample. The quantization error is the di erence between the input and output values, which is bounded by =2, where  equals the di erences between two adjacent quantization levels, i.e., 1 LSB. The error en is completely de ned by the input xn, but if xn is very active, en can be approximated as an independent random number uniformly distributed between =2. Thus we can treat the quantization error as white noise e with power:

Z +=2 2 1 e2 de =  Pe =  12 ,=2

2.2

13 and it is independent of the sampling frequency, fs. Also the spectral density of e is white i.e., a constant over frequency and all of its power folds into the frequency band fs =2 a two-sided representation of frequencies. Then the spectral density of the sampled noise is given by s s P  1 e 2.3 E f  = f =  p  f 12 s s

Oversampling Advantage

Oversampling occurs when the signals of interest are bandlimited to f0 yet the sampling rate is at fs, where fs is much larger than 2f0. As it was de ned in the sections above, the oversampling ratio, OSR 2ff0 . After quantization, since the signals of interest are all below f0, the quantized signal along with noise is ltered by
s

1 jf j  f0 2.4 : 0 f0 jf j fs This lter eliminates quantization noise together with any other image signals greater than f0 . While the in-band signal power is the same as the original input signal power, the quantization noise power is reduced to Z f =2 Z f0 Pe Pe0 = E 2 f jH f j2 df = E 2f  df = 2ff0 Pe = OSR 2.5 ,f =2 ,f0 s Therefore, each doubling of OSR i.e., sampling at twice the rate decreases the inband noise power by one-half or, equivalently, 3 dB, increasing the resolution by only 0.5 bits according to Equation 2.1.

H f  =

2.2.3 Noise-Shaping Oversampling  Modulation


A more e cient oversampling quantizer is the noise-shaped  modulator shown in Figure 2.3a. Although most present  converters make use of 1-bit

14

x(n)

+ -

Analog Filter

Quantizer

y(n)

D/A (a) e(n) x(n) y(n)

+ -

H(z)

(b)

Figure 2.3: A  modulator and its linear model.


quantizers i.e., only two output levels due to the inherent linearity between two levels, a general discussion may be addressed on multilevel quantizers. By analyzing the linear model shown in Figure 2.3b as having two independent inputs xn and en, which is an approximation for active input signal xn, we can derive a signal transfer function, STF z, and a noise transfer function, NTF z, as

Y z = H z STF z X z 1 + H z Y z = 1 NTF z E z 1 + H z

2.6 2.7

Note that the zeros of NTF z will be equal to the poles of H z. In other words, when H z goes to in nity, NTF z will go to zero. In the frequency domain, the output signal can be written as the combination of the input signal and the noise signal:

Y z = STF zX z + NTF zE z

2.8

15

e(n) x(n)

+ -

z-1

y(n)

Figure 2.4: A rst-order  modulator.


To noise-shape the quantization noise in a useful manner, the magnitude of H z should be so large over the frequency band of interest as to make NTF z near zero and STF z near unity. Thus, the quantization noise is reduced over the signal bandwidth while the signal itself is largely una ected. The feedback will not reduce the highfrequency noise since there is little loop gain at high frequency. However, the out-ofband noise can be removed by additional low-pass digital lters.

First-Order Noise Shaping

To realize rst-order noise shaping, NTF z should have a zero at dc i.e., z = j!T = 1, where ! = 0 so that the quantization noise is high-pass ltered. Since the zeros of NTF z are equal to the poles of H z, H z should also have a pole at z = 1, resulting in a choice of a discrete-time integrator: 1 H z = z , 1 2.9

A block diagram for such a choice is shown in Figure 2.4. The signal goes rst through a subtractor and then an accumulator, corresponding to  . From a time domain point of view, the feedback forces the average value of the quantized output yn to equal the average value of input xn so that the integrator's input xn,yn equals zero otherwise, the in nite dc gain will amplify the di erence

16 to in nity. This con guration has similarity with an ideal opamp having unity closeloop gain. From Equations 2.6, 2.7, and 2.9, we have STF z = z,1 and NTF z = 1 , z,1 , corresponding to a one-clock-period delay for signal and a rst-order highpass lter for noise. The spectral density of the modulation noise may be expressed as

E 0 f  = E f jNTF zj = E f j1 , z,1 j = E f j1 ,

,j!T j =

8PeT sin !T 2.10 2

1 = fs , and the relation of E f  and Pe is de ned in Equation 2.3. where ! = 2f , T Hence, the in-band noise power can be calculated as

Pe0 =
for f0 fs i.e., OSR given 3, Chapter 2 as:

Z f0

,f0

2 1 E 02 f  df  = Pe 3  OSR 3

2.11

1. The nal SNR for a sinusoidal input signal can be 2.12

SNRrms = 6:02n + 1:76 , 5:17 + 30 log OSR

We see here that doubling the OSR reduces the noise by 9 dB or provides 1.5 bits of extra e ective resolution for the rst-order  modulator.

Second-Order Noise Shaping


A major shortcoming of the simple  system shown in Figure 2.4 is that it may generate low-frequency and thus in-band tones for special values of the input xn. To see how this may occur, assume for the moment that we are using 1-bit quantizer and 1-bit DAC in the system, and the two output levels of the 1-bit DAC are 0V and 1V , while the input xn is a constant voltage k=m volt where k and m are relatively prime integers and k m. Then, it is possible to have a periodic DAC output pattern with each period m pulses long and containing k pulses of value 1V

17

First Integrator x(n)

Second Integrator

Quantizer

+ -

+
z-1

+ -

z-1

+
e(n)

y(n)

Figure 2.5: A second-order  modulator.


and m , k pulses of 0V . The average value of the DAC output will then equal xn, and hence the loop can settle into a steady state oscillation under these conditions. If m is su ciently large so that m fs=f0 = 2OSR, then the fundamental component fs=m of the oscillation wave will fall into the signal band, and it may therefore appear as a sinusoid tone of considerable amplitude in the nal data output, which is often called pattern noise 15 . One way to avoid the pattern noise is to inject an extra noise call dither into the loop, which will also reduce the dynamic range of the system 16 . Another way is to reduce the correlation between xn and en by using a higher-order loop lter, which will also result in more selective noise shaping, and thus in an improved SNR. The modulator shown in Figure 2.5 realizes second-order noise shaping by using two cascaded integrators. For this modulator, the signal transfer function can be derived as STF z = z,1 , corresponding to a clock cycle delay, and the noise transfer function is NTF z = 1 , z,1 2, a second-order high-pass lter. Similar calculations as in the Section 2.2.3 may give the noise power and SNR for sinusoidal signal as: 4 1 2.13 Pe0  = Pe 5  OSR 5

SNRrms = 6:02n + 1:76 , 12:9 + 50 log OSR

2.14

18

E(f)

Second_order First_order No noise shaping

0 f0

fs 2

fs

Figure 2.6: Spectral density of noise for some di erent noise-shaping modulation.
Therefore, doubling the OSR improves the SNR for a second-order modulator by 15 dB or, equivalently, an extra resolution of 2.5 bits octave. A comparison for the spectral densities of noise among zero-, rst-, and second-order noise-shaping modulators is shown in Figure 2.6. The noise power decreases over the signal band i.e., from 0 to f0 as the noise-shaping order increases. However, the out-of-band noise increase for the higher-order modulators. Furthermore, the generation of in-band tones is largely prevented by the added feedback path which reduces the dependence of the quantization noise on the recent values of input signal, and thus decrease their correlation. In fact, the feedback signal added to the input of the second integrator acts as a dither signal.

2.3 Architectures of  A D Converters


The world of  converters can be roughly divided into the following camps: single-bit single-stage low-order designs, single-bit single-stage high-order designs, multi-stage cascaded designs with feedforward error cancellation, and multibit noise shapers. The systems shown in Figure 2.4 and 2.5 are in the rst categary as long as 1-bit quantizers are used. They have guaranteed stability 3, Chapter 3 with little

19 restriction on input range and simple circuit design. However, they cannot achieve high SNR with low-to-medium oversampling ratios.

2.3.1 Higher-Order Single-Stage  Converters


In principle, arbitrary higher-order loop lters can be used and con gured in cascade structure like that in Figure 2.5. When a modulator has L loops and is not overloaded, linear analysis 16 can show that the spectral density and in-band power of the modulation noise are
L  E f  = 2PeT 2 sin  !T 2
0

2.15 2.16

This noise falls 32L , 1 dB for every doubling of OSR, providing L , 1 2  extra bits of resolution. However, the stability of the loop becomes precarious for loop lters of order L 2. Linearized analysis, which results in NTF z = 1 , z,1 L, is not a reliable predictor of the stability, since the 1-bit quantizer is a grossly nonlinear element whose equivalent gain varies abruptly with the value of its input. It is revealed 3, Chapter 4 that for guaranteed stability the equivalent quantizer gain must be high. Thus, with the xed output amplitude, this will be achieved only if the quantizer input is small. To ensure this, the maximum amplitude of input must be restricted to pretty low, so as to inversely decrease the dynamic range. In practice, the loop lter has to be carefully designed and the stability may be signal dependent. The coe cients of the lter must come from a discrete-time simulation rather than from a linear model. Another challenging aspect of stability analysis lies in establishing a "safe" input range, which is also related to the choice of the lter.

 2L 1 Pe0  = Pe 2L + 1  OSR 2L+1

20

x(n)

b1

b2 1 z-1

b3 1 z-1

b4 1 z-1

b5 1 z-1

+ a1

+ a2

+ a3

+ a4

+ a5

1 z-1 e(n)

y(n)

Figure 2.7: Chain of integrators with distributed feedback and distributed feedforward inputs. There are many loop topologies to build a high-pass lter for modulation noise. One 17 is using chain of integrators with distributed feedback and distributed feedforward inputs as shown in Figure 2.7. This fth-order modulator has the noise transfer function and signal transfer function as: , 15 NTF z = 1 , a5 , a4 1 , : : : , a1 = zD 2.17 z z,1 z,12 z,15 where Dz = z , 15 , a5 z , 14 , : : : , a1 . The basic idea of adding Dz is to introduce poles to atten the high-frequency portion of NTF z, so that the quantizer gain will not be too low to result in stability at high frequency. + : : : + b5 z , 1 STF z = b1 + b2z , 1D z
4

2.18

2.3.2 Multi-stage Cascade  Converters


An alternative structure for realizing higher-order noise-shaping converters, which is free of the stability problems associated with the higher-order single-stage

21

Quantizer1 e1(n) x(n)

+ + + -

H 1(z) D/A

y1(n)

Quantizer2 e2(n) H 2(z) D/A

Digital signal processing y2(n)

y(n)

Figure 2.8: Block diagram of cascaded modulator.


converters described above, is the multi-stage or cascade architecture as shown in Figure 2.8. The overall  modulator consists of a cascade of several lower order single-loop modulators, each with its own quantizer. Each single-loop modulator in the cascade converts the quantization error from the preceding modulator. The error of all but the last single-loop modulator are then digitally canceled. The guaranteed stability is achieved by using rst- and or second-order loops in a feedforward as opposed to feedback con guration. Since the design in this thesis is also a cascaded modulator, a detailed analysis will be given in the next chapter. It should be mentioned here that a major disadvantage of the cascaded structure is that the exact cancellation of the error e1 n requires accurate matching of the analog transfer functions H1z and H2 z to some digital functions determined by the DSP system. If these conditions are not exactly satis ed, then un ltered or poorly ltered noise due to e1n will leak into the output data yn, and the SNR plummets.

22

2.3.3 Multi-bit  Converters


As mentioned before, one-bit  modulators employ a 1-bit internal DAC with inherent linearity that does not require precision component matching. This relax of requirement on analog components is a great attractiveness for modern VLSI technologies. However, it is easy to see from Equation 2.12 and 2.14 that employing multibit quantizers in the modulators shown in Figure 2.4 and 2.5 can dramatically increase SNR by 6 dB per additional bit. This is because the  in Equation 2.2, which is the level spacing of the quantizer, will decrease by a factor of 2 per bit so that the noise power Pe decreases by a factor of 4 per bit. Equivalently, the multibit  coder can achieve resolution comparable to that of a single-bit modulator at a lower sample rate, which is a signi cant advantage in applications requiring high bandwidth like digital video. The lower clock rate possible within the multibit modulators may decrease power consumption in the digital circuitry. Finally, the multibit quantizer is a better approximation to a linear ampli er than a single-bit one; hence, the stability properties are better and the agreement between the behavior predicted by linear theory and the actual performance is improved. Therefore, multibit  converters are gaining popularity 18, 1, 19 . The multibit internal A D converter must be a parallel  ash type circuit, since stability and noise cancellation allow only one clock period for conversion. On the other hand, the ADC nonlinearity merely increases the quantization noise somewhat, and will be suppressed by the noise shaping process. By contrast, any nonlinearity of the internal feedback DAC will directly a ect the output signal. This can be seen from the analysis of the linear model shown in Figure 2.9, where an represents the errors caused by the deviation of the quantizer thresholds from their ideal values, i.e., ADC nonlinearity, and dn represents the errors due to the internal DAC nonlinearity. For noise shaping to occur, the gain of H z must be large at low frequencies as we saw in Section 2.2.3. Therefore, both an and en are reduced by

23

N-bit Quantizer a(n) x(n) e(n)

+ -

H(z)

+ +

y(n)

d(n)

N-bit DAC

Figure 2.9: Simpli ed linear model of a  ADC with nonlinearity errors.


this large gain when referred back to the input xn. However, dn still resides in the feedback path so that the ultimate linearity of yn is no better than the linearity of the N-bit internal DAC. For example, 16-bit resolution of a multibit  converter can only be achieved if the internal DAC has an accuracy of 1=216  = 0:0015. A direct realization of such high-precision converter e.g. laser-trimmed resistors is very expensive and defeats the virtue of using oversampling. There are several alternative techniques 3, Chapter 8 for reducing the e ects of DAC nonlinearity. However, a novel structure using two quantizers Figure 2.10, proposed by Leslie and Singh 18 , is easier to implement. One of the two quantizers is a single-bit circuit contained in a  loop, which includes a single-bit DAC in the feedback path. Since this DAC plays the key role in determining the linearity of the modulator, its inherent linearity is used to full advantage. An added path with a second quantizer, a multibit one, is used to convert and cancel the large quantization error generated by the single-bit quantizer. This scheme is conceptually similar to the cascade technique. Analysis of the system of Figure 2.10b gives for the output signal the expression as:

Y = H1 V1 + H2V2 = STF H1 + H2X + H1NTF + H2NTF , H2 E1 + H2 E2

24 Here, STF and NTF are the signal and noise transfer functions, respectively, of the 1-bit loop; E1 and E2 are the quantization errors of the 1-bit and N-bit ADCs, respectively, in the z-domain. An appropriate choice of the H z, H1z, and H2z can cancel the large error E1 by satisfying the condition

H1 = 1 , 1 2.19 H2 NTF It is usually advantageous to choose STF and the overall signal transfer function STF H1 + H2 both as delays of k clock period, i.e., z,k . Then the design equations become H1 = 1 , NTF , H2 = NTF , and Y = z,k X + NTF E2
In practice, condition 2.19 cannot be exactly satis ed due to the inaccuracy of analog NTF . This inaccuracy generally makes it useless to choose the value of N larger than 3-5 bits.

2.4 Digital Filters for  A D Converters


As seen in previous sections, the techniques for  A D converters are heavily based on principles of the digital signal processing lters, which follow with the analog modulator to remove the shaped out-of-band noise, reduce the sampling rate to Nyquist rate, and increase 1-bit or several-bit data word to high-resolution sample word. The block diagram of a whole  A D converter system is shown in Figure 2.11. In this system, xn and yn are working on oversampling rate, while ym is working on Nyquist rate. The process of converting a signal from a given rate to a di erent rate is called sampling-rate conversion. In a  A D converter, the process of reducing the sampling rate is called decimation, while in a  D A converter, the process of increasing the sampling rate is called interpolation.

25

+ -

H(z)

N-bit ADC

N 1

1 H(z)

N+1

MSB

1-bit DAC (a) x

+ -

H(z)

1-bit ADC e1

v1

H 1(z)

1-bit DAC

N-bit ADC e2 (b)

v2

H 2(z)

Figure 2.10: a Leslie-Singh structure. b An Equivalent representation. 2.4.1 Basic Principles of Decimation
The process of decimation in the digital domain can be viewed as a linear ltering operation as shown in Figure 2.12. The input signal xn is characterized by the sampling rate fs, and the output signal ym is characterized by fs=M , where M is an integer factor. The spectrum of xn, X !, is assumed to be nonzero in the frequency interval 0  j!j   or, equivalently, jf j  fs=2. To avoid aliasing 20 , the bandwidth of xn must rst be reduced to fmax = fs=2M or, equivalently, !max = =M , then the band-limited signal wn can be down-sampled by simply discarding M , 1 out of every M samples to produce the output ym.

26

f0 x(t) Analog input Analog low-pass filter and S/H

2 Mf 0 x(n) Analog delta-sigma modulator

2 Mf 0 1 y(n) Digital low-pass decimation filter M 16

2 f0 w(m) PCM output

Figure 2.11: A  A D converter system.


x(n) fs w(n) Down-Sampler M y(m) fs M

h(n)

Figure 2.12: Decimation by a factor M.


The lter is a low-pass lter characterized by the impulse response hn and a frequency response HM !, which ideally satis es the condition

HM ! =

 1 j!j  M : 0 otherwise

Thus the lter eliminates the spectrum of X ! in the range =M ! . Of course, the implication is that only the frequency components of xn in the range j!j  =M are of interest in further processing of the signal. In practice, the digital lter, while speci ed at the high sampling rate fs, is actually implemented at the low rate fs=M . This is shown by the relationship

ym = wmM  =

1 X

k=0

hkxmM , k

Only one out of every M samples of xn needs to be convoluted with hn.

2.4.2 Multistage Implementation


An important criterion in the design of an A D or D A converter is the e ciency in which the decimator or interpolator operation can be implemented. This

27 e ciency is directly related to the type, the order, and the architecture of the digital lter used in the implementation. In practical applications, the decimation factor M is much larger than unity i.e., OSR 1, so a single-stage implementation i.e., only one lter to do the decimation will require a very high-order lter and may be computationally ine cient. However, a multistage implementation, which is a cascade of J decimators with M = QJ i=1 Mi and the sampling rate at the output of the ith stage Fi = Fi,1 =Mi, for i = 1; 2; : : : ; J , can save pretty much computation and thus is used in most  converters. In a multistage implementation, to ensure that no aliasing occurs in the overall decimation process, each lter stage should be designed to avoid aliasing within the frequency band of interest e.g., from 0 to 2f0. Let us de ne the passband cuto frequency Fpc = f0 and the stopband cuto frequency Fsc = Fs=2M for the overall decimator.3 Then aliasing in the band 0  F  Fsc is avoided by selecting the frequency bands of each lter stage as

passband : 0  F  Fpc 2.20 transitionband : Fpc  F  Fi , Fsc stopband : Fi , Fsc  F  F 2,1 The length or order of a FIR lter may be estimated from one of the well-known formulas given in the literature 20 : f  p; sf 2 + 1 N = D1 p; s , 2.21 f
i

D1 p; s = 0:005309log p2 + 0:07114log p , 0:4761 log , 0:00266log p2 + 0:5941log p + 0:4278 f  p; s = 11:012 + 0:51244log p , log s , Fpc f = Fsc F
s

3 In a practical lter design, a transition band has to be allowed, i.e., Fsc

Fpc.

28 where p and s represent the passband ripple and stopband ripple, respectively. It is easy to see that in a single-stage implementation, the transition bandwidth becomes small relative to Fs e.g., usually f 1, leading to excessively large lter orders and high-word-length requirements on the decimator. Whereas, in a multistage implementation, each stage's f i.e., Fi , Fsc , Fpc=Fi,1  is not so small that a relative low-order lter can be used. The total orders of all cascaded lters may be much smaller than the orders of the single lter. An example will illustrate this point in Section 4.1. Although, in principle, an arbitrary number of stages may be used, practical considerations sometimes lead to the conclusion that a two-stage design is best. A further careful consideration of the rst stage lter design can relax the lter requirement to a multiple stopband lter design 3, Chapter 13 , such as a comb or sincK lter, which will save more computations than a conventional FIR design.

2.4.3 Filter Structures


In principle, any number of classical lter design techniques can be applied to make the decimator or interpolator. However, because of the multirate and or multistage considerations, many of these classical techniques are often ruled out in favor of designs that can take better advantage of the above multirate criteria and achieve a more e ective design. These considerations often lead to the use of architectures that minimize the coe cient word lengths, eliminate the need for dedicated high-speed parallel multiplier, or reduce the memory storage requirements.

FIR and IIR Filters


FIR Finite Impulse Response and IIR In nite Impulse Response lters are two basic types of digital lters. In many applications, such as high-quality digital audio, linear phase is important; hence, FIR ltering is used exclusively. In voiceband

29
x(n) z-1 h(1) h(0)

+ + +

y(m)

h(N-2) z-1 h(N-1) x(n) M z-1 M

(a) h(0)

+ + +

y(m)

h(1)

M z-1 M

h(N-2)

h(N-1)

(b)

Figure 2.13: FIR structures for decimation by a factor M. a Direct form. b
E cient direct form. telephony applications, however, linear phase is not required so that IIR lters may be used as intermediate-stage decimators because they typically require lower order than FIR at the expense of higher internal word lengths for coe cients. In my work, FIR lters are designed due to the assumed linear phase requirement. In principle, the simplest realization of a decimator is the direct-form FIR structure as shown in Figure 2.13a with system function

H z =

N ,1 X k=0

hkz,k

2.22

30

x(n) M z-1 M

+ + + +
z-1

M z-1 M

h(0)

+ + +

y(m)

h(1)

M z-1 M

M z-1 M

h(N/2-2)

h(N/2-1)

Figure 2.14: Symmetrical FIR structure for decimation by a factor M.


where hk is the unit sample response of the FIR lter. Any of the standard, wellknown FIR lter design techniques e.g., window methods, Parks-McClellan algorithm 21 may be used to carry out the coe cients. Although the direct-form FIR lter realization is simple, it is also very ine cient. The ine ciency results from the fact that the down-sampling process requires only one out of every M output samples at the output of the lter. Consequently, only one out of every M possible values at the output should be computed. Thus, the logical solution to this problem is to embed the down-sampling operation within the lter as illustrated in Figure 2.13b. Additional reduction in computation can be achieved by exploiting the symmetry characteristics of hk as shown in Figure 2.14.

sincK Filters The transfer function for a sinck decimation lter has the general form M ,1 ,M X 1 1 1 , z , i K H z =  M z  =  M 1 , z,1 K 2.23 i=0

31 and its frequency response is therefore 1 sin!M=2 K =  sinc!M=2 K 2.24 jH ej! j =  M sin!=2 sinc!=2 where ! = 2f=fs. It has M=2 spectral zeros if M is even, or dM=2e , 1 zeros if M is odd, at frequencies that are multiples of the decimated sampling frequency !d. These zeros correspond to multiple notches or stopbands which are wide enough to cover the required stopband widths of the decimator speci cation as mentioned in Section 2.4.2. Therefore, this class of lters can be applied to the rst stage of a multistage decimator. Note that the sincK lter is actually a cascade of K averaging lters and its impulse response is nite, implying it is an FIR lter. In addition, all of its impulse response coe cients are symmetric in fact, they are all equal to unity so that no multiplications need to be implemented in hardware, and thus it is also a linearphase lter. A sinc4 lter design will be presented in Chapter 4.

Half-Band Filters
Half-band lters are characterized by the constraints that their passband and stopband ripples are the same i.e., p = s and that the cuto frequencies are symmetrical around =2 such that

!p + !s = 

2.25

This class of lters exhibits odd symmetry around =2 and has zero impulse response hn for all even values of n except n = 0, i.e., almost half of the lter's coe cients are zero. Therefore, these lters can be implemented with half the number of multiplications than arbitrary choices of FIR lter designs. They are appropriate for sampling rate conversion ratios of 2 : 1 22 . From Equation 2.20, we may see that if half-band lters are used as the intermediate decimators, aliasing will occur in the nal transition band from Fpc to

32

Fsc. This transition band aliasing is a very useful and important technique that leads to a large savings in lter complexity, especially in the last stage of the decimator, as we will see from the design in Chapter 4.

33

Chapter 3 ANALOG MODULATOR DESIGN


In commercial applications, oversampling  converters have been successfully used for high-resolution signal acquisition in voice-band, digital audio, and ISDN elds, which have signal bands less than 100 KHz. The next signi cant telecommunication application for  converters will be digital video with bandwidth less than 10 MHz. It is obvious that the oversampling ratio in the range of 64 to 512, typically used in oversampling converters, is impractical for such high-speed application. Alternative modulator structures are needed, working at low oversampling ratios to achieve moderate 10-12 bit resolution while maintaining the relative simplicity of analog circuit design. Dr. Brian Brandt 1 3, Chapter 7 has implemented such a structure for 12-b 2.1 MHz conversion with OSR = 24 in a 1-m CMOS technology, which is so far the fastest  ADC. In my thesis work, an equivalent modulator is re-designed in a 0.8-m CMOS technology with a few modi cations of the analog circuits.

3.1 A Cascaded Multibit  Modulator


The goal of this modulator design is to reduce the oversampling ratio to as low as 16 while maintaining a dynamic range of 74 dB or, equivalently, a resolution of 12 bits, for conversion rates above 1 MHz.

3.1.1  Modulation at Low Oversampling Ratios


The dynamic range of  modulator employing pure di erentiation noise transfer functions depends on the oversampling ratio OSR, the order of noise shaping

34

1-bit Quantizer x

+ +-

+
z-1

+ -

z-1

1-bit ADC

1-bit DAC y1 N-bit Quantizer

+ -

z-1

N-bit ADC

y2

Error Cancellation Logic y

N-bit DAC (a) E1 (z) X(z)

+ +-

1 1- z-1

+ -

z-1 1- z-1

Y (z) 1 E2 (z) z-1 1- z-1 Y (z) 2 Error Cancellation Logic -1 Y=z Y -(1-z -1 ) 2 Y 1 2 Y(z)

E1 (z)

+ -

+
ED(z)

+
(b)

Figure 3.1: Block diagram of a third-order cascaded multibit  A D converter.

35

L, and the internal quantizer resolution N , according to 23


3  2L + 1 2N , 12OSR2L+1 DR = 2 2L 3.1

At a given OSR, the dynamic range may be extended by increasing L or by increasing N. Due to the term OSR2L+1 in Equation 3.1, the e ectiveness of increasing the order of noise shaping is signi cantly diminished as the oversampling ratio is reduced. In contrast, the e ectiveness of increasing the quantizer resolution is independent of the oversampling ratio and is therefore particularly attractive for applications at low oversampling ratios. For example, when a 4-bit quantizer is used instead of a 1-bit quantizer, the dynamic range increases nearly 20 log24 , 1  = 24 dB. However, as mentioned in Section 2.3.3, linearity and resolution of a modulator based on a multibit quantizer are limited by the precision of the multibit D A converter. To reduce this dependence on DAC linearity, several methods 3, Chapter 8 , including element swapping and digital calibration, have been proposed. An alternative method is described here.

3.1.2 Linear Analysis of the Modulator


The cascaded multibit modulator shown in Figure 3.1a avoids sensitivity to the DAC precision by placing the multibit quantizer in the nal stage of a thirdorder cascaded modulator. The more critical rst-stage quantizer has only two analog output levels and is therefore inherently linear. The modulator consists of a secondorder stage with a 1-bit quantizer followed by a rst-order stage with a multibit quantizer. The input to the second stage is the di erence between the output and the input of the rst-stage quantizer, i.e., the quantization error of the rst stage. Shown in Figure 3.1b is a linear approximation of this modulator, wherein the quantizers are modeled by signal-independent additive error sources, while the integrators are represented by their transfer functions in the z-domain. The z-transform

36 of the output of the rst stage is

Y1z = z,1 X z + 1 , z,1 2E1 z

3.2

where E1 z models the quantization noise of the 1-bit quantizer. The input to the second stage is E1z and the transform of the second-stage output is

Y2z = z,1 E1 z , ED z + 1 , z,1 E2z

3.3

where E2z models the quantization noise of the N-bit quantizer and ED z models the nonlinearity error of the N-bit DAC. The digital error cancellation logic combines the outputs from the two stages according to

Y z = z,1 Y1z , 1 , z,1 2Y2z Y z = z,2 X z + z,1 1 , z,1 2ED z , 1 , z,1 3E2 z

3.4

so as to cancel the quantization error E1 z of the rst stage in the nal output: 3.5 Thus, ideally the quantization error of the rst stage is canceled and the quantization error of the second stage is attenuated in the passband by third-order shaping. Also, because E2z originates from a multibit quantizer, the dynamic range is improved according to Equation 3.1. More importantly, the nonlinearity error ED z is attenuated in the passband by second-order shaping so that the cascaded multibit modulator much more tolerant of DAC nonlinearity than the single-stage modulator. Simulation shows that, to achieve 12-bit dynamic range, this modulator requires a DAC accuracy of only 6 bits i.e., about 1.5 accuracy.

3.2 Modi ed Modulator with Interstage Coupling


In practical design of the cascaded modulator, some subunity gain factors have to be added for the integrators in Figure 3.1 in order to avoid overload at

37
E1 (z) X(z)

+ 0

1 1- z-1

+ -

z-1 1- z-1

+ 1 4 E2 (z) z
-1

Y (z) 1

+ -

1- z-1

+
ED(z)

Y (z) 2

Error Cancellation Logic -1 Y=z Y -(1-z -1 ) 2 Y 1 2p =4Y2 + z-1 Y Y 2p 1 Y(z)

Figure 3.2: A cascaded multibit  ADC with interstage coupling.


the second stage and maintain largest signal range allowed by the power supplies. Also, trade-o s have to be considered to determine the resolution N of the multibit quantizer and simplify the circuit design. Brandt has experimented several alternative choices and concluded that the measured di erence in dynamic range among them was negligible 1 . Therefore, the choice with the simplest circuit design is used in this thesis work.

3.2.1 Interstage Coupling


It is preferable that both stages in the modulator have the same input range, which is de ned by the two levels of the 1-bit DAC or by the two outermost levels of the N-bit DAC, in order to utilize the maximum signal swing allowed by the power

38 supplies and also to reduce the number of required voltage references. If unity gains are used for the summation nodes in Figure 3.1b, however, the large amplitude of the quantization error produced by the second-order rst stage would overload the input range of the second stage and therefore the input range of the rst stage must be reduced. To reduce the signal range at the input to the second stage while keeping the same input range for both stages, some kind of interstage coupling coe cients should be added around the interstage summation node. An empirical selection of the interstage coupling coe cients results in a modi ed version of the modulator shown in Figure 3.2. In this con guration, the interstage subtraction node is actually not needed at all due to the coe cient 0 so that the complexity of the modulator is 1 reduces the output of the second integrator to onereduced. The gain coe cient 4 fourth to avoid overloading the input range of the second stage. Due to these two coe cients, the Y2z in Equation 3.4 must be replaced by

Y2pz = 4Y2z + z,1 Y1z

3.6

so as to prevent the overall output Y z from containing E1z. The trade-o s concerning the selection of appropriate coupling coe cients are presented in 1, 24 .

3.2.2 Multibit Quantizer

In principle, the multibit quantizer resolution N can be increased inde nitely to improve the dynamic range. However, in practical implementations, increasing the resolution of the multibit quantizer reduces the second-stage quantization noise and thereby increases the sensitivity to uncanceled quantization noise from the rst range. The cancellation of E1 z depends on the noise shaping performed in the rst stage precisely matching the shaping provided by the error cancellation logic. In

39 practical, the rst-stage noise shaping deviates from 1 , z,1 2 because of circuit nonidealities such as gain errors from capacitor mismatch and dc gain in the opamps so that E1 z cannot be totally canceled. Thus, increasing the resolution N of the multibit quantizer in the second stage increases the sensitivity to gain errors. At increasing levels of gain error, the performance of the modulator becomes dominated by uncanceled rst-stage noise, and less bene t is derived from increasing the secondstage quantizer resolution. The resolution can be tailored to the expected capacitor matching of the fabrication process. For a moderate 2 gain error margin, the bene t of a 4-bit quantizer over a 3-bit quantizer does not justify doubling the size of the second-stage quantizer. Hence, a 3-bit quantizer was chosen for the second stage.

3.2.3 Modi ed Integrators


Figure 3.3 shows the approximate structure of the practical modulator, where a 3-bit quantizer is used in the second stage. Two modi cations of the modulator in Figure 3.1b are evident in the practical structure. First, both integrators in the rst stage include delays in their forward paths, as well as gain factors of one-half at their inputs. Thus, the transfer function of both rst-stage integrators is 1 z,1 H z  = 2 1 , z ,1 3.7

An extraction of the modi ed architecture from the ideal structure results in a conguration with an attenuation of 0.5 preceding the rst integrator and a gain of 2 at the input of the second integrator 25 . However, since the second integrator is followed immediately by a single-threshold 1-bit quantizer, its gain can be adjusted arbitrarily without impairing the performance of the modulator. Hence, the output of the rst stage given in Equation 3.2 is changed only slightly to include an additional

40
1-bit Quantizer X(z)

+ -

1 2

z-1 1- z-1

+ -

1 2

z-1 1- z-1

1-bit ADC

1-bit DAC 3-bit Quantizer

Y (z) 1

+ -

-1 -1

1- z

3-bit ADC

Y (z) 2

Error Cancellation Logic -1 Y=z Y -(1-z -1 ) 2 Y 2p 1 Y(z)

3-bit DAC

Figure 3.3: Modi ed structure of the third-order cascaded multibit  ADC. Note
that the Y2p is de ned in Equation 3.6. delay z,1  preceding X z. Moreover, simulations reveal that this con guration reduces the signal range required at the outputs of the rst-stage integrators to about 1.7 times the modulator's input range. The second modi cation present in Figure 3.3 is that the input to the second stage is simply the output of the second integrator in the rst stage. The combination 1 gain factors in the rst-stage integrators of this simple interstage coupling and the 2 implements the con guration shown in Figure 3.2. In practical implementations, the modulator's performance may be degraded by integrator leakage resulting from the nite dc gain of operational ampli ers, which introduces a subunity factor for z,1 in the denominator of Equation 3.7. Simulations show that approximately 60 dB of dc gain is required to prevent performance degradation and maintain a 12-bit dynamic range.

41

Figure 3.4: Top-level schematic of the modulator. 3.3 VLSI Implementation

The analog modulator is implemented in a 0.8-m CMOS technology supplied by MOSIS see Appendix A through full-custom design methods. The performance objective is a Nyquist conversion rate of 2.1 MHz and a dynamic range of 12 bits while operating under 50 MHz clocks and from a single 5-V power supply. This section mainly discusses design issues for all of the building blocks in the modulator. All of the circuit diagrams are generated by Design Architect, a schematic capture tool in Mentor Graphics EDA system installed in the VLSI Lab.

3.3.1 Top-Level Structure of the Modulator


Figure 3.4 shows the topmost-level structure of the whole analog circuit and its digital interface. The inputs consist of the two di erential signals X+ and X- , clock signal CLK , input reference voltages VR EF+ and VR EF- , common-mode input

42

Figure 3.5: The A-to-D block in Figure 3.4.


voltage VC M I , and common-mode output voltage VC M O for fully di erential operational ampli ers used in the A-to-D block. The outputs are the 1-bit quantized data Y1 and 3-bit quantized data Y22:0 . The BIAS block generates four bias voltages used by opamps, and the CLOCK GENERATOR block generates clocks for switched-capacitor circuits. The PRIORITY ENCODER block is actually a simple 8-to-3 encoder, which is generated in VHDL see Appendix B.1 by LogicLib, a VHDL library in Design Architect. The A-to-D block has almost the same topology as that in Figure 3.3, as we may see in Figure 3.5. The rst stage in Figure 3.5 consists of two integrators INTE1 1 and a comparator QUANTIZER 1B block that serves blocks with gain factors of 2 as the 1-bit ADC. The 1-bit DAC and the gain factor are actually implemented inside the INTE1 blocks. The second stage consists of a single integrator with unity gain INTE2 block and a 3-bit quantizer QUANTIZER 3B block which serves as both

43

Cf P1 V in P3 P4 Cs P2

V out

Figure 3.6: A parasitic-insensitive switched-capacitor integrator.


3-bit ADC and DAC. The QUANTIZER 3B block generates 8-bit data words, while the QUANTIZER 1B block generates complementary 1-bit outputs.

3.3.2 Switched-Capacitor Integrator


The integrators used in  modulators are usually switched-capacitor SC circuits due to their compatibility with VLSI CMOS process as well as accurate frequency responses determined by capacitance ratios. Before analysis of the SC integrators in Figure 3.5, a brief review of SC integrator is given rst. A SC circuit is realized with the use of some basic building blocks such as opamps, capacitors, switches, and nonoverlaping clocks. Figure 3.6 shows a simple realization of a parasitic-insensitive single-ended integrator 5, Chapter 10 which consists of a single-ended opamp, a sampling capacitor Cs, a feedback capacitor Cf , and four switches controlled by clock signals P1 to P4 . The four clock signals are grouped into two nonoverlaping clocks. There are two categories of SC integrators. In the rst case, P1 and P4 as well as P2 and P3 are identical, so the integrator has a noninverting delaying transfer function as Cs  z,1 out z  H z V =  3.8 V z C 1 , z,1
in f

44

Figure 3.7: The CLOCK GENERATOR block in Figure 3.4.


In the second case, P1 and P2 as well as P3 and P4 are identical, so the integrator has a inverting delay-free transfer function as

Cs  1 out z  H z V = ,  V z C 1 , z,1


in f

3.9

Note that the gain factors of both transfer functions are set by the ratio of Cs and Cf , which can be precisely controlled in VLSI CMOS process. The bottom plate of the capacitors depicted in Figure 3.6 is designated by a folded line. This distinction is important because integrated capacitors are not generally symmetrical, and there is a larger parasitic capacitance to the substrate from the bottom plate than from the top plate. The capacitors should be connected such that the bottom plate is driven either directly or through a switch by a voltage source or the output of the opamp. This arrangement causes the parasitic capacitances to have the least e ect on the operation of the circuit. Also, substrate noise coupling is reduced by this arrangement. To reduce charge-injection e ects in SC circuits, only n-channel switches should be used to realize the switches connected to ground or virtual ground, and clocks

45

Figure 3.8: The INTE1 block in Figure 3.5.


should be arranged so as to turn o P2 and P4 , which are near the virtual ground node of the opamp, slightly ahead of their counterparts. A simple scheme 26 is used to generate such a two-phase nonoverlaping clock consisting of a sampling phase and an integration phase, as shown in Figure 3.7. The outputs C1 and C2 are a pair of nonoverlaping clocks, while the outputs C1A and C2A will be advanced slightly by two inverter delays, compared to C1 and C2 , respectively. The outputs C1N and C2N are complementary parts of C1 and C2 , respectively.

46 Now we are ready to present the integrator design at a transistor level as shown in Figure 3.8. This integrator 27 has a gain factor of 0.5 de ned by the ratio of sampling capacitance 200fF and feedback capacitance 400fF . The only di erence between the INTE2 and INTE1 blocks is that the feedback capacitance in INTE2 is also 200fF . Note that the bottom plate of the capacitors is denoted with a + symbol. In Figure 3.8, fully di erential circuitry is used because it has superior power supply noise rejection, as compared to a single-ended design, and also provides twice the output swing for a given supply voltage 3, Chapter 11 . The OPAMP block and CMFB block construct a fully di erential operational ampli er, which will be discussed later. The common-mode output voltage of the opamp is determined by VC M O , while the common-mode input voltage is de ned by VC M I . These two voltages can be di erent. All of the switches, except for the two switches directly connected with inputs VIN + and VIN - , are realized by minimum-gate-length NMOS FETs W=L = 1:6m=0:8m so as to minimize parasitics, while maintaining good conduction for low enough analog ground VC M I . However, full CMOS transmission gates are used for the switches coupled to the signal input, since they must conduct over a wide range of voltages. The INTE1 block realizes both an integrator and a 1-bit DAC. To understand this, rst consider only the signal path from VIN + to the input of the OPAMP block. This signal path has the same con guration as that shown in Figure 3.6 in the case that P1 and P4 correspond to the identical clocks C1 and C1A while P2 and P3 correspond to the identical clocks C2 and C2A . Therefore, the transfer function of this integrator is de ned by Equation 3.8. Recalling from the previous discussion for Equation 3.8 and 3.9, it is easy to see that the upper VR EF path is noninverting while the lower VR EF path is inverting,

47 which may correspond to positive VR EF and negative VR EF , respectively. During phase C1 , the top capacitor charges to VR EF while the lower capacitor charges to ground. During phase C2 , the inputs X and XN shown in Figure 3.8, which are complementary digital outputs of a comparator i.e., a 1-bit ADC, decide which summing junction is to receive a positive reference charge and which is to receive a negative reference charge. If X is high and XN is low, then a reference charge is taken from the lower summing junction, while a reference charge is delivered to the upper summing junction. If X is low, the charge delivery is swapping. This circuit realizes the 1-bit DAC while avoiding modulation of the reference voltage 3, Chapter 11 .

3.3.3 Wide-Swing Constant-Transconductance Bias Circuit


The rst problem that I encountered in opamp design was to gure out proper bias voltages to make the opamp work in the region where moderate gain can be achieved. In Brandt's original design, no bias circuitry was designed and conventional cascode current mirrors were used. In this thesis work, a constant-transconductance bias circuit is designed exploiting wide-swing cascode current mirrors 5, Chapter 6 , as shown in Figure 3.9. A conventional cascode current mirror is shown in Figure 3.10a. Although its output impedance is increased to rds4rds2 gm4, where gm4 is the transconductance of the transistor Q4 and rds is drain-source resistance of a transistor, a cascode current mirror reduces the maximum output-signal swing so that the minimum allowed voltage for Vout is Vtn threshold voltage of NMOS FET, approximately 0.7V greater than 2Veff , where Veff is the minimum drain-source voltage typically around 0.2 0.25V needed to keep a transistor working in the saturation region. This loss of signal swing is a serious disadvantage for modern VLSI technologies.

48

Figure 3.9: The BIAS block in Figure 3.4.


An alternative circuit that does not reduce the signal swing so much while keeping high output impedance, often called the wide-swing cascode current mirror, is shown in Figure 3.10b. The minimum allowable output voltage just needs to be greater than n +1Veff . When selecting n as unity, this current mirror can guarantee that all of the transistors are in the saturation region even when Vout drops to as small as 0.4 0.5V. In practical designs, the lengths of Q2 and Q3 should be minimized in order to maximize the frequency response, as their gate-source capacitances are the most signi cant capacitances contributing to high-frequency poles. However, Q1 and Q4 should be chosen to have longer gate lengths, typically twice the minimum channel length, to eliminate detrimental short-channel e ects. Also, the aspect ratio of Q5 ,

49

I in . Q3 . Q1 . .

V out

I out = I in

I bias . W/ L (n+1)2 Q5

I in . W/ L . n2 Q3

V out

I out = I in W/ L Q4 n2

Q4

Q2 W/ L Q 1 W/ L Q 2 (b)

(a)

Figure 3.10: Conventional a and wide-swing b cascode current mirrors. Ibias
typically is set to the nominal or maximum input current, Iin. W=L5, should be smaller than the size given in Figure 3.10b in order to o set the body e ect of Q1 and Q4 . All of these practical considerations have been taken into account in the bias circuit design shown in Figure 3.9. In Figure 3.9, the width and length of all the transistors are denoted in units of lambda. For instance, Q1 has length of 4 lambda and width of 25 lambda. Table 3.1 lists the W=L ratios m=m of all the transistors for a MOSIS 0:8,m process. The n-channel wide-swing cascode current mirror consists of transistors Q1 Q4, along with the diode-connected biasing transistor Q5 . The output current comes from Q1 , and the bias current is actually derived from the bias loop via Q10 and Q11 . Similarly, the p-channel wide-swing cascode current mirror is realized by Q6 Q9 , along with diode-connected Q14 which has a bias current derived from the bias loop via Q12 and Q13 . The output current is the drain current of Q6 . Transistor transconductances are perhaps the most important parameters in opamps that must be stabilized. This stabilization can be achieved by using resistor RB and setting W=L2 = 4W=L3 as shown in Figure 3.9, where transistor transconductances are matched to the conductance of RB . Speci cally, the transconductance

50

Table 3.1: Transistors in the Bias Circuit.


Transistor Q8 ; Q7; Q11 ; Q21 Q9 ; Q6; Q10 ; Q20 Q3 ; Q12 ; Q15; Q16 ; Q17 Q1 ; Q4; Q13 Q2 Q14 Q5 Q19 Q18 of Q3 is set to 1 g m3 = R

W=Lm=m 24 0.8 24 1.6 10 0.8 10 1.6 40 0.8 6 1.6 2.4 1.6 4.8 1.6 2 20

Since all transistor currents including those in opamps are derived from the same biasing network and the ratios of the currents are mainly dependent on geometry, all other transconductances are also stabilized so that for all n-channel transistors,

3.10

v u u W=LiIDi gmi = t W=L 3ID3  gm3

3.11

and for all p-channel transistors

v u u p W=LiIDi gmi = t   gm3 n W=L3 ID3

3.12

Therefore, this kind of bias circuits is called constant-transconductance bias circuit. The bias loop does have the problem that at start-up it is possible for the current to be zero in all transistors, and the circuit will remain in this stable state forever. Thus, it is necessary to include the start-up circuitry consisting of Q15 Q18 to inject currents into the bias-loop in the case that all currents in the loop are zero. Once the loop starts up, this start-up circuitry is disabled via turning o Q15 and

51

Q16 . Note that Q18 has much higher impedance than Q17 when the latter is turned on. A simple current mirror consisting of Q19 Q21 is added to supply a bias voltage for setting up the common-mode output level of opamps.

3.3.4 Fully Di erential Folded-Cascode Opamp


The operational ampli er used in the integrators is the most critical element of the modulator. The design of a proper opamp involves a great amount of theoretical calculations and empirical simulations to achieve fast speed and su cient gain. In  modulators, if the opamp is fast enough i.e., it has a small enough settling time constant = 1= !t, where is the feedback factor and !t is the unity-gain frequency of the opamp, small sampling period T i.e., high sampling frequency and oversampling ratio may be used while allowing the circuit to settle completely. Usually, the unity-gain frequency of the opamp should be at least ve times greater than the sampling frequency e.g., T 5  to ensure the gain error introduced by incomplete settling of the integrator less than ,T= = ,5 = ' 0:7. The ampli er should also have a reasonable amount of low-frequency open-loop gain, so the distortion of the ampli er will be reduced. Since opamps in a  modulator usually drive only capacitive loads, it is possible to realize an opamp with only a single high-impedance node at the output while having all internal nodes of relatively low impedance, in order to maximize the speed. Given these design criteria, a good choice for the opamp in the switchedcapacitor integrators of a  modulator is a fully di erential folded-cascode opamp as shown in Figure 3.11 28 . The aspect ratios of all transistors in the opamp are listed in Table 3.2. The basic idea of the folded-cascode opamp is to apply cascode transistors to the input di erential pair but using transistors opposite in type from those used in the input stage. For example, the di erential-pair transistors consisting

52

Figure 3.11: The OPAMP block in Figure 3.8.


of Q1 and Q2 are p-channel transistors in Figure 3.11, whereas the cascode transistors consisting of Q9 and Q10 are n-channel transistors, which are biased by VB3 generated by the bias circuit shown in Figure 3.9. The p-channel transistors Q3 Q8 construct three cascode current sources biased by VB 1 and VB 2 from the bias circuit to provide bias currents and active loads. Note that PMOS wide-swing cascode current mirrors are used here and the common-mode currents are set as ID1 = ID2 = ID9 = ID10. The dc gain of this opamp is determined by the product of the input transconductance gm1 and the output impedance ro i.e., looking into the drains of Q7 and Q10 , and ro is quite high due to the use of cascode techniques. A small-signal analysis

53

Table 3.2: Transistors in the Opamp Circuit.


Transistor Q4 Q3 Q6 ; Q8 Q5 ; Q7 Q1 ; Q2 Q9 ; Q10 Q11 ; Q12 gives the dc gain as

W=Lm=m 120 0.8 120 1.6 60 0.8 60 1.6 100 0.8 50 1.6 50 0.8

Av = ,gm1 ro = ,gm1 gm10 rds10rds12 k rds1 k gm7rds7rds8

3.13

where transconductances gmi can be estimated from Equation 3.11 and 3.12 and drain-source resistances rdsi are typically in the order of 102k . The cascode opamp is compensated by its load capacitance at the output nodes not shown in Figure 3.11. For mid-band and high frequencies, the load capacitance CL dominates, and the unitygain bandwidth of the ampli er is determined by the ratio of the transconductance of an input device to CL in a single dominant pole response, i.e., m1 3.14 !t = g CL From this equation, it seems that the opamp could be made arbitrarily fast by increasing the width and bias current of the input devices to increase gm. However, to maintain stability, all nondominant poles must occur at frequencies higher than the dominant pole. Therefore, the speed of the opamp is limited by the location of the rst nondominant pole determined by the ratio of the transconductance of the cascode NMOS device to the total capacitance on its source i.e., gm10 =Cs10. Note that the bias voltage VB 4 in Figure 3.11 does not directly come from the bias circuit shown in Figure 3.9, instead, it comes from the common-mode feedback

54

Figure 3.12: The CMFB block in Figure 3.8.


CMFB circuitry shown in Figure 3.12. One drawback of using fully di erential opamps is that a CMFB circuit must be added to establish the common-mode i.e., average output voltage. Ideally, it will keep this common-mode voltage immovable, preferably close to halfway between the power-supply voltages, even when large differential signals are present. Without it, the common-mode voltage is left to drift, since the common-mode loop gain is not typically large enough to control its value. The performance requirements on the CMFB circuitry are not nearly as stringent as for the main opamp, because the signal of interest is the di erence between the main opamp outputs. Figure 3.12 shows a CMFB circuit only used in switched-capacitor circuits. The connections between the CMFB circuit and the opamp are depicted in Figure 3.8. It may be seen that nonswitched capacitors CAP1 and CAP2 form a voltage divider

55

Figure 3.13: The QUANTIZER 1B block in Figure 3.5.


to generate the average of the opamp output voltages, and switched capacitors CAP3 and CAP4 transfer corrective charges onto CAP1 and CAP2 during phase 1 C1 . The bias voltage VB is chosen to set the common-mode output voltage of the opamp around VC M O 2:5V in this case. CMOS transmission gates are used to realize the switches connected to the outputs of the opamp, in order to accommodate a wider signal swing. The switched capacitors 25fF  are set to be one-quarter the sizes of the nonswitched capacitors 100fF  so as to avoid common-mode o set voltages and opamp overload 5, Chapter 6 .

3.3.5 Regenerative Track-and-Latch Comparator


The second major building block in a  modulator is the comparator which quantizes a signal in the loop and provides the output of the modulator. Since the comparator appears after the loop gain block and before the output terminal, nonidealities associated with it are shaped by the loop in the same way that the quantization noise it produces is shaped. Therefore, the performance of the modulator

56 is relatively insensitive to o set and hysteresis i.e., the tendency that a comparator might have to stay in the previous direction when it should toggle to another direction in the rst-stage comparator. A fast regenerative latch without preampli cation, as shown in Figure 3.13, has been used to implement the rst-stage comparator. In this latch the cross-coupled devices Q1 Q2 and Q3 Q4, are strobed at their drains, rather than at the sources, to eliminate backgating e ects and promote faster regeneration 5, Chapter 7 . During the latch phase the clock input C1 is low, thereby disabling the regeneration while nodes A and B are pulled high. Nodes R and S, which are the inputs to the RS latch constructed by two NOR gates, are consequently both low. When the comparator is strobed by bringing C1 high track phase, nodes A and B are released and positive feedback, produced by the cross-coupling of transistors Q1 Q2 and Q3 Q4 , regenerates analog input signal into a full-scale digital signal.

3.3.6 3-bit Flash ADC and 3-bit DAC


The modulator performance is also very tolerant of nonlinearity and hysteresis in the second-stage 3-bit ADC because their e ects are attenuated in the baseband by third-order noise shaping. The design of this ADC is complicated by the fact that it must compare the di erential output voltage of the second-stage integrator to seven di erential reference voltages. Because the clock frequency is high and the comparisons should be completed in one clock cycle, a 3-bit ash A D converter 29 is used in this implementation as shown in Figure 3.14. It should be noted that a modi ed 3-bit di erential DAC is also included in Figure 3.14 in order to share one resistor string with the ADC. The resistor string divides the reference into equal segments as thresholds followed by a bank of comparators i.e., COMP blocks. Since the outputs of the second-stage integrator VIN + and VIN - in Figure 3.14 need to drive seven comparators which introduce large load, source followers i.e., BUFF

57

Figure 3.14: The QUANTIZER 3B block in Figure 3.5.

58

Figure 3.15: The BUFF block shown in Figure 3.14.


blocks are added to bu er the loading. Equivalent bu ers are placed between the resistor string and the comparator bank to compensate for gain error and nonlinearity introduced by the source followers. The source follower is shown in Figure 3.15 where the active load is biased by VB 1 and VB 2 from the bias circuit. The COMP block is a switched-capacitor circuit combined with the regenerative track-and-latch comparator, as shown in Figure 3.16. During phase 1 C1 , the two capacitors are charged to unique di erential voltages derived from the resistor string, and during phase 2 C2 , the left sides of the capacitors are driven by the outputs of the second-stage integrator. Note here that CMOS transmission gates are connected to the signals for wider swing. The comparator is strobed at the end of C2 . As an overall output of the 3-bit ash ADC, a 1-out-of-8 code is produced by the AND gates in Figure 3.14. The modulator's tolerance of nonlinearity in the second-stage 3-bit DAC permits the use of a simple di erential tapped resistor string for implementation 1 , as is illustrated in Figure 3.17. In order to utilize the 1-bit DAC structure shown in Figure 3.8, however, I modi ed a normal di erential DAC by cutting o the lower

59

Figure 3.16: The CMPR block shown in Figure 3.14.


half of the DAC and adding a combinational logic to control the 1-bit DAC, as shown in the left portion of Figure 3.14. In Figure 3.17, after a pair of taps from the resistor string, centered with the average of Vref + and Vref ,, are selected to complete the D A conversion, the outputs Out+ and Out- are fed into the upper and lower summing junctions of the integrator, respectively. In my design, only one tap from the upper resistor string is selected and the output X decides whether VR should be delivered to or taken from the summing junctions of the integrator shown in Figure 3.8. For instance, when any of Y4 Y 7 in Figure 3.14 is high, X goes high and VR is taken from the lower summing junction and delivered to the upper summing junction of the integrator.

60

Vref+ Y(0)

Y(7)

Out+

R Y(4) Y(3)

R Y(3) Y(4)

R Y(7) Vref-

Y(0) Out-

Figure 3.17: Block diagram of a 3-bit di erential DAC. 3.3.7 Simulation


All of the above circuits were simulated within Mentor Graphics analog simulator, AccuSim, after schematic captures. Three types of analysis may be applied to the circuits in the simulator: dc, ac, and transient analysis. The simulations exploit SPICE MOSFET models see Appendix A.2 measured from a speci c process run. Since the clock generator in Figure 3.7 is composed of logic gates from a MOSIS standard cell library and the library does not have SPICE models for analog simulation, I could not simulate this digital circuit with other analog circuits at the same time, instead, I had to simulate the switched-capacitor circuits by adding arbitrary pulse signals to emulate the clocks. All of the pulse signals have same cycle of 20ns i.e.,

61

Table 3.3: Clock Signals De ned in the Simulator.


Clocks Pulse Delay nanosecond C1, C2N 0 C2, C1N 10 C1A 19 C2A 9 frequency of 50 MHz, pulse width of 8ns, rise and fall time of 1ns, and pulse range of 0 5 volts. Table 3.3 lists phase di erences of the six pulse signals. Figure 3.18 shows the transient analysis results of the bias circuit in Figure 3.9 as RB is chosen to be 2:4k so that the current-mirror output current is about 30A. Simulations reveal that the value of RB is not very crucial for the stability of the circuit, and +3 derivation 5-bit accuracy of RB is tolerable for maintaining enough open-loop low-frequency gain of the opamp. Under these bias voltages, the opamp in Figure 3.11 has a open-loop frequency response with 200fF capacitive load on each of its outputs, as shown in Figure 3.19. The ac analysis of the opamp circuit shows a dc gain of 53 dB and a unity-gain frequency of 450 MHz, which are acceptable for the present application. Figure 3.20 shows the transient analysis results of the opamp circuit driven by a 1 MHz fully di erential sinusoidal input of 1 millivolt centered at 2.5 volt. The outputs of the opamp have a common-mode voltage of 2.5 volt, a gain of 48 dB, a phase shift of about 40 . The bias current owing through Q3 is about 143 A, less than the ideal value of 150 A, due to the nite output impedance of the current source constructed by Q3 and Q4 in Figure 3.11. The transient analysis of the whole modulator is very time-consuming because the circuits contain around 500 devices transistors, capacitors, and resistors. A 10-microsecond transient simulation takes more than 24 hours for a Sun SPARC 20 workstation to complete. To simulate the circuit shown in Figure 3.5, the bias circuit

62
Analog Trace
/Q3/D (Current:a ) /Q1/D (Current:a )
3.08e-05 3.04e-05 3e-05 2.96e-05 2.92e-05 2.88e-05 3.0426e-05 3.06e-05 3.0426e-05 3.04e-05 3.02e-05 3e-05 2.98e-05

/VB1 (Voltage:v ) /VB2 (Voltage:v ) /VB3 (Voltage:v ) /VB4 (Voltage:v )


1.1278 1.1274 1.127 1.1266 1.1262 1.2968 1.2962 1.2958 1.2954 1.295 1.2946 3.454 3.4535 3.453 3.4525 3.452 3.4515 3.451 3.907 3.9068 3.9066 3.9064 3.9062 3.906 0 5e-07 1 1e-06 1.5e-06 2e-06 2.5e-06 3e-06 3.5e-06 4e-06 4.5e-06 5e-06 3.7e-06 Base t (Time:sec ) 1.1264

1.2949

3.4535

3.9069

Figure 3.18: Bias voltages from simulation of the circuit shown in Figure 3.9.

63
0 60

50.158 -20 50

-40 -47.02

40

Y2: Unitless PH (deg)

-60

Y: Unitless MAG (dB)


30 20 10 -107.02

-80

-100

0.05881 -120 0

-140 10 100 1000

-10 10000

1 1e+05 1.0566e+06 1e+06 1e+07 Base X (Frequency:hz )

2 1e+08 1e+09 4.5287e+08 4.5181e+08

Figure 3.19: Open-loop freqency response of the opamp shown in Figure 3.11 with 200fF capacitive load.

64
Analog Trace
/Q3/S (Current:a )
0.000146 0.00014317 0.0001431 0.000142 0.000138 0.000134 2.7406 0.00015

/VOUT+ (Voltage:v ) /VOUT- (Voltage:v )


2.7 2.6 2.4967 2.5 2.4 2.3 2.2 2.7 2.6 2.4945 2.5 2.4 2.3 2.2 2.2506 2.501 2.5007 2.5006 2.5002 2.4998 2.4994 2.499 2.501 2.5006 2.5002 2.4998 2.4994 2.499 1e-06 1.2e-06 2.4993 2.4993

/VIN+ (Voltage:v )

/VIN- (Voltage:v )

2.5007

2 1.6e-06 1.618e-06 -1.258e-06

2e-06 2.2e-06

2.6e-06

1 3e-06 2.876e-06

t (Time:sec )

Base

Figure 3.20: Open-loop transient analysis of the opamp shown in Figure 3.11 with 200fF capacitive load.

65
Analog Trace
/N$1514 (Voltage:v ) /Y1 (Voltage:v ) /N$632 (Voltage:v ) /POSIN (Voltage:v ) /N$426 (Voltage:v )
5 4 3 2 1 0 5 4 3 2 1 0 5 4.5 4 3.5 3 2.5 2 1.5 1 4 3.5 3 2.5 2 1.5 1 0.5 2 1.6 1.2 0.8 0.4 0 5e-06 6e-06 7e-06 8e-06 9e-06 1e-05 1.1e-05

t (Time:sec )

Figure 3.21: Simulation results of the modulator shown in Figure 3.5. a Outputs
of the integrators and 1-bit quantizer.

66
Analog Trace
/Y2(6) (Voltage:v ) /Y2(0) (Voltage:v /Y2(1) ) (Voltage:v /Y2(2) ) (Voltage:v /Y2(3) ) (Voltage:v /Y2(4) ) (Voltage:v /Y2(5) ) (Voltage:v ) /Y2(7) (Voltage:v )

4 2 0 6 3 1 -1 4 2 0 4 2 0 4 2 0 4 2 0 4 2 0 4 2 0 5e-06 6e-06 7e-06 8e-06 9e-06 1e-05 1.1e-05

t (Time:sec )

Figure 3.22: Simulation results of the modulator shown in Figure 3.5. b Outputs
of the 3-bit quantizer.

67 is combined with it to supply bias voltages. The di erential sinusoidal input signals are set at frequency of 1 MHz and magnitude of 2 volts with common-mode voltage CI = 1V . The two reference voltages, V+ and V- , are set to 2V and 0V , respectively. As mentioned before, the common-mode output voltage VO is 2:5V . In Figure 3.21, the positive upper signals of the fully di erential structure are shown to illustrate the simulation results. From the bottom, they correspond to the input signal, output of the rst integrator, output of the second integrator, output of the 1-bit quantizer, and output of the third integrator. We may see that 2V di erential input causes a little bit of overload on the third integrator. If the common-mode input voltage is set to 2:5V , however, even 1V input will cause serious overload on the second integrator. Figure 3.22 shows the 1-out-of-8 code produced by the 3-bit ash ADC.

68

Chapter 4 DIGITAL DECIMATOR DESIGN


The original implementation of the high-speed  ADC done by Brandt was accomplished by processing the digital outputs of the analog modulator on a workstation to perform the error cancellation and decimation ltering. In practical  ADC designs, however, digital circuits have to be hardwired to realize the real-time data conversion. In this chapter, the digital error cancellation logic and decimation lters are designed in Matlab, described in VHDL Very-high-speed-integrated-circuit Hardware Description Language, and synthesized into standard-cell integrated circuits.

4.1 Digital Filter Design


The digital circuit blocks shown in Figure 4.1 are chosen to implement the error cancellation and sampling decimation. In Figure 4.1, the two digital outputs of the analog modulator described in the last chapter are fed into a digital cancellation logic, which is de ned by Equation 3.4 and 3.6, or revised as

Y z = 2z,2 , z,3 Y1z , 41 , z,1 2Y2z

4.1

where Y1z and Y2z are the 1-bit and 3-bit output in Figure 3.4, respectively. In the present design, it is desired to decimate the data words of Y z from a sampling rate of 50 MHz to one of 50=24 = 2:08 MHz with a baseband extending from zero to 1 MHz. A linear-phase requirement results in the exclusive use of FIR lters for the decimator. Table 4.1 lists the practical speci cations of the decimator. A single-stage implementation of the decimator will result in a lter size of 3050 taps coe cients as per Equation 2.21 see Appendix C.1 with Fs = 50 MHz and a

69

f=50MHz
1

f=50/6MHz
13

f=50/12MHz
13

f=50/24MHz
12

3rd-order Delta-Sigma Modulator

Error Cancellation Logic

T sinc(z)

H 1 (z)

H 2 (z)

sinc 4 FIR

Half-band

Half-band

Figure 4.1: Block diagram of digital lters.


computation rate of 3050  50=24  106 = 6:355  109 multiplications per second. However, if a three-stage decimator is designed as shown in Figure 4.1, the FIR lters will have fewer orders and less computation. The rst 6:1 converter reduces the sampling rate from 50 MHz to 8.33 MHz with a transition region of 1-7.29 MHz as per Equation 2.20. Thus, the rst lter requires a lter size of 20 taps and a computation rate of 1:67  108 multiplications per second. Equivalently, the two 2:1 converters that reduce the sampling rate from 8.33 MHz to 4.17 MHz and further to 2.08 MHz with transition bands of 1-3.13 MHz and 1-1.04 MHz, respectively, have the respective lter orders of 8 and 255 taps, and computation rates of 3:3  107 and 5:31  108 multiplications per second. Therefore, the three-stage decimator has a combined lter order of 283 taps and a total computation rate of 7:31  108 multiplications per second, i.e., about 11 times and 9 times smaller than those of the single-stage decimator, respectively. To further reduce the computation rate of the three-stage decimator, special

Table 4.1: Design Speci cations of the Decimator.


passband cuto frequency Fpc stopband cuto frequency Fsc passband ripples p stopband ripples s 1 MHz 50 48=1.04 MHz 0.01 0.001

70 types of FIR lters, such as comb lter and half-band lter, may be applied to implement the three lters, as we mentioned in Section 2.4.3. As Candy 30 showed, comb lters with response sinck ! are appropriate for decimating  modulation down to four times the Nyquist rate, and the order of the comb lter should be at least one larger than that of the noise-shaping modulator in order to prevent excessive aliasing of out-of-band noise from entering the baseband. Therefore, in the present design, where the oversampling ratio is 24 and the quantization noise is third-order shaped, the rst-stage comb lter shown in Figure 4.1 should be fourth-order with a decimation factor of 6 or equivalently, as per Equation 2.23, have a transfer function 1  1 , z,6 4 4.2 H z =  6 1 , z,1 Simulations in Matlab 31 see Appendix C.2 give the frequency response of this comb lter in Figure 4.2. The upper Bode diagram in Figure 4.2 shows three notches in the region of 0  i.e., 0 25 MHz which are wide and deep enough to cover the stopband width of 2.08 MHz required by the decimator. Whereas, the lower Bode diagram enlarges the frequency response in the region of 0 =12 where the in-band signal loss is less than 1 dB so that a compensation lter is not necessary. The computation rate of this sinc4 lter depends on what kind of structure is adopted to implement it. If a cascaded structure described in Section 4.2 is used, no multiplications are demanded at all such that the lter has a computation rate of 2:33  108 additions subtractions per second.

4.1.1 Fourth-Order Comb sinc4  Filter

4.1.2 Half-band Filters


The purpose of the lters following the rst-stage comb lter is to remove any higher-frequency input signal in e ect, a sharp anti-aliasing lter before the signal is

71

Comb Filter Frequency Response 0

Magnitude (dB)

50 100 150 200 250 0 0.5 1 1.5 Frequency (Hertz) 2 2.5 x 10


7

Magnitude (dB)

0 1 2 3 4 0 0.5 1 1.5 Frequency (Hertz) 2 2.5 x 10


6

Figure 4.2: Frequency response of sinc4 lter with a decimation ratio of 6.


downsampled to the nal Nyquist rate. In other words, while the comb lter is good at ltering out the quantization noise, it is not sharp enough to act as a reasonable anti-aliasing lter for input signals slightly higher than baseband frequency. Instead using arbitrary FIR lters to realize this anti-aliasing lter, a few half-band FIR lters might be used in order to save about half multiplications. As mentioned in Section 2.4.3, half-band lters have equal passband and stopband ripples as well as equal passband and stopband width. Such lters naturally have the property that approximately half of the lter coe cients are exactly zero. Hence

72

Table 4.2: Nonzero Coe cients of the 11-tap Half-Band Filter.


Ideal Coe cients Rounded Coe cients Binary Coe cients b1 = b11 = 0.012218 0.012207 0.00000011001 b3 = b9 = -0.062581 -0.062500 -0.00010000000 b5 = b7 = 0.300962 0.300781 0.01001101000 b6 = 0.500014 0.500000 0.10000000000
* Other coe cients 2 = 4 = 10 = 8 = 0.
b b b b

the number of multiplications in implementing such lters is half of that needed for a linear phase design. The half-band lter is appropriate only for sampling rate changes of 2 to 1. Thus, the nal two lters in Figure 4.1 are preferable to be implemented as half-band lters. For a two-stage half-band implementation, the same passband and stopband ripples are required as = min 2p ; s = 0:001 4.3

where p and s are listed in Table 4.1. For the two half-band lters, the stopband cuto frequencies are 50=12 , 1 = 3:17 MHz and 50=24 , 1 = 1:08 MHz, respectively. Note that the nal stopband cuto frequency is larger than Fs de ned in Table 4.1. In other words, the nal half-band lter allows aliasing into the transition region Fp to Fs. The respective lter sizes are approximately based on the next largest odd number as per Equation 2.21 11 taps and 165 taps. Therefore, the computation rates for both lters are about 2:3  107 and 1:72  108 multiplications per second, respectively. If four additions or subtractions are averagely equal to one multiplication, the total computation rate of the comb and half-band lters is about 2:53  108 multiplications per second, i.e., approximate 3 times smaller than that of conventional FIR lters.

73

Table 4.3: Nonzero Coe cients of the 165-tap Half-Band Filter.


Ideal Coe cients Rounded Coe cients Binary Coe cients b2 = b164 = 0.000641 0.000488 0.000000000010 b4 = b162 = -0.000301 -0.000244 -0.000000000001 b6 = b160 = 0.000371 0.000244 0.000000000001 b8 = b158 = -0.000450 -0.000488 -0.000000000010 b10 = b156 = 0.000540 0.000488 0.000000000010 b12 = b154 = -0.000641 -0.000488 -0.000000000010 b14 = b152 = 0.000755 0.000733 0.000000000011 b16 = b150 = -0.000883 -0.000977 -0.000000000100 b18 = b148 = 0.001025 0.000977 0.000000000100 b20 = b146 = -0.001183 -0.000977 -0.000000000100 b22 = b144 = 0.001357 0.001953 0.000000001000 b24 = b142 = -0.001550 -0.001953 -0.000000001000 b26 = b140 = 0.001763 0.001953 0.000000001000 b28 = b138 = -0.001997 -0.001953 -0.000000001000 b30 = b136 = 0.002255 0.002197 0.000000001001 b32 = b134 = -0.002537 -0.002441 -0.000000001010 b34 = b132 = 0.002847 0.002930 0.000000001100 b36 = b130 = -0.003188 -0.002930 -0.000000001100 b38 = b128 = 0.003561 0.003906 0.000000010000 b40 = b126 = -0.003970 -0.003906 -0.000000010000 b42 = b124 = 0.004421 0.004395 0.000000010010 b44 = b122 = -0.004917 -0.004883 -0.000000010100 b46 = b120 = 0.005464 0.005859 0.000000011000 b48 = b118 = -0.006070 -0.005859 -0.000000011000 b50 = b116 = 0.006744 0.007812 0.000000100000 b52 = b114 = -0.007497 -0.007812 -0.000000100000 b54 = b112 = 0.008343 0.008301 0.000000100010 b56 = b110 = -0.009302 -0.009766 -0.000000101000 b58 = b108 = 0.010398 0.010742 0.000000101100 b60 = b106 = -0.011665 -0.011719 -0.000000110000 b62 = b104 = 0.013150 0.013672 0.000000111000 b64 = b102 = -0.014917 -0.015625 -0.000001000000 b66 = b100 = 0.017066 0.017578 0.000001001000 b68 = b98 = -0.019746 -0.019775 -0.000001010001 b70 = b96 = 0.023198 0.023438 0.000001100000 b72 = b94 = -0.027841 -0.027832 -0.000001110010 b74 = b92 = 0.034466 0.035156 0.000010010000 b76 = b90 = -0.044769 -0.046875 -0.000011000000 b78 = b88 = 0.063157 0.062500 0.000100000000 b80 = b86 = -0.105800 -0.105469 -0.000110110000 b82 = b84 = 0.318209 0.318359 0.010100011000 b83 = 0.500008 0.500000 0.100000000000 * Other coe cients 1 = 3 = = 165 = 0.
b b ::: b

74

The First Half Band Filter 0 0

The Second Half Band Filter

Magnitude (dB)

40 60 80 100 0 1 2 3 Frequency (Hertz) 4 x 10


6

Magnitude (dB)

20

20 40 60 80 100 0 0.5 1 1.5 Frequency (Hertz) 2 x 10


6

1 0.8

1 0.8

Magnitude

0.6 0.4 0.2 0 0 0.2 0.4 0.6 Frequency 0.8 1

Magnitude

0.6 0.4 0.2 0 0 0.2 0.4 0.6 Frequency 0.8 1

Figure 4.3: Frequency responses of 11-tap and 165-tap half-band lters.


An n-tap digital FIR lter may be expressed as

H z =

n X k=1

bkz,k,1

4.4

where the n coe cients bk are designed to meet given performance speci cations on the lter. There are a few algorithms 20 for determining suitable values of bk for linear-phase FIR lters. In this thesis work, the two half-band lters are designed by using the Parks-McClellan algorithm which applies the Remez exchange algorithm and Chebyshev approximation theory to nd optimal equiripple lters. The

75

Log Magnitude 0 1 0.8

Nomalized Magnitude

Magnitude (dB)

20

Magnitude
0 1 2 3 Frequency (Hertz) Passband Ripples 4 x 10
6

40 60 80

0.6 0.4 0.2 0

100

0.2

0.4 0.6 Frequency Stopband Ripples

0.8

0.015 0.01

15

x 10

Magnitude (dB)

10

0 0.005 0.01 0.015 0 2 4 6 Frequency (Hertz) 8 10 x 10


5

Magnitude

0.005

0.8

0.85 0.9 Frequency

0.95

Figure 4.4: Frequency responses of 11-tap half-band lter with rounded coe cients.
11-tap and 165-tap half-band lters are simulated in Matlab see Appendix C.3 and Figure 4.3 shows the ideal frequency responses of the designed lters. The upper diagrams in Figure 4.3 depict the log magnitude responses in dB, while the lower ones show the normalized magnitude responses versus normalized frequencies i.e., 1 corresponds to . for both desired lters dashed curves and designed lters dotted curves.

76

Log Magnitude 0 1 0.8

Nomalized Magnitude

Magnitude (dB)

20

Magnitude
0 0.5 1 1.5 Frequency (Hertz) Passband Ripples 2 x 10
6

40 60 80

0.6 0.4 0.2 0

100

0.2

0.4 0.6 Frequency Stopband Ripples

0.8

0.1

15

x 10

Magnitude (dB)

0.05

10

Magnitude
0 2 4 6 Frequency (Hertz) 8 10 x 10
5

0.05

0.1

0.6

0.7 0.8 Frequency

0.9

Figure 4.5: Frequency responses of 165-tap half-band lter with rounded coe cients. The ideal nonzero coe cients of the 11-tap lter are listed in right column of Table 4.2, while those of the 165-tap lter are listed in Table 4.3. The coe cients are symmetrically distributed around b n+1 2  and almost half of them are zero. Since the coe cients have to be represented in binary in order to be implemented into digital hardware, rounded coe cients and their binary correspondences are also listed in Table 4.2 and 4.3 for comparison. The word-lengths of the xed-point coe cients are set to 11 bits and 12 bits for the 11-tap and 165-tap lters, respectively, due

77 to the overall 12-bit resolution of the system, though optimal selection of the wordlength 32, Chapter 11 is not such a simple problem. The rounded coe cients were obtained through empirical iterative simulations see Appendix C.4 by using as few 1's in each coe cient as possible in order to reduce the computation amount, while maintaining tolerable deviation from the ideal frequency responses. Figure 4.4 and 4.5 depict the resulting frequency responses of the 11-tap and 165-tap lters, respectively. The lower diagrams in Figure 4.4 and 4.5 show the passband and stopband ripples which are typically smaller than 0:01. Simulations reveal that the closer the coe cients are to 0.5 i.e., b6 and b83 in Table 4.2 and 4.3, respectively, the more e ect their accuracy have on the frequency responses. Hence those coe cients around that of 0.5 should be rounded as precisely as possible. Sometimes, however, a more accurate value of a speci c pair of coe cients may have negative e ects on the frequency response. For instance, b78 and b88 in Table 4.3 can be more accurate by converting the last second bit to 1, whereas the frequency response at the stopband will drop about 10 dB. Also, a more precise value may have only negligible e ects while needing much more 1's in binary number which correspond to more shifts and additions in hardware. For example, b64 and b102 should be ,0:0000001111101 instead of ,0:000001 as shown in Table 4.2, but the additional hardware size introduced by the ve more 1's will not be worth of the little resulting improvement on the frequency response. In Table 4.3, many coe cients far away from b83 are just coarsely rounded without seriously impairing the performance of the 165-tap lter.

4.2 Digital Filter Structures


To realize the transfer functions described above into hardware, proper structures and word-lengths for computing have to be rst determined, depending on the type of machine arithmetic used. In this digital implementation, xed-point two's complement arithmetic is used in order to simplify the hardware design as well as, most importantly, avoid over ow.

78

Table 4.4: Decimal Equivalents of Numbers 0.11 to 1.00.


Binary Decimal equivalent fourths number Signed-magnitude 1's complement 2's complement 0.11 3 3 3 0.10 2 2 2 0.01 1 1 1 0.00 0 0 0 1.11 -3 -0 -1 1.10 -2 -1 -2 1.01 -1 -2 -3 1.00 -0 -3 -4

4.2.1 Two's Complement Arithmetic


In xed-point arithmetic, the numbers are usually assumed to be proper fractions. A binary proper fractional number is obtained by setting the binary point between the rst and second bits. The rst i.e., the most signi cant bit is reserved for the sign of the number. The positive sign is 0, while the negative sign is 1. For example, a positive binary number +0:010 is simply represented as 0:010. However, a negative binary number ,0:010 may have three forms: 1.010, 1.101, and 1.110 depending on whether the signed-magnitude, or one's complement, or two's complement arithmetic is adopted. Table 4.4 lists the possible 3-bit binary numbers and their decimal equivalents in three arithmetic systems. The signed-magnitude and the 1's complement systems have two representations for zero whereas the 2's complement system has only one. On the other hand, ,1 is represented in the 2's complement system but not in the other two. The two's complement arithmetic can be realized in the simplest hardware compared with the other two, because the carry-out at the sign bit can be ignored in 2's complement addition. This wrap-around property is essential to the proper operation of the comb lter 33 , as we will mention later. Therefore, the 2's complement arithmetic is used in the whole digital system design.

79

Y1

z-1

z-1

2 z-1

+ 2

+ 5

Y2

5 z-1

+ -

z-1

+ -

Figure 4.6: Structure for error cancellation logic.


Before choosing structures and wordlength for the decimation lters, we have to determine the length of registers in the error cancellation logic shown in Figure 4.1 and de ned by Equation 4.1. Since in the analog modulator the 3-bit Y2 is actually scaled to one-quarter i.e., right shift of two bits of the full signal range of the 1-bit Y1, 5-bit binary numbers should be used to represent Y1. Thus, the two levels of Y1, 1 and 0, correspond to 0.1111 and 1.0000 in binary or, 31=32 and ,1 in decimal, respectively, while the eight levels of Y2 correspond to the binary numbers in the range of 0.0011 1.1100 or, from 3=32 to ,1=8 in decimal by adding two more bits at left identical with the leftmost bit of the 3-bit word of Y2. For example, 010 of Y2 will be transferred to 0.0010, and 101 will be 1.1101. Figure 4.6 shows a simple and direct structure for realizing the error cancellation logic. Note that the 41 , z,1 2 term in Equation 4.1 is accomplished by two cascaded integrators with transfer function of 21 , z,1 .

4.2.2 Pipelining Structure for Comb Filter


A way to realize the comb lter is to re-write Equation 4.2 as 1 H z =  1 ,1z,1 4 1 , z,6 4 6 4 4.5

80
50/6 MHz 13 OUT

13 IN

+
z-1

+
z-1 (Four integrators) z-M

+ z-M (Four differentiators) (a) 50/6 MHz

+ -

13 IN

+
z-1

+
z-1 50 MHz (b) 50/6 MHz z
-1

+ z-1 50/6 MHz (Low clock rate) z-1

+ -

13 OUT

50 MHz (High clock rate) 13 IN

50/6 MHz

-1

+ z-1 50/6 MHz (Low clock rate) z-1 (c)

+ -

13 OUT

50 MHz (High clock rate)

50 MHz

50/6 MHz

Figure 4.7: Structures for the fourth-order comb lter.


and thus realize it by cascading four integrators IIR followed by four cascaded di erentiators FIR and a dumper as shown in Figure 4.7a. However, the disadvantage of this architecture is that one should implement 6  4 shift registers for the di erentiators. An attractive alternative is depicted in Figure 4.7b where the dumper a switch working at lower frequency is moved to between the IIR and FIR parts and therefore z,6 can be replaced by z,1 , resulting in less memory. Since the integrators are working at 50 MHz in this design, the performance requirements for the non-delaying integrators i.e., four additions per cycle place heavy burden on their design. Instead, they can be replaced by four delaying integrators 34 , as shown in Figure 4.7c, which will simply introduce four clock delays while allowing one addition per cycle. This pipelining structure for comb lters is more attractive for

81
registers working at 50/6 MHz 2T 2T 2T 2T T T

13 IN

+
b(1)

+
b(3)

+
b(5) b(6) latches working at 50/12 MHz

++
13 OUT

Figure 4.8: Structure for the 11-tap half-band lter.


applications with higher speed. A point to note here is that at rst glance it appears as though the recursive integrators may over ow due to a dc input. Fortunately, it can be proven 33 that over ow can be avoided by using a modulo arithmetic system with wrap-around characteristic everywhere in the system. A su cient condition for the lter to work correctly is to choose a modulo which is larger than M K + 1  N , where M is the decimation factor, K is the order of the comb lter, and N is the dynamic range of the input signal. In the present design, by taking a modulo 213  64 + 1  5, all calculations can be performed by ordinary 2's complement operators without carry handling. Thus, the wordlength for the decimator system is set to 13-bit.

4.2.3 Parallel and Pipelining Structure for Half-Band Filters


Since half-band lters are symmetrical FIR lters, the e cient structure shown in Figure 2.14 can be used to realize the 11-tap and 165-tap half-band lters. Fig-

82

13 IN

registers working at 50/12 MHz T 2T 2T 2T 2T 2T 2T T T

41 adders

+
b(2)

+
b(4)

+
b(82) b(83)

42 multipliers

20 substractors & 1 adder

+21 latches working at 50/24 MHz

20 adders on 5 stages

+
13 OUT

Figure 4.9: Structure for the 165-tap half-band lter.


ure 4.8 indicates the structure of the 11-tap lter with absolute coe cients shown in it. A two-period delay is used between the nonzero coe cients because every second coe cient is zero, except for the middle coe cient b6, as listed in Table 4.2. The negative coe cient b3 is realized by performing a positive multiplication followed by a subtraction. All of the additions denoted by symbol  are performed in parallel, i.e., every  represents a 13-bit adder or subtractor. The registers denoted by T are working at 50=6 MHz, while the calculations may work at 50=12 MHz with a decimation factor of 2. To alleviate the performance requirements, a pipelining con guration is also used by inserting a layer of latches to separate the calculations into two stages as shown in Figure 4.8. The latches are working at 50=12 MHz. All of the multiplications are also performed in parallel via many shifters and adders to avoid the use of a shared multiplier. This parallel and pipelining structure

83 is not area-e cient, but may maximize the speed of calculations so as to be critical for modern high-speed DSP applications. A similar structure for the 165-tap lter is shown in Figure 4.9. In this structure, more parallel adders and pipelining latches are needed, and the parallel multiplications will consume a large portion of silicon area. Hence the binary coe cients should be rounded with as few 1's as possible because each 1 means a 13-bit adder. Due to the overall 12-bit resolution, the nal output of this lter is truncated to 12 bits by ignoring the rightmost bit of the 13-bit word.

4.3 VHDL Implementation


The functionalities of the digital lters are rst described on RTL Register Transfer Level level via VHDL, and nally implemented into standard-cell integrated circuits. The VHDL programs see Appendix B.2 to B.5 are coded, compiled, and simulated within Mentor Graphics QuickHDL, a VHDL compiler and simulator. The compiled codes are then fed to a synthesizer AutoLogic II to get the actual hardware from a MOSIS 0.8-m standard cell library.

4.3.1 Programming and Simulation


VHDL is an acronym for VHSIC Hardware Description Language and VHSIC stands for Very High Speed Integrated Circuit. This language was originally developed as a result of a DOD the Department of Defense project and now is an IEEE standard IEEE Std 1076-1993 for describing digital designs. In this section, only some elements of the language that are used in my programs will be explained. Detailed introduction of VHDL may be found elsewhere 35 . Since the synthesizer AutoLogic II can only support a subset of the standard VHDL syntax 36 , many features of the standard language are not applicable so that the codes are not so compact as they should be.

84

Error Cancellation Logic


Appendix B.2 gives the VHDL codes for the error cancellation logic de ned by Equation 4.1. The library command makes two design libraries, ieee and arithmetic, visible in this design unit. The ieee library contains the package std logic 1164, which is an IEEE standard IEEE Std 1164-1993 and de nes a nine-value logic type and its associated overloaded functions and other utilities. Although the ieee design library is not part of the VHDL language standard, it is extensively used in digital designs. The arithmetic design library and its std logic arith package, which are developed by Mentor Graphics and associated with Mentor tools, contain synthesizable functions for 2's complement arithmetic and other types of arithmetic operations. This library makes the coding more concise in the present calculation-oriented application. The use commands following the library command make visible all items declared in the two packages, std logic 1164 and std logic arith. After library and package declarations, an entity named e cancel is declared. The entity declaration describes the external view of the entity, for example, the input signals y1, y2, and clk in e cancel. A generic in the entity declaration declares a constant object multi bit with default value of 4, which may be overridden, if di erent wordlength is needed, while maintaining other codes unchanged. An entity is modeled using an entity declaration and at least one architecture body. The architecture body contains the internal description of the entity, as either a set of interconnected components that represents the structure of the entity, or a set of concurrent or sequential statements that represents the behavior of the entity. In the architecture behav of entity e cancel, a behavioral model is described via a combination of concurrent and sequential statements. At the beginning of the architecture, some internal signals between the primary inputs and outputs are declared. The signals y1 d0 and y2 d0 are used to latch the inputs y1 and y2 because the analog modulator does not have registers at its output

85 interface. The signals y1 d1 to y1 d3 are outputs of the three delay elements for y1 as shown in Figure 4.6. The signal y2 d1 is the output of the delay element in the rst di erentiator, while the signal sum1d is that in the second di erentiator. The rst three concurrent statements in the architecture de ne all of the calculations involved in this logic. The signal sum1 is the output of the rst di erentiator, while the signal sum2 combines all other summation outputs. Note that the gain factor of 2 is accomplished by a left shift of 1 bit via the sla operator shift left arithmetic. The process statement contains sequential statements that describe the functionality of a portion of the entity in sequential terms. This process is triggered by a rising edge on the clock signal clk. In this process, the 1-bit input y1 is extended to 5-bit y1 d2 by converting 1 to 01111 and 0 to 10000, while the 3-bit input y2 is extend to 5-bit y2 d0 with a gain of 2. The primary output y is a latched version of sum2.

Comb Filter
The VHDL program for the fourth-order comb lter is given in Appendix B.3. Actually, this program can be a general model for any comb lter by applying di erent values to the generic objects: the input wordlength d, the order of comb lter k, and the output wordlength l. Two clock signals, clk1 and clk2, are used in this lter at the original sampling rate and the decimated sampling rate, respectively. A new data type named sinc adder is de ned as an unconstrained array of another constrained array of signed. Four internal signals are declared as this kind of array of array. The signal sum d0 is actually a 13-bit version of the 5-bit input x, while other four elements of the sum d array are outputs of the delay elements in the integrators as shown in Figure 4.7c. The signal diff 0 is the output of the dumper, while other diff elements contain results from the summation nodes of the

86 cascaded di erentiators. The four sum signals are resulted from the four adders of the cascaded integrators. Array diff d are used to store the results of diff 0 to diff 3. All of the eight  nodes in Figure 4.7c are generated by the for statement labeled as gk. The clocked process is triggered by rising edges on both clk1 and clk2. The memory elements in integrators are working at clk1, while those in di erentiators are working at clk2. Note that a reset signal rst is used by the memory elements in the integrators, because these registers have to be initially set to known states as long as they are con gured in feedback paths. Otherwise, the initial unknown states will be iterated and propergated forever. The registers in the di erentiators do not need to be initially reset due to their feedforward con gurations. Between the two stages of memory elements, diff 0 and y, the maximum combinational delay is four times the delay of a 13-bit adder, which should be less than the clk2 period. This constraint may be satis ed in the 0.8-m technology for the present speed. If higher speed is required, delays should be introduced into the di erentiators.

Half-Band Filters
In Appendix B.4, two di erent architectures, hbf11 and hbf165 which correspond to the 11-tap and 165-tap half-band FIR lters, respectively, are sharing a common entity declaration for the entity r since the two lters have similar external interfaces. In the architecture hbf11, the register le x d for storing the input x is working at clk1, while the latches sum di and output y are working at clk2, whose frequency should be half of that of clk1 for half-band lters. It should be noted that all the multiplications of the coe cients listed in Table 4.2 are accomplished by shiftand-add operations. Each '1' in a coe cient corresponds a right shift operation via the shift operator sra shift right arithmetic. The maximum combinational delay in the 11-tap lter is three times the delay of a 13-bit adder. Similarly, the maximum

87 combinational delay in the 165-tap lter, which is introduced by the assignment statement for y, is ve times the delay of a 13-bit adder. However, these constraints are not the speed bottle-neck for the whole digital circuits since both lters are working at low sampling frequency compared to the error cancellation logic. Finally, all the above design entities and architectures are compiled and stored in the default working library work, and then used as components in the structural model for the whole digital design as given in Appendix B.5. In the architecture struc of the entity lter, four components labeled as i1 i4 are directly instantiated with the entity-architecture pairs described above. A clock divider is modeled in the block clk gen to generate three decimated clock signals from the original clock clk, which may come from the C 2N signal in Figure 3.7.

Simulation
To simulate the VHDL models on RTL Register Transfer Level level, digital stimulus have to be derived from the outputs of the analog modulator. However, the complicated digital waveforms shown in Figure 3.21 and 3.22 can hardly be modeled into regular and periodical test patterns. Figure 4.10 shows the digital outputs of the modulator as a di erential dc signal of 0:666V with reference range of 0 2V is applied to the input. A periodical pattern may be observed from the output Y 1, while another output Y 2 appears more irregularly. In order to simplify the digital simulation, I pick up the sequences from 1.48 s to 1.72 s in Figure 4.10 as periodical stimulus applied to the digital circuit. Table 4.5 lists the test patterns for inputs x1 and x2 of the lter entity. It should be noted that if a high on x1 means a decimal number +1 and a low means ,1, then the average of the test sequence on x1 is +1=3, : 1=3. corresponding to the analog dc input 0:666=2 = The RTL simulation is performed in QuickHDL by applying the above test patterns along with clock and reset patterns. A portion of the simulation results is

88
Analog Trace
/Y2(6) (Voltage:v )
5 4 3 2 1 0 5 4 3 2 1 0 6 5 4 3 2 1 0 -1 5 4 3 2 1 0 5

/Y1 (Voltage:v )
4 3 2 1 0 1.3e-06 1.4e-06 1.5e-06 1.6e-06

/Y2(3) (Voltage:v )

/Y2(4) (Voltage:v )

/Y2(5) (Voltage:v )

1.7e-06

1.8e-06

1.9e-06

2e-06

t (Time:sec )

Figure 4.10: Simulation results of the modulator shown in Figure 3.5 with dc input 1. of 3

/x1 /clk /rst /x2 /y /y1 /y2 /y3 /x /clk1 /clk2 /clk3 /count

Figure 4.11: Simulation results of the digital circuits modeled in Appendix B.5.

011

101

110

100 110100111100 0111010011100

101

100

101

011

101

110

100

101

100

101

011

101

110

100

XXXXXXXXXXXX 1011001100100 1101001111101

1011001100100

0111010011100

1011001100100

0111010011100

XXXXXXXXXXXXX 1101001111000 01001 01111 01001 10011 10100 11000 00101 10011 10100 01001 10011 11000 01001 01111 01001 10011 10100 11000 00101 10011 10100 01001 10011 11000 01001 01111 01001 10011

21 22

23

10

11

12

13

14

15

16

17

18

19

20

21

22

23

43750 Entity:filter Architecture:struc Date: Mon Jun 23 18:56:23 1997 Page 1

44 us

89

90

Table 4.5: Periodical Test Patterns for Digital Filters.


cycle 1 2 3 4 5 6 7 8 9 10 11 12 x1 1 1 0 1 1 0 1 1 0 1 0 1 101 110 110 100 101 101 100 101 101 011 011 011 x2 shown in Figure 4.11. The signals clk1 clk3 are clocks dividing clk by 6, 12, and 24. The signal x is the output of the error cancellation logic, while the signals y1 y3 are the outputs of the comb, 11-tap, and 165-tap lters, respectively. Note that y3 shows up meaningful results after the simulation has run for 43700 ns. This is because all the registers in feedforward con gurations are not initially reset to known states. The primary output y is a 12-bit truncated version of y3. The output is 1.10100111100 in binary and ,0:345 in decimal, approximately an inversion of 1=3 with o set error. The inversion is probably introduced by the interface between analog modulator and digital lters, while the o set error is due to the inaccurate test patterns in Table 4.5.

4.3.2 Synthesis
Now it is ready to synthesize the compiled VHDL codes into ASIC Application Speci c Integrated Circuits or FPGA Field Programmable Gate Array design. The synthesis process in Mentor Graphics AutoLogic II involves generic netlist synthesis, area optimization and timing performance optimization 37 based on a destination technology. A cell library is needed to map the synthesized generic structure into a netlist of logic gates. Here we are using the 0.8-m standard cell library from MOSIS, scn08hp lib. The optimized netlist may be stored into a proper format, such as EDIF, EDDM, and GENIC, so as to facilitate further gate-level simulation and layout design. It takes about 6 hours for a Sun SPARC 20 workstation to nish the synthesis and area optimization with low e ort on logic factoring for the architecture struc of entity lter. To illustrate how large the resulting circuit is, Table 4.6 gives

91

Table 4.6: Global Cell Usage Statistics for Digital Filters. 3 2


Cost in Area Units: 10 m Cell Instance Cost Name Count Cell Subtotal GFL LIB:FALSE 47 81 0.76 61.16 SCN08HP LIB:and02 SCN08HP LIB:ao012 8 0.94 7.54 SCN08HP LIB:ao013 3 1.13 3.40 145 1.13 164.14 SCN08HP LIB:ao022 SCN08HP LIB:ao023 7 1.32 9.25 SCN08HP LIB:aoi012 465 0.76 351.07 SCN08HP LIB:aoi022 14 0.94 13.20 SCN08HP LIB:d 0 2709 2.27 6135.89 3957 0.38 1491.79 SCN08HP LIB:inv01 SCN08HP LIB:mux0 6 1.32 7.93 SCN08HP LIB:nand02 1220 0.57 690.52 1 0.76 0.76 SCN08HP LIB:nand03 SCN08HP LIB:nand04 1 0.94 0.94 SCN08HP LIB:nor02 12 0.57 6.79 SCN08HP LIB:nor03 1 0.76 0.76 SCN08HP LIB:oa012 2 0.94 1.89 1 1.13 1.13 SCN08HP LIB:oa013 SCN08HP LIB:oa014 1 1.32 1.32 SCN08HP LIB:oa023 1 1.32 1.32 1756 0.76 1325.78 SCN08HP LIB:oai012 SCN08HP LIB:oai013 16 0.94 15.09 1 1.13 1.13 SCN08HP LIB:oai014 SCN08HP LIB:oai022 28 0.94 26.40 SCN08HP LIB:oai023 7 1.13 7.92 1 1.13 1.13 SCN08HP LIB:oai0113 SCN08HP LIB:or02 4 0.76 3.02 146 1.32 192.87 SCN08HP LIB:xnor0 SCN08HP LIB:xor0 1094 1.32 1445.17 Total: 11735 11969.30

92

Table 4.7: Hierarchy Statistics for Digital Filters.


Cost in Area Units: 103m2 local local local local global global Inst Cell prim prim module module prim prim name name count cost count cost count cost lter.struc 35 32.08 4 11937.22 11735 11969.30 i1 e cancel.behav 53 69.64 53 69.64 i2 decimator.opt 325 394.04 8 377.39 847 771.43 i3 r.hbf11 426 553.49 10 434.26 945 986.75 i4 r.hbf165 5039 6529.63 88 3579.79 9855 10109.42 the statistical report for the standard cells used in the circuit, while Table 4.7 shows how large each of the four major components is. From Table 4.6 we may see that the circuit consists of 11.7 K gates consuming about 12 mm2 silicon area. If the routing area of a standard-cell ASIC design is estimated as 40 of the total silicon area, this digital core circuit will cost totally about 20 mm2 silicon. It is typical for a transistor in this library to cost about 120 m2 area with overhead. Thus, the total number of transistors in this circuit is about 0.1 million. The majority of gates used are D- ip ops d 0, inverters inv01, XOR gates xor0, and OAI gates oai012, which are major building blocks for registers and adders. In Table 4.7, a primitive prim corresponds to a standard cell or gate, while a module means an adder or a subtractor consisting of many primitives. For example, the 11-tap half-band lter i3 contains 426 primitives mainly d 0 and inv01 as well as 10 modules which are constructed by other 945 , 426 = 519 primitives mainly xor0 and oai012. It should be noted that the 165-tap FIR lter consumes up about 90 of the total resources, while the comb lter costs less than 7 of the area. One or two more area optimizations may be performed to achieve less silicon consumption, but the e ects will not be very obvious because this design is mainly constructed by registers and adders which are already optimum. Timing optimiza-

93 tions, which are usually performed after area optimizations by using larger area to prevent any violations of timing constraints, may not be needed due to the pipelining structures inside the circuits.

94

Chapter 5 CONCLUSION
This thesis dealt with the design of a  analog-to-digital converter for highspeed applications, using 0.8-m CMOS VLSI technology from MOSIS. A complete  A D converter combines an analog  modulator with a digital decimator to construct a mixed-mode signal processing system. This work accepts the challenge of designing with such a mixed-signal system of analog CMOS circuits and digital signal processing lters.

5.1 Summary
A published  modulator structure 1 was implemented into VLSI. This modulator is an interesting structure because it may be applied to high-speed data conversions while most of the commercial  data converters were used for lowspeed applications like digital telephony and digital audio. The work was to migrate that published modulator into a faster technology with necessary modi cations, and complete the  ADC system by creating a new DSP architecture for the high-speed application. Based on the speci cations and interface of the published  modulator, a new digital decimator was designed and veri ed rst. There are many choices to implement a decimator, and a three-stage linear-phase structure was chosen in this work. The digital lters realizing the decimator include a fourth-order comb lter with decimation factor of 6, an 11-tap half-band FIR lter, and a 165-tap half-band FIR lter. These lters were generated and simulated in Matlab. Then they were modeled in VHDL, veri ed with an RTL simulator, and synthesized into a standardcell circuit. Mentor Graphics tools were used in the thorough design procedure.

95 A common way to implement a digital lter is to store the coe cients in ROM read-only memory and perform the multiplications in a dedicated multiplier. In this work, however, a pipelining, parallel, and multiplierless structure was used to implement the half-band lters, since this structure might be more feasible for higherspeed applications when ner VLSI technologies are available. The synthesized digital circuit contains about 0.1 million transistors and consumes about 20 mm2 silicon area with 0.8-m standard-cell technology. The most di cult part of this work was the analog modulator design, not only because analog IC design is a kind of empirical art", but also because the original paper proposing that high-speed  modulator did not give so many details that sometimes there were some kind of uncertainty in the design. For example, as the analog circuit was initially simulated under a di erential input of 3V with the common-mode voltage set at 0V , the 3-bit output Y 2 just bounced between two levels. This problem was solved by re-de ning the input range to 0 , 2V while keeping the integrators from overload. Although most of the original analog circuits were mapped into the 0.8-m technology without structural modi cations, some new features were added. A wideswing bias circuitry was added so that the original opamp design was modi ed to t in with the wide-swing structure. Hence, lots of simulations have been performed to look for the proper bias voltage values for the opamp's operation. A new structure published in the 90's has been accommodated into the original switched-capacitor integrator to accomplish a combined 1-bit DAC for better performance, and therefore the 3-bit DAC was also modi ed to take advantage of this upgrade. The nal analog circuit contains about 0.5 K transistors, capacitors, and resistors. The modulator was analyzed in Mentor's analog simulator AccuSim II, though a mixed-mode simulator is strongly needed to perform the simulation for the whole converter.

96 So far, the front-end design of this  converter, i.e., the design entry and pre-layout simulation, has been done. The back-end design, i.e., the layout design and post-layout simulation, is left for future work. As an estimation of the di culty of full-custom layout design, the opamp block was laid out as shown in Figure 5.1. This layout has passed all DRC design rules checking and LVS layout versus schematic checkings. Some analog layout design considerations in this layout will be discussed in the next section. After all, some highlights of this work may be listed as follows: A published  modulator was migrated to a faster, 0.8-m CMOS process. A three-stage decimator was designed to complete the  A D converter. Some new analog circuits, such as the wide-swing constant transconductance bias circuit and the switched-capacitor 1-bit DAC, were added into the original modulator design. Parallel and pipelining structures were implemented in VHDL for the lters. Both full-custom bottom-up and semi-custom top-down IC design approaches were utilized in designing this mixed-signal system.

5.2 Future Work


It is well known that designing a layout is a tedious, time-consuming, and error prone task. For a pure digital design, many mature CAD tools are available to facilitate and even automate the design for smaller and or faster layout. For analog or mixed analog-digital designs, however, considerations for accuracy have to be carefully taken into account in the layout design phase 4, Chapter 11 5, Chapter 2 3, Chapter 11 . Thus, many special techniques for analog layout, such as a interdigitized fashion for di erential pairs, a serpentine arrangement for resistors, and

97

Figure 5.1: Layout for the opamp shown in Figure 3.11.

98 common centroid structure for capacitors, make only manual" layout design feasible in some circumstances. A SDL schematic-driven layout procedure in ICstation, a Mentor Graphics tool for IC layout design, facilitated the layout process of Figure 5.1, because the device generator available in the current MOSIS design kit can automatically fold large transistors into smaller ones with shared di usion regions, as shown in Figure 5.1. However, the interconnections of these folded transistors had to be manually routed. For example, the two transistors, Q1 and Q2 in Figure 3.11, which are made up of a di erential pair, were physically implemented into the row of transistors in the middle of Figure 5.1 and connected in an interdigitized fashion to minimize the o set caused by mismatch. The serpentine metal1 blue layer path in the middle row, which has the same stipple pattern as that of signal VOUT+ and VOUT-, and the poly red layer combs of signal VIN+ and VIN-, had to be carefully drawn by hand". The total area for this opamp layout is about 85  105m2. The current 0.8-m process le is not suitable for analog layout. For example, Appendix A.1 just gives the crossover capacitance between metal1 and poly layers as 0:0104fF=lambda2 i.e., 0:65fF=m2, while a typical capacitance between poly and electrode for analog linear capacitors is about 0:5fF=lambda2 . Also, the sheet resistance of n-di usion 0.368 per square in Appendix A.1 is much smaller than a typical resistance of 2 3 per square. Hence a more accurate and complete process le is needed for analog layout design. The layout for the analog modulator should be designed in a full-custom manner, while the digital circuit may be automatically laid out. The physical standard-cell library may be linked to the layout tool ICblock so that the placement and routing of the cells used in the schematic can be done by ICblock. However, clock skew problems may be encountered if the clock signals are not distributed in balance.

99 After the layout design, the physical circuit should be simulated in a transistorlevel simulator, like Mentor's Lsim. To facilitate the testing of the manufactured chips, a fault coverage analysis should be performed for the test vectors.

100

Bibliography
1 B. P. Brandt and B. A. Wooley. A 50-MHz multi-bit sigma-delta modulator for 12-b 2-MHz A D conversion". IEEE J. Solid-State Circuits, 26:1746 1756, Dec. 1991. 2 H. Inose, Y. Yasuda, and J. Murakami. A telemetering system code modulation|delta-sigmamodulation". IRE Trans. Space Elect. Telemetry, SET8:204 209, Sept. 1962. 3 S. R. Norsworthy, R. Schreier, and G. C. Temes, editors. Delta-Sigma Data Converters: theory, design, and simulation". IEEE Press, New York, 1996. 4 J. E. Franca and Y. Tsividis, editors. Design of Analog-Digital VLSI Circuits for Telecommunications and Signal Processing". Prentice Hall, Englewood Cli s, New Jersey, 1994. 5 D. A. Johns and K. Martin. Analog Integrated Circuit Design". Johns Wiley & Sons Inc., New York, 1996. 6 K. M. Daugherty. Analog-to-Digital Conversion". McGraw-Hill Inc., New York, 1994. 7 J. van Valburg and R. J. van de Plassche. An 8-b 650-MHz folding ADC". IEEE J. Solid-State Circuits, 27:1662 1666, Dec. 1992. 8 K. Matsumoto, E. Ishii, K. Yoshitate, K. Amano, and R. W. Adams. An 18-b oversampling A D converter for digital audio". ISSCC Dig. Tech. Pap., pages 202 203, Feb. 1988.

101 9 R. W. Adams. Design and implementation of an audio 18-bit analog-to-digital converter using oversampling techniques". J. Audio Eng. Soc., 34:153 166, Mar. 1986. 10 D. R. Welland, B. P. Del Signore, E. J. Swanson, T. Tanaka, K. Hamashita, S. Hara, and K. Takasuka. A stereo 16-bit delta-sigma A D converter for digital audio". J. Audio Eng. Soc., 37:476 486, June 1989. 11 V. Friedman, D. M. Brinthaupt, D.-P. Chen, T. W. Deppa, J. P. Elward, E. M. Fields Jr., J. W. Scott, and T. R. Viswanathan. A dual-channel voiceband PCM codec using , modulation technique". IEEE J. Solid-State Circuits, SC-24:274 280, Apr. 1989. 12 B. H. Leung, R. Ne , P. R. Gray, and R. W. Brodersen. Area-e cent multichannel oversampled PCM voice-band coder. IEEE J. Solid-State Circuits, SC-23:1351 1357, Dec. 1988. 13 L. Logo and M. Copeland. A 13-bit ISDN-band oversampled ADC using twostage third order noise shaping". IEEE Proc. Custom IC Conf., pages 21.2.1 21.2.4, Jan. 1988. 14 R. Steele. Delta Modulation Systems". Johns Wiley & Sons Inc., New York, 1975. 15 J. C. Candy and O. J. Benjamin. The structure of quantization noise from sigma-delta modulation". IEEE Trans. Commun., COM-29:1316 1323, Sept. 1981. 16 J. C. Candy. A use of double integration in sigma-delta modulation". IEEE Trans. Commun., COM-33:249 258, Mar. 1985.

102 17 R. W. Adams, P. F. Ferguson, A. Ganesan, S. Vincelette, A. Volpe, and R. Libert. Theory and practical implementation of a fth-order sigma-delta A D converter". J. Audio Eng. Soc., 39:515 528, July 1991. 18 T. C. Leslie and B. Singh. An improved sigma-delta modulator architecture". IEEE Proc. ISCAS'90, 1:372 375, May 1990. 19 A. Hairapetian and G. C. Temes. A dual-quantization multi-bit sigma-delta A D converter". IEEE Proc. ISCAS'94, 5:437 440, May 1994. 20 J. G. Proakis and D. G. Manolakis. Digital Signal Processing: principles, algorithms, and applications". Maxwell MacMillan Int., New York, 1992. 21 A. V. Oppenheim. Discrete-time Signal Processing". Prentice Hall, Englewood Cli s, N.J., 1989. 22 R. E. Crochiere and L. R. Rabiner. Interpolation and decimation of digital signals|A tutorial review". Proc. IEEE, pages 300 331, Mar. 1981. 23 B. Brandt. Oversampled Analog-to-Digital Conversion. PhD thesis, Stanford University, 1991. 24 L. A. Williams and B. A. Wooley. Third-order cascaded sigma-delta modulators". IEEE Tran. Circuits Syst., 38:489 498, May 1991. 25 B. E. Boser and B. A. Wooley. The design of sigma-delta modulation analogto-digital converters". IEEE J. Solid-State Circuits, 23:1298 1308, Dec. 1988. 26 K. Martin. Improved circuits for the realization of switched-capacitor lters". IEEE Trans. Circuits and Syst., CAS-27:237 244, Apr. 1980. 27 Jr. P. Ferguson, A. Ganesan, and R. Adams. An 18-b 20 khz dual ds a d converter". ISSCC Dig. Tech. Papers, pages 68 69, Feb. 1991.

103 28 T. Choi et al. High-frequency cmos switched-capacitor lters for communications application". IEEE J. Solid-State Circuits, pages 652 664, Dec. 1983. 29 S. H. Lewis and P. R. Gray. A pipelined 5-Msample s 9-bit analog-to-digital converter". IEEE J. Solid-State Circuits, pages 954 962, Dec. 1987. 30 J. C. Candy. Decimation for sigma delta modulation". IEEE Trans. Commun., pages 72 76, Jan. 1986. 31 The MathWorks Inc. Guide", 1995.
The Student Edition of MATLAB: Version 4 User's

32 A. Antoniou. Digital Filters : analysis, design, and applications". McGraw-Hill, New York, 1993. 33 S. Chu and C. S. Burrus. Multirate lter designs using comb lters". IEEE Trans. Circuits and Sys., pages 913 924, Nov. 1984. 34 E. Dijkstra, O. Nys, C. Piguet, and M. Degrauwe. On the use of modulo arithmetic comb lters in sigma delta modulators". IEEE Proc. ICASSP'88, pages 2001 2004, Apr. 1988. 35 J. Bhasher. A VHDL Primer: Revised Edition". Prentice Hall, Englewood Cli s, NJ, 1994. 36 Mentor Graphics Corp. The VHDL Style Guide For AutoLogic II", 1995. 37 Mentor Graphics Corp. Synthesizing with AutoLogic II", 1996.

104

Appendix A MOSIS SCMOS PROCESS


MOSIS is a public organization to supply ASIC fabrication services for worldwide IC designers. To apply MOSIS scaled-CMOS SCMOS technologies within Mentor Graphics tools, Mentor Design Kit MDK for MOSIS is needed for synthesis, layout, and simulation. The current version of MDK installed in the VLSI Lab is V1.7 which consists of SCMOS technology les, SCMOS standard cell libraries, and documentation. In this thesis work, I chose a 0:8,m CMOS technology to implement the circuits.

A.1 SCN08HP Technology and Parameter File

Following is the technology and parameter le for 0:8 , m SCMOS process available at Hewlett-Packard. This process is a CMOS N-well, 3-metal technology named scn08hp with lambda equal to 0:4m.
IC trace Device De nitions DEV IC E C CA P PO LY ELEC T R O D E PO S NEG  0 0 device rdp dpres psrcdrn psrcdrn 0.32 device rdn dnres nsrcdrn nsrcdrn 0.368 IC extract Rule De nitions capacitance order PO LY META L1 META L2 META L3 direct capacitance order NSR C D R N PSR C D R N PO LY META L1 META L2 META L3 mask capacitance intrinsic META L3 0.0027 0 capacitance intrinsic META L2 0.0035 0 capacitance intrinsic META L1 0.0073 0

105
capacitance intrinsic PO LY 0.0137 0 capacitance intrinsic PSR C D R N 0.0891 0.0257 mask capacitance intrinsic NSR C D R N 0.048 0.0644 mask capacitance crossover META L3 META L2 0.0048 0 0 capacitance crossover META L3 META L1 0.0024 0 0 capacitance crossover META L3 PO LY 0.00208 0 0 capacitance crossover META L2 META L1 0.00448 0 0 capacitance crossover META L2 PO LY 0.00336 0 0 capacitance crossover META L1 PO LY 0.0104 0 0 resistance sheet META L3 0.008 0 resistance sheet META L2 0.011 0 resistance sheet META L1 0.011 0 resistance sheet PO LY 0.336 0 resistance sheet PSR C D R N 0.32 0 mask resistance sheet NSR C D R N 0.368 0 mask resistance connection META L1 PO LY 0.48 0 resistance connection META L1 META L2 0.22 0 resistance connection META L2 META L3 0.16 0 resistance connection META L1 PSR C D R N 0.44 0 mask resistance connection META L1 NSR C D R N 0.52 0 mask

A.2 SPICE Model File


For SPICE simulation in Mentor's analog simulator AccuSim, a SPICE model le generated after a recent run was downloaded from MOSIS web site http: www.isi.edu mosis vendors hp-cmos26g. The SPICE3 models for PMOS and NMOS are shown as follows:

106
* FET models for MO SIS H P 0.8 micron process on 3 24 97 .M O D EL n NM O S PH I= 0.700000 TOX = 1.6600E-08 XJ= 0.200000U TPG = 1 + VT O = 0.7184 DELTA = 6.3430E-01 LD = 9.0910E-10 KP= 1.2240E-04 + UO = 588.4 TH ETA = 1.3630E-01 RSH = 2.6960E+ 01 GA M M A = 0.6249 + NSU B= 5.0910E+ 16 NFS= 7.0710E+ 11 VM A X = 1.9510E+ 05 ETA = 3.5430E-02 + KA PPA = 1.2400E-01 CG D O = 1.9000E-10 CG SO = 1.9000E-10 + CG B O = 3.9199E-10 CJ= 3.0900E-04 MJ= 0.9500 CJSW = 6.3000E-10 + MJSW = 0.33100 PB= 0.700000 .M O D EL p PM O S PH I= 0.700000 TOX = 1.6600E-08 XJ= 0.200000U TPG = -1 + VT O = -0.8434 DELTA = 1.1570E+ 00 LD = 9.4480E-10 KP= 3.2971E-05 + UO = 158.5 TH ETA = 1.2470E-01 RSH = 5.0970E+ 01 GA M M A = 0.4820 + NSU B= 3.0290E+ 16 NFS= 5.9090E+ 11 VM A X = 2.2220E+ 05 ETA = 2.5980E-02 + KA PPA = 8.3010E+ 00 CG D O = 1.9000E-10 CG SO = 1.9000E-10 + CG B O = 4.0554E-10 CJ= 5.5200E-04 MJ= 0.4910 CJSW = 1.83000E-10 + MJSW = 0.42500 PB= 0.90000

107

Appendix B VHDL PROGRAMS


B.1 8-to-3 Encoder
------------------

Written by LL to VH D L at Thu Mar 20 21:42:03 1997 Param eterized Generator Sp eci cation to VH D L Code LogicLib generator called: PR IO R IT Y EN C O D ER Passed Param eters are: tinst nam e = encoder param eters are: Priority = MSB W =8 WO = 3 program values =  0:000 1:001 2:010 3:011 4:100 5:101 6:110 7:111 d:000  Rows = 1

library IEEE, AR IT H M ET IC ; use IEEE.std logic 1164.all; use AR IT H M ET IC .std logic arith.all;
--

encoder Entity Description

108
entity encoder is port DIN : in std logic vector7 downto 0; DO U T : out std logic vector2 dow nto 0 ; end encoder;
--

encoder Architecture Description

architecture rtl of encoder is begin PR IO R ITY EN C O D ER Process: processD IN  typ e enc table is array 0 to 7 of std logic vector 2 dow nto 0; constant prio table : enc table :=  "000", "001", "010", "011", "100", "101", "110", "111"; variable enc out : std logic vector2 dow nto 0; begin enc out := "000"; chk input: for x in DIN 'reverse range loop if DIN x = '1' then enc out := prio tablex; end if; end loop chk input;
--

Assign output
=

DO U T end rtl;

enc out;

end process PR IO R IT Y EN C O D ER Process;

109

B.2 Error Cancellation Logic


Error cancellation logic for 1-bit Y1 and 3-bit Y2 -- Y z  = 2z ,2 , z ,3 Y1 z  , 41 , z ,1 2 Y2 z 
--

LIB R A R Y ieee, arithm etic; use ieee.std logic 1164.all; use arithm etic.std logic arith.all; entity e cancel is genericm ulti bit: integer :=4; port y1:in std logic:= '0'; clk: in bit := '0'; y2: in signed m ulti bit - 2 dow nto 0:= others = '0'; y: out signed m ulti bit dow nto 0; end e cancel; architecture behav of e cancel is signal y1 d1, y1 d0: std logic:= '0'; signal y2 d1, y2 d0, sum 1, sum 1 d: signed m ulti bit dow nto 0:= others = '0'; signal sum 2, y1 d2, y1 d3, tem p: signed m ulti bit dow nto 0:= others begin sum 1 =y2 d0 - y2 d1; tem p =sum 1 sla 1; sum 2 =y1 d2 sla 1 - y1 d3 - tem p - sum 1 d;
=

'0';

110
processclk begin if clk= '1' and clk'event then y1 d0 =y1; y2 d0 =y2y2'left & y2 & '0'; y1 d1 =y1 d0; y2 d1 =y2 d0; for i in y1 d2'left - 1 dow nto 0 loop y1 d2i =y1 d1; end loop; y1 d2y1 d2'left =not y1 d1; y1 d3 =y1 d2; sum 1 d =tem p; y =sum 2; end if; end process; end architecture behav;

sinc4 lter -- internal word length l  lgd  nk + 1


--

B.3 Comb Filter


Decim ator using

--

d= input word length, n= decim ation ratio, k= order of sinc

LIB R A R Y ieee, arithm etic; use ieee.std logic 1164.all;

111
use arithm etic.std logic arith.all; entity decim ator is genericd:integer:= 5; k: integer:= 4; l:integer:= 13; port clk1, clk2, rst: in bit := '0'; x: in signed d - 1 dow nto 0:= others = '0'; y: out signed l -1 dow nto 0; end decim ator; architecture opt of decim ator is typ e sinc adder is array integer range signal sum d, di : sinc adder0 to k; signal sum , di d: sinc adder0 to k -1; begin sum d0x'left dow nto 0 =x; sum d0sum d0'left dow nto x'left +1 =others = xx'left; gk: for i in 0 to k -1 generate sum i =sum di + sum di +1; di i +1 =di i -di di; end generate gk; processclk1, clk2 begin if clk1= '1' and clk1'event then if rst= '1' then  of signed l -1 dow nto 0;

112
fori in 0 to k loop sum di =others = '0'; end loop; else for i in 1 to k loop sum di =sum i -1; end loop; end if; end if; if clk2= '1' and clk2'event then di 0 =sum dk; for i in 0 to k - 1 loop di di =di i; end loop; y =di k; end if; end process; end architecture opt;

B.4 Half-Band FIR Filters


--

11 taps and 165 taps half band FIR lters

library ieee, arithm etic; use ieee.std logic 1164.all; use arithm etic.std logic arith.all;

113

entity r is genericl:integer:= 13; portclk1, clk2:in bit:= '0'; x: in signedl -1 downto 0:= others = '0'; y:out signedl -1 dow nto 0; end r; architecture hbf11 of r is typ e data bus is array integer range signal x d:data bus0 to 10; signal sum : data bus0 to 2; signal sum d: data bus0 to 3; begin shift: process clk1, clk2 begin if clk1= '1' and clk1'event then for i in 1 to 10 loop x di =x di -1; end loop; end if; if clk2= '1' and clk2'event then sum d0 =sum 0 sra 7 + sum 0 sra 8 + sum 0 sra 11; sum d1 =sum 1 sra 4; sum d2 =sum 2 sra 2 + sum 2 sra 5 + sum 2 sra 6 + sum 2 sra 8; sum d3 =x d5 sra 1; y =sum d0 - sum d1 + sum d2 + sum d3;  of signed l -1 dow nto 0;

114
end if; end process shift; x d0 =x; add: for i in 0 to 2 generate sum i =x d2 * i + x d10 - 2 * i; end generate add; end architecture hbf11;

architecture hbf165 of r is typ e data bus is array integer range  of signed l -1 dow nto 0; signal x d:data bus0 to 163; signal sum : data bus0 to 40; signal sum d: data bus0 to 20; begin shift: process clk1, clk2 begin if clk1= '1' and clk1'event then for i in 1 to 163 loop x di =x di -1; end loop; end if; if clk2= '1' and clk2'event then sum d0 =sum 0 sra 11 - sum 1 sra 12;

115
sum d1 =sum 2 sra 12 - sum 3 sra 11; sum d2 =sum 4 sra 11 - sum 5 sra 11; sum d3 =sum 6 sra 11 + sum 6 sra 12 - sum 7 sra 10; sum d4 =sum 8 sra 10 - sum 9 sra 10; sum d5 =sum 10 sra 9 - sum 11 sra 9; sum d6 =sum 12 sra 9 - sum 13 sra 9; sum d7 =sum 14 sra 9 + sum 14 sra 12 sum 15 sra 9 + sum 15 sra 11; sum d8 =sum 16 sra 9 + sum 16 sra 10 sum 17 sra 9 + sum 17 sra 10; sum d9 =sum 18 sra 8 - sum 19 sra 8; sum d10 =sum 20 sra 8 + sum 20 sra 11 sum 21 sra 8 + sum 21 sra 10; sum d11 =sum 22 sra 8 + sum 22 sra 9 sum 23 sra 8 + sum 23 sra 9; sum d12 =sum 24 sra 7 - sum 25 sra 7; sum d13 =sum 26 sra 7 + sum 26 sra 11 sum 27 sra 7 + sum 27 sra 9; sum d14 =sum 28 sra 7 + sum 28 sra 9 + sum 28 sra 10 sum 29 sra 7 + sum 29 sra 8; sum d15 =sum 30 sra 7 + sum 30 sra 8 + sum 30 sra 9 sum 31 sra 6; sum d16 =sum 32 sra 6 + sum 32 sra 9 sum 33 sra 6 + sum 33 sra 8 + sum 33 sra 12; sum d17 =sum 34 sra 6 + sum 34 sra 7 - sum 35 sra 6 + sum 35 sra 7 + sum 35 sra 8+ sum 35 sra 11; sum d18 =sum 36 sra 5 + sum 36 sra 8 -

116
sum 37 sra 5 + sum 37 sra 6; sum d19 =sum 38 sra 4 - sum 39 sra 4 + sum 39 sra 5 + sum 39 sra 7 + sum 39 sra 8; sum d20 =x d5 sra 1 + sum 40 sra 2+ sum 40 sra 4 + sum 40 sra 8 + sum 40 sra 9; y =sum d0 + sum d1 + sum d2 + sum d3 + sum d4 + sum d5 + sum d6 + sum d7 + sum d8 + sum d9 + sum d10 + sum d11 + sum d12 + sum d13 + sum d14 + sum d15 + sum d16 + sum d17 + sum d18 + sum d19 + sum d20; end if; end process shift; x d0 =x; add: for i in 0 to 40 generate sum i =x d2 * i +1 + x d163 - 2 * i; end generate add; end architecture hbf165;

B.5 Structure Modeling for Digital Circuit


--

digital lter for delta-sigm a converter

LIB R A R Y ieee, arithm etic; use ieee.std logic 1164.all;

117
use arithm etic.std logic arith.all; entity lter is genericm ulti bit: integer :=4; l: integer:= 12; n: integer:= 24; port x1: in std logic:= '0'; clk, rst: in bit := '0'; x2: in signed m ulti bit - 2 dow nto 0:= others = '0'; y: out signed l -1 dow nto 0; end lter; architecture struc of lter is signal y1, y2, y3: signed l dow nto 0:= others = '0'; signal x: signed m ulti bit dow nto 0:= others signal clk1, clk2, clk3: bit:= '0'; signal count: integer range 0 to n -1:= 0; begin counter: processclk begin if clk= '1' and clk'event then if rst= '1' then count =0; else if count =n -1 then count =0; else count =count +1; end if;
=

'0';

118
end if; end if; end process counter; clk gen: block begin clk1 ='1' whencount= 0 or count= n 4 or count= n 2 or count= 3 *n 4 else '0'; clk2 ='1' whencount= 0 or count= n 2 else '0'; clk3 ='1' whencount= 0 else '0'; end block clk gen; i1: entity work.e cancelb ehav port mapx1, clk, x2, x; i2: entity work.decim atoropt port mapclk, clk1, rst, x, y1; i3: entity work. rhbf11 port mapclk1, clk2, y1, y2; i4: entity work. rhbf165 port mapclk2, clk3, y2, y3; y =y3l downto 1; end architecture struc;

119

Appendix C MATLAB PROGRAMS


C.1 Herrmann Estimation of FIR Filter Order
 Form ula prop osed by Herrm ann for approxim ating the length  of the linear phase FIR lter. function m= herrsf, pc, sc, pr, sr  sf: sam pling frenquency  pc: passband cuto frequency  sc: stopband cuto frequency  pr: passband ripple  sr: stopband ripple x= sc-p c sf; d= 0.005309*log10pr^2
+ 0.07114*log10pr - 0.4761*log10sr ... - 0.00266*log10pr^2 + 0.5941*log10pr + 0.4278;

f= 11.012 + 0.51244*log10pr - log10sr; m= d


- f*x^2 x + 1;

C.2 Frequency Response of sinck Comb Filters


 frequency resp onse for a

sinck com b lter

function yy, h, ny, pd = com bm , k, f, r

120

 m= decim ation ratio, k= order of sinc, f= frequency in Hz  r= oversam pling ratio  construct the coe cient vector for x= 1:m + 1; x1= -1; xm + 1= 1; fori= 2:m xi= 0; end  the coe cient vector for y= -1:2:1;

,z,m + 1

,z,1 + 1
m,z,1 + 1 k

 the coe cient vectors for ,z ,m + 1k and dd= convy, y; for i=1:k-2 nd= convnd, x; dd= convdd, y; end

nd= convx, x;  function conv multiplies two polynom ials.

nd= iplrnd;  ip vectors left-right so that theses vectors may be dd= iplrdd;  used in function freqz. dd= dd*m^k;  frequency resp onse of a digital lter given the num erator  and denom inator coe cients in vectors
h, p =freqznd, dd, 512, f;

nd and dd.

ny= 0:f 512*r:f r;  512-p oint frequencies in baseband of Nyquist rate.  frequency resp onse between 0 and f 2.

121
pd= freqznd, dd, ny, f;  frequency resp onse in range of 0 f r. h= 20*log10absh;  magnitude in dB. pd= 20*log10absp d; yy= 1:512; for i=1:512 yyi= i-1*f 1024;  y-axis vector of frequency. end  plot the bode diagram s. subplot2,1,1; plotyy,h; title'Com b Filter Frequency Resp onse'; grid on; ylab el'M agnitude dB'; xlab el'Frequency H ertz'; subplot2,1,2; plotnyquist, fr; title'M agnitude Decrease in Passband'; grid on; ylab el'M agnitude dB'; xlab el'Frequency H ertz';

C.3 Two-Stage Half-Band Filter Design


Figure 4.3 is generated by calling the following function half band in Matlab as half band1, 50 6, 10, 164 , where 1 corresponds to 1 MHz passband cuto frequency, 50 6 indicates the sampling rate of 50=6 = 8:33 MHz for the input signal of the half-

122 band lters, and 10 and 164 correspond to orders of the 11-tap and 165-tap lters, respectively.
 Calculation of coe cients for two cascaded half band lters  fp= pass band in MH z, fs= initial sam pling frequency in MH z  n1= order of the rst lter n-tap lter is n-1th-order  n2= order of the second lter  n1 and n2 should be even num b ers for odd num b er of taps coe cients function b1, b2 = half bandfp, fs, n1, n2 f1= 0 2*fp fs 1-2*fp fs 1 ;  pairs of frequency points between 0 and 1 m= 1 1 0 0 ;  the desired magnitude resp onse at the points sp eci ed in f f2= 0 4*fp fs 1-4*fp fs 1 ; b1= rem ezn1, f1, m;  remez generates equiripple FIR lter b2= rem ezn2, f2, m;
h1, w1 =freqzb1, 1, 512, fs*10^6;

 frequency resp onse

h2, w2 =freqzb2, 1, 512, fs*10^6 2;

h1= 20*log10absh1;  calculating in dB h2= 20*log10absh2;  plot the Bode diagram s subplot2,2,1; y= 1:512; fori= 1:512
yi=i-1*fs*10^6 1024;

end

123
ploty, h1; title'The First Half Band Filter'; grid on;
axis 0 fs 2*10^6 -110 10 ;

ylab el'M agnitude dB'; xlab el'Frequency H ertz'; subplot2,2,2; for i=1:512
yi=i-1*fs*10^6 2048;

end ploty, h2; title'The Second Half Band Filter'; grid on;
axis 0 fs 4*10^6 -110 10 ;

ylab el'M agnitude dB'; xlab el'Frequency H ertz';


h1, w1 =freqzb1, 1, 512; h2, w2 =freqzb2, 1, 512;

 plot the nom alized diagram s subplot2,2,3; plotf1, m, '-', w1 pi, absh1, '.';  dashed curve is the ideal resp onse. grid on; axis 0 1 -0.1 1.1 ; ylab el'M agnitude'; xlab el'Frequency';

124
subplot2,2,4; plotf2, m, '-', w2 pi, absh2, '.';  dotted curve is the designed resp onse. grid on; axis 0 1 -0.1 1.1 ; ylab el'M agnitude'; xlab el'Frequency';

C.4 Half-Band Filters with Rounded Coe cients


Figure 4.4 is generated by calling hbf11, 50 6 , while Figure 4.5 is generated by calling hbf21, 50 12 . The rounded coe cients are listed in Table 4.2.
 Frequency resp onse of 11-tap half band lter with rounded coe cients  fp= pass band in MH z, fs= sam pling frequency in MH z function b1= hbf1fp, fs f1= 0 2*fp fs 1-2*fp fs 1 ; m= 1 1 0 0 ;  construct vector for rounded coe cients b1=
2^-7+2^-8+2^-11 0 -2^-4 0 2^-2+2^-5+2^-6+2^-8 0.5 ... 2^-2+2^-5+2^-6+2^-8 0 -2^-4 0 2^-7+2^-8+2^-11 ; h1, w1 =freqzb1, 1, 512, fs*10^6;

h1= 20*log10absh1;  take logarithm of the magnitude subplot2,2,1; y= 1:512; for i=1:512

125
yi=i-1*fs*10^6 1024;

end ploty, h1; title'Log Magnitude'; grid on;


axis 0 fs 2*10^6 -100 10 ;

ylab el'M agnitude dB'; xlab el'Frequency H ertz'; subplot2,2,3; ploty, h1; title'Passband Ripples'; grid on;
axis 0 fp*10^6 -0.015 0.015 ;

 passband 0 fp MH z

ylab el'M agnitude dB'; xlab el'Frequency H ertz';


h1, w1 =freqzb1, 1, 512;

subplot2,2,2; plotf1, m, '-', w 1 pi, absh1, '.'; title'N om alized Magnitude'; grid on; axis 0 1 -0.1 1.1 ; ylab el'M agnitude'; xlab el'Frequency'; subplot2,2,4;

126
plotf1, m, '.', w 1 pi, absh1, '-'; title'Stopband Ripples'; grid on; axis 1-2*fp fs 1 -0.0005 0.0015 ;  stopband from 1-fp fs 2 to 1 ylab el'M agnitude'; xlab el'Frequency';

 frequency resp onses of the 165-tap half-band lter with rounded coe cients  fp= pass band in MH z, fs= sam pling frequency in MH z function b= hbf2fp, fs f1= 0 2*fp fs 1-2*fp fs 1 ; m= 1 1 0 0 ; b= rem ez164, f1, m; fori= 1:2:165 bi= 0; end  rounded coe cients
b2=2^-11; b4=-2^-12; b6=2^-12; b8=-2^-11; b10=2^-11; b12=-2^-11; b14=2^-11+2^-12;

127
b16=-2^-10; b18=2^-10; b20=-2^-10; b22=2^-9; b24=-2^-9; b26=2^-9; b28=-2^-9; b30=2^-9+2^-12; b32=-2^-9+2^-11; b34=2^-9+2^-10; b36=-2^-9+2^-10; b38=2^-8; b40=-2^-8; b42=2^-8+2^-11; b44=-2^-8+2^-10; b46=2^-8+2^-9; b48=-2^-8+2^-9; b50=2^-7; b52=-2^-7; b54=2^-7+2^-11; b56=-2^-7+2^-9; b58=2^-7+2^-9+2^-10; b60=-2^-7+2^-8; b62=2^-7+2^-8+2^-9; b64=-2^-6; b66=2^-6+2^-9; b68=-2^-6+2^-8+2^-12;

128
b70=2^-6+2^-7; b72=-2^-6+2^-7+2^-8+2^-11; b74=2^-5+2^-8; b76=-2^-5+2^-6; b78=2^-4; b80=-2^-4+2^-5+2^-7+2^-8; b82=2^-2+2^-4+2^-8+2^-9; b83=2^-1;

for i=2:2:82 b166-i= bi; end


h1, w1 =freqzb, 1, 512, fs*10^6;

h1= 20*log10absh1; subplot2,2,1; y= 1:512; for i=1:512


yi=i-1*fs*10^6 1024;

end ploty, h1; title'Log Magnitude'; grid on;


axis 0 fs 2*10^6 -100 10 ;

ylab el'M agnitude dB'; xlab el'Frequency H ertz'; subplot2,2,3;

129
ploty, h1; title'Passband Ripples'; grid on;
axis 0 fp*10^6 -0.1 0.1 ;

ylab el'M agnitude dB'; xlab el'Frequency H ertz';


h1, w1 =freqzb, 1, 512;

subplot2,2,2; plotf1, m, '-', w1 pi, absh1, '.'; title'N om alized Magnitude'; grid on; axis 0 1 -0.1 1.1 ; ylab el'M agnitude'; xlab el'Frequency'; subplot2,2,4; plotf1, m, '.', w1 pi, absh1, '-'; title'Stopband Ripples'; grid on; axis 1-2*fp fs 1 -0.005 0.015 ; ylab el'M agnitude'; xlab el'Frequency';

You might also like