0% found this document useful (0 votes)
85 views

18-791 Lecture #17 Introduction To The Fast Fourier Transform Algorithm Richard M. Stern

The document discusses the fast Fourier transform (FFT) algorithm. It begins by outlining the basic workings of the radix-2 decimation-in-time FFT algorithm. It then shows how the FFT provides significant computational savings over direct computation of the discrete Fourier transform (DFT) by reducing the operation count from O(N^2) to O(NlogN). Specifically, it decomposes the DFT into smaller DFTs using the Cooley-Tukey algorithm, represented via a signal flowgraph. This decomposition process is continued until only basic 2-point DFTs or "butterflies" remain.

Uploaded by

xenightmare
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views

18-791 Lecture #17 Introduction To The Fast Fourier Transform Algorithm Richard M. Stern

The document discusses the fast Fourier transform (FFT) algorithm. It begins by outlining the basic workings of the radix-2 decimation-in-time FFT algorithm. It then shows how the FFT provides significant computational savings over direct computation of the discrete Fourier transform (DFT) by reducing the operation count from O(N^2) to O(NlogN). Specifically, it decomposes the DFT into smaller DFTs using the Cooley-Tukey algorithm, represented via a signal flowgraph. This decomposition process is continued until only basic 2-point DFTs or "butterflies" remain.

Uploaded by

xenightmare
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 22

18-791 Lecture #17

INTRODUCTION TO THE
FAST FOURIER TRANSFORM ALGORITHM

Richard M. Stern

Department of Electrical and Computer Engineering


Carnegie Mellon University
Pittsburgh, Pennsylvania 15213
Phone: +1 (412) 268-2535
FAX: +1 (412) 268-3890
rms@cs.cmu.edu
http://www.ece.cmu.edu/~rms
October 24, 2005
Introduction

 Today we will begin our discussion of the family of algorithms


known as “Fast Fourier Transforms”, which have revolutionized
digital signal processing
 What is the FFT?
– A collection of “tricks” that exploit the symmetry of the DFT calculation to
make its execution much faster
– Speedup increases with DFT size

 Today - will outline the basic workings of the simplest


formulation, the radix-2 decimation-in-time algorithm
 Thursday - will discuss some of the variations and extensions
– Alternate structures
– Non-radix 2 formulations
Carnegie
Mellon Slide 2 ECE Department
Introduction, continued

 Some dates:
– ~1880 - algorithm first described by Gauss
– 1965 - algorithm rediscovered (not for the first time) by Cooley and
Tukey

 In 1967 (spring of my freshman year), calculation of a 8192-


point DFT on the top-of-the line IBM 7094 took ….
– ~30 minutes using conventional techniques
– ~5 seconds using FFTs

Carnegie
Mellon Slide 3 ECE Department
Measures of computational efficiency

 Could consider
– Number of additions
– Number of multiplications
– Amount of memory required
– Scalability and regularity

 For the present discussion we’ll focus most on number of


multiplications as a measure of computational complexity
– More costly than additions for fixed-point processors
– Same cost as additions for floating-point processors, but number of
operations is comparable

Carnegie
Mellon Slide 4 ECE Department
Computational Cost of Discrete-Time Filtering

Convolution of an N-point input with an M-point unit sample


response ….

 Direct convolution:

y[n]   x[k]h[n  k]
k

– Number of multiplies ≈ MN

Carnegie
Mellon Slide 5 ECE Department
Computational Cost of Discrete-Time Filtering

Convolution of an N-point input with an M-point unit sample


response ….
 Using transforms directly:
N 1
X[k]   x[n]e  j2kn / N
n0
2
– Computation of N-point DFTs requires N multiplys
– Each convolution requires three DFTs of length N+M-1 plus an
additional N+M-1 complex multiplys or
2
3(N  M  1)  (N  M  1)

2
– For N  M , for example, the computation is O(N )

Carnegie
Mellon Slide 6 ECE Department
Computational Cost of Discrete-Time Filtering

Convolution of an N-point input with an M-point unit sample


response ….
 Using overlap-add with sections of length L:
– N/L sections, 2 DFTs per section of size L+M-1, plus additional multiplys
for the DFT coefficients, plus one more DFT for h[n]
2N N 
(L  M  1)   (L  M  1)  ( L  M  1) 2
2
L L 

2
– For very large N, still is proportional to M

Carnegie
Mellon Slide 7 ECE Department
The Cooley-Tukey decimation-in-time algorithm

 Consider the DFT algorithm for an integer power of 2, N  2


N 1 N 1
X[k]   x[n]WN nk   x[n]e j2nk / N ; WN  e j2 / N
n0 n0
 Create separate sums for even and odd values of n:
X[k]   x[n]WN nk   x[n]WN nk
n even n odd
 Letting n  2r for n even and n  2r  1 for n odd, we obtain
 N / 21  N / 2 1
X[k]   x[2r]WN 2rk   x[2r  1]WN  2r1 k
r 0 r 0

Carnegie
Mellon Slide 8 ECE Department
The Cooley-Tukey decimation in time algorithm

 Splitting indices in time, we have obtained


 N / 21  N / 2 1
X[k]   x[2r]WN 2rk   x[2r  1]WN  2r1 k
r 0 r 0

 j 2 2 / N
2
 But WN  e  e  j2 /( N / 2)  WN / 2 and WN2rk WNk  WNk WNrk/ 2

So … (N/ 2)1 ( N/ 2)1


X[k]   x[2r]WNrk/ 2  WNk  x[2r  1]WNrk/ 2
n0 n0

N/2-point DFT of x[2r] N/2-point DFT of x[2r+1]

Carnegie
Mellon Slide 9 ECE Department
Savings so far …

 We have split the DFT computation into two halves:

N 1
X[k]   x[n]WN nk
k 0
( N/ 2)1 ( N/ 2)1
  x[2r]WNrk/ 2  WNk  x[2r  1]WNrk/ 2
n0 n0
 Have we gained anything? Consider the nominal number of multiplications for
– Original form produces multiplications
– New form produces multiplications
 8keep going!!
NLet’s
– So we’re already ahead …..
8 2  64
2(4 2 )  8  40

Carnegie
Mellon Slide 10 ECE Department
Signal flowgraph notation

 In generalizing this formulation, it is most convenient to adopt


a graphic approach …
 Signal flowgraph notation describes the three basic DSP
operations:
x[n]
– Addition x[n]+y[n]
y[n]
a
– Multiplication by a constant x[n] ax[n]
z-1
– Delay x[n] x[n-1]

Carnegie
Mellon Slide 11 ECE Department
Signal flowgraph representation of 8-point DFT

 Recall that the DFT is now of the form X[k]  G[k]  WNk H[k]
 The DFT in (partial) flowgraph notation:

Carnegie
Mellon Slide 12 ECE Department
Continuing with the decomposition …

 So why not break up into additional DFTs? Let’s take the


upper 4-point DFT and break it up into two 2-point DFTs:

Carnegie
Mellon Slide 13 ECE Department
The complete decomposition into 2-point DFTs

Carnegie
Mellon Slide 14 ECE Department
Now let’s take a closer look at the 2-point DFT

 The expression for the 2-point DFT is:


1 1
X[k]   x[n]W2nk   x[n]e  j 2nk / 2
n0 n0

 Evaluating for k  0,1 we obtain


X[0]  x[0]  x[1]
X[1]  x[0]  e  j 21 / 2 x[1]  x[0]  x[1]

which in signal flowgraph notation looks like ...

This topology is referred to as the


basic butterfly

Carnegie
Mellon Slide 15 ECE Department
The complete 8-point decimation-in-time FFT

Carnegie
Mellon Slide 16 ECE Department
Number of multiplys for N-point FFTs

 Let N  2 where   log 2 (N)


 (log2(N) columns)(N/2 butterflys/column)(2 mults/butterfly)
or ~ N log 2 (N) multiplys
Carnegie
Mellon Slide 17 ECE Department
Comparing processing with and without FFTs

 “Slow” DFT requires N mults; FFT requires N log2(N) mults


 Filtering using FFTs requires 3(N log2(N))+2N mults
2 2
 Let 1  N log2 (N) / N ;  2  [3(N log2 (N))  N] / N
N 1 2
16 .25 .8124 Note: 1024-point FFTs
accomplish speedups of 100
32.156 .50 for filtering, 30 for DFTs!
64.0935 .297
128.055 .171
256.031 .097
1024 .0097 .0302
Carnegie
Mellon Slide 18 ECE Department
Additional timesavers: reducing multiplications
in the basic butterfly

 As we derived it, the basic butterfly is of the form

WNr

WNr N / 2
N /2
 Since WN  1 we
r
can reducing computation by 2 by
premultiplying by WN

WNr 1
Carnegie
Mellon Slide 19 ECE Department
Bit reversal of the input

 Recall the first stages of the 8-point FFT:


Consider the binary representation of the
indices of the input:

0 000 If these binary indices are


4 100 time reversed, we get the
2 010 binary sequence representing
0,1,2,3,4,5,6,7
6 110
1 001 Hence the indices of the FFT
5 101 inputs are said to be in
3 011 bit-reversed order
7 111

Carnegie
Mellon Slide 20 ECE Department
Some comments on bit reversal

 In the implementation of the FFT that we discussed, the input


is bit reversed and the output is developed in natural order
 Some other implementations of the FFT have the input in
natural order and the output bit reversed (to be described
Thursday)
 In some situations it is convenient to implement filtering
applications by
– Use FFTs with input in natural order, output in bit-reversed order
– Multiply frequency coefficients together (in bit-reversed order)
– Use inverse FFTs with input in bit-reversed order, output in natural order

 Computing in this fashion means we never have to compute bit


reversal explicitly
Carnegie
Mellon Slide 21 ECE Department
Summary

 We developed the structure of the basic decimation-in-time


FFT
 Use of the FFT algorithm reduces the number of multiplys
required to perform the DFT by a factor of more than 100 for
1024-point DFTs, with the advantage increasing with
increasing DFT size
 Next time we will consider inverse FFTs, alternate forms of the
FFT, and FFTs for values of DFT sizes that are not an integer
power of 2

Carnegie
Mellon Slide 22 ECE Department

You might also like