This is a heavily modified implementation of FFTPack [1,2], with the following advantages:
N*log(N)
, because Bluestein's algorithm [3] is used for these cases.Twiddle factor computation:
[0; pi/4]
for higher accuracysincospi()
is used, which actually computes sin(x)
and (cos(x)-1)
.n
sin/cos pairs are required, the adjusted sincospi()
is only called 2*sqrt(n)
times; the remaining values are obtained by evaluating the angle addition theorems in a numerically accurate way.Parallel invocation:
Efficient codelets are available for the factors:
Larger prime factors are handled by somewhat less efficient, generic routines.
For lengths with very large prime factors, Bluestein's algorithm is used, and instead of an FFT of length n
, a convolution of length n2 >= 2*n-1
is performed, where n2
is chosen to be highly composite.
[1] Swarztrauber, P. 1982, Vectorizing the Fast Fourier Transforms (New York: Academic Press), 51 [2] https://www.netlib.org/fftpack/ [3] https://en.wikipedia.org/wiki/Chirp_Z-transform