You are on page 1of 4

Theoretical Review of FFT Implementations for Digital Signal Processors

Abstract

The discrete Fourier transform (DFT) is one of the most pivotal tools employed in the realm of digital signal
processing and Fast Fourier Transform (FFT) is a powerful algorithm optimization of DFT. The world is fast moving
from analog to digital and in essence, FFT thrives to achieve the same. Though the outputs of DFT and FFT are the
same, the difference lies in the algorithm that is optimized to amputate redundant calculations. Several algorithms
have been developed to improve the computation time of FFT – the overall aim herein remains the same i.e. to
reduce the number of complex calculations. This paper aims to throw light on different implementations through
which the efficiency of FFT can be augmented to design more powerful signal processors.

1. Introduction
1.1. Fourier Series
Fourier series is a representation of a periodic function as a sum of sines and cosines.

Solving for coeficient gives,

1.2. Fourier Series Transform


The practice of isolating a signal into individual frequencies is known as a Fourier transform. The
applications include audio processing wherein individual sounds from a recording are picked out using this
series transform.
1.3. Discrete Fourier Transform
Given a sequence of N samples f(n), indexed by n = 0..N-1, the Discrete Fourier transform (DFT) is defined
as F(k), where k=0..N-1:

F(k) are often called the 'Fourier Coefficients' or 'Harmonics'.


1.4. Fast Fourier Transform
An FFT computes the DFT and produces exactly the same result as evaluating the DFT definition directly; the
most important difference is that an FFT is much faster.The DFT is defined by the formula:

Where x0, ...., xN−1 be complex numbers


2. Problem Analysis
2.1. Complexity Bounds
The lower bounds on the complexity along with the exact operation counts of FFT continue to be grey
areas in the signal processing sphere. Despite of the fact that today’s computers have robust caching
mechanisms and optimized process-queuing, the arithmetic count of operations required by FFT is pivotal.
It is still not firmly established if FFT in fact require Ω(N log N) or greater operations. The complexity
bounds problem analysis has so far been approached using the ordinary complex-data case due to its
uncomplicated nature but these are as closely related to FFTs as are the real-data FFTs.
2.2. Approximation & Accuracy
The trade-off between the approximation error and speed/precision of output is another problem
analysis area associated with FFT algorithms. This trade-off can be explained using Guo and Burrus’
wavelet-based approximate FFT which is more efficient than exact FFT as it uses sparse data
(input/output). The complexity can be reduced to O(K log(N)log(N/K)) if the data are sparse. Another
computational issue linked to FFT algorithms is Accuracy. In fixed-point arithmetic, the finite-precision
errors emitted by FFT algorithms are critical and involve re-scaling at each transitional decomposition
state (example, Cooley-Tukey).
3. Design Requirement, Specifications & Proposed Solutions
There are multiple ways to decompose an FFT of which Radix-2 is the simplest one. Though, it has been
proven that Radix-4 FFT has a fair advantage in the realm of encrypted domain implementation. In fact,
for large transforms Radix-4 20% is more efficient than Radix -2. Nonetheless, Radix-2 and Radix-4 are the
most common FFTs. Radix-8 is rarely used because of its high complexity and hardware implementations
which have only a slight effect on overall efficiencies. Some illustrations for Radix-2 & Radix-4 FFTs:
3.1. Common-Factor FFTs
Also called as Cooley-Tukey FFTs, Common-Factor FFTs are most common class of FFTs. The
factors of N used in decomposition have common factor(s). Radix‐r and Mixed‐radix are further
two categories of common FTTs. While for Radix-r, N = rk, and Butterflies used in each stage, for
Mixed-radix N ≠ rk necessarily and radices of component butterflies are not all equal

Data flow diagram for N=8: a decimation-in-time radix-2 FFT breaks a length-N DFT into two
length-N/2 DFTs followed by a combining stage consisting of many size-2 DFTs called "butterfly"
operations (so-called because of the shape of the data-flow diagrams).
3.2. Prime-Factor FFTs
The transform length must be the product of numbers that are relatively prime. Their pros are
absence of WN twiddle factor multiplication. Lastly, they have irregular sorting of input and
output data and irregular addressing for butterflies. Prime-Factor FFTs constitute of re-indexing
of input/output arrays which are then substituted into DFT to get a 2-dimensional DFT. Suppose
that N = N1N2, where N1 and N2 are relatively prime. The re-indexing of input n and out k can
then be keyed as:

Substituting this re-indexing in the DFT transform formula, we get

The inner and outer sums denoted the DFTs of size N2 and N1, respectively.
3.3. Other FFTs
3.3.1. Split‐radix FFTs have N = pk, where p is a small prime number and k is a positive integer, this
method can be more efficient than standard radix‐p FFTs. Butterfly for SRFFT algorithm:
3.3.2. Winograd Fourier Transform Algorithm (WFTA) is a type of prime factor algorithm based on DFT
building blocks using a highly efficient convolution algorithm and requires many additions but
only order N multiplications.
3.3.3. Goertzel DFT is not considered a normal FFT in that its computational complexity is still order
N2 – It allows a subset of the DFT’s N output terms to be efficiently calculated.
4. Conclusion
There are several research areas that have to be addressed in the future to extend the FFT research for
emerging standards and applications. Though the major area of application for FFTs remain as Digital Signal
Processing, these are also used extensively in Aerospace Industry, energy management systems, image
processing, etc. Thus, the challenges related to computational efficiencies of FFTs remain the focus on
different researches going on in this field.

References

[1] J. Johnson, R. Johnson, D. Rodríguez, R. Tolimieri, “A Methodology for Designing, Modifying, and Implementing Fourier Transform
Algorithms on Various Architectures,” Journal of Circuits, Systems and Signal Processing, Birkhäuser, Boston, Vol. 9, No. 4, 1990.

[2] D. Rodríguez, N. G. Santiago, H. Nava, “High Performance SAR Raw Array Data Environment (SARADAS),” IEEE 5th European Conference on
Synthetic Aperture Radar, EUSAR 2004, May 2004, Ulm, Germany.

[3] N. G. Santiago, D. T. Rover, D. Rodríguez, “A Statistical Approach for the Analysis of the Relation between Low-Level Performance
Information, the Code, and the Environment,” Proceedings of the SIAM Journal of Parallel and Distributed Computing Practice. Accepted for
publication.

[4] D. Rodríguez, “SAR Point Spread Signals and Earth Surface Property Characteristics,” (Invited Paper), SPIE 44th Annual Meeting and
Exhibition, Colorado, July 18-23, 1999.

[5] D. Rodríguez, “A Computational Kronecker-core Array Algebra SAR Raw Data Generation Modeling System,” Proceedings of the Asilomar
Conference on Signals, Systems, and Computers, Monterey, California, Nov. 2001. [6] R. Tolimieri, M. An, “Time-Frequency Representations,”
Birkhäuser, Boston, 1998.

[6] W.Press, B.Flannery, S.Teukolsky, and W.Vetterling, Numerical Recipes; the Art of Scientific Computing, Cambridge Univ. Press, 1986. [Rad]
H.Rademacher, Lectures on Elementary Number Theory, Chelsea, New York, 1958.

[7] W.Rudin, Real and Complex Analysis, McGraw-Hill, New York, 1976

You might also like