4 Fast Fourier Transform (FFT)

Principle

The Fast Fourier Transform (FFT) is a versatile tool for signal analysis. The general idea, in terms of electronic signals, is to de-compose a given signal (in the time domain) into sine-shaped components (in the frequency domain). Each component has a frequency, a phase, and an amplitude. The inverse operation is adding all components and will return a signal that is, in a theoretical limit, identical to the original input signal. By providing the parameters of the components, FFT can serve as a filtering tool: It can isolate a periodic signal in a noisy environment and quantify amplitudes and frequencies, detect frequency shifts of a signal, or distinguish amplitudes in high and low-frequency bands for controlling your disco lights.

Several libraries are available that implement FFT and could run on AVR devices. Here, kissFFT was chosen because it focuses on simplicity and compactness: It is easy to integrate into an AVR project, it is implemented in C, and has a useful licensing model (BSD-3-Clause). On the other hand, there are no windowing functions (aside from the implicit rectangular window when picking data blocks out of a stream) and no analysis features such as peak localization - this is up to the application. You may prefer different criteria for your library or even implement your own FFT functionality, but the general upshots from this document are also valid there.

Example

In this section, we examine the implementation and the example code we have created.

The code shows how to run FFT and measure the speed of execution. For simplicity, we use a hard-coded input signal for the FFT algorithm:

 const int16_t sinewave[1024] = { ... }; 
The signal is the same periodic signal as used in the Infinite Impulse Response (IIR) Filters section. It has these properties:
  • Sine wave with fundamental frequency f0 = 440 Hz
  • Sampling rate fS = 44.1 kHz
  • DC offset
  • Strong harmonic overtones (”fuzz” effect, deliberately added)

The nfft parameter selects the sample length in kissFFT, i.e., how many data points are used for the FFT operation. It also determines the number of frequency bins (nfft / 2) in the operation’s output and hence, the width of each bin (fS/nfft). Consequently, nfft affects the speed of the filter.

In the example, we select:

 int nfft = 800;
and therefore:
  • Number of bins nfft / 2 = 400
  • Bin width = 2fS / nfft ≈ 110 Hz

The resulting bin width will work well with the known fundamental frequency of 440 Hz of the example - but for speed optimization, use values for nfft that are powers of two (e.g., 64 or 256).

For real signals in the time domain (just like a sequence of ADC conversions provides), the kissFFT library recommends this function:
    
    kiss_fftr(cfg, cpx_in , cpx_out);      // The actual FFT operation 
where cfg = kiss_fftr_alloc(nfft, 0, 0, 0) is a required set of configuration parameters, cpx_in is the input data (real) and cpx_out is the complex output data.
Note: The kissFFT library provides other functions that may have complex input. For this reason, the input variable name has the prefix cpx. Nonetheless, the input signal is a sequence of ADC sampling values and hence, comprises only real numbers.

Now the application can proceed and evaluate specific information in cpx_out[n] for its purposes. In our case, we calculate the power spectrum pwr, send it to the PC using the USART peripheral, and plot it on the PC using the Data Visualizer:

First, calculate pwr:
        //Calculating the power spectrum
        pwr = sqrt(cpx_out[n].r * cpx_out[n].r + cpx_out[n].i * cpx_out[n].i);
For each frequency bin n, the absolute value of the complex result vector cpx_out[n] is calculated from its real and imaginary components, cpx_out[n].r and cpx_out[n].i, respectively.

When the device is connected to a PC, the MPLAB® Data Visualizer can be used for signal analysis. The following figure shows the FFT of the aforementioned input signal. The upper plot displays the variable values, as sent by the USART: The real and the imaginary components of cpx_out[n] share one scale. Next is the iteration counter in yellow, followed by the power spectrum in red.

Figure 4-1. FFT of a Sine Wave with Low Amplitude
The presentation of the power spectrum over “USART transmission time” (red) is not the most useful since it is difficult to pinpoint exact peak locations. The MPLAB Data Visualizer’s XY Plot feature can plot the power spectrum over the frequency bin number using the iteration counter as X-value, see the lower plot in the figure above (blue). Keeping in mind that one bin is approximately 110 Hz in width, we can identify and quantify peak locations and amplitudes:
  • Bin 0 is defined by the kissFFT library to contain the DC offset of the signal (clipped for legibility)
  • Bin 4 contains the fundamental, around 440 Hz
  • The natural harmonics can be identified:
    • 2nd harmonic at bin 8
    • (There is no 3rd harmonic component at bin 12)
    • 4th harmonic at bin 16
    • 5th harmonic at bin 20
  • The “fuzzy” look of the original signal is caused by the dominant components in bins 24 and 28, representing the sixth and seventh harmonics at around 2.65 kHz and 3.09 kHz, respectively
  • Even higher harmonics and high-frequency noise components can be identified, for example, the tenth harmonic at bin 40

Performance and Properties

In comparison to other filter techniques, FFT is not easy on CPU load: depending on nfft (and the application’s requirements), the device’s CPU can be busy for a substantial time.

When the actual device is accessible, the duration of one FFT can be measured by observing the output level of a digital output pin with an oscilloscope or a logic analyzer. In the example, the Curiosity Nano board LED pin can be toggled:
    	
PORTD.OUTSET = PIN6_bm; // Make PD6 output logic high
kiss_fftr(cfg, cpx_in , cpx_out);      // The actual FFT operation
PORTD.OUTCLR = PIN6_bm; // Make PD6 output logic low
The measured duration can be converted to ‘number of FFT per MHz’ to compare performance or scale it with clock frequency. This method is recommended for applications where timing and CPU load are critical.
When the device or the required tools are not accessible, you can use MPLAB X to measure the speed of the algorithm on a simulated AVR device:
  1. Select “Simulator” as the connected tool in the project properties.
  2. Add suitable breakpoints right before and after the function call kiss_fftr(...); .
  3. Find the Stopwatch window (Window → Debugging → Stopwatch).
  4. Run the debugger.
  5. When the debugger halts at a breakpoint, the Stopwatch window will display the number of virtual clock cycles used since the previous breakpoint or start and the current breakpoint.
This approach provides a quick and easy estimation over the upper theoretical performance limit.

The following table lists measured performance done on the Curiosity Nano.

Table 4-1. Cycle Times - FFT filter
nfftCycles Per FFT
45,300
812,000
1630,100
3267,000
64161,000
128348,000
256810,000
5121,750,000

As an example, an FFT with nfft=64 uses about 161 kilocycles per FFT, corresponding to 6.2 FFT per MHz. Consequently, a device running at a core frequency of 20 MHz could perform twenty times as many, i.e., 124 FFT per second. If the device uses an external crystal with only 32.768 kHz, the same FFT operation takes apporximately five seconds.

Note:

The maximum nfft value is depending on the actual memory configuration of the device and the memory requirements of the rest of the application.

For speed optimization, use values for nfft that are powers of two (e.g., 64 or 256).

Conclusion and Use Cases

We have shown that an AVR device can perform Fast Fourier Transform using a third-party library (kissFFT). The FFT can provide valuable information other digital filtering techniques can’t, but at the cost of CPU cycles: It depends on the application’s requirements and details whether the FFT can be run continuously or not. Alternatively, an FFT can only occasionally be used to optimize parameters for other, faster filters.

Another point to keep in mind is the virtual precision, as suggested by the power spectrum, vs. the actual accuracy of the measurement:

One aspect is the quality of the input sample data and the time base used: The oscillator providing the time base can suffer a temperature drift of several percent. This is less of a problem for self-contained applications for signal analysis, especially when there is a short time between sampling and usage of the FFT. Though, it is relevant when interacting with external devices that have their own time base.

The other aspect is the width of a frequency bin, determined by the signal’s sampling rate and nfft: In the example above, the power spectrum was well suited to identify higher harmonics. On the other hand, even when exploiting the tenth harmonic, we can not determine whether the fundamental component was actually tuned to 440 Hz or, musically speaking, de-tuned.