wikiC/2.06
Main page
Alphabetic index
Recent Edits

Fourier/resynthesis

login 38.107.179.240

Overlap-add Resynthesis

With ideal analysis parameters and infinite precision arithmetic, one can reassemble the original waveform perfectly from a succession of IDFT's by means of overlap-add (OA) resynthesis. The basic idea is to perform an inverse FFT for each frame and then sum the overlapping frames. The following procedure describes analysis/OA-resynthesis process in the phase vocoder.
  1. Read M samples of the input signal x into a local buffer memory. M is the frame length, and the time advance or hop size in samples from one frame to the next is H.
  2. Multiply the samples in the frame by an analysis window w[m] of length M.
  3. Extend the windowed frame on both sizes to obtain a zero padded frame of length N, where N is a power of two larger than M. N/M is the zero padding factor.
  4. Apply FFT of length N to the frame to obtain the spectrum. Each frequency channel k of the analysis is referred to as a bin.
  5. Convert each frequency bin from rectangular to polar form to obtain the magnitude (absolute value) and phase. Differentiate the phase to obtain instantaneous frequency. It is customary to interpolate the amplitude, phase, and frequency trajectories from one hop to the next. The phase is usually discarded at this point, but in can be regenerated as needed as the integral of the instantaneous frequency. (Step 5 is usually referred to as phase vocoder analysis).
  6. Apply any desired modifications to the analysis data, including time scaling, pitch transposition, formant modification, et cetera.
  7. Apply an inverse FFT to obtain a time waveform for each frame.
  8. If the phase spectrum was edited, apply a resynthesis window to the output of the inverse FFT.
  9. Reconstruct the final output by overlapping and adding the output frames.

Assesment of Overlap-add Resynthesis

Overlap-add resynthesis is a rather delicate operation in the sense that modifications made in step 6 can easily affect the quality of the resynthesis process. In particular, if the sum of the overlapping windows does not add up to a constant, then a form of modulation will be heard at the frequency of the hop size. Indeed any additive or multiplicative transformations that disturb the perfect summation criterion cause side effects (Allen and Rabiner 1977). The effects are particularly noticeable in rapidly varying parts of a sound, such as attacks and decays. In general, transformations using OA resynthesis work best for constant or slowly changing sounds.

Curtis Roads, "The Computer Music Tutorial"

There is another step that Curtis Roads seems not to mention in his 'FFT recipe' that can be found in the Appendix of "The Computer Music Tutorial". That is, to apply a circular shift of the windowed audio on the zero padded segment before performing the FFT in order to obtain zero phase condition.
When applying FFT to an audio signal one introduces a group delay. In order to avoid it, a circular shift must be performed before 'FFTing' and an inverse circular shift (by the same amount) after 'iFFting'.
Jumping over this step would not matter if one only wishes to alter the magnitude of the bins. But performing transformations, which affect the phase without performing circular shift will result in a 'bad' resynthesised sound.
It is difficult to find a mathematical explanation on why one must perform such circular shift or at least it has been difficult for me to find it anywhere. A starting point can be found in http://ccrma.stanford.edu/~jos/filters/Filters_Preserving_Phase.html .

History of this page