1*3f1979aaSAndroid Build Coastguard Worker# PFFFT: a pretty fast FFT and fast convolution with PFFASTCONV 2*3f1979aaSAndroid Build Coastguard Worker 3*3f1979aaSAndroid Build Coastguard Worker## TL;DR 4*3f1979aaSAndroid Build Coastguard Worker 5*3f1979aaSAndroid Build Coastguard WorkerPFFFT does 1D Fast Fourier Transforms, of single precision real and 6*3f1979aaSAndroid Build Coastguard Workercomplex vectors. It tries do it fast, it tries to be correct, and it 7*3f1979aaSAndroid Build Coastguard Workertries to be small. Computations do take advantage of SSE1 instructions 8*3f1979aaSAndroid Build Coastguard Workeron x86 cpus, Altivec on powerpc cpus, and NEON on ARM cpus. The 9*3f1979aaSAndroid Build Coastguard Workerlicense is BSD-like. 10*3f1979aaSAndroid Build Coastguard Worker 11*3f1979aaSAndroid Build Coastguard Worker 12*3f1979aaSAndroid Build Coastguard WorkerPFFASTCONV does fast convolution (FIR filtering), of single precision 13*3f1979aaSAndroid Build Coastguard Workerreal vectors, utilizing the PFFFT library. The license is BSD-like. 14*3f1979aaSAndroid Build Coastguard Worker 15*3f1979aaSAndroid Build Coastguard WorkerPFDSP contains a few other signal processing functions. 16*3f1979aaSAndroid Build Coastguard WorkerCurrently, mixing and carrier generation functions are contained. 17*3f1979aaSAndroid Build Coastguard WorkerIt is work in progress - also the API! 18*3f1979aaSAndroid Build Coastguard WorkerThe fast convolution from PFFASTCONV might get merged into PFDSP. 19*3f1979aaSAndroid Build Coastguard Worker 20*3f1979aaSAndroid Build Coastguard Worker 21*3f1979aaSAndroid Build Coastguard Worker## Why does it exist: 22*3f1979aaSAndroid Build Coastguard Worker 23*3f1979aaSAndroid Build Coastguard WorkerI was in search of a good performing FFT library , preferably very 24*3f1979aaSAndroid Build Coastguard Workersmall and with a very liberal license. 25*3f1979aaSAndroid Build Coastguard Worker 26*3f1979aaSAndroid Build Coastguard WorkerWhen one says "fft library", FFTW ("Fastest Fourier Transform in the 27*3f1979aaSAndroid Build Coastguard WorkerWest") is probably the first name that comes to mind -- I guess that 28*3f1979aaSAndroid Build Coastguard Worker99% of open-source projects that need a FFT do use FFTW, and are happy 29*3f1979aaSAndroid Build Coastguard Workerwith it. However, it is quite a large library , which does everything 30*3f1979aaSAndroid Build Coastguard Workerfft related (2d transforms, 3d transforms, other transformations such 31*3f1979aaSAndroid Build Coastguard Workeras discrete cosine , or fast hartley). And it is licensed under the 32*3f1979aaSAndroid Build Coastguard WorkerGNU GPL , which means that it cannot be used in non open-source 33*3f1979aaSAndroid Build Coastguard Workerproducts. 34*3f1979aaSAndroid Build Coastguard Worker 35*3f1979aaSAndroid Build Coastguard WorkerAn alternative to FFTW that is really small, is the venerable FFTPACK 36*3f1979aaSAndroid Build Coastguard Workerv4, which is available on NETLIB. A more recent version (v5) exists, 37*3f1979aaSAndroid Build Coastguard Workerbut it is larger as it deals with multi-dimensional transforms. This 38*3f1979aaSAndroid Build Coastguard Workeris a library that is written in FORTRAN 77, a language that is now 39*3f1979aaSAndroid Build Coastguard Workerconsidered as a bit antiquated by many. FFTPACKv4 was written in 1985, 40*3f1979aaSAndroid Build Coastguard Workerby Dr Paul Swarztrauber of NCAR, more than 25 years ago ! And despite 41*3f1979aaSAndroid Build Coastguard Workerits age, benchmarks show it that it still a very good performing FFT 42*3f1979aaSAndroid Build Coastguard Workerlibrary, see for example the 1d single precision benchmarks 43*3f1979aaSAndroid Build Coastguard Worker[here](http://www.fftw.org/speed/opteron-2.2GHz-32bit/). It is however not 44*3f1979aaSAndroid Build Coastguard Workercompetitive with the fastest ones, such as FFTW, Intel MKL, AMD ACML, 45*3f1979aaSAndroid Build Coastguard WorkerApple vDSP. The reason for that is that those libraries do take 46*3f1979aaSAndroid Build Coastguard Workeradvantage of the SSE SIMD instructions available on Intel CPUs, 47*3f1979aaSAndroid Build Coastguard Workeravailable since the days of the Pentium III. These instructions deal 48*3f1979aaSAndroid Build Coastguard Workerwith small vectors of 4 floats at a time, instead of a single float 49*3f1979aaSAndroid Build Coastguard Workerfor a traditionnal FPU, so when using these instructions one may expect 50*3f1979aaSAndroid Build Coastguard Workera 4-fold performance improvement. 51*3f1979aaSAndroid Build Coastguard Worker 52*3f1979aaSAndroid Build Coastguard WorkerThe idea was to take this fortran fftpack v4 code, translate to C, 53*3f1979aaSAndroid Build Coastguard Workermodify it to deal with those SSE instructions, and check that the 54*3f1979aaSAndroid Build Coastguard Workerfinal performance is not completely ridiculous when compared to other 55*3f1979aaSAndroid Build Coastguard WorkerSIMD FFT libraries. Translation to C was performed with [f2c]( 56*3f1979aaSAndroid Build Coastguard Workerhttp://www.netlib.org/f2c/). The resulting file was a bit edited in 57*3f1979aaSAndroid Build Coastguard Workerorder to remove the thousands of gotos that were introduced by 58*3f1979aaSAndroid Build Coastguard Workerf2c. You will find the fftpack.h and fftpack.c sources in the 59*3f1979aaSAndroid Build Coastguard Workerrepository, this a complete translation of [fftpack]( 60*3f1979aaSAndroid Build Coastguard Workerhttp://www.netlib.org/fftpack/), with the discrete cosine transform 61*3f1979aaSAndroid Build Coastguard Workerand the test program. There is no license information in the netlib 62*3f1979aaSAndroid Build Coastguard Workerrepository, but it was confirmed to me by the fftpack v5 curators that 63*3f1979aaSAndroid Build Coastguard Workerthe [same terms do apply to fftpack v4] 64*3f1979aaSAndroid Build Coastguard Worker(http://www.cisl.ucar.edu/css/software/fftpack5/ftpk.html). This is a 65*3f1979aaSAndroid Build Coastguard Worker"BSD-like" license, it is compatible with proprietary projects. 66*3f1979aaSAndroid Build Coastguard Worker 67*3f1979aaSAndroid Build Coastguard WorkerAdapting fftpack to deal with the SIMD 4-element vectors instead of 68*3f1979aaSAndroid Build Coastguard Workerscalar single precision numbers was more complex than I originally 69*3f1979aaSAndroid Build Coastguard Workerthought, especially with the real transforms, and I ended up writing 70*3f1979aaSAndroid Build Coastguard Workermore code than I planned.. 71*3f1979aaSAndroid Build Coastguard Worker 72*3f1979aaSAndroid Build Coastguard Worker 73*3f1979aaSAndroid Build Coastguard Worker## The code: 74*3f1979aaSAndroid Build Coastguard Worker 75*3f1979aaSAndroid Build Coastguard Worker### Good old C: 76*3f1979aaSAndroid Build Coastguard WorkerThe FFT API is very very simple, just make sure that you read the comments in `pffft.h`. 77*3f1979aaSAndroid Build Coastguard Worker 78*3f1979aaSAndroid Build Coastguard WorkerThe Fast convolution's API is also very simple, just make sure that you read the comments 79*3f1979aaSAndroid Build Coastguard Workerin `pffastconv.h`. 80*3f1979aaSAndroid Build Coastguard Worker 81*3f1979aaSAndroid Build Coastguard Worker### C++: 82*3f1979aaSAndroid Build Coastguard WorkerA simple C++ wrapper is available in `pffft.hpp`. 83*3f1979aaSAndroid Build Coastguard Worker 84*3f1979aaSAndroid Build Coastguard Worker 85*3f1979aaSAndroid Build Coastguard Worker### Git: 86*3f1979aaSAndroid Build Coastguard WorkerThis archive's source can be downloaded with git including the submodules: 87*3f1979aaSAndroid Build Coastguard Worker``` 88*3f1979aaSAndroid Build Coastguard Workergit clone --recursive https://github.com/hayguen/pffft.git 89*3f1979aaSAndroid Build Coastguard Worker``` 90*3f1979aaSAndroid Build Coastguard Worker 91*3f1979aaSAndroid Build Coastguard WorkerWith `--recursive` the submodules for Green and Kiss-FFT are also fetched, 92*3f1979aaSAndroid Build Coastguard Workerto use them in the benchmark. You can omit the `--recursive`-option. 93*3f1979aaSAndroid Build Coastguard Worker 94*3f1979aaSAndroid Build Coastguard WorkerFor retrieving the submodules later: 95*3f1979aaSAndroid Build Coastguard Worker``` 96*3f1979aaSAndroid Build Coastguard Workergit submodule update --init 97*3f1979aaSAndroid Build Coastguard Worker``` 98*3f1979aaSAndroid Build Coastguard Worker 99*3f1979aaSAndroid Build Coastguard Worker 100*3f1979aaSAndroid Build Coastguard Worker## CMake: 101*3f1979aaSAndroid Build Coastguard WorkerThere's now CMake support to build the static libraries `libPFFFT.a` 102*3f1979aaSAndroid Build Coastguard Workerand `libPFFASTCONV.a` from the source files, plus the additional 103*3f1979aaSAndroid Build Coastguard Worker`libFFTPACK.a` library. Later one's sources are there anyway for the benchmark. 104*3f1979aaSAndroid Build Coastguard Worker 105*3f1979aaSAndroid Build Coastguard Worker 106*3f1979aaSAndroid Build Coastguard Worker## Origin: 107*3f1979aaSAndroid Build Coastguard WorkerOrigin for this code is Julien Pommier's pffft on bitbucket: 108*3f1979aaSAndroid Build Coastguard Worker[https://bitbucket.org/jpommier/pffft/](https://bitbucket.org/jpommier/pffft/) 109*3f1979aaSAndroid Build Coastguard Worker 110*3f1979aaSAndroid Build Coastguard Worker 111*3f1979aaSAndroid Build Coastguard Worker## Comparison with other FFTs: 112*3f1979aaSAndroid Build Coastguard Worker 113*3f1979aaSAndroid Build Coastguard WorkerThe idea was not to break speed records, but to get a decently fast 114*3f1979aaSAndroid Build Coastguard Workerfft that is at least 50% as fast as the fastest FFT -- especially on 115*3f1979aaSAndroid Build Coastguard Workerslowest computers . I'm more focused on getting the best performance 116*3f1979aaSAndroid Build Coastguard Workeron slow cpus (Atom, Intel Core 1, old Athlons, ARM Cortex-A9...), than 117*3f1979aaSAndroid Build Coastguard Workeron getting top performance on today fastest cpus. 118*3f1979aaSAndroid Build Coastguard Worker 119*3f1979aaSAndroid Build Coastguard WorkerIt can be used in a real-time context as the fft functions do not 120*3f1979aaSAndroid Build Coastguard Workerperform any memory allocation -- that is why they accept a 'work' 121*3f1979aaSAndroid Build Coastguard Workerarray in their arguments. 122*3f1979aaSAndroid Build Coastguard Worker 123*3f1979aaSAndroid Build Coastguard WorkerIt is also a bit focused on performing 1D convolutions, that is why it 124*3f1979aaSAndroid Build Coastguard Workerprovides "unordered" FFTs , and a fourier domain convolution 125*3f1979aaSAndroid Build Coastguard Workeroperation. 126*3f1979aaSAndroid Build Coastguard Worker 127*3f1979aaSAndroid Build Coastguard WorkerVery interesting is [https://www.nayuki.io/page/free-small-fft-in-multiple-languages](https://www.nayuki.io/page/free-small-fft-in-multiple-languages). 128*3f1979aaSAndroid Build Coastguard WorkerIt shows how small an FFT can be - including the Bluestein algorithm, but it's everything else than fast. 129*3f1979aaSAndroid Build Coastguard WorkerThe whole C++ implementation file is 161 lines, including the Copyright header, see 130*3f1979aaSAndroid Build Coastguard Worker[https://github.com/nayuki/Nayuki-web-published-code/blob/master/free-small-fft-in-multiple-languages/FftComplex.cpp](https://github.com/nayuki/Nayuki-web-published-code/blob/master/free-small-fft-in-multiple-languages/FftComplex.cpp) 131*3f1979aaSAndroid Build Coastguard Worker 132*3f1979aaSAndroid Build Coastguard Worker## Dependencies / Required Linux packages 133*3f1979aaSAndroid Build Coastguard Worker 134*3f1979aaSAndroid Build Coastguard WorkerOn Debian/Ubuntu Linux following packages should be installed: 135*3f1979aaSAndroid Build Coastguard Worker 136*3f1979aaSAndroid Build Coastguard Worker``` 137*3f1979aaSAndroid Build Coastguard Workersudo apt-get install build-essential gcc g++ cmake 138*3f1979aaSAndroid Build Coastguard Worker``` 139*3f1979aaSAndroid Build Coastguard Worker 140*3f1979aaSAndroid Build Coastguard Workerfor benchmarking, you should have additional packages: 141*3f1979aaSAndroid Build Coastguard Worker``` 142*3f1979aaSAndroid Build Coastguard Workersudo apt-get install libfftw3-dev gnuplot 143*3f1979aaSAndroid Build Coastguard Worker``` 144*3f1979aaSAndroid Build Coastguard Worker 145*3f1979aaSAndroid Build Coastguard Workerrun the benchmarks with `./bench_all.sh ON` , to include benchmarks of fftw3 .. 146*3f1979aaSAndroid Build Coastguard Workermore details in README of [https://github.com/hayguen/pffft_benchmarks](https://github.com/hayguen/pffft_benchmarks) 147*3f1979aaSAndroid Build Coastguard Worker 148*3f1979aaSAndroid Build Coastguard Worker 149*3f1979aaSAndroid Build Coastguard Worker## Benchmark results 150*3f1979aaSAndroid Build Coastguard Worker 151*3f1979aaSAndroid Build Coastguard WorkerThe benchmark results are stored in a separate git-repository: 152*3f1979aaSAndroid Build Coastguard WorkerSee [https://github.com/hayguen/pffft_benchmarks](https://github.com/hayguen/pffft_benchmarks). 153*3f1979aaSAndroid Build Coastguard Worker 154*3f1979aaSAndroid Build Coastguard WorkerThis is to keep the sources small. 155*3f1979aaSAndroid Build Coastguard Worker 156