The FFT converts a signal from the time domain (amplitude over time) to the frequency domain (amplitude over frequency). Add sinusoidal components below and observe both representations side by side.
Each component is a sine wave at a given frequency and amplitude. The signal you hear (and see in the time domain) is the sum of all components.
FFT size controls the frequency resolution. Larger FFT = more frequency bins = sharper peaks, but slower response. The number of frequency bins is always FFT size / 2.
Frequency resolution = sample rate / FFT size = -- Hz per bin
This is the raw signal: amplitude vs. time. It is the sum of all the sine wave components above.
The FFT decomposes the signal into its frequency components. Each bar shows how much energy exists at that frequency. You should see peaks at the frequencies of the active components above.
The Discrete Fourier Transform (DFT) computes the correlation of the input signal with sine and cosine waves at each frequency bin. The FFT is simply a fast algorithm for computing the DFT in O(N log N) instead of O(N^2) time.
Key relationships:
Try changing the FFT size and notice how the peaks become sharper or broader. Try moving two component frequencies close together and increasing the FFT size to see them resolve into separate peaks.