This is essentially a 12-point sinc resampler, unless it's resampling to a rate
higher than the output, at which point it will vary between 12 and 24 points
and do anti-aliasing to avoid/reduce frequencies going over nyquist.
Code provided by Christopher Fitzgerald.
SSE uses reverse ordering, such that component 0 is the last in memory.
_mm_load_* and _mm_loadu_*, and the corresponding stores, do not change the
memory ordering.