It's the only implementation currently, so there's no point to having it stored
as a function pointer in the filter struct. Even if there were SIMD versions,
it'd be a global selection, not per-instance.
Since it was merely acting as an extension of it anyway, with the second delay
line tap (for late reverb) copying attenuated samples to the decorrelator line
that was being tapped off of. Just extend the delay line and offset the
decorrelator taps to be relative to the late reverb tap.
Since it's accumulating multiple HRIRs for two output speakers, it seems to be
a better option to preserve the amplitude of the high-frequency decoder instead
of increasing it, and reduce the amplitude of the low-frequency decoder to
compensate.
14 in total, an 8-point cube and a 6-point diamond shape, to help improve sound
localization a bit. Incurs no real extra CPU cost once the IRs are built.
Less than ideal since documentations warn it may not list 'neon' even if it's
really supported. However, the "proper" APIs to check for NEON extensions don't
seem to exist in my toolchain.
Ideally the band-pass should probably happen closer to output, like gain is.
However, doing that would require 16 filters (4 early + 4 late channels, each
with a low-pass and high-pass filter), compared to the two needed to do it on
input.
The combined source and listener gains now can't exceed a multiplier of 16
(~24dB). This is to avoid mixes getting out of control with large volume
boosts, which reduces the effective precision given by floating-point.
This appears to be how Creative's Windows drivers handle it, and is necessary
for at least the Windows version of UT2k4 (otherwise it tries to play a source
while suspended, checks and sees it's stopped, then kills it before it's given
a chance to start playing).
Consequently, the internal properties it gets mixed with are determined by what
the source properties are at the time of the play call, and the listener
properties at the time of the suspend call.
This does not change alDeferUpdatesSOFT, which will still hold the play state
change until alProcessUpdatesSOFT.
Note that this now also causes all playing sources to update when an effect
slot is updated. This is a bit wasteful, as it should only need to re-update
sources that are using the effect slot (and only when a relevant property is
changed), but it's good enough. Especially with deferring since all playing
sources are going to get updated on the process call anyway.
This allows us to not have to play around with trying to avoid duplicate state
pointers, since the reference count will ensure they're deleted as appropriate.
The only caveat is that the mixer is not allowed to decrement references, since
that can cause the object to be freed (which the mixer code is not allowed to
do).
The source's voice holds a copy of the last properties it received, so listener
updates can make sources recalculate internal properties from that stored copy.
The CalcEvIndices and CalcAzIndices methods were dependent on the FPU being in
round-to-zero mode, which is not the case for panning initialization. And since
we just need the closest index and don't need to lerp between them, it's better
to just directly calculate the index with rounding.
Using all the HRIRs seems to have problems with volume balancing, due in part
to HRTF data sets not having uniform enough measurements for a simple decoder
matrix to work (and generating a proper one that would work better is not that
easy). This still maintains the benefits of decoding ambisonics directly to
HRTF, namely that it only needs to filter the 4 ambisonic channels and can use
more optimized HRTF filtering methods on those channels. It can also be
improved further with frequency-dependent processing baked into the generated
coefficients, incurring no extra run-time cost for it.
Last time this attempted to average the HRIRs according to their contribution
to a given B-Format channel as if they were loudspeakers, as well as averaging
the HRIR delays. The latter part resulted in the loss of the ITD (inter-aural
time delay), a key component of HRTF.
This time, the HRIRs are averaged similar to above, except instead of averaging
the delays, they're applied to the resulting coefficients (for example, a delay
of 8 would apply the HRIR starting at the 8th sample of the target HRIR). This
does roughly double the IR length, as the largest delay is about 35 samples
while the filter is normally 32 samples. However, this is still smaller the
original data set IR (which was 256 samples), it also only needs to be applied
to 4 channels for first-order ambisonics, rather than the 8-channel cube. So
it's doing twice as much work per sample, but only working on half the number
of samples.
Additionally, since the resulting HRIRs no longer rely on an extra delay line,
a more efficient HRTF mixing function can be made that doesn't use one. Such a
function can also avoid the per-sample stepping parameters the original uses.