This is somewhat awkward to support, but the best effort we can do here
is to analyze various Load/Store opcodes and deduce the ideal overall
alignment based on this. This is not a 100% perfect solution, but should
be correct for any reasonable use case.
Also fix various nitpicks with BDA support while I'm at it.
Add test shader for new functionality.
Add legacy test reference shader for unrelated buffer-bitcast
test, that doesn't seem to have been added previously.