Storage was in place already, so mostly just dealing with bitcasts and
constants.
Simplies some of the bitcasting logic, and this exposed some bugs in the
implementation. Refactor to use correct width integers with explicit bitcast opcodes.
Structs are aligned as you would expect in MSL (maximum member
alignment), and it is not minimum 16 bytes like in std140.
Also rename the dummy "pad" members to a reserved naming scheme.
Apparently we didn't use those yet. MSL seems to be able to alias struct
types and variable types to a degree, so that's why it has escaped
testing until now.
If not enough components are provided in the shader,
the shader MSL compiler throws an error rather than make components
undefined. This hurts portability, so we need to add explicit padding
here.
This is a fairly fundamental change on how IDs are handled.
It serves many purposes:
- Improve performance. We only need to iterate over IDs which are
relevant at any one time.
- Makes sure we iterate through IDs in SPIR-V module declaration order
rather than ID space. IDs don't have to be monotonically increasing,
which was an assumption SPIRV-Cross used to have. It has apparently
never been a problem until now.
- Support LUTs of structs. We do this by interleaving declaration of
constants and struct types in SPIR-V module order.
To support this, the ParsedIR interface needed to change slightly.
Before setting any ID with variant_set<T> we let ParsedIR know
that an ID with a specific type has been added. The surface for change
should be minimal.
ParsedIR will maintain a per-type list of IDs which the cross-compiler
will need to consider for later.
Instead of looping over ir.ids[] (which can be extremely large), we loop
over types now, using:
ir.for_each_typed_id<SPIRVariable>([&](uint32_t id, SPIRVariable &var) {
handle_variable(var);
});
Now we make sure that we're never looking at irrelevant types.
This allows shaders to declare and use pointer-type variables. Pointers
may be loaded and stored, be the result of an `OpSelect`, be passed to
and returned from functions, and even be passed as inputs to the `OpPhi`
instruction. All types of pointers may be used as variable pointers.
Variable pointers to storage buffers and workgroup memory may even be
loaded from and stored to, as though they were ordinary variables. In
addition, this enables using an interior pointer to an array as though
it were an array pointer itself using the `OpPtrAccessChain`
instruction.
This is a rather large and involved change, mostly because this is
somewhat complicated with a lot of moving parts. It's a wonder
SPIRV-Cross's output is largely unchanged. Indeed, many of these changes
are to accomplish exactly that! Perhaps the largest source of changes
was the violation of the assumption that, when emitting types, the
pointer type didn't matter.
One of the test cases added by the change doesn't optimize very well;
the output of `spirv-opt` here is invalid SPIR-V. I need to file a bug
with SPIRV-Tools about this.
I wanted to test that variable pointers to images worked too, but I
couldn't figure out how to propagate the access qualifier properly--in
MSL, it's part of the type, so getting this right is important. I've
punted on that for now.
Just like OpAccessChain we need to make use of the meta information
available to use from access_chain_internal as we can extract a packed
vector or transposed vector from a composite, not just memory load.
A block name cannot alias with any name in its own scope,
and it cannot alias with any other "global" name.
To solve this, we need to complicate the name cache updates a little bit
where we have a "primary" namespace and "secondary" namespace.
This is required to avoid relying on complex sub-expression elimination
in compilers, and generates cleaner code.
The problem case is if a complex expression is used in an access chain,
like:
Composite comp = buffer[texture(...)];
vec4 a = comp.a + comp.b + comp.c;
Before, we did not have common subexpression tracking for
OpLoad/OpAccessChain, so we easily ended up with code like:
vec4 a = buffer[texture(...)].a + buffer[texture(...)].b + buffer[texture(...)].c;
A good compiler will optimize this, but we should not rely on it, and
forcing texture(...) to a temporary also looks better.
The solution is to add a vector "implied_expression_reads", which works
similarly to expression_dependencies. We also need an extra mechanism in
to_expression which lets us skip expression read checking and do it
later. E.g. for expr -> access chain -> load, we should only trigger
a read of expr when using the loaded expression.
Avoids certain cases of variance between translation units by forcing
every dependent expression of a store to be temporary.
Should avoid the major failure cases where invariance matters.
Don't use `addsat()`/`subsat()`; that'll erroneously flag cases where
the sum is exactly the maximum integer value, or the difference is
exactly 0. Also, correct the condition for the `select()` function; it's
basically `mix()` with a boolean factor.
(What was I *thinking*?)