This only affects the builtin when it is used, and not when it's passed
to a function. It's a lot cleaner than the way I was doing it before.
Remove the `to_expression()` hack.
In SPIR-V, builtin integral vectors can be either signed or unsigned,
but in MSL they're always unsigned. Unfortunately, the MSL spec forbids
implicit conversions between vector types--even if the corresponding
scalar types would implicitly convert. If you try, the result is a
cryptic error message such as:
```
program_source:37:60: error: cannot convert between vector values of different size ('int4' (aka 'vector_int4') and 'vector_uint4' (vector of 4 'unsigned int' values))
float4 r3 = as_type<float4>((as_type<int4>(r0) * gl_LocalInvocationID.xyyy) + as_type<int4>(r2));
~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~
```
Therefore, uses of these builtins must be explicitly cast, since the
rest of the binary likely assumes that the builtin is of its declared
type.
Two varyings (vertex outputs/fragment inputs) might have the same
location but be in different components--e.g. the compiler may have
packed what were two different varyings into a single varying vector.
Giving both varyings the same `[[user]]` attribute won't work--it may
yield unexpected results, or flat out fail to link. We could eventually
pack such varyings into a single vector, but that would require us to
handle the case where the varyings are different types--e.g. a `float`
and a `uint` packed into the same vector. For now, it seems most
prudent to give them unique `[[user]]` locations and let Apple's
compiler work out the best way to pack them.
This roughly matches their semantics in SPIR-V and MSL. For `FMin`,
`FMax`, and `FClamp`, and the Metal functions `fast::min()`,
`fast::max()`, and `fast::clamp()`, the result is undefined if any
operand is NaN. For the 'N' operations and their corresponding MSL
`precise::` functions, the result is consistent with IEEE 754 (first
non-NaN wins; result is NaN if all operands are NaN).
We can only do this with 32-bit floats, though, because Metal only
provides these variants for `float`. `half` only has one variant of
these functions that is presumably consistent with IEEE 754. I guess
that's OK; the SPIR-V spec only says that `F{Min,Max,Clamp}` are
undefined for NaNs. Performance might suffer, though.
The SPIR-V spec says that these check if the operands either are
unordered or satisfy the given condition. So that's just what we'll do,
using Metal's `isunordered()` stdlib function. Apple's optimizers ought
to be able to collapse that to a single unordered compare.
When the name of an alias global variable collides with a global
declaration, MSL would emit inconsistent names, sometimes with the
naming fix, sometimes without, because names were being tracked in two
separate meta blocks. Fix this by always redirecting parameter naming to
the original base variable as necessary.
MSL would force thread const& which would not work if the input argument
came from a different storage class.
Emit proper non-reference arguments for such values.
Support flattening StorageOutput & StorageInput matrices and arrays.
No longer move matrix & array inputs to separate buffer.
Add separate SPIRFunction::fixup_statements_in & SPIRFunction::fixup_statements_out
instead of just SPIRFunction::fixup_statements.
Emit SPIRFunction::fixup_statements at beginning of functions.
CompilerMSL track vars_needing_early_declaration.
Pass global output variables as variables to functions that access them.
Sort input structs by location, same as output structs.
Emit struct declarations in order output, input, uniforms.
Regenerate reference shaders to new formats defined by above.
Update SPIRV-Tools/glslang commits.
Use vulkan1.1 environment for testing.
Found new "errors" in SPIRV-Tools, so disable validation on those shaders
for now.
Certain patterns with OpVectorShuffle (and probably others) will cascade
to so large, that they can cause OOM. After we have observed
force_recompile, don't spend unnecessary memory emitting code which will
never be used.
Normally, temporary declaration must dominate any use of it,
so we generally did not need to analyze the CFG for these variables,
but there is an edge case where you have an inliner doing:
do {
create_temporary;
break;
} while(0);
use_temporary;
The inside of the loop dominates the outer scope, but we cannot emit
code like this in GLSL, so make sure we hoist these temporaries outside
the "loop".