By default, the matrix would be declared as mediump, causing precision
issues. Need to dispatch to two separate functions since GLSL does not
support overload based on precision.
Undef values may be of struct type and may be used in constants.
Therefore, they must be interleaved with constants and types.
Fixes the rest of the Vulkan CTS test
`dEQP-VK.spirv_assembly.instruction.compute.opundef.undefined_spec_constant_composite`.
(Please excuse the churn in the reference output; it's an inevitable
result of this change.)
Fixes the CTS test
`dEQP-VK.spirv_assembly.instruction.compute.opundef.undefined_constant_composite`
and helps with another,
`dEQP-VK.spirv_assembly.instruction.compute.opundef.undefined_spec_constant_composite`.
Unfortunately, fixing the latter requires another change.
Some Metal devices have a bug where storage resources can still be
written to even if the fragment is discarded. This is obviously a bug in
Metal, but bothering Apple to fix it will only fix it for newer
versions; therefore, a workaround is needed for older versions. I have
made this an option so that, in case the bug is ever fixed, the
workaround can be disabled.
This workaround is simple: if a fragment shader may discard its fragment
and writes to a storage resource, a variable representing the
`HelperInvocation` built-in is created and passed to all functions. The
flag is checked on all resource writes; writes do not occur when
`HelperInvocation` is `true`. This relies on the earlier workaround to
update `HelperInvocation` when the fragment is discarded.
Fixes at least 3 failures in the CTS.
It is possible to pass unsigned integers to `OpSMulExtended`. In that
case, we want to do a signed multiply with sign extension, so make sure
the operands are forced to be interpreted as signed.
This was an oversight on my part when I added these instructions.
Fixes the CTS test
`dEQP-VK.spirv_assembly.instruction.compute.signed_op.uint_smulextended`.
Some Metal devices have a bug where `simd_is_helper_thread()` won't
return true after a fragment has been discarded. We can work around this
by manually setting `gl_HelperInvocation` upon discarding a fragment.
This is fairly unintrusive, so it is enabled by default. I've made it an
option so that, when the bug is fixed, we can disable it.
The array mechanism breaks DXC which needs to observe that all
components have been written.
Uninitialized outputs will be undefined. Resort to simple vector
instead.
This op creates a new composite constant with one element replaced. So,
we reconstruct the `SPIRConstant` for the composite constant, but with
one of the IDs replaced. Constant initializer lists are memoized for
when the result of a `CompositeInsert` is used in another
`CompositeInsert`.
(I wanted to add a test case for GLSL as well, but for two things:
1. `glslang` in Vulkan mode chokes on the first constant array,
insisting that its initializer needs to be a constant. [Bug in
glslang?]
2. The declarations for the buffers used by the shader aren't emitted,
regardless of whether Vulkan mode is enabled.)
Fixes five tests under
`dEQP-VK.spirv_assembly.instruction.*.opspecconstantop.vector_related`.
MSL inherits the behavior of C where arithmetic on small types are
implicitly converted to int. SPIR-V does not have this behavior, so make
sure that arithmetic results are handled correctly.
restrict was supported, but it broke in MSL 3.0. __restrict works on all
versions, so opt for that instead.
Also check for RestrictPointer decoration and refactor to_restrict() to
not take optional parameter to make it more obvious when implied space
character is added.
In tessellation shaders, we call
`add_plain_member_variable_to_interface_block()` on composite types,
since we are using buffers for I/O and can use nested structs/arrays
here. In those cases, we need to make sure the next location is
incremented by the total amount consumed by the entire composite.
Fixes six more tests in the CTS, under
`dEQP-VK.tessellation.user_defined_io.per_vertex_block.*`.
Flattening doesn't play well with dynamic indices. In this case, it's
better to leave it as an array of structs.
(I wanted to do this for named blocks generally. Trouble is, the builtin
`gl_out` block is *also* a named block...)
Fixes six more CTS tests, under
`dEQP-VK.tessellation.user_defined_io.per_patch_block_array.*`.
Using vertex-style stage input is complex, and it doesn't support
nesting of structures or arrays. By using raw buffer input instead, we
get this support "for free," and everything becomes much simpler.
Arguably, this is the way I should've done this in the first place.
Eventually, I'd like to make this the default, and then remove the
option altogether. (And I still need to do that with
`multi_patch_workgroup`...)
Should help fix 66 tests in the Vulkan CTS, under the following trees:
- `dEQP-VK.pipeline.*.interface_matching.*`
- `dEQP-VK.tessellation.user_defined_io.*`
- `dEQP-VK.clipping.user_defined.*`