The check_discard code was too annoying to deal with,
and there is no requirement to return anything meaningful.
Vulkan specs state that values returned by atomic instructions are undefined.
To date, all released Apple Silicon GPUs incorrectly interpret the
gradient vectors when sampling a cube texture. Specifically, they ignore
one of the three partial derivatives in each gradient depending on the
selected major axis, and they expect the remaining derivatives to be
partially transformed.
h/t @lexaknyazev for the code used in the `spvGradientCube()` function.
Fixes 8 tests under `dEQP-VK.glsl.texture_functions.texturegrad.*`.
Argument buffers can contain multiple runtime arrays if they have fixed
lengths as specified by the binding API. Regression error had assumed each
runtime array is in separate argument buffer with undefined array length.
- Add CompilerMSL::is_var_runtime_size_array() to include test for
setting of array length via CompilerMSL::add_msl_resource_binding().
- Fixed unrelated test case MSL compile syntax failure when acceleration
structure is the first entry point function argument (unrelated).
Metal 3.1 introduced a Metal regression bug which causes an infinite recursion
crash during Metal's analysis of an entry point input structure that itself
contains internal recursion. This patch works around this by replacing the
recursive input declaration with a alternate variable of type void*, and
then casting to the correct type at the top of the entry point function.
- Add CompilerMSL::Options::replace_recursive_inputs to enable
replacing recursive input.
- Add Compiler::type_contains_recursion() to determine if a struct
contains internal recursion, and add custom Decorations to mark
such structs, to short-cut future similar checks.
- Replace recursive input struct declarations with void*,
and emit a recast to correct type at top of entry function.
- Add unit test.
- Compiler::type_is_top_level_block() remove hardcode reference to spirv_cross
namespace, as it interferes with configurable namespaces (unrelated).
- When determining need for arg buffer padding, use the descriptor count provided
by the app, rather than the shader, to determine the number of slots consumed,
as the shader may only be accessing part, or even one element, of the array.
Normally, I wouldn't have bothered with this, given that we already
support the Vulkan 1.1 subgroup functionality, but a client asked for
the legacy extensions.
Previously, if a constant without DecorationSpecId occurred before the constant with DecorationSpecId 0, it would be inserted into the unique_func_constants map with key 0 (the default return value from get_decoration). This prevented us from ever emitting the declaration with [[function_constant(0)]], which produced some bizarre MSL compilation errors.
Instead, we now only insert into unique_func_constants when we know we're going to emit a [[function_constant(…)]] line.
This regressed in 41007cdc7d.
Handling native array types is not really feasible since we need to fuse
the variable declaration with the type declaration.
This is feasible in something like variable_decl, but for plain SSA
pointers, this breaks down.
If we have emitted block IO lowering at the end of vertex shader, we
will end up using the wrong name. Forcing a v_ prefix does not solve any
actual problems since the intentifier already has to be valid.
It is possible in SPIR-V to declare multiple specialization constants
with the same constant ID. The most common cause of this in GLSL is
defining a spec constant, then declaring the workgroup size to use that
spec constant by its ID. But, MSL forbids defining multiple function
constants with the same function constant ID. So, we must only emit one
definition of the actual function constant (with the
`[[function_constant(id)]]` attribute); but we can point the other
variables at this one definition.
Fixes three tests in the Vulkan CTS under
`dEQP-VK.compute.basic.max_local_size_*`.
Some Metal devices have a bug with depth array textures using comparison
with explicit LoD, where the LoD given will be biased by some amount.
For these devices, we can use a gradient instead, which does not exhibit
this problem. As with the fragment demote workaround, this is only
expected to be needed until the bug is fixed in Metal.