The standard `matrix` type in MSL lacked a constructor in the
`threadgroup` AS. This means that it was impossible to declare a
`threadgroup` variable that contains a matrix. This appears to have been
an oversight that was corrected in macOS 13/Xcode 14 beta 4. This
workaround continues to be required, however, for older systems.
To avoid changing interfaces unnecessarily (which shouldn't be a problem
regardless because the old and new types take up the same amount of
storage), only do this for structs if the struct is positively
identified as being used for workgroup storage.
I'm entirely aware this is inconsistent with the way packed matrices are
handled. One of them should be changed to match the other. Not sure
which one.
Fixes 23 CTS tests under `dEQP-VK.memory_model.shared`.
- Assign ulongn physical type to buffer pointers in short arrays
when array stride is larger than pointer size.
- Support GL_EXT_buffer_reference_uvec2 casting
buffer reference pointers to and from uvec2 values.
- When packing structs, include structs inside physical buffers.
- Update mechanism for traversing pointer arrays when calculating type sizes.
- Added unit test shaders.
This commit modifies sprintf calls to fix the following compilation
errors using XCode 14.0 beta on MacOS:
> 'sprintf' is deprecated: This function is provided for compatibility
> reasons only. Due to security concerns inherent in the design of
> sprintf(3), it is highly recommended that you use snprintf(3) instead.
HLSL is very fussy about fallthrough in switch blocks, so promote
Unreachable blocks to breaks if they are inside a switch construct.
Some false positives are possible in weird multi-break scenarios, but
this is benign.
Speculate that we can modify the SSA value in-place. As long as it is
not used after the modify, this is fine.
Also need to make sure we don't attempt to RMW something that is
impossible to modify.
GLSL and RelaxedPrecision are quite different in what they affect.
RelaxedPrecision affects operations, while this is merely implied in
GLSL based on inputs.
This leads to situations where we have to promote mediump inputs to
highp, and the simplest approach is to force highp temporaries for
inputs which are consumed in a highp context. For completeness, we also
demote RelaxedPrecision inputs to mediump variables.
PHI is handled by copying the PHI into a temporary.
We have to be very careful with hoisted temporaries, since the child
temporary will not be analyzed up-front. We inherit the hoisted-ness
state and emit the hoisted child temporary as necessary. When faking the
temporaries with OpCopyObject, we make sure to block any variable
hoisting.
Hoisting children of PHI variables is fine, since PHIs are not hoisted with
the same framework as other temporaries.
This fixes a regression introduced by 93b0dc7, where all volatile variables
were not allowed to be forwarded. This doesn't work well for volatile memory
object variables like images or buffer blocks, because it forces local variables
to be defined, which is unnecessary, and sometimes results in wrong types.
This patch restricts volatile builtin variables (eg. HelperInvocation) from
being forwarded, but allows other volatile variables to be forwarded, as before.
Just like we try to fixup struct names for block types, inner structs
can be "anonymous" structs. HLSL codegen from DXC tends to emit this,
and emitting dummy struct names tends to break GL linkage on some
drivers.
Makes codegen from typical D3D emulation SPIR-V more readable.
Also makes cross compilation with NotEqual more sensible.
It's very rare to actually need the strict NaN-checks in practice.
Also, glslang now emits UnordNotEqual by default it seems, so give up
trying to assume OrdNotEqual. Harmonize for UnordNotEqual as the sane
default.
Clang added -Wunqualified-std-cast-call in
https://reviews.llvm.org/D119670, which warns on unqualified std::move
and std::forward calls. This change qualifies these calls to allow the
project to build on HEAD Clang -Werror.
The patterns where we force temporary due to invalid/overused expression -> recompile
should be seen as making forward progress, and there are very rare scenarios where
these recompiles can cascade into many loops.
Refactor this style of logic into a new function which is equivalent to handle_invalid_expression().
Need this to be context sensitive, since array of block-like struct is
template, but struct of block-like array is C-style.
Also, test a mix and match, so we have constant array of block-like
struct with array inside. :v
Introduces an idea of a recompilation making forward progress.
There are some extreme edge cases where we need more than 3 loops, but
only allow this in specific circumstances where we can reason about
forward progress being made.
In cases where we know the size of the vector and the index at compile
time, we can check if it's accessing in bounds and rely in undefined
behavior otherwise.
Signed-off-by: Sebastián Aedo <saedo@codeweavers.com>