Now we added block.cases_32bit as requested and we only parse if the
remaining ops are a multiple of 2. None of them are mutable because we
return a reference of them depending of the op.condition width.
Signed-off-by: Sebastián Aedo <saedo@codeweavers.com>
We don't need to keep track of the type itself, only its width since the
type check of the OpSwitch can be done at runtime. This also avoids
creating a dangling reference.
Signed-off-by: Sebastián Aedo <saedo@codeweavers.com>
Moving out the logic from the parser as requested because it's sensitive
to try to keep the parsing the most simple process as said.
For that, the load_types is now tracked in the ParsedIR, which can be
accessed in the Compiler struct. The switch cases are fixed in the CFG
stage since that's the point where the nullptr is deref.
Signed-off-by: Sebastián Aedo <saedo@codeweavers.com>
Certain shaders where functions have a *ton* of merging control flow
will end up with exponential time complexity to figure out parameter
preservation semantics.
The trivial fix to make it O(1) again is to terminate recursive traversal early if we've seen
the path before. Simple oversight :(
Fairly minor differences, so can keep them side by side without too much
effort. NV support is effectively deprecated now however.
- Add OpConvertUToAccelerationStructureKHR
- Ignore/Terminate ray is now a terminator in KHR, but a call in NV.
- Fix some bugs with reportIntersection.
Subsequent stages can legally attempt to read from these variables,
which causes compilation failure.
Always make sure we emit user outputs in vertex shaders if they are
active in the entry point.
New in MSL 2.3 is a template that can be used in the place of a scalar
type in a stage-in struct. This template has methods which interpolate
the varying at the given points. Curiously, you can't set interpolation
attributes on such a varying; perspective-correctness is encoded in the
type, while interpolation must be done using one of the methods. This
makes using this somewhat awkward from SPIRV-Cross, requiring us to jump
through a bunch of hoops to make this all work.
Using varyings from functions in particular is a pain point, requiring
us to pass the stage-in struct itself around. An alternative is to pass
references to the interpolants; except this will fall over badly with
composite types, which naturally must be flattened. As with
tessellation, dynamic indexing isn't supported with pull-model
interpolation. This is because of the need to reference the original
struct member in order to call one of the pull-model interpolation
methods on it. Also, this is done at the variable level; this means that
if one varying in a struct is used with the pull-model functions, then
the entire struct is emitted as pull-model interpolants.
For some reason, this was not documented in the MSL spec, though there
is a property on `MTLDevice`, `supportsPullModelInterpolation`,
indicating support for this, which *is* documented. This does not appear
to be implemented yet for AMD: it returns `NO` from
`supportsPullModelInterpolation`, and pipelines with shaders using the
templates fail to compile. It *is* implemeted for Intel. It's probably
also implemented for Apple GPUs: on Apple Silicon, OpenGL calls down to
Metal, and it wouldn't be possible to use the interpolation functions
without this implemented in Metal.
Based on my testing, where SPIR-V and GLSL have the offset relative to
the pixel center, in Metal it appears to be relative to the pixel's
upper-left corner, as in HLSL. Therefore, I've added an offset 0.4375,
i.e. one half minus one sixteenth, to all arguments to
`interpolate_at_offset()`.
This also fixes a long-standing bug: if a pull-model interpolation
function is used on a varying, make sure that varying is declared. We
were already doing this only for the AMD pull-model function,
`interpolateAtVertexAMD()`; for reasons which are completely beyond me,
we weren't doing this for the base interpolation functions. I also note
that there are no tests for the interpolation functions for GLSL or
HLSL.
In some cases, we need to get a literal value from a spec constant op.
Mostly relevant when emitting buffers, so implement a 32-bit integer
scalar subset of the evaluator. Can be extended as needed to support
evaluating any specialization constant operation.
When inside a loop, treat any read of outer expressions to happen
multiple times, forcing a temporary of said outer expressions.
This avoids the problem where we can end up relying on loop-invariant code motion to happen in the
compiler when converting optimized shaders.
When loading and storing array types which belong to buffer objects, we
need to treat these values as not being value types. Also, need to
handle array load/store from/to more address space combinations.
We had output dependent on complex_continue being set, but setting that
flag was dependent on unordered_set declaration order. Make it invariant
to ordering and change the implementation so it knows about the new
temporary hoisting for access chains.
Some fallout where internal functions are using stronger types.
Overkill to move everything over to strong types right now, but perhaps
move over to it slowly over time.