Commit Graph

330 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen
99ae0d32e9 MSL: Handle array with component when we cannot rely on user() attrib.
In these cases, we emit one variable per location, and so we must
flatten stuff.
2021-05-21 13:46:33 +02:00
Hans-Kristian Arntzen
e47a30e807 Honor NoContraction qualifier.
We'll need to force a temporary and mark it as precise.
MSL is a little weird here, but we can piggyback on top of the invariant
float math option here to force fma() operations everywhere.
2021-05-07 12:59:47 +02:00
Hans-Kristian Arntzen
82a77e534e MSL: Use proper array for quad tess levels.
We need to handle loads from array as well, so the float4 hack doesn't
work.
2021-04-23 14:12:00 +02:00
Hans-Kristian Arntzen
ae9ca7d73c MSL: Fix copy of arrays to/from stage IO variables.
Need to take into account effective storage classes and whether or not
we target stage IO blocks since native arrays are conditionally enabled.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
682a227f4b MSL: Make builtin argument type declaration context sensitive.
Sometimes we'll need array template, sometimes not 🤷.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
5826298697 MSL: Handle CullDistance better. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
23da445bd4 MSL: Emit multiple threadgroup slices for multi-patch.
Multiple patches can run in the same workgroup when using multi-patch
mode, so we need to allocate enough storage to avoid false sharing.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
e32c474911 MSL: Handle masking of TESC IO block members. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
40f628f49c MSL: Add test for complex control point outputs. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
46c48ee6b5 MSL: Rewrite how IO blocks are emitted in multi-patch mode.
Firstly, never flatten inputs or outputs in multi-patch mode.
The main scenario where we do need to care is Block IO.
In this case, we should only flatten the top-level member, and after
that we use access chains as normal.

Using structs in Input storage class is now possible as well. We don't
need to consider per-location fixups at all here. In Vulkan, IO structs
must match exactly. Only plain vectors can have smaller vector sizes as
a special case.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
ff3f5bcba5 MSL: Handle masking of builtin control points. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
ae7bb41ef4 MSL: Test that we can mask location writes in TESC. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
ba93b6518d MSL: Fix masking of vertex block outputs. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
857295a9ab MSL: Add tests for masking with --for-tess. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
43b6ea2c9a MSL: Remove position mask tests. They will fail compilation. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
50a6bc058a MSL: Force builtin arrays for builtin array types.
Handles argument_decl() correctly.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
88b54f5dab MSL: Add tests for vertex output masking. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
0ad12a0036 MSL: Always return [[position]] when required. 2021-02-15 12:57:37 +01:00
Hans-Kristian Arntzen
5d82d32e0f Roll dependencies. 2021-01-08 10:41:51 +01:00
Chip Davis
fd738e3387 MSL: Adjust FragCoord for sample-rate shading.
In Metal, the `[[position]]` input to a fragment shader remains at
fragment center, even at sample rate, like OpenGL and Direct3D. In
Vulkan, however, when the fragment shader runs at sample rate, the
`FragCoord` builtin moves to the sample position in the framebuffer,
instead of the fragment center. To account for this difference, adjust
the `FragCoord`, if present, by the sample position. The -0.5 offset is
because the fragment center is at (0.5, 0.5).

Also, add an option to force sample-rate shading in a fragment shader.
Since Metal has no explicit control for this, this is done by adding a
dummy `[[sample_id]]` which is otherwise unused, if none is already
present. This is intended to be used from e.g. MoltenVK when a
pipeline's `minSampleShading` value is nonzero.

Instead of checking if any `Input` variables have `Sample`
interpolation, I've elected to check that the `SampleRateShading`
capability is present. Since `SampleId`, `SamplePosition`, and the
`Sample` interpolation decoration require this cap, this should be
equivalent for any valid SPIR-V module. If this isn't acceptable, let me
know.
2020-11-23 10:30:24 -06:00
Jan Sikorski
f0239bce05 MSL: extract global variables from subgroup ballot operations
Fixes #1513.
2020-11-09 11:23:01 +01:00
Chip Davis
aca9b6879a MSL: Support pull-model interpolation on MSL 2.3+.
New in MSL 2.3 is a template that can be used in the place of a scalar
type in a stage-in struct. This template has methods which interpolate
the varying at the given points. Curiously, you can't set interpolation
attributes on such a varying; perspective-correctness is encoded in the
type, while interpolation must be done using one of the methods. This
makes using this somewhat awkward from SPIRV-Cross, requiring us to jump
through a bunch of hoops to make this all work.

Using varyings from functions in particular is a pain point, requiring
us to pass the stage-in struct itself around. An alternative is to pass
references to the interpolants; except this will fall over badly with
composite types, which naturally must be flattened.  As with
tessellation, dynamic indexing isn't supported with pull-model
interpolation. This is because of the need to reference the original
struct member in order to call one of the pull-model interpolation
methods on it. Also, this is done at the variable level; this means that
if one varying in a struct is used with the pull-model functions, then
the entire struct is emitted as pull-model interpolants.

For some reason, this was not documented in the MSL spec, though there
is a property on `MTLDevice`, `supportsPullModelInterpolation`,
indicating support for this, which *is* documented. This does not appear
to be implemented yet for AMD: it returns `NO` from
`supportsPullModelInterpolation`, and pipelines with shaders using the
templates fail to compile. It *is* implemeted for Intel. It's probably
also implemented for Apple GPUs: on Apple Silicon, OpenGL calls down to
Metal, and it wouldn't be possible to use the interpolation functions
without this implemented in Metal.

Based on my testing, where SPIR-V and GLSL have the offset relative to
the pixel center, in Metal it appears to be relative to the pixel's
upper-left corner, as in HLSL. Therefore, I've added an offset 0.4375,
i.e. one half minus one sixteenth, to all arguments to
`interpolate_at_offset()`.

This also fixes a long-standing bug: if a pull-model interpolation
function is used on a varying, make sure that varying is declared. We
were already doing this only for the AMD pull-model function,
`interpolateAtVertexAMD()`; for reasons which are completely beyond me,
we weren't doing this for the base interpolation functions. I also note
that there are no tests for the interpolation functions for GLSL or
HLSL.
2020-11-05 11:57:45 -06:00
Chip Davis
547c29f7bb MSL: Allow Bias and Grad arguments with comparison on Mac in MSL 2.3.
I kept the code to replace constant zero arguments, because `Bias` and
`Grad` still have some problems on desktop GPUs.

`Bias` works on AMD GPUs. `Grad` does not. Both work on Intel. Still
needs testing on NV. It will definitely work with Apple GPUs.
2020-10-30 11:14:59 -05:00
Chip Davis
d48d2a95c7 MSL: Allow post-depth coverage on Mac in MSL 2.3.
It's still only supported on Apple GPUs, but Macs will have those soon.
2020-10-27 22:07:01 -05:00
Chip Davis
064ed448b9 MSL: Don't remove periods from swizzle buffer index exprs. 2020-10-20 17:47:40 -05:00
Chip Davis
5845e009ea MSL: Handle Offset and Grad operands for 1D-as-2D textures. 2020-10-15 12:51:00 -05:00
Chip Davis
3e6010d8c5 MSL: Don't use a bitcast for tessellation levels in tesc shaders.
`half` cannot be bitcasted to `float`, because the two types are not the
same size. Use an expanding cast instead.

We were already doing this for stores to the tessellation levels; why I
didn't also do this for loads is beyond me.
2020-10-14 18:35:59 -05:00
Chip Davis
7a5d0d6b29 MSL: Add missing interlock handling to atomic image buffers. 2020-10-13 11:44:17 -05:00
Hans-Kristian Arntzen
fab6ad234e
Merge pull request #1486 from cdavis5e/atomic-image-argument-buffer
MSL: Support atomic access to images from argument buffers.
2020-10-13 12:55:43 +02:00
Chip Davis
9cafea6cf8 MSL: Support atomic access to images from argument buffers.
This was not added when Epic contributed atomic image support.

Fixes #1484.
2020-10-13 02:37:18 -05:00
Chip Davis
2219c4a392 MSL: Support SPV_EXT_demote_to_helper_invocation for MSL 2.3.
MSL 2.3 has everything needed to support this extension on all
platforms. The existing `discard_fragment()` function was given demote
semantics, similar to Direct3D, and the `simd_is_helper_thread()`
function was finally added to iOS.

I've left the old test alone. Should I remove it in favor of these?
2020-10-13 00:25:32 -05:00
Chip Davis
4cf840ee7b MSL: Support layered input attachments.
These need to use arrayed texture types, or Metal will complain when
binding the resource. The target layer is addressed relative to the
Layer output by the vertex pipeline, or to the ViewIndex if in a
multiview pipeline. Unlike with the s/t coordinates, Vulkan does not
forbid non-zero layer coordinates here, though this cannot be expressed
in Vulkan GLSL.

Supporting 3D textures will require additional work. Part of the problem
is that Metal does not allow texture views to subset a 3D texture, so we
need some way to pass the base depth to the shader.
2020-09-02 09:18:25 -05:00
Chip Davis
cab7335e64 MSL: Don't set the layer for multiview if the device doesn't support it.
Some older iOS devices don't support layered rendering. In that case,
don't set `[[render_target_array_index]]`, because the compiler will
reject the shader in that case. The client will then have to unroll the
render pass manually.
2020-09-01 19:30:28 -05:00
Le Hoang Quyen
ab8eb70af1 Fix #1445: MSL: Enclose args when convert distance(a,b) to abs(a-b) 2020-08-13 21:16:08 +08:00
Chip Davis
3347b1076d MSL: Fix handling of matrices and structs in the output control point array.
Prior to this point, we were treating them as flattened, as they are in
old-style tessellation control shaders, and still are for structs in
new-style shaders. This is not true for outputs; output composites are
not flattened at all. This semantic mismatch broke a Vulkan CTS test.
It should now pass.
2020-08-03 17:18:18 -05:00
Tomek Ponitka
18f23c47d9 Enabling setting a fixed sampleMask in Metal fragment shaders.
In Metal render pipelines don't have an option to set a sampleMask
parameter, the only way to get that functionality is to set the
sample_mask output of the fragment shader to this value directly.
We also need to take care to combine the fixed sample mask with the
one that the shader might possibly output.
2020-07-24 11:19:46 +02:00
Chip Davis
688c5fcbda MSL: Add support for processing more than one patch per workgroup.
This should hopefully reduce underutilization of the GPU, especially on
GPUs where the thread execution width is greater than the number of
control points.

This also simplifies initialization by reading the buffer directly
instead of using Metal's vertex-attribute-in-compute support. It turns
out the only way in which shader stages are allowed to differ in their
interfaces is in the number of components per vector; the base type must
be the same. Since we are using the raw buffer instead of attributes, we
can now also emit arrays and matrices directly into the buffer, instead
of flattening them and then unpacking them. Structs are still flattened,
however; this is due to the need to handle vectors with fewer components
than were output, and I think handling this while also directly emitting
structs could get ugly.

Another advantage of this scheme is that the extra invocations needed to
read the attributes when there were more input than output points are
now no more. The number of threads per workgroup is now lcm(SIMD-size,
output control points). This should ensure we always process a whole
number of patches per workgroup.

To avoid complexity handling indices in the tessellation control shader,
I've also changed the way vertex shaders for tessellation are handled.
They are now compute kernels using Metal's support for vertex-style
stage input. This lets us always emit vertices into the buffer in order
of vertex shader execution. Now we no longer have to deal with indexing
in the tessellation control shader. This also fixes a long-standing
issue where if an index were greater than the number of vertices to
draw, the vertex shader would wind up writing outside the buffer, and
the vertex would be lost.

This is a breaking change, and I know SPIRV-Cross has other clients, so
I've hidden this behind an option for now. In the future, I want to
remove this option and make it the default.
2020-07-23 17:59:54 -05:00
Chip Davis
5281d9997e MSL: Fix up input variables' vector lengths in all stages.
Metal is picky about interface matching. If the types don't match
exactly, down to the number of vector components, Metal fails pipline
compilation. To support pipelines where the number of components
consumed by the fragment shader is less than that produced by the vertex
shader, we have to fix up the fragment shader to accept all the
components produced.
2020-06-16 14:50:30 -05:00
Le Hoang Quyen
9ddfe6db6d Fix #1359: MSL: If the packed type is scalar, don't emit "pack_" prefix.
Scalar type is already packed in metal.
2020-05-06 00:43:34 +08:00
Hans-Kristian Arntzen
ebf463674d MSL: Allow removing clip distance user varyings.
Only safe if user knows that subsequent shader stage will not read clip
distance.
2020-04-20 09:58:40 +02:00
Chip Davis
96f7008aa8 MSL: Force disabled fragment builtins to have the right name.
DXVK emits SPIR-V where fragment shader builtins have names derived from
DXBC assembly, e.g. `oDepth` for `FragDepth`. When we declared the
disabled output, we used this name, but when referencing it, we
continued to use the GLSL name. This breaks compilation.
2020-04-15 19:25:18 -05:00
Chip Davis
495e48de44 MSL: Only disable output variables in fragment shaders.
Forgot to do this in #1319.

Fixes #1322.
2020-04-15 12:14:57 -05:00
Chip Davis
b29f83c383 MSL: Add options to control emission of fragment outputs.
Like with `point_size` when not rendering points, Metal complains when
writing to a variable using the `[[depth]]` qualifier when no depth
buffer be attached. In that case, we must avoid emitting `FragDepth`,
just like with `PointSize`.

I assume it will also complain if there be no stencil attachment and the
shader write to `[[stencil]]`, or it write to `[[color(n)]]` but there
be no color attachment at n.
2020-04-13 15:29:11 -05:00
Hans-Kristian Arntzen
d91e134500 MSL: Add native array test for composite array initialization. 2020-02-24 13:34:51 +01:00
Hans-Kristian Arntzen
20b28f72fa MSL: Reinstate workaround for returning arrays. 2020-02-24 13:04:10 +01:00
Hans-Kristian Arntzen
c9d4f9cd74 MSL: Add a workaround path to force native arrays for everything. 2020-02-24 12:47:14 +01:00
Chip Davis
fedbc35315 MSL: Support inline uniform blocks in argument buffers.
Here, the inline uniform block is explicit: we instantiate the buffer
block itself in the argument buffer, instead of a pointer to the buffer.
I just hope this will work with the `MTLArgumentDescriptor` API...

Note that Metal recursively assigns individual members of embedded
structs IDs. This means for automatic assignment that we have to
calculate the binding stride for a given buffer block. For MoltenVK,
we'll simply increment the ID by the size of the inline uniform block.
Then the later IDs will never conflict with the inline uniform block. We
can get away with this because Metal doesn't require that IDs be
contiguous, only monotonically increasing.
2020-01-24 18:51:24 -06:00
Hans-Kristian Arntzen
a3fe9756d2 MSL: Support ClipDistance as an input stage variable.
MSL does not support this, so we have to emulate it by passing it around
as a varying between stages. We use a special "user(clipN)" attribute
for this rather than locN which is used for user varyings.
2019-12-02 13:19:42 +01:00
Hans-Kristian Arntzen
b85ab5f5ff MSL: Fix automatic binding allocation for image atomic buffers.
The Primary decoration was used by the atomic buffer, causing the
texture binding to be potentially overlapping with other resources.
2019-11-28 11:07:44 +01:00
Dan Sinclair
d409210ee5 Move all .invalid shaders into no-opt folders. 2019-11-05 13:19:19 -05:00