Commit Graph

15 Commits

Author SHA1 Message Date
Chip Davis
688c5fcbda MSL: Add support for processing more than one patch per workgroup.
This should hopefully reduce underutilization of the GPU, especially on
GPUs where the thread execution width is greater than the number of
control points.

This also simplifies initialization by reading the buffer directly
instead of using Metal's vertex-attribute-in-compute support. It turns
out the only way in which shader stages are allowed to differ in their
interfaces is in the number of components per vector; the base type must
be the same. Since we are using the raw buffer instead of attributes, we
can now also emit arrays and matrices directly into the buffer, instead
of flattening them and then unpacking them. Structs are still flattened,
however; this is due to the need to handle vectors with fewer components
than were output, and I think handling this while also directly emitting
structs could get ugly.

Another advantage of this scheme is that the extra invocations needed to
read the attributes when there were more input than output points are
now no more. The number of threads per workgroup is now lcm(SIMD-size,
output control points). This should ensure we always process a whole
number of patches per workgroup.

To avoid complexity handling indices in the tessellation control shader,
I've also changed the way vertex shaders for tessellation are handled.
They are now compute kernels using Metal's support for vertex-style
stage input. This lets us always emit vertices into the buffer in order
of vertex shader execution. Now we no longer have to deal with indexing
in the tessellation control shader. This also fixes a long-standing
issue where if an index were greater than the number of vertices to
draw, the vertex shader would wind up writing outside the buffer, and
the vertex would be lost.

This is a breaking change, and I know SPIRV-Cross has other clients, so
I've hidden this behind an option for now. In the future, I want to
remove this option and make it the default.
2020-07-23 17:59:54 -05:00
Dan Sinclair
d409210ee5 Move all .invalid shaders into no-opt folders. 2019-11-05 13:19:19 -05:00
Dan Sinclair
e5af41255c Only run spirv-opt if the spirv is valid.
This CL updates the test runner to only run spirv-opt if the generated
SPIR-V is valid. If validation is skipped it's possible to hit aborts
and other memory errors in the optimizer as it assumes the SPIR-V is
valid.
2019-11-05 11:00:49 -05:00
Hans-Kristian Arntzen
fa011f8547 MSL: Declare arrays with proper type wrapper.
Need to construct with value type spvUnsafeArray<T, N>({ elem0, elem1 })
to make array initialization work in complex scenarios.
2019-10-26 17:57:34 +02:00
Lukas Hermanns
c236ca4572 Moved all UE4 test shaders into 'shaders-ue4/' folder. 2019-10-23 17:39:05 -04:00
Lukas Hermanns
7ad0a84778 Updates for pull request #1162 2019-09-24 14:35:25 -04:00
Lukas Hermanns
cb3ecb9e1b Updated reference Metal shaders. 2019-09-17 15:11:19 -04:00
Lukas Hermanns
0be20cd933 Renamed new test shaders to fit the naming convention in SPIRV-Cross. 2019-09-16 10:33:45 -04:00
Mark Satterthwaite
564cb3c08d Update the Metal shaders to account for changes in the shader compilation. 2019-09-11 15:06:05 -04:00
Thomas Roughton
91b2f34a3d Update tests to account for all non-entry-point functions being inlined 2019-08-30 09:39:06 +12:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Michael Barriault
d6754c5713 Fix tests for device->constant address space change in MSL tessellation control shader generation. 2019-04-10 18:37:04 +01:00
Chip Davis
a43dcd7b99 MSL: Return early from helper tesc invocations.
Return after loading the input control point array if there are more
input points than output points, and this was one of the helper
invocations spun off to load the input points. I was hesitant to do this
initially, since the MSL spec has this to say about barriers:

> The `threadgroup_barrier` (or `simdgroup_barrier`) function must be
> encountered by all threads in a threadgroup (or SIMD-group) executing
> the kernel.

That is, if any thread executes the barrier, then all threads must
execute it, or the barrier'd invocations will hang. But, the key words
here seem to be "executing the kernel;" inactive invocations, those that
have already returned, need not encounter the barrier to prevent hangs.
Indeed, I've encountered no problems from doing this, at least on my
hardware. This also fixes a few CTS tests that were failing due to
execution ordering; apparently, my assumption that the later, invalid
data written by the helpers would get overwritten was wrong.
2019-02-24 12:17:47 -06:00
Chip Davis
8095434dc4 MSL: Drop stores to nonexistent tess levels.
In SPIR-V, there are always two inner levels and four outer levels, even
if the input patch isn't a quad patch. But in MSL, due to requirements
imposed by Metal, only one inner level and three outer levels exist when
the input patch is a triangle patch. We must explicitly ignore any write
to the nonexistent second inner and fourth outer levels in this case.
2019-02-20 09:11:24 -06:00
Chip Davis
eb89c3a428 MSL: Add support for tessellation control shaders.
These are transpiled to kernel functions that write the output of the
shader to three buffers: one for per-vertex varyings, one for per-patch
varyings, and one for the tessellation levels. This structure is
mandated by the way Metal works, where the tessellation factors are
supplied to the draw method in their own buffer, while the per-patch and
per-vertex varyings are supplied as though they were vertex attributes;
since they have different step rates, they must be in separate buffers.

The kernel is expected to be run in a workgroup whose size is the
greater of the number of input or output control points. It uses Metal's
support for vertex-style stage input to a compute shader to get the
input values; therefore, at least one instance must run per input point.
Meanwhile, Vulkan mandates that it run at least once per output point.
Overrunning the output array is a concern, but any values written should
either be discarded or overwritten by subsequent patches. I'm probably
going to put some slop space in the buffer when I integrate this into
MoltenVK to be on the safe side.
2019-02-07 08:51:22 -06:00