Commit Graph

169 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen
cea934c03f MSL: Test that we can capture cull distance to buffer. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
23da445bd4 MSL: Emit multiple threadgroup slices for multi-patch.
Multiple patches can run in the same workgroup when using multi-patch
mode, so we need to allocate enough storage to avoid false sharing.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
c9946296dd MSL: Fix initialization of masked threadgroup variables. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
fb1f295aaf
Merge pull request #1635 from KhronosGroup/fix-1627
Handle edge cases in OpCopyMemory.
2021-03-09 10:21:35 +01:00
Hans-Kristian Arntzen
4ca06c7278 Handle edge cases in OpCopyMemory.
Implement this by synthesizing an OpLoad/OpStore pair instead.
2021-03-08 14:15:27 +01:00
Hans-Kristian Arntzen
aea6d29aa8 MSL: Add test for logical subgroup arith ops. 2021-03-08 12:57:37 +01:00
Hans-Kristian Arntzen
97796e0609 MSL: Deal with pointer-to-pointer qualifier ordering. 2021-02-26 13:37:14 +01:00
Hans-Kristian Arntzen
621884d709
Merge pull request #1622 from KhronosGroup/fix-1619
MSL: Handle load and store to TessLevel array in TESC.
2021-02-17 20:46:06 +01:00
Hans-Kristian Arntzen
85704f70bc MSL: Handle load and store to TessLevel array in TESC.
More edge cases ... :(
2021-02-17 13:26:08 +01:00
Hans-Kristian Arntzen
ce552f4f91 MSL: Gracefully assign automatic input locations to builtin attributes. 2021-02-17 12:29:19 +01:00
Hans-Kristian Arntzen
daddbd4078 MSL: Fixup type when using tessellation levels in TESC functions.
Need to rewrite array size depending on execution mode.
2021-02-15 13:28:11 +01:00
Hans-Kristian Arntzen
ea02a0c03a Check entry point variables in is_hidden_variables.
Need to be careful not to emit globals we're not supposed to.
2021-01-22 13:53:22 +01:00
Hans-Kristian Arntzen
893a011299 MSL: Fix various bugs with framebuffer fetch on macOS and argument buffers.
Introduce a helper to make it clearer if a resource can be
considered for argument buffers or not.
2021-01-08 10:19:18 +01:00
Hans-Kristian Arntzen
3136e34215 MSL: Always use input_attachment_index for framebuffer fetch binding.
--msl-decoration-binding would end up overriding the input attachment
index to binding which is very unexpected and broken.
2021-01-08 10:17:42 +01:00
Hans-Kristian Arntzen
03ee71e86c Add test for pure initializer gl_FragDepth.
Tests that the builtin is considered active.
2021-01-07 15:32:15 +01:00
Hans-Kristian Arntzen
014b3bc5ea MSL: Make sure initialized output builtins are considered active. 2021-01-07 15:32:13 +01:00
Hans-Kristian Arntzen
a4a9b53b5b MSL: Always enable Outputs in vertex stages.
Subsequent stages can legally attempt to read from these variables,
which causes compilation failure.

Always make sure we emit user outputs in vertex shaders if they are
active in the entry point.
2021-01-07 11:24:47 +01:00
Hans-Kristian Arntzen
fa76d01203 MSL: Only consider builtin variables if they are part of IO interface. 2021-01-07 10:50:29 +01:00
Hans-Kristian Arntzen
df4f8ef8fe MSL: Emit correct initializer for tessellation control points. 2021-01-05 15:16:49 +01:00
Hans-Kristian Arntzen
ad3e1584f9 MSL: Handle initializers for tess levels. 2021-01-05 13:25:50 +01:00
Hans-Kristian Arntzen
a1c784f002 More robust handling of initialized output builtin variables. 2021-01-04 19:12:43 +01:00
Hans-Kristian Arntzen
9a304fe931 Handle output IO block initializers more robustly. 2021-01-04 19:04:10 +01:00
Hans-Kristian Arntzen
ddb3c65648 Handle reserved identifiers for functions.
gl_ identifiers are already handled by fixups, so remove redundant code.
2021-01-04 10:00:12 +01:00
Hans-Kristian Arntzen
c4ff129fe3 MSL: Handle reserved identifiers for entry point.
We only considered invalid names, and overwrote the alias for the
function. The correct fix is to replace illegal names early, do the
reserved fixup, then copy back alias to entry point name.
2021-01-04 09:40:11 +01:00
Hans-Kristian Arntzen
a11c4780d0 GLSL: Emit nonuniformEXT in correct place for late-combined samplers.
Need to emit nonuniformEXT(sampler2D()) since constructor expressions in
Vulkan GLSL do not propgate the nonuniform qualifier.
2020-12-07 13:00:15 +01:00
comex
c80cbde7aa spirv_msl: Don't add fixup hooks for builtin variables if they're unused.
This is necessary to avoid invalid output because of how implicit
dependencies on builtins work.

For example, the fixup for `BuiltInSubgroupEqMask` initializes the
variable based on `builtin_subgroup_invocation_id_id`, a field storing
the ID for a variable with decoration `BuiltInSubgroupLocalInvocationId`.
This could be either a variable that already exists in the input
(spirv_msl.cpp:300) or, if necessary, a newly created one
(spirv_msl.cpp:621).  In both cases, though,
`builtin_subgroup_invocation_id_id` is only set under the condition
`need_subgroup_mask || needs_subgroup_invocation_id`.
`need_subgroup_mask` is true if any of the `BuiltInSubgroupXXMask` are
set in `active_input_builtins`.

Normally, if the program contains `BuiltInSubgroupEqMask`,
`Compiler::ActiveBuiltinHandler` will set it in `active_input_builtins`.
But this only happens if the variable is actually used, whereas
`fix_up_shader_inputs_outputs` loops over all variables in the program
regardless of whether they're used.

If `BuiltInSubgroupEqMask` is not used,
`builtin_subgroup_invocation_id_id` is never set, but before this patch
the fixup hook would try to use it anyway, producing MSL that references
a nonexistent variable named `_0`.

Avoid this by changing `fix_up_shader_inputs_outputs` to skip builtins
which are not set in `active_input_builtins` or
`active_output_builtins`.  And add a test case.
2020-11-25 13:41:12 -05:00
Chip Davis
68908355a9 MSL: Expand subgroup support.
Add support for declaring a fixed subgroup size. Metal, like Vulkan with
`VK_EXT_subgroup_size_control`, allows the thread execution width to
vary depending on factors such as register usage. Unfortunately, this
breaks several tests that depend on the subgroup size being what the
device says it is. So we'll fix the subgroup size at the size the device
declares. The extra invocations in the subgroup will appear to be
inactive. Because of this, the ballot mask builtins are now ANDed with
the active subgroup mask.

Add support for emulating a subgroup of size 1. This is intended to be
used by Vulkan Portability implementations (e.g. MoltenVK) when the
hardware/software combo provides insufficient support for subgroups.
Luckily for us, Vulkan 1.1 only requires that the subgroup size be at
least 1.

Add support for quadgroup and SIMD-group functions which were added to
iOS in Metal 2.2 and 2.3. This will allow clients to take advantage of
expanded quadgroup and SIMD-group support in recent Metal versions and
on recent Apple GPUs (families 6 and 7).

Gut emulation of subgroup builtins in fragment shaders. It turns out
codegen for the SIMD-group functions in fragment wasn't implemented for
AMD on Mojave; it's a safe bet that it wasn't implemented for the other
drivers either. Subgroup support in fragment shaders now requires Metal
2.2.
2020-11-20 15:55:49 -06:00
Hans-Kristian Arntzen
db13762297 MSL: Fix regression in image gather handling.
It was not always possible to get backing variable for a late-combined
image sampler.
2020-11-06 16:21:30 +01:00
Chip Davis
c20d5945a2 MSL: Allow framebuffer fetch on Mac in MSL 2.3.
Another Apple GPU feature that will now be supported on Apple Silicon
Macs.
2020-10-29 10:50:59 -05:00
Hans-Kristian Arntzen
f65f259ab7 MSL: Do not use component::x gather for depth2d textures. 2020-10-26 10:18:17 +01:00
Chip Davis
1264e2705e MSL: Cast broadcast booleans to ushort.
Metal doesn't support broadcasting or shuffling boolean values, but we
can work around that by casting it to `ushort`, then casting it back to
`bool`. I used `ushort` instead of `uint` because 16-bit values give
better throughput on Apple GPUs.
2020-10-23 21:55:46 -05:00
Chip Davis
781367d083 MSL: Support vectors with OpGroupNonUniformAllEqual.
This was not tested here in SPIRV-Cross. Predictably, it broke when I
tried it in the CTS.
2020-10-23 21:55:46 -05:00
Hans-Kristian Arntzen
3360daa6f3 MSL: Fix OpCompositeInsert and OpVectorInsertDynamic.
Need to take care of unpacked RHS expressions.
2020-09-02 10:27:39 +02:00
Chip Davis
688c5fcbda MSL: Add support for processing more than one patch per workgroup.
This should hopefully reduce underutilization of the GPU, especially on
GPUs where the thread execution width is greater than the number of
control points.

This also simplifies initialization by reading the buffer directly
instead of using Metal's vertex-attribute-in-compute support. It turns
out the only way in which shader stages are allowed to differ in their
interfaces is in the number of components per vector; the base type must
be the same. Since we are using the raw buffer instead of attributes, we
can now also emit arrays and matrices directly into the buffer, instead
of flattening them and then unpacking them. Structs are still flattened,
however; this is due to the need to handle vectors with fewer components
than were output, and I think handling this while also directly emitting
structs could get ugly.

Another advantage of this scheme is that the extra invocations needed to
read the attributes when there were more input than output points are
now no more. The number of threads per workgroup is now lcm(SIMD-size,
output control points). This should ensure we always process a whole
number of patches per workgroup.

To avoid complexity handling indices in the tessellation control shader,
I've also changed the way vertex shaders for tessellation are handled.
They are now compute kernels using Metal's support for vertex-style
stage input. This lets us always emit vertices into the buffer in order
of vertex shader execution. Now we no longer have to deal with indexing
in the tessellation control shader. This also fixes a long-standing
issue where if an index were greater than the number of vertices to
draw, the vertex shader would wind up writing outside the buffer, and
the vertex would be lost.

This is a breaking change, and I know SPIRV-Cross has other clients, so
I've hidden this behind an option for now. In the future, I want to
remove this option and make it the default.
2020-07-23 17:59:54 -05:00
Hans-Kristian Arntzen
fa5b206d97 MSL: Workaround broken vector -> scalar access chain in MSL.
On MSL, the compiler refuses to allow access chains into a normal vector type.
What happens in practice instead is a read-modify-write where a vector type is
loaded, modified and written back.

The workaround is to convert a vector into a pointer-to-scalar before
the access chain continues to add the scalar index.
2020-07-06 10:03:44 +02:00
Hans-Kristian Arntzen
e1600d4df8 MSL: Use input attachment index directly for resource index fallback. 2020-07-06 09:49:46 +02:00
Hans-Kristian Arntzen
ace4d25222 MSL: Add test case for constructing struct with non-value-type array. 2020-06-18 12:55:59 +02:00
Hans-Kristian Arntzen
7314f51a32 MSL: Deal with loading non-value-type arrays. 2020-06-18 12:46:39 +02:00
Hans-Kristian Arntzen
02db4c1f16 MSL: Add tests for array copies in and out of buffers. 2020-06-18 11:59:02 +02:00
Hans-Kristian Arntzen
0ebb88cc39 MSL: Redirect member indices when buffer has been sorted by Offset.
If a buffer rewrites its Offsets, all member references to that struct
are invalidated, and must be redirected, do so in to_member_reference,
but there might be other places where this is needed. Fix as required.
SPIR-V code relying on this is somewhat questionable, but seems to be
in-spec.
2020-04-30 11:48:53 +02:00
Hans-Kristian Arntzen
d7d630a0b7
Merge pull request #1347 from KhronosGroup/fix-1343
Implement OpAtomicLoad/OpAtomicStore.
2020-04-27 15:29:21 +02:00
Hans-Kristian Arntzen
9b7140e2ba Implement OpAtomicLoad/OpAtomicStore.
Need some emulation on GLSL/HLSL, fix bug with atomic store on MSL.
2020-04-27 12:11:46 +02:00
Hans-Kristian Arntzen
6ef47d6657 MSL: Fix case where subpassInput is passed to leaf functions. 2020-04-27 11:29:21 +02:00
Hans-Kristian Arntzen
5e5d1c27ce GLSL: Support f16x2 <-> f32 bitcast.
There is no native formulation, so introduce a concept of a "complex"
bitcast to handle odd-ball cases which have no native unary operation.
2020-04-21 23:27:33 +02:00
Hans-Kristian Arntzen
f8592ecdfc MSL: Deal correctly with initializers on Private variables.
Do not attempt to defer declaration. It would happen to work in most
cases, but the edge case is where the first thing that happens to a
variable is being OpStore'd into.
2020-04-21 11:20:49 +02:00
Hans-Kristian Arntzen
17ad62eea4 MSL: Support edge case with DX layout in scalar block layout.
DX may emit ArrayStride and MatrixStride of 16, but the size of the
object does not align with that and expect to pack other members inside
its last member.

The workaround is to emit array size/col/row one less than we expect and
rely on padding to carve out a "dead zone" for the last member.
2020-04-20 15:29:24 +02:00
Hans-Kristian Arntzen
c7b75a8fe6 MSL: Do not use base expression with PhysicalTypeID OpCompositeExtract.
Similar reasoning as packed expressions.
2020-04-07 18:25:44 +02:00
Hans-Kristian Arntzen
d9d3359ffb MSL: Deal with cases where builtin is implicitly needed, declared, but unused.
We need to make sure any builtins which are declared and unused are
emitted as active variables.
2020-04-03 12:50:21 +02:00
Hans-Kristian Arntzen
3cb6aeb480 MSL: Fix access chain for deep struct hierarchy on array of buffers. 2020-03-31 14:17:29 +02:00
Hans-Kristian Arntzen
b8905bbd95 Add support for forcefully zero-initialized variables.
Useful to better support certain platforms which require all variables
to be initialized to something.
2020-03-26 13:38:27 +01:00
Hans-Kristian Arntzen
30343f3e95 MSL: Reintroduce workaround for constant arrays being passed by value. 2020-02-24 13:22:52 +01:00
Hans-Kristian Arntzen
af787a8a79
Merge pull request #1264 from KhronosGroup/msl-argument-buffer-persist
MSL: Add support for force-activating IAB resources.
2020-01-16 14:44:23 +01:00
Hans-Kristian Arntzen
c3bd136df1 MSL: Add support for force-activating IAB resources.
Important for ABI compatibility on MSL in certain cases.
2020-01-16 11:12:06 +01:00
Hans-Kristian Arntzen
f79c1e2fed Deal with illegal names in types as well.
- Fixes issue with clip_distance flattening in MSL where member to
  flatten from would come from to_member_name, where it should have used
  the builtin name directly. This member name was modified by this patch
  and broke clip distance test shaders.

- Some cleanups with ir.meta, use ir.find_meta instead to not create
  unnecessary hashmap nodes.
2020-01-16 10:34:49 +01:00
Hans-Kristian Arntzen
2bbb012e9c MSL: Deal with sign on wave min/max. 2020-01-09 12:35:18 +01:00
Hans-Kristian Arntzen
c024e24d45 MSL: Deal with padded fragment output + Component decoration. 2020-01-07 17:02:12 +01:00
Hans-Kristian Arntzen
93f3265fe0 MSL: Deal with packing vectors for vertex input/fragment output. 2020-01-07 14:14:31 +01:00
Hans-Kristian Arntzen
7a69d764b0 MSL: Add trivial tests for Component decoration.
Verifies that Component decoration is honored for vertex outputs and
fragment inputs.
2020-01-07 11:36:51 +01:00
Hans-Kristian Arntzen
8bef6ff167 Add test shader for OpCopyLogical with packing/unpacking. 2020-01-06 12:44:18 +01:00
Hans-Kristian Arntzen
9012a39b60 Basic implementation of OpCopyLogical. 2020-01-06 11:47:26 +01:00
Hans-Kristian Arntzen
d9afa9e238 MSL: Fix unpack_expression from column of padded matrix. 2019-11-07 11:35:07 +01:00
Hans-Kristian Arntzen
d4ca91f6c2 Move .invalid. test shaders to the more appropriate subfolders. 2019-11-06 10:40:37 +01:00
Dan Sinclair
d409210ee5 Move all .invalid shaders into no-opt folders. 2019-11-05 13:19:19 -05:00
Hans-Kristian Arntzen
3b5c4c7316 Implement constant empty struct correctly on all backends.
MSL actually supports empty structs, so enable that path as well.
2019-10-26 16:10:11 +02:00
Lukas Hermanns
c236ca4572 Moved all UE4 test shaders into 'shaders-ue4/' folder. 2019-10-23 17:39:05 -04:00
Lukas Hermanns
2482ff708c Merge remote-tracking branch 'upstream/master' 2019-10-14 11:06:15 -04:00
Hans-Kristian Arntzen
07e9501ae1 MSL: Fix regression with OpCompositeConstruct from std140 float[].
Simple fix, just need to use to_unpacked_expression rather than to_expression here to
deal with this.
2019-10-11 11:21:43 +02:00
Lukas Hermanns
0853bcaee1 Disabled spvUnsafeArray<> type for packed vectors and added test cases for those arrays. 2019-10-09 17:59:47 -04:00
Hans-Kristian Arntzen
a0c13e4ee8 Do not consider aliased struct types if the master is not a block.
It is possible for a shader to declare two plain struct types which
simply share the same OpName without there being an implicit
value/buffer alias relationship.

For to_member_name(), make sure to use the type alias master when
resolving member names. The member name may be different in a type alias
master if the SPIR-V is being intentionally difficult.
2019-10-07 10:52:16 +02:00
Ryan Harrison
cf1bf1c6ae Update external/ to SPIR-V 1.5
Rolled the hashes used for glslang, SPIRV-Tools, and SPIRV-Headers to
HEAD, which includes the update to 1.5.

Added passing '--amb' to glslang, so I didn't have to explicitly set
bindings in a large number of test shaders that currently don't, and
now glslang considers them invalid.

Marked all shaders that no longer pass spirv-val as .invalid.
2019-09-18 16:04:27 -04:00
Hans-Kristian Arntzen
0286442906 Add test case for interlocks in control flow. 2019-09-04 13:10:32 +02:00
Hans-Kristian Arntzen
65e48ca5ea Add interlock test for split functions doing begin/end. 2019-09-04 12:26:34 +02:00
Hans-Kristian Arntzen
261b46982a Deal with complex interlock cases in GLSL. 2019-09-04 12:18:04 +02:00
Hans-Kristian Arntzen
63a770ed5c Add test shader for simple case of interlocked callstack. 2019-09-04 11:56:19 +02:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen
b3305799a8 Deal correctly with sign on bitfield operations.
Need a lot of special purpose implementation functions for these.
2019-08-26 11:36:36 +02:00
Hans-Kristian Arntzen
abb345d0b3 MSL: Deal with Modf/Frexp where output is access chain to scalar.
This is not allowed as we cannot take mutable reference to a
vec.{x,y,z,w}. We only care about scalar since entire vectors are fine.
2019-07-26 11:02:38 +02:00
Hans-Kristian Arntzen
18bcc9b790 Do not disable temporary forwarding when we suppress usage tracking.
This subtle bug removed any expression validation for trivially swizzled
variables. Make usage suppression a more explicit concept rather than
just hacking off forwarded_temporaries.

There is some fallout here with loop generation since our expression
invalidation is currently a bit too naive to handle loops properly.
The forwarding bug masked this problem until now.

If part of the loop condition is also used in the body, we end up
reading an invalid expression, which in turn forces a temporary to be
generated in the condition block, not good. We'll need to be smarter
here ...
2019-07-23 19:18:44 +02:00
Hans-Kristian Arntzen
8ba0507a6d Add another test for unpacking without load forwarding. 2019-07-23 17:14:59 +02:00
Hans-Kristian Arntzen
1ece67a050 Look at pointee type when unpacking expressions.
We might be unpacking in OpLoad, so don't want any pointer types from
access chains creeping in.
2019-07-23 17:07:15 +02:00
Hans-Kristian Arntzen
ebe109d91d Deal correctly with non-forwarded packed loads.
Need to unpack the expression if we're not forwarding.
2019-07-23 16:25:19 +02:00
Hans-Kristian Arntzen
79f533b662 Test CompositeInsert/Extract/VectorShuffle on packed vectors. 2019-07-23 15:44:35 +02:00
Hans-Kristian Arntzen
5582145549 Add test for array of scalar struct. 2019-07-23 15:30:03 +02:00
Hans-Kristian Arntzen
5c1cb7accf Recursively pack struct types when we find scalar packed structs. 2019-07-23 15:24:53 +02:00
Hans-Kristian Arntzen
0f10601f27 Test matrix multiplies in more complex scenarios. 2019-07-23 12:12:24 +02:00
Hans-Kristian Arntzen
978253c804 Test implicit packing of struct members. 2019-07-23 12:04:15 +02:00
Hans-Kristian Arntzen
fc741596d4 Add tests for struct padding and self-alignment. 2019-07-23 11:46:34 +02:00
Hans-Kristian Arntzen
7277c7ac46 Use to_unpacked_row_major_expression to unify row-major in MSL/GLSL. 2019-07-23 11:36:54 +02:00
Hans-Kristian Arntzen
47a18b9f1b Simplify row-major matrix/vector multiplies. 2019-07-23 10:56:57 +02:00
Hans-Kristian Arntzen
d584d833fa Test array of std140 vectors. 2019-07-23 10:38:32 +02:00
Hans-Kristian Arntzen
6224199c76 Add struct size padding tests. 2019-07-23 10:30:37 +02:00
Hans-Kristian Arntzen
82c819ee6c Add test for CompositeExtract from row-major loaded vector. 2019-07-22 16:32:22 +02:00
Hans-Kristian Arntzen
d7a5303cf2 Add test for split access chain into row-major matrix. 2019-07-22 16:28:05 +02:00
Hans-Kristian Arntzen
172185016f MSL: Add std140 and scalar matrix layouts. 2019-07-22 11:30:03 +02:00
Hans-Kristian Arntzen
6471236652 MSL: Add std430 matrix access test. 2019-07-22 11:23:06 +02:00
Hans-Kristian Arntzen
c7eda1bce9 Test glsl.std450 more exhaustively.
Make sure to test everything with scalar as well to catch any weird edge
cases.

Not all opcodes are covered here, just the arithmetic ones. FP64 packing
is also ignored.
2019-07-17 11:53:05 +02:00
Hans-Kristian Arntzen
932ee0e328 Deal correctly with return sign of bitscan operations. 2019-07-12 10:57:56 +02:00
Hans-Kristian Arntzen
c365cc1b43 Deal with OpPhi and case fallthrough.
This is quite complex since we cannot flush Phi inside the case labels,
we have to do it outside by emitting a lot of manual branches ourselves.

This should be extremely rare, but we need to handle this case.
2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen
8b236f24f1 Fix infinite loop when OpAtomic* temporaries are used in other blocks.
We made the mistake of registering a dependency on the atomic variable
even if the atomic result was forced to a temporary. There is no need to
register reads from atomic variables like this as we always force atomic
results to a temporary and argument read/writes do not need to be
tracked.
2019-04-24 09:33:39 +02:00
Hans-Kristian Arntzen
9ae91c2d1e Deal with mismatched signs in S/U/F conversion opcodes. 2019-04-10 14:03:58 +02:00