Commit Graph

1002 Commits

Author SHA1 Message Date
Chip Davis
547c29f7bb MSL: Allow Bias and Grad arguments with comparison on Mac in MSL 2.3.
I kept the code to replace constant zero arguments, because `Bias` and
`Grad` still have some problems on desktop GPUs.

`Bias` works on AMD GPUs. `Grad` does not. Both work on Intel. Still
needs testing on NV. It will definitely work with Apple GPUs.
2020-10-30 11:14:59 -05:00
Hans-Kristian Arntzen
439b666829 GLSL: Fix nonuniformEXT injection.
Needs to consider that other expressions might be using brackets as well
...
2020-10-30 14:11:16 +01:00
Hans-Kristian Arntzen
541a801fed
Merge pull request #1514 from cdavis5e/msl-mac-framebuffer-fetch
MSL: Allow framebuffer fetch on Mac in MSL 2.3.
2020-10-30 08:09:41 +01:00
Yuwen Wu
c8a43876c7
added metal keyworld: "level" (#1501)
* added metal keyworld: "level"

* added more metal keywords

* updated test case.
2020-10-30 08:07:25 +01:00
Chip Davis
c20d5945a2 MSL: Allow framebuffer fetch on Mac in MSL 2.3.
Another Apple GPU feature that will now be supported on Apple Silicon
Macs.
2020-10-29 10:50:59 -05:00
Hans-Kristian Arntzen
78c6d2d628
Merge pull request #1509 from cdavis5e/mac-post-depth-coverage
MSL: Allow post-depth coverage on Mac in MSL 2.3.
2020-10-29 09:50:44 +01:00
Hans-Kristian Arntzen
08e49bfd67
Merge pull request #1508 from KhronosGroup/fix-1507
Handle case where block is loop header, continue AND break block.
2020-10-28 16:04:14 +01:00
Chip Davis
d48d2a95c7 MSL: Allow post-depth coverage on Mac in MSL 2.3.
It's still only supported on Apple GPUs, but Macs will have those soon.
2020-10-27 22:07:01 -05:00
Hans-Kristian Arntzen
542d460364 Handle case where block is loop header, continue AND break block. 2020-10-27 12:29:08 +01:00
Hans-Kristian Arntzen
e47561a28b GLSL: Support a workaround for loading row-major matrices.
On AMD Windows OpenGL, it has been reported that we need to load
matrices via a wrapper function.
2020-10-27 12:07:09 +01:00
Hans-Kristian Arntzen
f65f259ab7 MSL: Do not use component::x gather for depth2d textures. 2020-10-26 10:18:17 +01:00
Chip Davis
1264e2705e MSL: Cast broadcast booleans to ushort.
Metal doesn't support broadcasting or shuffling boolean values, but we
can work around that by casting it to `ushort`, then casting it back to
`bool`. I used `ushort` instead of `uint` because 16-bit values give
better throughput on Apple GPUs.
2020-10-23 21:55:46 -05:00
Chip Davis
065b5bda3c MSL: Mask ballots passed to Ballot bit ops.
Only the least *n* bits are significant, where *n* is the subgroup size.
The Vulkan CTS actually checks this.

The `FindLSB` tests weren't actually failing, but I masked that anyway,
in case there's some corner case the CTS is missing.
2020-10-23 21:55:46 -05:00
Chip Davis
781367d083 MSL: Support vectors with OpGroupNonUniformAllEqual.
This was not tested here in SPIRV-Cross. Predictably, it broke when I
tried it in the CTS.
2020-10-23 21:55:46 -05:00
Chip Davis
6ccb902462 MSL: Correct definitions of subgroup ballot mask variables.
`SubgroupEqMask` had a fencepost error that gave wrong values for
invocation ID 32.

For `SubgroupGeMask` and `SubgroupGtMask`, I forgot to shift the values
from `extract_bits()` up so that the mask is in the correct position.
Using `insert_bits()` instead should fold these two operations into one.

`SubgroupLtMask` and `SubgroupLeMask` were already correct.
2020-10-23 21:54:55 -05:00
Chip Davis
064ed448b9 MSL: Don't remove periods from swizzle buffer index exprs. 2020-10-20 17:47:40 -05:00
Chip Davis
5845e009ea MSL: Handle Offset and Grad operands for 1D-as-2D textures. 2020-10-15 12:51:00 -05:00
Chip Davis
3e6010d8c5 MSL: Don't use a bitcast for tessellation levels in tesc shaders.
`half` cannot be bitcasted to `float`, because the two types are not the
same size. Use an expanding cast instead.

We were already doing this for stores to the tessellation levels; why I
didn't also do this for loads is beyond me.
2020-10-14 18:35:59 -05:00
Chip Davis
21d38f74ce MSL: Fix calculation of atomic image buffer address.
Fix reversed coordinates: `y` should be used to calculate the row
address. Align row address to the row stride.

I've made the row alignment a function constant; this makes it possible
to override it at pipeline compile time.

Honestly, I don't know how this worked at all for Epic. It definitely
didn't work in the CTS prior to this.
2020-10-13 20:51:56 -05:00
Chip Davis
7a5d0d6b29 MSL: Add missing interlock handling to atomic image buffers. 2020-10-13 11:44:17 -05:00
Hans-Kristian Arntzen
fab6ad234e
Merge pull request #1486 from cdavis5e/atomic-image-argument-buffer
MSL: Support atomic access to images from argument buffers.
2020-10-13 12:55:43 +02:00
Chip Davis
9cafea6cf8 MSL: Support atomic access to images from argument buffers.
This was not added when Epic contributed atomic image support.

Fixes #1484.
2020-10-13 02:37:18 -05:00
Chip Davis
2219c4a392 MSL: Support SPV_EXT_demote_to_helper_invocation for MSL 2.3.
MSL 2.3 has everything needed to support this extension on all
platforms. The existing `discard_fragment()` function was given demote
semantics, similar to Direct3D, and the `simd_is_helper_thread()`
function was finally added to iOS.

I've left the old test alone. Should I remove it in favor of these?
2020-10-13 00:25:32 -05:00
Hans-Kristian Arntzen
5619329665 Style nits for GL subgroup implementation. 2020-10-08 13:25:29 +02:00
Hans-Kristian Arntzen
a6f6547cf1 Add missing VK variant of the test file. 2020-10-08 12:22:45 +02:00
Hans-Kristian Arntzen
28994a3186 Update GL subgroup test file. 2020-10-08 12:22:24 +02:00
Hans-Kristian Arntzen
819c599ecd Merge branch 'issues1350-2' of git://github.com/devshgraphicsprogramming/SPIRV-Cross into master 2020-10-08 12:20:07 +02:00
criss
db52e277b9 Resolved issues 1350, 1351, 1352 2020-10-08 12:14:52 +02:00
Hans-Kristian Arntzen
e0c9aad934 GLSL: Add support for transform_feedback3 geometry streams. 2020-09-30 13:01:35 +02:00
dan sinclair
9880b05572 Roll dependencies.
This CL rolls the spirv-tools, spirv-headers and glslang dependencies.
2020-09-22 12:31:38 -04:00
Hans-Kristian Arntzen
54cc0b01f6 Deal with case where a selection construct conditionally merges/breaks. 2020-09-17 12:02:43 +02:00
Hans-Kristian Arntzen
66afe8c499 Implement a simple evaluator of specialization constants.
In some cases, we need to get a literal value from a spec constant op.
Mostly relevant when emitting buffers, so implement a 32-bit integer
scalar subset of the evaluator. Can be extended as needed to support
evaluating any specialization constant operation.
2020-09-14 11:45:59 +02:00
Chip Davis
4cf840ee7b MSL: Support layered input attachments.
These need to use arrayed texture types, or Metal will complain when
binding the resource. The target layer is addressed relative to the
Layer output by the vertex pipeline, or to the ViewIndex if in a
multiview pipeline. Unlike with the s/t coordinates, Vulkan does not
forbid non-zero layer coordinates here, though this cannot be expressed
in Vulkan GLSL.

Supporting 3D textures will require additional work. Part of the problem
is that Metal does not allow texture views to subset a 3D texture, so we
need some way to pass the base depth to the shader.
2020-09-02 09:18:25 -05:00
Hans-Kristian Arntzen
3360daa6f3 MSL: Fix OpCompositeInsert and OpVectorInsertDynamic.
Need to take care of unpacked RHS expressions.
2020-09-02 10:27:39 +02:00
Chip Davis
cab7335e64 MSL: Don't set the layer for multiview if the device doesn't support it.
Some older iOS devices don't support layered rendering. In that case,
don't set `[[render_target_array_index]]`, because the compiler will
reject the shader in that case. The client will then have to unroll the
render pass manually.
2020-09-01 19:30:28 -05:00
Chip Davis
53080ecca8 MSL: Fix multiview view index calculation with a non-zero base instance.
Account for a non-zero base instance when calculating the view index and
the "real" instance index. Before, it was likely broken with a non-zero
base instance, since the calculated instance index could be less than
the base instance.
2020-08-31 20:33:44 -05:00
Hans-Kristian Arntzen
a07441568e Overhaul how we deal with reserved identifiers.
- Do not silently drop reserved identifiers in the parser. This makes it
  possible to reflect identifiers which are reserved by the
  cross-compiler module.
- Instead of dropping the name, emit _RESERVED_IDENTIFIER_FIXUP in the
  source to make it clear that a name has been rewritten.
- Document what is reserved and not.
2020-08-21 16:33:27 +02:00
Hans-Kristian Arntzen
f0fe4442e3
Merge pull request #1448 from KhronosGroup/fix-1437
HLSL: Fix some subtle bugs in buffer packing handling.
2020-08-20 19:21:50 +02:00
Hans-Kristian Arntzen
fdbc80d131 HLSL: Fix FragCoord.w.
Need to invert it, SM 4.0+ uses W, not 1/W (like Vulkan/GL).
2020-08-20 16:22:48 +02:00
Hans-Kristian Arntzen
fad36a6b28 HLSL: Deal with partially filled 16-byte word in cbuffers.
The last element of an array or matrix in HLSL cbuffers are not filled
completely, but only have a size equal to the base vector.
2020-08-20 16:05:21 +02:00
Hans-Kristian Arntzen
dd1f53ff15 HLSL: Fix bug in is_packing_standard for cbuffer.
Was not keeping offset in sync with actual_offset and HLSL could trigger
spurious realignments due to the straddle check.
2020-08-20 15:26:55 +02:00
Le Hoang Quyen
ab8eb70af1 Fix #1445: MSL: Enclose args when convert distance(a,b) to abs(a-b) 2020-08-13 21:16:08 +08:00
Chip Davis
3347b1076d MSL: Fix handling of matrices and structs in the output control point array.
Prior to this point, we were treating them as flattened, as they are in
old-style tessellation control shaders, and still are for structs in
new-style shaders. This is not true for outputs; output composites are
not flattened at all. This semantic mismatch broke a Vulkan CTS test.
It should now pass.
2020-08-03 17:18:18 -05:00
Hans-Kristian Arntzen
8a1843ab20 Add some test cases for complex type aliasing scenario. 2020-07-29 13:02:52 +02:00
Hans-Kristian Arntzen
aac6885950 GLSL: Be more aggressive about using type_alias.
To facilitate an improved linking-by-name use case for older GL,
we will be more aggressive about merging struct definitions, even for
rather unrelated cases where we don't strictly need to use type aliases.
2020-07-29 12:48:41 +02:00
Hans-Kristian Arntzen
57c93d44ac GLSL: Add option to force flattening IO blocks.
It is not always desirable to use actual blocks.
A prime example in the case where EXT_shader_io_blocks is not supported
on the target implementation.
2020-07-28 15:16:06 +02:00
Hans-Kristian Arntzen
f5e9f4a172
Merge pull request #1432 from ponitka/hlsl-sample-mask
Adding BuiltInSampleMask in HLSL
2020-07-28 14:40:40 +02:00
Tomek Ponitka
ba58f78395 Adding BuiltInSampleMask in HLSL 2020-07-27 14:14:26 +02:00
Tomek Ponitka
18f23c47d9 Enabling setting a fixed sampleMask in Metal fragment shaders.
In Metal render pipelines don't have an option to set a sampleMask
parameter, the only way to get that functionality is to set the
sample_mask output of the fragment shader to this value directly.
We also need to take care to combine the fixed sample mask with the
one that the shader might possibly output.
2020-07-24 11:19:46 +02:00
Chip Davis
688c5fcbda MSL: Add support for processing more than one patch per workgroup.
This should hopefully reduce underutilization of the GPU, especially on
GPUs where the thread execution width is greater than the number of
control points.

This also simplifies initialization by reading the buffer directly
instead of using Metal's vertex-attribute-in-compute support. It turns
out the only way in which shader stages are allowed to differ in their
interfaces is in the number of components per vector; the base type must
be the same. Since we are using the raw buffer instead of attributes, we
can now also emit arrays and matrices directly into the buffer, instead
of flattening them and then unpacking them. Structs are still flattened,
however; this is due to the need to handle vectors with fewer components
than were output, and I think handling this while also directly emitting
structs could get ugly.

Another advantage of this scheme is that the extra invocations needed to
read the attributes when there were more input than output points are
now no more. The number of threads per workgroup is now lcm(SIMD-size,
output control points). This should ensure we always process a whole
number of patches per workgroup.

To avoid complexity handling indices in the tessellation control shader,
I've also changed the way vertex shaders for tessellation are handled.
They are now compute kernels using Metal's support for vertex-style
stage input. This lets us always emit vertices into the buffer in order
of vertex shader execution. Now we no longer have to deal with indexing
in the tessellation control shader. This also fixes a long-standing
issue where if an index were greater than the number of vertices to
draw, the vertex shader would wind up writing outside the buffer, and
the vertex would be lost.

This is a breaking change, and I know SPIRV-Cross has other clients, so
I've hidden this behind an option for now. In the future, I want to
remove this option and make it the default.
2020-07-23 17:59:54 -05:00