Commit Graph

618 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen
ad5eae46ed
Merge pull request #1078 from cdavis5e/post-depth-coverage
Support the SPV_KHR_post_depth_coverage extension.
2019-07-12 09:56:26 +02:00
Hans-Kristian Arntzen
2e32d4c0db
Merge pull request #1079 from cdavis5e/msl-boolean-mix
MSL: Use the select() function for OpSelect.
2019-07-12 09:52:57 +02:00
Chip Davis
6628ea6e48 MSL: Use the select() function for OpSelect.
This significantly improves codegen for vector `OpSelect` in MSL.
2019-07-11 10:30:37 -05:00
Chip Davis
1df47db6ba Support the SPV_KHR_post_depth_coverage extension.
Using the `PostDepthCoverage` mode specifies that the `gl_SampleMaskIn`
variable is to contain the computed coverage mask following the early
fragment tests, which this mode requires and implicitly enables.

Note that unlike Vulkan and OpenGL, Metal places this on the sample mask
input itself, and furthermore does *not* implicitly enable early
fragment testing. If it isn't enabled explicitly with an
`[[early_fragment_tests]]` attribute, the compiler will error out. So we
have to enable that mode explicitly if `PostDepthCoverage` is enabled
but `EarlyFragmentTests` isn't.

For Metal, only iOS supports this; for some reason, Apple has yet to
implement it on macOS, even though many desktop cards support it.
2019-07-11 10:28:43 -05:00
Hans-Kristian Arntzen
63bcbd511e GLSL: Need extension to use bitcast on GLSL < 330. 2019-07-11 15:32:57 +02:00
Hans-Kristian Arntzen
25c74b324e Forget loop variable enables after emitting block chain. 2019-07-10 12:57:12 +02:00
Hans-Kristian Arntzen
f6f849397e MSL: Re-roll array expressions in initializers.
We cannot rely on copy path when using an array as part of a struct
initializer, so reroll such expressions to an initializer list again.
2019-07-10 11:19:33 +02:00
Hans-Kristian Arntzen
53ab2144b9
Merge pull request #1064 from KhronosGroup/fix-1062
Fall back to complex loop if non-trivial continue block is found.
2019-07-08 13:58:35 +02:00
Hans-Kristian Arntzen
50342966c0 Fall back to complex loop if non-trivial continue block is found.
There is a case where we can deduce a for/while loop, but the continue
block is actually very painful to deal with, so handle that case as
well. Removes an exceptional case.
2019-07-08 11:54:29 +02:00
Hans-Kristian Arntzen
d12b54bbb4 Propagate NonUniformEXT to dependent expressions.
This decoration might only be present for the very last ID which is
consumed by a sampling or Load/Store instruction. To make sure our
access chains are emitted correctly, we have to back-propagate this
decoration.
2019-07-08 11:19:38 +02:00
Lifeng Pan
5ca8779044 Parse SPIR-V debug information extended instructions, as well as OpNoLine.
No impact on result shader string.
2019-07-04 16:21:44 +08:00
Hans-Kristian Arntzen
581ed0fd59 HLSL: Does not support case-fallthrough.
Disable any fallthrough on HLSL. Risky business if fallthrough blocks
had a barrier(), but can't do anything about that ...
2019-06-27 15:10:17 +02:00
Hans-Kristian Arntzen
c76b99b711 Handle more cases with FP16 and texture sampling. 2019-06-27 15:04:22 +02:00
Hans-Kristian Arntzen
bcef66fbf3 Fix declaration of loop variables with a Phi helper copy.
Certain Phi variables need to maintain a temporary copy, but we forgot
to declare them when the master variable is a loop variable itself.
2019-06-25 10:45:15 +02:00
Hans-Kristian Arntzen
7557ff5567 Workaround GCC 9 bug. 2019-06-24 10:17:25 +02:00
Hans-Kristian Arntzen
b4e0163749 Run format_all.sh. 2019-06-21 16:02:22 +02:00
Hans-Kristian Arntzen
bcec5cb370 Old MSVC does not like +[] constructs. 2019-06-21 14:59:51 +02:00
Hans-Kristian Arntzen
c365cc1b43 Deal with OpPhi and case fallthrough.
This is quite complex since we cannot flush Phi inside the case labels,
we have to do it outside by emitting a lot of manual branches ourselves.

This should be extremely rare, but we need to handle this case.
2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen
22e3beaab9 Deal with switch block fallthrough more correctly ... 2019-06-20 12:14:19 +02:00
Hans-Kristian Arntzen
bc3bf47446 Rewrite how switch block case labels are emitted. 2019-06-20 11:57:05 +02:00
Hans-Kristian Arntzen
7fdb418f18
Merge pull request #1028 from KhronosGroup/fix-1010
MSL: Support barycentrics and PrimitiveID in fragment shaders
2019-06-19 15:29:14 +02:00
Hans-Kristian Arntzen
707312b83a GLSL: Support NV barycentrics. 2019-06-19 09:52:35 +02:00
Hans-Kristian Arntzen
f171d82590 MSL: Support MinLod operand. 2019-06-19 09:43:03 +02:00
Hans-Kristian Arntzen
d81bfc5b58 MSL: Fix regression with Private parameter declaration.
If we compile multiple times due to forced_recompile, we had
deferred_declaration = true while emitting function prototypes which
broke an assumption. Fix this by clearing out stale state before leaving
a function.
2019-06-13 10:36:21 +02:00
Hans-Kristian Arntzen
a9da59b0b8 GLSL: Support GL_ARB_shader_stencil_export. 2019-06-12 10:06:54 +02:00
Hans-Kristian Arntzen
bf56dc88b9 Rewrite how loop dominators are propagated.
Do this analysis in the CFG stage rather than last minute with the
ad-hoc algorithm we had in place before CFG was introduced.
2019-06-06 12:17:46 +02:00
Hans-Kristian Arntzen
720681da39
Merge pull request #1006 from KhronosGroup/fix-1003
Deal with case where a block is somehow emitted in a duplicated fashion.
2019-06-05 16:11:06 +02:00
Patrick Mours
8d64d5e776 Fix storage packing qualifiers missing on "shaderRecordNV" buffers 2019-06-05 13:31:24 +02:00
Patrick Mours
be3035db26 Fix callable data variables 2019-06-05 13:31:24 +02:00
Patrick Mours
789178666f Add support for "shaderRecordNV" qualifier 2019-06-05 13:31:24 +02:00
Hans-Kristian Arntzen
c09ca74c61 Deal with case where a block is somehow emitted in a duplicated fashion.
We seen to have to deal with a case where a block is used multiple times
without any "proper" structured control flow, so we risk losing deferred
declaration state.
2019-06-05 12:39:40 +02:00
Hans-Kristian Arntzen
65af09d2d1 Support emitting OpLine directive.
Facilitates easier mapping from source language to cross-compiled output
in tooling.
2019-05-28 13:44:24 +02:00
Hans-Kristian Arntzen
23889f7b87 GLSL: Support std430 in UBOs with scalar layout. 2019-05-28 12:22:44 +02:00
Hans-Kristian Arntzen
b3094cd02a Run format_all.sh. 2019-05-27 16:54:13 +02:00
Hans-Kristian Arntzen
7b9e0fb428 MSL: Implement OpArrayLength.
This gets rather complicated because MSL does not support OpArrayLength
natively. We need to pass down a buffer which contains buffer sizes, and
we compute the array length on-demand.

Support both discrete descriptors as well as argument buffers.
2019-05-27 16:13:09 +02:00
Hans-Kristian Arntzen
96492648d4 MSL: Fix struct declaration order with complex type aliases.
MSL generally emits the aliases, which means we cannot always place the
master type first, unlike GLSL and HLSL. The logic fix is just to
reorder after we have tagged types with packing information, rather than
doing it in the parser fixup.
2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen
45a36ad034 Run format_all.sh. 2019-05-14 09:54:35 +02:00
Hans-Kristian Arntzen
c52d6bcd0c
Merge pull request #975 from alpqr/master
GLSL: Add option to disable buffer blocks regardless of version
2019-05-14 09:51:39 +02:00
Laszlo Agocs
7bc31491be GLSL: Add option to disable buffer blocks regardless of version 2019-05-13 21:29:06 +02:00
Hans-Kristian Arntzen
647ddaee42 HLSL/MSL: Deal correctly with nonuniformEXT qualifier.
MSL does not seem to have a qualifier for this, but HLSL SM 5.1 does.
glslangValidator for HLSL does not support this, so skip any validation,
but it passes in FXC.
2019-05-13 14:58:27 +02:00
Hans-Kristian Arntzen
6fcf8c83d9 GLSL: Support OpBitcast for buffer references.
Update glslang/SPIRV-Tools/SPIRV-Headers references.
2019-05-09 10:29:31 +02:00
Hans-Kristian Arntzen
b6f8a20624 GLSL: Return correct sign for OpArrayLength.
.length() returns int, not uint ...
2019-05-07 19:02:32 +02:00
Hans-Kristian Arntzen
3186701739 GLSL: Support GL_EXT_nonuniform_qualifier. 2019-05-02 11:15:51 +02:00
Hans-Kristian Arntzen
6f091e7c8f GLSL: Support GL_EXT_scalar_block_layout. 2019-04-26 15:43:37 +02:00
Hans-Kristian Arntzen
758427e127 Fix GCC 4.x warning. 2019-04-26 13:09:54 +02:00
Hans-Kristian Arntzen
2cc374a0c8 GLSL: Implement GL_EXT_buffer_reference.
Buffer objects can contain arbitrary pointers to blocks.
We can also implement ConvertPtrToU and ConvertUToPtr.
The latter can cast a uint64_t to any type as it pleases,
so we will need to generate fake buffer reference blocks to be able to
cast the type.
2019-04-26 11:43:51 +02:00
Hans-Kristian Arntzen
8b236f24f1 Fix infinite loop when OpAtomic* temporaries are used in other blocks.
We made the mistake of registering a dependency on the atomic variable
even if the atomic result was forced to a temporary. There is no need to
register reads from atomic variables like this as we always force atomic
results to a temporary and argument read/writes do not need to be
tracked.
2019-04-24 09:33:39 +02:00
Hans-Kristian Arntzen
e23c9ea700 Force complex loop in certain rare access chain scenarios.
If we generate an access chain in a loop body, and it is consumed in the
loop continue block, we have a problem because we cannot emit a
temporary here holding the access chain reference. Force a complex loop
body to workaround this exceptionally rare case.
2019-04-10 16:02:03 +02:00
Hans-Kristian Arntzen
9ae91c2d1e Deal with mismatched signs in S/U/F conversion opcodes. 2019-04-10 14:03:58 +02:00
Hans-Kristian Arntzen
a489ba7fd1 Reduce pressure on global allocation.
- Replace ostringstream with custom implementation.
  ~30% performance uplift on vector-shuffle-oom test.
  Allocations are measurably reduced in Valgrind.

- Replace std::vector with SmallVector.
  Classic malloc optimization, small vectors are backed by inline data.
  ~ 7-8% gain on vector-shuffle-oom on GCC 8 on Linux.

- Use an object pool for IVariant type.
  We generally allocate a lot of SPIR* objects. We can amortize these
  allocations neatly by pooling them.

- ~15% overall uplift on ./test_shaders.py --iterations 10000 shaders/.
2019-04-09 15:09:44 +02:00