Commit Graph

307 Commits

Author SHA1 Message Date
Mark Satterthwaite
32557e9093 SPIRV doesn't distinguish depth textures from regular textures, but Metal does, so if we've ever seen a depth comparison operation we must ensure that the texture is specified as a depth-texture. 2019-09-06 16:58:27 -04:00
Chip Davis
39dce88d3b MSL: Add support for sampler Y'CbCr conversion.
This change introduces functions and in one case, a class, to support
the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of
GBGR8 and BGRG8 formats, for which Metal natively supports implicit
chroma reconstruction, we're on our own here. We have to do everything
ourselves. Much of the complexity comes from the need to support
multiple planes, which must now be passed to functions that use the
corresponding combined image-samplers. The rest is from the actual
Y'CbCr conversion itself, which requires additional post-processing of
the sample retrieved from the image.

Passing sampled images to a function was a particular problem. To
support this, I've added a new class which is emitted to MSL shaders
that pass sampled images with Y'CbCr conversions attached around. It
can handle sampled images with or without Y'CbCr conversion. This is an
awful abomination that should not exist, but I'm worried that there's
some shader out there which does this. This support requires Metal 2.0
to work properly, because it uses default-constructed texture objects,
which were only added in MSL 2. I'm not even going to get into arrays of
combined image-samplers--that's a whole other can of worms.  They are
deliberately unsupported in this change.

I've taken the liberty of refactoring the support for texture swizzling
while I'm at it. It's now treated as a post-processing step similar to
Y'CbCr conversion. I'd like to think this is cleaner than having
everything in `to_function_name()`/`to_function_args()`. It still looks
really hairy, though. I did, however, get rid of the explicit type
arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`.

Update the C API. In addition to supporting this new functionality, add
some compiler options that I added in previous changes, but for which I
neglected to update the C API.
2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen
3ccfbce264 Run format_all.sh. 2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen
d5a65b4190 GLSL: Assume image and sampler can be RelaxedPrecision.
When merging combined image samplers, we only looked at sampler, but DXC
emits RelaxedPrecision only for texture. Does not hurt to check for more
things.
2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen
5d97dae1eb Move branchless analysis to CFG.
Traverse backwards instead, far more robust. Should elide basically all
redundant continue; statements now.
2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen
d620f1dd26 Do not force temporary unless continue-only for loop dominates.
We would force temporaries in unexpected places, causing assertions to
throw if access chains were consumed in such loops.
2019-07-26 10:39:05 +02:00
Hans-Kristian Arntzen
e06efb7259 Missed case where DoWhile continue block deals with Phi. 2019-07-25 12:30:50 +02:00
Hans-Kristian Arntzen
a86308bce1 MSL: Begin rewrite of buffer packing logic. 2019-07-19 10:06:19 +02:00
Lifeng Pan
5ca8779044 Parse SPIR-V debug information extended instructions, as well as OpNoLine.
No impact on result shader string.
2019-07-04 16:21:44 +08:00
Hans-Kristian Arntzen
b4e0163749 Run format_all.sh. 2019-06-21 16:02:22 +02:00
Hans-Kristian Arntzen
2b11b331d6
Merge pull request #1036 from KhronosGroup/msl-auto-binding
MSL: Rewrite how resources are automatically assigned bindings.
2019-06-21 15:58:50 +02:00
Hans-Kristian Arntzen
c365cc1b43 Deal with OpPhi and case fallthrough.
This is quite complex since we cannot flush Phi inside the case labels,
we have to do it outside by emitting a lot of manual branches ourselves.

This should be extremely rare, but we need to handle this case.
2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen
e2c95bdcbc MSL: Rewrite how resource indices are fallback-assigned.
We used to use the Binding decoration for this, but this method is
hopelessly broken. If no explicit MSL resource remapping exists, we
remap automatically in a manner which should always "just work".
2019-06-21 12:54:08 +02:00
Hans-Kristian Arntzen
457eba355e Employ heuristics to figure out how to emit SSBO/UAV reflection names.
This is rather shaky, but we don't have many choices here except add a
lot of awkward and unintuitive options. Try to deduce this from OpSource
and fallback to heuristic.
2019-06-10 11:24:24 +02:00
Hans-Kristian Arntzen
6b52b0fe8b Deal with nested loops.
Actually need to hoist out variable to outermost loop.
2019-06-06 14:37:02 +02:00
Hans-Kristian Arntzen
02ae99f399 Use the existing loop dominator when doing loop variable preservation. 2019-06-06 12:22:28 +02:00
Hans-Kristian Arntzen
bf56dc88b9 Rewrite how loop dominators are propagated.
Do this analysis in the CFG stage rather than last minute with the
ad-hoc algorithm we had in place before CFG was introduced.
2019-06-06 12:17:46 +02:00
Hans-Kristian Arntzen
03d93abc1a Deal with case where a variable is dominated by inner part of a loop.
There is a risk that we try to preserve a loop variable through multiple
iterations, even though the dominating block is inside a loop.

Fix this by analyzing if a block starts off by writing to a variable. In
that case, there cannot be any preservation going on. If we don't, pretend the
loop header is reading the variable, which moves the variable to an
appropriate scope.
2019-06-06 11:11:44 +02:00
Hans-Kristian Arntzen
65af09d2d1 Support emitting OpLine directive.
Facilitates easier mapping from source language to cross-compiled output
in tooling.
2019-05-28 13:44:24 +02:00
Hans-Kristian Arntzen
42e64597a7 OpArrayLength must trigger active variables. 2019-05-27 16:44:02 +02:00
Hans-Kristian Arntzen
96492648d4 MSL: Fix struct declaration order with complex type aliases.
MSL generally emits the aliases, which means we cannot always place the
master type first, unlike GLSL and HLSL. The logic fix is just to
reorder after we have tagged types with packing information, rather than
doing it in the parser fixup.
2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen
6fcf8c83d9 GLSL: Support OpBitcast for buffer references.
Update glslang/SPIRV-Tools/SPIRV-Headers references.
2019-05-09 10:29:31 +02:00
Chip Davis
01c491648b Fix a copy-pasto. 2019-04-26 17:16:21 -05:00
Hans-Kristian Arntzen
2cc374a0c8 GLSL: Implement GL_EXT_buffer_reference.
Buffer objects can contain arbitrary pointers to blocks.
We can also implement ConvertPtrToU and ConvertUToPtr.
The latter can cast a uint64_t to any type as it pleases,
so we will need to generate fake buffer reference blocks to be able to
cast the type.
2019-04-26 11:43:51 +02:00
Hans-Kristian Arntzen
e23c9ea700 Force complex loop in certain rare access chain scenarios.
If we generate an access chain in a loop body, and it is consumed in the
loop continue block, we have a problem because we cannot emit a
temporary here holding the access chain reference. Force a complex loop
body to workaround this exceptionally rare case.
2019-04-10 16:02:03 +02:00
Hans-Kristian Arntzen
3fe57d3798 Do not use SmallVector as input type in public interfaces.
This is an API break, which we need to be careful with.
Handing out SmallVectors is easier since the interface is basically the
same.
2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen
a489ba7fd1 Reduce pressure on global allocation.
- Replace ostringstream with custom implementation.
  ~30% performance uplift on vector-shuffle-oom test.
  Allocations are measurably reduced in Valgrind.

- Replace std::vector with SmallVector.
  Classic malloc optimization, small vectors are backed by inline data.
  ~ 7-8% gain on vector-shuffle-oom on GCC 8 on Linux.

- Use an object pool for IVariant type.
  We generally allocate a lot of SPIR* objects. We can amortize these
  allocations neatly by pooling them.

- ~15% overall uplift on ./test_shaders.py --iterations 10000 shaders/.
2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen
317144a59c Detect invalid DoWhileLoop early.
We had a bug where error conditions in DoWhileLoop emit path would not
detect that statements were being emitted due to the masking behavior
which happens when force_recompile is true. Fix this.

Also, refactor force_recompile into member functions so we can properly
break on any situation where this is set, without having to rely on
watchpoints in debuggers.
2019-04-05 12:19:32 +02:00
Hans-Kristian Arntzen
fc37c52d26 Fix typo with array stride error message.
Trivial copy-paste bug.
2019-04-02 19:18:13 +02:00
Hans-Kristian Arntzen
9b92e68d71 Add an option to override the namespace used for spirv_cross.
This is a pragmatic trick to avoid symbol collision where a project
links against SPIRV-Cross statically, while linking to other projects
which also use SPIRV-Cross statically. We can end up with very awkward
symbol collisions which can resolve themselves silently because
SPIRV-Cross is pulled in as necessary. To fix this, we must use
different symbols and embed two copies of SPIRV-Cross in this scenario,
now with different namespaces, which in turn leads to different symbols.
2019-03-29 10:29:44 +01:00
Patrick Mours
b2a667520d Add reflection support for ray tracing acceleration structures 2019-03-26 15:09:42 +01:00
Patrick Mours
90c91e4f23 Fix missing check for purity on ray tracing builtins 2019-03-26 14:25:25 +01:00
Hans-Kristian Arntzen
d8e4d995e5 Remove strange include which got included for some reason. 2019-03-15 21:55:53 +01:00
Hans-Kristian Arntzen
e47a77d596 MSL: Implement Metal 2.0 indirect argument buffers. 2019-03-15 11:01:27 +01:00
Hans-Kristian Arntzen
8bfb04d29d Run format_all.sh
Disable clang format in C wrapper for now.
Some weird formatting bug with the try/catch macro.
2019-03-06 12:20:13 +01:00
Hans-Kristian Arntzen
ef24337849 Support do-while where test is negative. 2019-03-06 12:17:38 +01:00
Hans-Kristian Arntzen
70ff96b03f Deal with more for loop candidate cases.
We can trivially deal with cases where the loop tests are simply
inverted. We can also deal with cases where the condition block branches
to the merge block via other noop blocks.
This makes SPIR-V codegen easier when targeting SPIRV-Cross.
2019-03-06 11:24:43 +01:00
Hans-Kristian Arntzen
9bbdccddb7 Add a stable C API for SPIRV-Cross.
This adds a new C API for SPIRV-Cross which is intended to be stable,
both API and ABI wise.

The C++ API has been refactored a bit to make the C wrapper easier and
cleaner to write. Especially the vertex attribute / resource interfaces
for MSL has been rewritten to avoid taking mutable pointers into the
interface. This would be very annoying to wrap and it didn't fit well
with the rest of the C++ API to begin with. While doing this, I went
ahead and removed all the old deprecated interfaces.

The CMake build system has also seen an overhaul.
It is now possible to build static/shared/CLI separately with -D
options.
The shared library only exposes the C API, as it is the only ABI-stable
API. pkg-configs as well as CMake modules are exported and installed for
the shared library configuration.
2019-03-01 11:53:51 +01:00
Hans-Kristian Arntzen
825ff4af7e Replace locale handling.
We were using std::locale::global() to force a C locale which is not
safe when SPIRV-Cross is used in a multi-threaded environment.

To fix this, we could tap into various per-platform specific locale
handling to get safe thread-local locales, but since locales only affect
the decimal point in floats, we simply query the locale instead and do
the necessary radix replacement ourselves, without touching the locale.

This should be much safer and cleaner than the alternative.
2019-02-28 11:28:31 +01:00
Hans-Kristian Arntzen
d2cc43e667 Fix edge case where opaque types can be declared on stack.
In the bizarre case where the ID of a loaded opaque type aliased with a
literal which was used as part of another texturing instruction, we
could end up with a case where domination analysis assumed the loaded
opaque type needed to be moved to a different scope.

Fix the issue by never doing dominance analysis for opaque temporaries,
and be more robust when analyzing texturing instructions.

Also make sure reflection output is deterministic.
This patch slightly alterered output for some unknown reason, but it came from an
unordered_map, so it's fine.
2019-02-19 17:28:31 +01:00
Chip Davis
e75add42c9 MSL: Add support for tessellation evaluation shaders.
These are mapped to Metal's post-tessellation vertex functions. The
semantic difference is much less here, so this change should be simpler
than the previous one. There are still some hairy parts, though.

In MSL, the array of control point data is represented by a special
type, `patch_control_point<T>`, where `T` is a valid stage-input type.
This object must be embedded inside the patch-level stage input. For
this reason, I've added a new type to the type system to represent this.

On Mac, the number of input control points to the function must be
specified in the `patch()` attribute. This is optional on iOS.
SPIRV-Cross takes this from the `OutputVertices` execution mode; the
intent is that if it's not set in the shader itself, MoltenVK will set
it from the tessellation control shader. If you're translating these
offline, you'll have to update the control point count manually, since
this number must match the number that is passed to the
`drawPatches:...` family of methods.

Fixes #120.
2019-02-14 10:00:08 -06:00
Chip Davis
8860a97d4a Fix formatting of uint32_t casts. 2019-02-11 16:14:00 -06:00
Chip Davis
eb89c3a428 MSL: Add support for tessellation control shaders.
These are transpiled to kernel functions that write the output of the
shader to three buffers: one for per-vertex varyings, one for per-patch
varyings, and one for the tessellation levels. This structure is
mandated by the way Metal works, where the tessellation factors are
supplied to the draw method in their own buffer, while the per-patch and
per-vertex varyings are supplied as though they were vertex attributes;
since they have different step rates, they must be in separate buffers.

The kernel is expected to be run in a workgroup whose size is the
greater of the number of input or output control points. It uses Metal's
support for vertex-style stage input to a compute shader to get the
input values; therefore, at least one instance must run per input point.
Meanwhile, Vulkan mandates that it run at least once per output point.
Overrunning the output array is a concern, but any values written should
either be discarded or overwritten by subsequent patches. I'm probably
going to put some slop space in the buffer when I integrate this into
MoltenVK to be on the safe side.
2019-02-07 08:51:22 -06:00
Hans-Kristian Arntzen
3e584f2c3f Support LUTs in single-function CFGs on Private storage class.
Fairly common pattern in unoptimized SPIR-V. Support this case as well.
2019-02-06 10:38:59 +01:00
Hans-Kristian Arntzen
3e09879131 Support initializers on StorageClassOutput. 2019-01-30 10:29:08 +01:00
Hans-Kristian Arntzen
40e7723051 Run format_all.sh. 2019-01-17 11:29:50 +01:00
Hans-Kristian Arntzen
de7e5ccd8b Refactor out packed expressions to extended decorations.
Can't safely just cast to the original enum without lots of hacks.
2019-01-17 11:28:51 +01:00
Hans-Kristian Arntzen
72377366d3 Replace custom use of DecorationCPacked with an explicit one.
Will need to use more variants of this decoration, so might as well make
it clearer what is going on with CPacked.
2019-01-17 10:36:56 +01:00
Hans-Kristian Arntzen
64ca1ec677 MSL: Start considering float[] and float2[] in std140 layout. 2019-01-16 16:16:39 +01:00