Commit Graph

294 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen
c365cc1b43 Deal with OpPhi and case fallthrough.
This is quite complex since we cannot flush Phi inside the case labels,
we have to do it outside by emitting a lot of manual branches ourselves.

This should be extremely rare, but we need to handle this case.
2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen
457eba355e Employ heuristics to figure out how to emit SSBO/UAV reflection names.
This is rather shaky, but we don't have many choices here except add a
lot of awkward and unintuitive options. Try to deduce this from OpSource
and fallback to heuristic.
2019-06-10 11:24:24 +02:00
Hans-Kristian Arntzen
6b52b0fe8b Deal with nested loops.
Actually need to hoist out variable to outermost loop.
2019-06-06 14:37:02 +02:00
Hans-Kristian Arntzen
02ae99f399 Use the existing loop dominator when doing loop variable preservation. 2019-06-06 12:22:28 +02:00
Hans-Kristian Arntzen
bf56dc88b9 Rewrite how loop dominators are propagated.
Do this analysis in the CFG stage rather than last minute with the
ad-hoc algorithm we had in place before CFG was introduced.
2019-06-06 12:17:46 +02:00
Hans-Kristian Arntzen
03d93abc1a Deal with case where a variable is dominated by inner part of a loop.
There is a risk that we try to preserve a loop variable through multiple
iterations, even though the dominating block is inside a loop.

Fix this by analyzing if a block starts off by writing to a variable. In
that case, there cannot be any preservation going on. If we don't, pretend the
loop header is reading the variable, which moves the variable to an
appropriate scope.
2019-06-06 11:11:44 +02:00
Hans-Kristian Arntzen
65af09d2d1 Support emitting OpLine directive.
Facilitates easier mapping from source language to cross-compiled output
in tooling.
2019-05-28 13:44:24 +02:00
Hans-Kristian Arntzen
42e64597a7 OpArrayLength must trigger active variables. 2019-05-27 16:44:02 +02:00
Hans-Kristian Arntzen
96492648d4 MSL: Fix struct declaration order with complex type aliases.
MSL generally emits the aliases, which means we cannot always place the
master type first, unlike GLSL and HLSL. The logic fix is just to
reorder after we have tagged types with packing information, rather than
doing it in the parser fixup.
2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen
6fcf8c83d9 GLSL: Support OpBitcast for buffer references.
Update glslang/SPIRV-Tools/SPIRV-Headers references.
2019-05-09 10:29:31 +02:00
Chip Davis
01c491648b Fix a copy-pasto. 2019-04-26 17:16:21 -05:00
Hans-Kristian Arntzen
2cc374a0c8 GLSL: Implement GL_EXT_buffer_reference.
Buffer objects can contain arbitrary pointers to blocks.
We can also implement ConvertPtrToU and ConvertUToPtr.
The latter can cast a uint64_t to any type as it pleases,
so we will need to generate fake buffer reference blocks to be able to
cast the type.
2019-04-26 11:43:51 +02:00
Hans-Kristian Arntzen
e23c9ea700 Force complex loop in certain rare access chain scenarios.
If we generate an access chain in a loop body, and it is consumed in the
loop continue block, we have a problem because we cannot emit a
temporary here holding the access chain reference. Force a complex loop
body to workaround this exceptionally rare case.
2019-04-10 16:02:03 +02:00
Hans-Kristian Arntzen
3fe57d3798 Do not use SmallVector as input type in public interfaces.
This is an API break, which we need to be careful with.
Handing out SmallVectors is easier since the interface is basically the
same.
2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen
a489ba7fd1 Reduce pressure on global allocation.
- Replace ostringstream with custom implementation.
  ~30% performance uplift on vector-shuffle-oom test.
  Allocations are measurably reduced in Valgrind.

- Replace std::vector with SmallVector.
  Classic malloc optimization, small vectors are backed by inline data.
  ~ 7-8% gain on vector-shuffle-oom on GCC 8 on Linux.

- Use an object pool for IVariant type.
  We generally allocate a lot of SPIR* objects. We can amortize these
  allocations neatly by pooling them.

- ~15% overall uplift on ./test_shaders.py --iterations 10000 shaders/.
2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen
317144a59c Detect invalid DoWhileLoop early.
We had a bug where error conditions in DoWhileLoop emit path would not
detect that statements were being emitted due to the masking behavior
which happens when force_recompile is true. Fix this.

Also, refactor force_recompile into member functions so we can properly
break on any situation where this is set, without having to rely on
watchpoints in debuggers.
2019-04-05 12:19:32 +02:00
Hans-Kristian Arntzen
fc37c52d26 Fix typo with array stride error message.
Trivial copy-paste bug.
2019-04-02 19:18:13 +02:00
Hans-Kristian Arntzen
9b92e68d71 Add an option to override the namespace used for spirv_cross.
This is a pragmatic trick to avoid symbol collision where a project
links against SPIRV-Cross statically, while linking to other projects
which also use SPIRV-Cross statically. We can end up with very awkward
symbol collisions which can resolve themselves silently because
SPIRV-Cross is pulled in as necessary. To fix this, we must use
different symbols and embed two copies of SPIRV-Cross in this scenario,
now with different namespaces, which in turn leads to different symbols.
2019-03-29 10:29:44 +01:00
Patrick Mours
b2a667520d Add reflection support for ray tracing acceleration structures 2019-03-26 15:09:42 +01:00
Patrick Mours
90c91e4f23 Fix missing check for purity on ray tracing builtins 2019-03-26 14:25:25 +01:00
Hans-Kristian Arntzen
d8e4d995e5 Remove strange include which got included for some reason. 2019-03-15 21:55:53 +01:00
Hans-Kristian Arntzen
e47a77d596 MSL: Implement Metal 2.0 indirect argument buffers. 2019-03-15 11:01:27 +01:00
Hans-Kristian Arntzen
8bfb04d29d Run format_all.sh
Disable clang format in C wrapper for now.
Some weird formatting bug with the try/catch macro.
2019-03-06 12:20:13 +01:00
Hans-Kristian Arntzen
ef24337849 Support do-while where test is negative. 2019-03-06 12:17:38 +01:00
Hans-Kristian Arntzen
70ff96b03f Deal with more for loop candidate cases.
We can trivially deal with cases where the loop tests are simply
inverted. We can also deal with cases where the condition block branches
to the merge block via other noop blocks.
This makes SPIR-V codegen easier when targeting SPIRV-Cross.
2019-03-06 11:24:43 +01:00
Hans-Kristian Arntzen
9bbdccddb7 Add a stable C API for SPIRV-Cross.
This adds a new C API for SPIRV-Cross which is intended to be stable,
both API and ABI wise.

The C++ API has been refactored a bit to make the C wrapper easier and
cleaner to write. Especially the vertex attribute / resource interfaces
for MSL has been rewritten to avoid taking mutable pointers into the
interface. This would be very annoying to wrap and it didn't fit well
with the rest of the C++ API to begin with. While doing this, I went
ahead and removed all the old deprecated interfaces.

The CMake build system has also seen an overhaul.
It is now possible to build static/shared/CLI separately with -D
options.
The shared library only exposes the C API, as it is the only ABI-stable
API. pkg-configs as well as CMake modules are exported and installed for
the shared library configuration.
2019-03-01 11:53:51 +01:00
Hans-Kristian Arntzen
825ff4af7e Replace locale handling.
We were using std::locale::global() to force a C locale which is not
safe when SPIRV-Cross is used in a multi-threaded environment.

To fix this, we could tap into various per-platform specific locale
handling to get safe thread-local locales, but since locales only affect
the decimal point in floats, we simply query the locale instead and do
the necessary radix replacement ourselves, without touching the locale.

This should be much safer and cleaner than the alternative.
2019-02-28 11:28:31 +01:00
Hans-Kristian Arntzen
d2cc43e667 Fix edge case where opaque types can be declared on stack.
In the bizarre case where the ID of a loaded opaque type aliased with a
literal which was used as part of another texturing instruction, we
could end up with a case where domination analysis assumed the loaded
opaque type needed to be moved to a different scope.

Fix the issue by never doing dominance analysis for opaque temporaries,
and be more robust when analyzing texturing instructions.

Also make sure reflection output is deterministic.
This patch slightly alterered output for some unknown reason, but it came from an
unordered_map, so it's fine.
2019-02-19 17:28:31 +01:00
Chip Davis
e75add42c9 MSL: Add support for tessellation evaluation shaders.
These are mapped to Metal's post-tessellation vertex functions. The
semantic difference is much less here, so this change should be simpler
than the previous one. There are still some hairy parts, though.

In MSL, the array of control point data is represented by a special
type, `patch_control_point<T>`, where `T` is a valid stage-input type.
This object must be embedded inside the patch-level stage input. For
this reason, I've added a new type to the type system to represent this.

On Mac, the number of input control points to the function must be
specified in the `patch()` attribute. This is optional on iOS.
SPIRV-Cross takes this from the `OutputVertices` execution mode; the
intent is that if it's not set in the shader itself, MoltenVK will set
it from the tessellation control shader. If you're translating these
offline, you'll have to update the control point count manually, since
this number must match the number that is passed to the
`drawPatches:...` family of methods.

Fixes #120.
2019-02-14 10:00:08 -06:00
Chip Davis
8860a97d4a Fix formatting of uint32_t casts. 2019-02-11 16:14:00 -06:00
Chip Davis
eb89c3a428 MSL: Add support for tessellation control shaders.
These are transpiled to kernel functions that write the output of the
shader to three buffers: one for per-vertex varyings, one for per-patch
varyings, and one for the tessellation levels. This structure is
mandated by the way Metal works, where the tessellation factors are
supplied to the draw method in their own buffer, while the per-patch and
per-vertex varyings are supplied as though they were vertex attributes;
since they have different step rates, they must be in separate buffers.

The kernel is expected to be run in a workgroup whose size is the
greater of the number of input or output control points. It uses Metal's
support for vertex-style stage input to a compute shader to get the
input values; therefore, at least one instance must run per input point.
Meanwhile, Vulkan mandates that it run at least once per output point.
Overrunning the output array is a concern, but any values written should
either be discarded or overwritten by subsequent patches. I'm probably
going to put some slop space in the buffer when I integrate this into
MoltenVK to be on the safe side.
2019-02-07 08:51:22 -06:00
Hans-Kristian Arntzen
3e584f2c3f Support LUTs in single-function CFGs on Private storage class.
Fairly common pattern in unoptimized SPIR-V. Support this case as well.
2019-02-06 10:38:59 +01:00
Hans-Kristian Arntzen
3e09879131 Support initializers on StorageClassOutput. 2019-01-30 10:29:08 +01:00
Hans-Kristian Arntzen
40e7723051 Run format_all.sh. 2019-01-17 11:29:50 +01:00
Hans-Kristian Arntzen
de7e5ccd8b Refactor out packed expressions to extended decorations.
Can't safely just cast to the original enum without lots of hacks.
2019-01-17 11:28:51 +01:00
Hans-Kristian Arntzen
72377366d3 Replace custom use of DecorationCPacked with an explicit one.
Will need to use more variants of this decoration, so might as well make
it clearer what is going on with CPacked.
2019-01-17 10:36:56 +01:00
Hans-Kristian Arntzen
64ca1ec677 MSL: Start considering float[] and float2[] in std140 layout. 2019-01-16 16:16:39 +01:00
Hans-Kristian Arntzen
6e1c3ccb72 Run format_all.sh. 2019-01-11 12:56:00 +01:00
Hans-Kristian Arntzen
2fb9aa251e Workaround bugs on MSVC.
Bug:
https://developercommunity.visualstudio.com/content/problem/303996/c-error-c2668-ambiguous-overloaded-in-lambda-with.html
2019-01-11 09:29:28 +01:00
Hans-Kristian Arntzen
b629878f45 Make meta a hashmap.
A flat array was consuming way too much memory and was far too slow to
initialize properly with a very large ID bound (8 million IDs, showed up as #1 hotspot in perf).

Meta struct does not have to be in-order as we never iterate over it in
a meaningful way, so using a hashmap here is reasonable. Very few IDs
should need decorations or meta-data, so this should also be a quite
decent memory save.

For the pathological case, a 6x uplift was observed.
2019-01-10 14:04:01 +01:00
Hans-Kristian Arntzen
d92de00cc1 Rewrite how IDs are iterated over.
This is a fairly fundamental change on how IDs are handled.
It serves many purposes:

- Improve performance. We only need to iterate over IDs which are
  relevant at any one time.
- Makes sure we iterate through IDs in SPIR-V module declaration order
  rather than ID space. IDs don't have to be monotonically increasing,
  which was an assumption SPIRV-Cross used to have. It has apparently
  never been a problem until now.
- Support LUTs of structs. We do this by interleaving declaration of
  constants and struct types in SPIR-V module order.

To support this, the ParsedIR interface needed to change slightly.
Before setting any ID with variant_set<T> we let ParsedIR know
that an ID with a specific type has been added. The surface for change
should be minimal.

ParsedIR will maintain a per-type list of IDs which the cross-compiler
will need to consider for later.

Instead of looping over ir.ids[] (which can be extremely large), we loop
over types now, using:

ir.for_each_typed_id<SPIRVariable>([&](uint32_t id, SPIRVariable &var) {
	handle_variable(var);
});

Now we make sure that we're never looking at irrelevant types.
2019-01-10 12:52:56 +01:00
Chip Davis
d6aa911156 Flush all variables after storing through a variable pointer.
Since we can't know which variable was modified, we therefore have to
conservatively assume that any variable might have been modified.
2019-01-08 15:16:33 -06:00
Chip Davis
fc02b3d656 Rename get_non_pointer_type() methods.
This better reflects their purpose now.
2019-01-08 12:55:22 -06:00
Chip Davis
3bfb2f94d4 MSL: Support SPV_KHR_variable_pointers.
This allows shaders to declare and use pointer-type variables. Pointers
may be loaded and stored, be the result of an `OpSelect`, be passed to
and returned from functions, and even be passed as inputs to the `OpPhi`
instruction. All types of pointers may be used as variable pointers.
Variable pointers to storage buffers and workgroup memory may even be
loaded from and stored to, as though they were ordinary variables. In
addition, this enables using an interior pointer to an array as though
it were an array pointer itself using the `OpPtrAccessChain`
instruction.

This is a rather large and involved change, mostly because this is
somewhat complicated with a lot of moving parts. It's a wonder
SPIRV-Cross's output is largely unchanged. Indeed, many of these changes
are to accomplish exactly that! Perhaps the largest source of changes
was the violation of the assumption that, when emitting types, the
pointer type didn't matter.

One of the test cases added by the change doesn't optimize very well;
the output of `spirv-opt` here is invalid SPIR-V. I need to file a bug
with SPIRV-Tools about this.

I wanted to test that variable pointers to images worked too, but I
couldn't figure out how to propagate the access qualifier properly--in
MSL, it's part of the type, so getting this right is important. I've
punted on that for now.
2019-01-07 11:19:10 -06:00
Hans-Kristian Arntzen
5b8762223d Run format_all.sh. 2019-01-07 10:01:28 +01:00
Hans-Kristian Arntzen
211abfb7ef
Merge pull request #799 from KhronosGroup/fix-780
Use correct block-name / other-name aliasing rules.
2019-01-04 16:08:10 +01:00
Hans-Kristian Arntzen
9728f9c1b7 Use correct block-name / other-name aliasing rules.
A block name cannot alias with any name in its own scope,
and it cannot alias with any other "global" name.

To solve this, we need to complicate the name cache updates a little bit
where we have a "primary" namespace and "secondary" namespace.
2019-01-04 15:02:54 +01:00
Hans-Kristian Arntzen
acae607703 Register implied expression reads in OpLoad/OpAccessChain.
This is required to avoid relying on complex sub-expression elimination
in compilers, and generates cleaner code.

The problem case is if a complex expression is used in an access chain,
like:

Composite comp = buffer[texture(...)];
vec4 a = comp.a + comp.b + comp.c;

Before, we did not have common subexpression tracking for
OpLoad/OpAccessChain, so we easily ended up with code like:

vec4 a = buffer[texture(...)].a + buffer[texture(...)].b + buffer[texture(...)].c;

A good compiler will optimize this, but we should not rely on it, and
forcing texture(...) to a temporary also looks better.

The solution is to add a vector "implied_expression_reads", which works
similarly to expression_dependencies. We also need an extra mechanism in
to_expression which lets us skip expression read checking and do it
later. E.g. for expr -> access chain -> load, we should only trigger
a read of expr when using the loaded expression.
2019-01-04 14:56:12 +01:00
Hans-Kristian Arntzen
318c17cbb2 Nonfunctional: Update copyright headers for 2019. 2019-01-04 12:38:35 +01:00
Hans-Kristian Arntzen
9aa623a553 Remove old hack for dealing with HLSL counter buffers.
No longer needed.
2018-11-22 10:23:58 +01:00