Commit Graph

388 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen
cf1e9e0643 Add MIT dual license for the SPIRV-Cross API. 2020-12-01 16:47:08 +01:00
Hans-Kristian Arntzen
6fc2a0581a Run format_all.sh. 2020-11-08 13:59:52 +01:00
Chip Davis
aca9b6879a MSL: Support pull-model interpolation on MSL 2.3+.
New in MSL 2.3 is a template that can be used in the place of a scalar
type in a stage-in struct. This template has methods which interpolate
the varying at the given points. Curiously, you can't set interpolation
attributes on such a varying; perspective-correctness is encoded in the
type, while interpolation must be done using one of the methods. This
makes using this somewhat awkward from SPIRV-Cross, requiring us to jump
through a bunch of hoops to make this all work.

Using varyings from functions in particular is a pain point, requiring
us to pass the stage-in struct itself around. An alternative is to pass
references to the interpolants; except this will fall over badly with
composite types, which naturally must be flattened.  As with
tessellation, dynamic indexing isn't supported with pull-model
interpolation. This is because of the need to reference the original
struct member in order to call one of the pull-model interpolation
methods on it. Also, this is done at the variable level; this means that
if one varying in a struct is used with the pull-model functions, then
the entire struct is emitted as pull-model interpolants.

For some reason, this was not documented in the MSL spec, though there
is a property on `MTLDevice`, `supportsPullModelInterpolation`,
indicating support for this, which *is* documented. This does not appear
to be implemented yet for AMD: it returns `NO` from
`supportsPullModelInterpolation`, and pipelines with shaders using the
templates fail to compile. It *is* implemeted for Intel. It's probably
also implemented for Apple GPUs: on Apple Silicon, OpenGL calls down to
Metal, and it wouldn't be possible to use the interpolation functions
without this implemented in Metal.

Based on my testing, where SPIR-V and GLSL have the offset relative to
the pixel center, in Metal it appears to be relative to the pixel's
upper-left corner, as in HLSL. Therefore, I've added an offset 0.4375,
i.e. one half minus one sixteenth, to all arguments to
`interpolate_at_offset()`.

This also fixes a long-standing bug: if a pull-model interpolation
function is used on a varying, make sure that varying is declared. We
were already doing this only for the AMD pull-model function,
`interpolateAtVertexAMD()`; for reasons which are completely beyond me,
we weren't doing this for the base interpolation functions. I also note
that there are no tests for the interpolation functions for GLSL or
HLSL.
2020-11-05 11:57:45 -06:00
Hans-Kristian Arntzen
e6f5ce6b89
Merge pull request #1471 from KhronosGroup/fix-1467
Work around MSVC warning.
2020-09-28 18:47:29 +02:00
Hans-Kristian Arntzen
34a6a45fba Work around MSVC warning. 2020-09-28 14:12:54 +02:00
Hans-Kristian Arntzen
5ea576ece2 Allow flip_vert_y in all relevant stages. 2020-09-28 14:10:08 +02:00
Hans-Kristian Arntzen
66afe8c499 Implement a simple evaluator of specialization constants.
In some cases, we need to get a literal value from a spec constant op.
Mostly relevant when emitting buffers, so implement a 32-bit integer
scalar subset of the evaluator. Can be extended as needed to support
evaluating any specialization constant operation.
2020-09-14 11:45:59 +02:00
Hans-Kristian Arntzen
d573a95a9c Run format_all.sh. 2020-07-01 11:42:58 +02:00
Hans-Kristian Arntzen
3afbfdb090 Implement context-sensitive expression read tracking.
When inside a loop, treat any read of outer expressions to happen
multiple times, forcing a temporary of said outer expressions.
This avoids the problem where we can end up relying on loop-invariant code motion to happen in the
compiler when converting optimized shaders.
2020-06-29 12:20:35 +02:00
Hans-Kristian Arntzen
7314f51a32 MSL: Deal with loading non-value-type arrays. 2020-06-18 12:46:39 +02:00
Hans-Kristian Arntzen
03d4bcea68 MSL: Improve handling of array types in buffer objects.
When loading and storing array types which belong to buffer objects, we
need to treat these values as not being value types. Also, need to
handle array load/store from/to more address space combinations.
2020-06-18 11:49:03 +02:00
Hans-Kristian Arntzen
58dad82fcb Handle physical pointers in reflection API. 2020-05-25 13:45:49 +02:00
Alexis Payen de la Garanderie
4edfe96739 Fixed recursion in combined_decoration_for_member
Members in nested structs were not properly iterated on,
and as a result, flags like row major for matrices could be
not propagated properly.
2020-04-27 15:54:16 +09:00
Hans-Kristian Arntzen
6b0e558169 Handle RayQueryKHR type.
Do not error out in parsing in shaders which use ray queries.
2020-04-21 14:25:18 +02:00
Hans-Kristian Arntzen
f9818f0804 Update license headers to 2020. 2020-01-16 15:24:37 +01:00
Hans-Kristian Arntzen
cf725b4c63 Go through access chain path for OpCopyLogical.
We will need to deal with packing/unpacking data when copying from/to
complex types in MSL.
2020-01-06 12:29:44 +01:00
Hans-Kristian Arntzen
e2155053c5 Fix broken access tracking for OpFunctionCall results.
We were looking at args[1] after incrementing args array, not before,
which means we tracked garbage.
This is also an out-of-bounds hazard.
2019-10-29 11:13:39 +01:00
Hans-Kristian Arntzen
648dfa5070 MSL: Ensure stable output for access chain CFG workarounds.
We had output dependent on complex_continue being set, but setting that
flag was dependent on unordered_set declaration order. Make it invariant
to ordering and change the implementation so it knows about the new
temporary hoisting for access chains.
2019-10-28 10:57:51 +01:00
Hans-Kristian Arntzen
8066d13599 MSL: Rewrite propagated depth comparison state handling.
Far cleaner, and more correct to run the traversal twice.
Fixes a case where we propagate depth state through multiple functions.
2019-10-26 16:10:11 +02:00
Hans-Kristian Arntzen
830e24c4ba MSL: Do read-only lookups of access_chain_children. 2019-10-26 16:10:11 +02:00
Lukas Hermanns
7ad0a84778 Updates for pull request #1162 2019-09-24 14:35:25 -04:00
Lukas Hermanns
37df74035b Merge branch 'ue4_dev' 2019-09-20 09:42:42 -04:00
Lukas Hermanns
50ac6862ac Rearranged all 'UE Change' comments to match to project's coding style. 2019-09-18 14:03:54 -04:00
Lukas Hermanns
a9f3c981d9 Adjustments after rebase of ue4_dev branch. 2019-09-13 14:03:02 -04:00
Hans-Kristian Arntzen
bfa76ee2ab Consider discard and demote as impure statements.
Fixes cases where discard and demote are called in pure functions and
the function result is not consumed.
2019-09-12 14:21:10 +02:00
Mark Satterthwaite
869d628521 The result of an AccessChain intrinsic in SPIRV can be referenced by multiple blocks but when they are loops that can result in compilation problems because the source variables might not be declared early enough. This forces us to hoist those variables high enough to make it work. 2019-09-11 14:01:40 -04:00
Mark Satterthwaite
32557e9093 SPIRV doesn't distinguish depth textures from regular textures, but Metal does, so if we've ever seen a depth comparison operation we must ensure that the texture is specified as a depth-texture. 2019-09-06 16:58:27 -04:00
Hans-Kristian Arntzen
333980ae91 Refactor into stronger types in public API.
Some fallout where internal functions are using stronger types.
Overkill to move everything over to strong types right now, but perhaps
move over to it slowly over time.
2019-09-06 12:29:47 +02:00
Hans-Kristian Arntzen
261b46982a Deal with complex interlock cases in GLSL. 2019-09-04 12:18:04 +02:00
Hans-Kristian Arntzen
36c433bd92 Deal with call stacks when analyzing access. 2019-09-04 11:42:29 +02:00
Hans-Kristian Arntzen
3f2ce375e1 Analyze complex cases for fragment interlocks.
If we are using interlocks in split functions or in control flow, we
have some serious workarounds we need to employ.
2019-09-04 11:20:25 +02:00
Chip Davis
2eff420d9a Support the SPV_EXT_fragment_shader_interlock extension.
This was straightforward to implement in GLSL. The
`ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT`
modes aren't implemented yet, because we don't support
`SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet.

HLSL and MSL were more interesting. They don't support this directly,
but they do support marking resources as "rasterizer ordered," which
does roughly the same thing. So this implementation scans all accesses
inside the critical section and marks all storage resources found
therein as rasterizer ordered. They also don't support the fine-grained
controls on pixel- vs. sample-level interlock and disabling ordering
guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here
merely means the order is undefined; that it just so happens to be the
same as rasterizer order is immaterial. As for pixel- vs. sample-level
interlock, Vulkan explicitly states:

> With sample shading enabled, [the `PixelInterlockOrderedEXT` and
> `PixelInterlockUnorderedEXT`] execution modes are treated like
> `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`
> respectively.

and:

> If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`]
> execution modes are used in single-sample mode they are treated like
> `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT`
> respectively.

So this will DTRT for MoltenVK and gfx-rs, at least.

MSL additionally supports multiple raster order groups; resources that
are not accessed together can be placed in different ROGs to allow them
to be synchronized separately. A more sophisticated analysis might be
able to place resources optimally, but that's outside the scope of this
change. For now, we assign all resources to group 0, which should do for
our purposes.

`glslang` doesn't support the `RasterizerOrdered` UAVs this
implementation produces for HLSL, so the test case needs `fxc.exe`.

It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`,
even though the spec says it needs either 4.20 or
`GL_ARB_shader_image_load_store`; and it doesn't support the
`GL_NV_fragment_shader_interlock` extension at all. So I haven't been
able to test those code paths.

Fixes #1002.
2019-09-02 12:31:10 -05:00
Chip Davis
39dce88d3b MSL: Add support for sampler Y'CbCr conversion.
This change introduces functions and in one case, a class, to support
the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of
GBGR8 and BGRG8 formats, for which Metal natively supports implicit
chroma reconstruction, we're on our own here. We have to do everything
ourselves. Much of the complexity comes from the need to support
multiple planes, which must now be passed to functions that use the
corresponding combined image-samplers. The rest is from the actual
Y'CbCr conversion itself, which requires additional post-processing of
the sample retrieved from the image.

Passing sampled images to a function was a particular problem. To
support this, I've added a new class which is emitted to MSL shaders
that pass sampled images with Y'CbCr conversions attached around. It
can handle sampled images with or without Y'CbCr conversion. This is an
awful abomination that should not exist, but I'm worried that there's
some shader out there which does this. This support requires Metal 2.0
to work properly, because it uses default-constructed texture objects,
which were only added in MSL 2. I'm not even going to get into arrays of
combined image-samplers--that's a whole other can of worms.  They are
deliberately unsupported in this change.

I've taken the liberty of refactoring the support for texture swizzling
while I'm at it. It's now treated as a post-processing step similar to
Y'CbCr conversion. I'd like to think this is cleaner than having
everything in `to_function_name()`/`to_function_args()`. It still looks
really hairy, though. I did, however, get rid of the explicit type
arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`.

Update the C API. In addition to supporting this new functionality, add
some compiler options that I added in previous changes, but for which I
neglected to update the C API.
2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen
3ccfbce264 Run format_all.sh. 2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen
d5a65b4190 GLSL: Assume image and sampler can be RelaxedPrecision.
When merging combined image samplers, we only looked at sampler, but DXC
emits RelaxedPrecision only for texture. Does not hurt to check for more
things.
2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen
5d97dae1eb Move branchless analysis to CFG.
Traverse backwards instead, far more robust. Should elide basically all
redundant continue; statements now.
2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen
d620f1dd26 Do not force temporary unless continue-only for loop dominates.
We would force temporaries in unexpected places, causing assertions to
throw if access chains were consumed in such loops.
2019-07-26 10:39:05 +02:00
Hans-Kristian Arntzen
e06efb7259 Missed case where DoWhile continue block deals with Phi. 2019-07-25 12:30:50 +02:00
Hans-Kristian Arntzen
a86308bce1 MSL: Begin rewrite of buffer packing logic. 2019-07-19 10:06:19 +02:00
Lifeng Pan
5ca8779044 Parse SPIR-V debug information extended instructions, as well as OpNoLine.
No impact on result shader string.
2019-07-04 16:21:44 +08:00
Hans-Kristian Arntzen
b4e0163749 Run format_all.sh. 2019-06-21 16:02:22 +02:00
Hans-Kristian Arntzen
2b11b331d6
Merge pull request #1036 from KhronosGroup/msl-auto-binding
MSL: Rewrite how resources are automatically assigned bindings.
2019-06-21 15:58:50 +02:00
Hans-Kristian Arntzen
c365cc1b43 Deal with OpPhi and case fallthrough.
This is quite complex since we cannot flush Phi inside the case labels,
we have to do it outside by emitting a lot of manual branches ourselves.

This should be extremely rare, but we need to handle this case.
2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen
e2c95bdcbc MSL: Rewrite how resource indices are fallback-assigned.
We used to use the Binding decoration for this, but this method is
hopelessly broken. If no explicit MSL resource remapping exists, we
remap automatically in a manner which should always "just work".
2019-06-21 12:54:08 +02:00
Hans-Kristian Arntzen
457eba355e Employ heuristics to figure out how to emit SSBO/UAV reflection names.
This is rather shaky, but we don't have many choices here except add a
lot of awkward and unintuitive options. Try to deduce this from OpSource
and fallback to heuristic.
2019-06-10 11:24:24 +02:00
Hans-Kristian Arntzen
6b52b0fe8b Deal with nested loops.
Actually need to hoist out variable to outermost loop.
2019-06-06 14:37:02 +02:00
Hans-Kristian Arntzen
02ae99f399 Use the existing loop dominator when doing loop variable preservation. 2019-06-06 12:22:28 +02:00
Hans-Kristian Arntzen
bf56dc88b9 Rewrite how loop dominators are propagated.
Do this analysis in the CFG stage rather than last minute with the
ad-hoc algorithm we had in place before CFG was introduced.
2019-06-06 12:17:46 +02:00
Hans-Kristian Arntzen
03d93abc1a Deal with case where a variable is dominated by inner part of a loop.
There is a risk that we try to preserve a loop variable through multiple
iterations, even though the dominating block is inside a loop.

Fix this by analyzing if a block starts off by writing to a variable. In
that case, there cannot be any preservation going on. If we don't, pretend the
loop header is reading the variable, which moves the variable to an
appropriate scope.
2019-06-06 11:11:44 +02:00
Hans-Kristian Arntzen
65af09d2d1 Support emitting OpLine directive.
Facilitates easier mapping from source language to cross-compiled output
in tooling.
2019-05-28 13:44:24 +02:00
Hans-Kristian Arntzen
42e64597a7 OpArrayLength must trigger active variables. 2019-05-27 16:44:02 +02:00
Hans-Kristian Arntzen
96492648d4 MSL: Fix struct declaration order with complex type aliases.
MSL generally emits the aliases, which means we cannot always place the
master type first, unlike GLSL and HLSL. The logic fix is just to
reorder after we have tagged types with packing information, rather than
doing it in the parser fixup.
2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen
6fcf8c83d9 GLSL: Support OpBitcast for buffer references.
Update glslang/SPIRV-Tools/SPIRV-Headers references.
2019-05-09 10:29:31 +02:00
Chip Davis
01c491648b Fix a copy-pasto. 2019-04-26 17:16:21 -05:00
Hans-Kristian Arntzen
2cc374a0c8 GLSL: Implement GL_EXT_buffer_reference.
Buffer objects can contain arbitrary pointers to blocks.
We can also implement ConvertPtrToU and ConvertUToPtr.
The latter can cast a uint64_t to any type as it pleases,
so we will need to generate fake buffer reference blocks to be able to
cast the type.
2019-04-26 11:43:51 +02:00
Hans-Kristian Arntzen
e23c9ea700 Force complex loop in certain rare access chain scenarios.
If we generate an access chain in a loop body, and it is consumed in the
loop continue block, we have a problem because we cannot emit a
temporary here holding the access chain reference. Force a complex loop
body to workaround this exceptionally rare case.
2019-04-10 16:02:03 +02:00
Hans-Kristian Arntzen
3fe57d3798 Do not use SmallVector as input type in public interfaces.
This is an API break, which we need to be careful with.
Handing out SmallVectors is easier since the interface is basically the
same.
2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen
a489ba7fd1 Reduce pressure on global allocation.
- Replace ostringstream with custom implementation.
  ~30% performance uplift on vector-shuffle-oom test.
  Allocations are measurably reduced in Valgrind.

- Replace std::vector with SmallVector.
  Classic malloc optimization, small vectors are backed by inline data.
  ~ 7-8% gain on vector-shuffle-oom on GCC 8 on Linux.

- Use an object pool for IVariant type.
  We generally allocate a lot of SPIR* objects. We can amortize these
  allocations neatly by pooling them.

- ~15% overall uplift on ./test_shaders.py --iterations 10000 shaders/.
2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen
317144a59c Detect invalid DoWhileLoop early.
We had a bug where error conditions in DoWhileLoop emit path would not
detect that statements were being emitted due to the masking behavior
which happens when force_recompile is true. Fix this.

Also, refactor force_recompile into member functions so we can properly
break on any situation where this is set, without having to rely on
watchpoints in debuggers.
2019-04-05 12:19:32 +02:00
Hans-Kristian Arntzen
fc37c52d26 Fix typo with array stride error message.
Trivial copy-paste bug.
2019-04-02 19:18:13 +02:00
Hans-Kristian Arntzen
9b92e68d71 Add an option to override the namespace used for spirv_cross.
This is a pragmatic trick to avoid symbol collision where a project
links against SPIRV-Cross statically, while linking to other projects
which also use SPIRV-Cross statically. We can end up with very awkward
symbol collisions which can resolve themselves silently because
SPIRV-Cross is pulled in as necessary. To fix this, we must use
different symbols and embed two copies of SPIRV-Cross in this scenario,
now with different namespaces, which in turn leads to different symbols.
2019-03-29 10:29:44 +01:00
Patrick Mours
b2a667520d Add reflection support for ray tracing acceleration structures 2019-03-26 15:09:42 +01:00
Patrick Mours
90c91e4f23 Fix missing check for purity on ray tracing builtins 2019-03-26 14:25:25 +01:00
Hans-Kristian Arntzen
d8e4d995e5 Remove strange include which got included for some reason. 2019-03-15 21:55:53 +01:00
Hans-Kristian Arntzen
e47a77d596 MSL: Implement Metal 2.0 indirect argument buffers. 2019-03-15 11:01:27 +01:00
Hans-Kristian Arntzen
8bfb04d29d Run format_all.sh
Disable clang format in C wrapper for now.
Some weird formatting bug with the try/catch macro.
2019-03-06 12:20:13 +01:00
Hans-Kristian Arntzen
ef24337849 Support do-while where test is negative. 2019-03-06 12:17:38 +01:00
Hans-Kristian Arntzen
70ff96b03f Deal with more for loop candidate cases.
We can trivially deal with cases where the loop tests are simply
inverted. We can also deal with cases where the condition block branches
to the merge block via other noop blocks.
This makes SPIR-V codegen easier when targeting SPIRV-Cross.
2019-03-06 11:24:43 +01:00
Hans-Kristian Arntzen
9bbdccddb7 Add a stable C API for SPIRV-Cross.
This adds a new C API for SPIRV-Cross which is intended to be stable,
both API and ABI wise.

The C++ API has been refactored a bit to make the C wrapper easier and
cleaner to write. Especially the vertex attribute / resource interfaces
for MSL has been rewritten to avoid taking mutable pointers into the
interface. This would be very annoying to wrap and it didn't fit well
with the rest of the C++ API to begin with. While doing this, I went
ahead and removed all the old deprecated interfaces.

The CMake build system has also seen an overhaul.
It is now possible to build static/shared/CLI separately with -D
options.
The shared library only exposes the C API, as it is the only ABI-stable
API. pkg-configs as well as CMake modules are exported and installed for
the shared library configuration.
2019-03-01 11:53:51 +01:00
Hans-Kristian Arntzen
825ff4af7e Replace locale handling.
We were using std::locale::global() to force a C locale which is not
safe when SPIRV-Cross is used in a multi-threaded environment.

To fix this, we could tap into various per-platform specific locale
handling to get safe thread-local locales, but since locales only affect
the decimal point in floats, we simply query the locale instead and do
the necessary radix replacement ourselves, without touching the locale.

This should be much safer and cleaner than the alternative.
2019-02-28 11:28:31 +01:00
Hans-Kristian Arntzen
d2cc43e667 Fix edge case where opaque types can be declared on stack.
In the bizarre case where the ID of a loaded opaque type aliased with a
literal which was used as part of another texturing instruction, we
could end up with a case where domination analysis assumed the loaded
opaque type needed to be moved to a different scope.

Fix the issue by never doing dominance analysis for opaque temporaries,
and be more robust when analyzing texturing instructions.

Also make sure reflection output is deterministic.
This patch slightly alterered output for some unknown reason, but it came from an
unordered_map, so it's fine.
2019-02-19 17:28:31 +01:00
Chip Davis
e75add42c9 MSL: Add support for tessellation evaluation shaders.
These are mapped to Metal's post-tessellation vertex functions. The
semantic difference is much less here, so this change should be simpler
than the previous one. There are still some hairy parts, though.

In MSL, the array of control point data is represented by a special
type, `patch_control_point<T>`, where `T` is a valid stage-input type.
This object must be embedded inside the patch-level stage input. For
this reason, I've added a new type to the type system to represent this.

On Mac, the number of input control points to the function must be
specified in the `patch()` attribute. This is optional on iOS.
SPIRV-Cross takes this from the `OutputVertices` execution mode; the
intent is that if it's not set in the shader itself, MoltenVK will set
it from the tessellation control shader. If you're translating these
offline, you'll have to update the control point count manually, since
this number must match the number that is passed to the
`drawPatches:...` family of methods.

Fixes #120.
2019-02-14 10:00:08 -06:00
Chip Davis
8860a97d4a Fix formatting of uint32_t casts. 2019-02-11 16:14:00 -06:00
Chip Davis
eb89c3a428 MSL: Add support for tessellation control shaders.
These are transpiled to kernel functions that write the output of the
shader to three buffers: one for per-vertex varyings, one for per-patch
varyings, and one for the tessellation levels. This structure is
mandated by the way Metal works, where the tessellation factors are
supplied to the draw method in their own buffer, while the per-patch and
per-vertex varyings are supplied as though they were vertex attributes;
since they have different step rates, they must be in separate buffers.

The kernel is expected to be run in a workgroup whose size is the
greater of the number of input or output control points. It uses Metal's
support for vertex-style stage input to a compute shader to get the
input values; therefore, at least one instance must run per input point.
Meanwhile, Vulkan mandates that it run at least once per output point.
Overrunning the output array is a concern, but any values written should
either be discarded or overwritten by subsequent patches. I'm probably
going to put some slop space in the buffer when I integrate this into
MoltenVK to be on the safe side.
2019-02-07 08:51:22 -06:00
Hans-Kristian Arntzen
3e584f2c3f Support LUTs in single-function CFGs on Private storage class.
Fairly common pattern in unoptimized SPIR-V. Support this case as well.
2019-02-06 10:38:59 +01:00
Hans-Kristian Arntzen
3e09879131 Support initializers on StorageClassOutput. 2019-01-30 10:29:08 +01:00
Hans-Kristian Arntzen
40e7723051 Run format_all.sh. 2019-01-17 11:29:50 +01:00
Hans-Kristian Arntzen
de7e5ccd8b Refactor out packed expressions to extended decorations.
Can't safely just cast to the original enum without lots of hacks.
2019-01-17 11:28:51 +01:00
Hans-Kristian Arntzen
72377366d3 Replace custom use of DecorationCPacked with an explicit one.
Will need to use more variants of this decoration, so might as well make
it clearer what is going on with CPacked.
2019-01-17 10:36:56 +01:00
Hans-Kristian Arntzen
64ca1ec677 MSL: Start considering float[] and float2[] in std140 layout. 2019-01-16 16:16:39 +01:00
Hans-Kristian Arntzen
6e1c3ccb72 Run format_all.sh. 2019-01-11 12:56:00 +01:00
Hans-Kristian Arntzen
2fb9aa251e Workaround bugs on MSVC.
Bug:
https://developercommunity.visualstudio.com/content/problem/303996/c-error-c2668-ambiguous-overloaded-in-lambda-with.html
2019-01-11 09:29:28 +01:00
Hans-Kristian Arntzen
b629878f45 Make meta a hashmap.
A flat array was consuming way too much memory and was far too slow to
initialize properly with a very large ID bound (8 million IDs, showed up as #1 hotspot in perf).

Meta struct does not have to be in-order as we never iterate over it in
a meaningful way, so using a hashmap here is reasonable. Very few IDs
should need decorations or meta-data, so this should also be a quite
decent memory save.

For the pathological case, a 6x uplift was observed.
2019-01-10 14:04:01 +01:00
Hans-Kristian Arntzen
d92de00cc1 Rewrite how IDs are iterated over.
This is a fairly fundamental change on how IDs are handled.
It serves many purposes:

- Improve performance. We only need to iterate over IDs which are
  relevant at any one time.
- Makes sure we iterate through IDs in SPIR-V module declaration order
  rather than ID space. IDs don't have to be monotonically increasing,
  which was an assumption SPIRV-Cross used to have. It has apparently
  never been a problem until now.
- Support LUTs of structs. We do this by interleaving declaration of
  constants and struct types in SPIR-V module order.

To support this, the ParsedIR interface needed to change slightly.
Before setting any ID with variant_set<T> we let ParsedIR know
that an ID with a specific type has been added. The surface for change
should be minimal.

ParsedIR will maintain a per-type list of IDs which the cross-compiler
will need to consider for later.

Instead of looping over ir.ids[] (which can be extremely large), we loop
over types now, using:

ir.for_each_typed_id<SPIRVariable>([&](uint32_t id, SPIRVariable &var) {
	handle_variable(var);
});

Now we make sure that we're never looking at irrelevant types.
2019-01-10 12:52:56 +01:00
Chip Davis
d6aa911156 Flush all variables after storing through a variable pointer.
Since we can't know which variable was modified, we therefore have to
conservatively assume that any variable might have been modified.
2019-01-08 15:16:33 -06:00
Chip Davis
fc02b3d656 Rename get_non_pointer_type() methods.
This better reflects their purpose now.
2019-01-08 12:55:22 -06:00
Chip Davis
3bfb2f94d4 MSL: Support SPV_KHR_variable_pointers.
This allows shaders to declare and use pointer-type variables. Pointers
may be loaded and stored, be the result of an `OpSelect`, be passed to
and returned from functions, and even be passed as inputs to the `OpPhi`
instruction. All types of pointers may be used as variable pointers.
Variable pointers to storage buffers and workgroup memory may even be
loaded from and stored to, as though they were ordinary variables. In
addition, this enables using an interior pointer to an array as though
it were an array pointer itself using the `OpPtrAccessChain`
instruction.

This is a rather large and involved change, mostly because this is
somewhat complicated with a lot of moving parts. It's a wonder
SPIRV-Cross's output is largely unchanged. Indeed, many of these changes
are to accomplish exactly that! Perhaps the largest source of changes
was the violation of the assumption that, when emitting types, the
pointer type didn't matter.

One of the test cases added by the change doesn't optimize very well;
the output of `spirv-opt` here is invalid SPIR-V. I need to file a bug
with SPIRV-Tools about this.

I wanted to test that variable pointers to images worked too, but I
couldn't figure out how to propagate the access qualifier properly--in
MSL, it's part of the type, so getting this right is important. I've
punted on that for now.
2019-01-07 11:19:10 -06:00
Hans-Kristian Arntzen
5b8762223d Run format_all.sh. 2019-01-07 10:01:28 +01:00
Hans-Kristian Arntzen
211abfb7ef
Merge pull request #799 from KhronosGroup/fix-780
Use correct block-name / other-name aliasing rules.
2019-01-04 16:08:10 +01:00
Hans-Kristian Arntzen
9728f9c1b7 Use correct block-name / other-name aliasing rules.
A block name cannot alias with any name in its own scope,
and it cannot alias with any other "global" name.

To solve this, we need to complicate the name cache updates a little bit
where we have a "primary" namespace and "secondary" namespace.
2019-01-04 15:02:54 +01:00
Hans-Kristian Arntzen
acae607703 Register implied expression reads in OpLoad/OpAccessChain.
This is required to avoid relying on complex sub-expression elimination
in compilers, and generates cleaner code.

The problem case is if a complex expression is used in an access chain,
like:

Composite comp = buffer[texture(...)];
vec4 a = comp.a + comp.b + comp.c;

Before, we did not have common subexpression tracking for
OpLoad/OpAccessChain, so we easily ended up with code like:

vec4 a = buffer[texture(...)].a + buffer[texture(...)].b + buffer[texture(...)].c;

A good compiler will optimize this, but we should not rely on it, and
forcing texture(...) to a temporary also looks better.

The solution is to add a vector "implied_expression_reads", which works
similarly to expression_dependencies. We also need an extra mechanism in
to_expression which lets us skip expression read checking and do it
later. E.g. for expr -> access chain -> load, we should only trigger
a read of expr when using the loaded expression.
2019-01-04 14:56:12 +01:00
Hans-Kristian Arntzen
318c17cbb2 Nonfunctional: Update copyright headers for 2019. 2019-01-04 12:38:35 +01:00
Hans-Kristian Arntzen
9aa623a553 Remove old hack for dealing with HLSL counter buffers.
No longer needed.
2018-11-22 10:23:58 +01:00
Hans-Kristian Arntzen
5bcf02f7c9 Hoist out parsing module from spirv_cross::Compiler.
This is a large refactor which splits out the SPIR-V parser from
Compiler and moves it into its more appropriately named Parser module.

The Parser is responsible for building a ParsedIR structure which is
then consumed by one or more compilers.

Compiler can take a ParsedIR by value or move reference. This should
allow for optimal case for both multiple compilations and single
compilation scenarios.
2018-10-19 12:01:31 +02:00
Hans-Kristian Arntzen
c07c303999 Use GL_EXT_samplerless_texture_functions in Vulkan GLSL. 2018-09-27 13:36:38 +02:00
Chip Davis
8855ea0a3e Move is_sampled_image_type() onto the Compiler class.
While I'm at it, don't use a bitwise op with a `bool` variable.
Apparently, MSVC doesn't like that.
2018-09-24 12:24:58 -05:00
Hans-Kristian Arntzen
d310060f92 MSL: Support global I/O block and struct Input/Output usage.
Implement this by flattening outputs and unflattening inputs explicitly.
This allows us to pass down a single struct instead of dealing with the
insanity that would be passing down each flattened member separately.

Remove stage_uniforms_var_id.
Seems to be dead code. Naked uniforms do not exist in SPIR-V for Vulkan,
which this seems to have been intended for. It was also unused elsewhere.
2018-09-13 16:04:24 +02:00
Hans-Kristian Arntzen
e86018f8a1 Add a helper function to improve reflection on runtime sized arrays. 2018-09-10 11:08:47 +02:00
Chip Davis
4b99fdd5d0 MSL: Account for components when assigning locations to varyings.
Two varyings (vertex outputs/fragment inputs) might have the same
location but be in different components--e.g. the compiler may have
packed what were two different varyings into a single varying vector.
Giving both varyings the same `[[user]]` attribute won't work--it may
yield unexpected results, or flat out fail to link. We could eventually
pack such varyings into a single vector, but that would require us to
handle the case where the varyings are different types--e.g. a `float`
and a `uint` packed into the same vector. For now, it seems most
prudent to give them unique `[[user]]` locations and let Apple's
compiler work out the best way to pack them.
2018-09-06 13:52:33 -05:00