SPIRV-Cross

Author	SHA1	Message	Date
Hans-Kristian Arntzen	ae9ca7d73c	MSL: Fix copy of arrays to/from stage IO variables. Need to take into account effective storage classes and whether or not we target stage IO blocks since native arrays are conditionally enabled.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	406af8ff4d	c: Add C API for builtin stage IO reflection.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	b4a380a04c	Support reflecting builtins. They were ignored in input/output variables.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	4ca06c7278	Handle edge cases in OpCopyMemory. Implement this by synthesizing an OpLoad/OpStore pair instead.	2021-03-08 14:15:27 +01:00
Bill Hollings	8e03cb60a5	Expose position invariance. Used with MSL to determine whether to compile with invariance preserved.	2021-01-28 16:13:20 -05:00
Hans-Kristian Arntzen	4704482bbc	meta: Update copyright headers to 2021.	2021-01-14 16:07:49 +01:00
Hans-Kristian Arntzen	014b3bc5ea	MSL: Make sure initialized output builtins are considered active.	2021-01-07 15:32:13 +01:00
Hans-Kristian Arntzen	cf1e9e0643	Add MIT dual license for the SPIRV-Cross API.	2020-12-01 16:47:08 +01:00
Hans-Kristian Arntzen	5ea576ece2	Allow flip_vert_y in all relevant stages.	2020-09-28 14:10:08 +02:00
Hans-Kristian Arntzen	66afe8c499	Implement a simple evaluator of specialization constants. In some cases, we need to get a literal value from a spec constant op. Mostly relevant when emitting buffers, so implement a 32-bit integer scalar subset of the evaluator. Can be extended as needed to support evaluating any specialization constant operation.	2020-09-14 11:45:59 +02:00
Hans-Kristian Arntzen	3afbfdb090	Implement context-sensitive expression read tracking. When inside a loop, treat any read of outer expressions to happen multiple times, forcing a temporary of said outer expressions. This avoids the problem where we can end up relying on loop-invariant code motion to happen in the compiler when converting optimized shaders.	2020-06-29 12:20:35 +02:00
Hans-Kristian Arntzen	7314f51a32	MSL: Deal with loading non-value-type arrays.	2020-06-18 12:46:39 +02:00
Hans-Kristian Arntzen	58dad82fcb	Handle physical pointers in reflection API.	2020-05-25 13:45:49 +02:00
Hans-Kristian Arntzen	f9818f0804	Update license headers to 2020.	2020-01-16 15:24:37 +01:00
Bill Hollings	ef8260dea6	Expose as public Compiler::update_active_builtins() and has_active_builtin(). MoltenVK tessellation needs to be able to identify when a shader has declared an output built-in, but does not populate it, in order to keep the expectations about how intermediary buffers are populated aligned between tessellation stages.	2019-11-25 16:53:54 -05:00
Hans-Kristian Arntzen	8066d13599	MSL: Rewrite propagated depth comparison state handling. Far cleaner, and more correct to run the traversal twice. Fixes a case where we propagate depth state through multiple functions.	2019-10-26 16:10:11 +02:00
Lukas Hermanns	f3a6d28a1d	Further updates for pull request #1162 ; also added two test cases for spvCubemapTo2DArrayFace function and added '--msl-framebuffer-fetch'/ '--msl-emulate-cube-array' compiler options.	2019-09-27 15:49:54 -04:00
Lukas Hermanns	7ad0a84778	Updates for pull request #1162	2019-09-24 14:35:25 -04:00
Lukas Hermanns	37df74035b	Merge branch 'ue4_dev'	2019-09-20 09:42:42 -04:00
Lukas Hermanns	50ac6862ac	Rearranged all 'UE Change' comments to match to project's coding style.	2019-09-18 14:03:54 -04:00
Lukas Hermanns	a9f3c981d9	Adjustments after rebase of ue4_dev branch.	2019-09-13 14:03:02 -04:00
Mark Satterthwaite	869d628521	The result of an AccessChain intrinsic in SPIRV can be referenced by multiple blocks but when they are loops that can result in compilation problems because the source variables might not be declared early enough. This forces us to hoist those variables high enough to make it work.	2019-09-11 14:01:40 -04:00
Mark Satterthwaite	32557e9093	SPIRV doesn't distinguish depth textures from regular textures, but Metal does, so if we've ever seen a depth comparison operation we must ensure that the texture is specified as a depth-texture.	2019-09-06 16:58:27 -04:00
Hans-Kristian Arntzen	333980ae91	Refactor into stronger types in public API. Some fallout where internal functions are using stronger types. Overkill to move everything over to strong types right now, but perhaps move over to it slowly over time.	2019-09-06 12:29:47 +02:00
Hans-Kristian Arntzen	36c433bd92	Deal with call stacks when analyzing access.	2019-09-04 11:42:29 +02:00
Hans-Kristian Arntzen	3f2ce375e1	Analyze complex cases for fragment interlocks. If we are using interlocks in split functions or in control flow, we have some serious workarounds we need to employ.	2019-09-04 11:20:25 +02:00
Chip Davis	2eff420d9a	Support the SPV_EXT_fragment_shader_interlock extension. This was straightforward to implement in GLSL. The `ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT` modes aren't implemented yet, because we don't support `SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet. HLSL and MSL were more interesting. They don't support this directly, but they do support marking resources as "rasterizer ordered," which does roughly the same thing. So this implementation scans all accesses inside the critical section and marks all storage resources found therein as rasterizer ordered. They also don't support the fine-grained controls on pixel- vs. sample-level interlock and disabling ordering guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here merely means the order is undefined; that it just so happens to be the same as rasterizer order is immaterial. As for pixel- vs. sample-level interlock, Vulkan explicitly states: > With sample shading enabled, [the `PixelInterlockOrderedEXT` and > `PixelInterlockUnorderedEXT`] execution modes are treated like > `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT` > respectively. and: > If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`] > execution modes are used in single-sample mode they are treated like > `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT` > respectively. So this will DTRT for MoltenVK and gfx-rs, at least. MSL additionally supports multiple raster order groups; resources that are not accessed together can be placed in different ROGs to allow them to be synchronized separately. A more sophisticated analysis might be able to place resources optimally, but that's outside the scope of this change. For now, we assign all resources to group 0, which should do for our purposes. `glslang` doesn't support the `RasterizerOrdered` UAVs this implementation produces for HLSL, so the test case needs `fxc.exe`. It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`, even though the spec says it needs either 4.20 or `GL_ARB_shader_image_load_store`; and it doesn't support the `GL_NV_fragment_shader_interlock` extension at all. So I haven't been able to test those code paths. Fixes #1002.	2019-09-02 12:31:10 -05:00
Hans-Kristian Arntzen	3ccfbce264	Run format_all.sh.	2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen	d5a65b4190	GLSL: Assume image and sampler can be RelaxedPrecision. When merging combined image samplers, we only looked at sampler, but DXC emits RelaxedPrecision only for texture. Does not hurt to check for more things.	2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen	9436cd3036	MSL: Deal with array copies from and to threadgroup.	2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen	5d97dae1eb	Move branchless analysis to CFG. Traverse backwards instead, far more robust. Should elide basically all redundant continue; statements now.	2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen	b97e9b0499	Fix severe performance issue with invariant expression invalidation. We were going down a tree of expressions multiple times and this caused an exponential explosion in time, which was not caught until recently. Fix this by blocking any traversal going through an ID more than one time. This fix overall improves performance by almost an order of magnitude on a particular test shader rather than slowing it down by ~75x.	2019-08-01 09:55:21 +02:00
Hans-Kristian Arntzen	e06efb7259	Missed case where DoWhile continue block deals with Phi.	2019-07-25 12:30:50 +02:00
Hans-Kristian Arntzen	18bcc9b790	Do not disable temporary forwarding when we suppress usage tracking. This subtle bug removed any expression validation for trivially swizzled variables. Make usage suppression a more explicit concept rather than just hacking off forwarded_temporaries. There is some fallout here with loop generation since our expression invalidation is currently a bit too naive to handle loops properly. The forwarding bug masked this problem until now. If part of the loop condition is also used in the body, we end up reading an invalid expression, which in turn forces a temporary to be generated in the condition block, not good. We'll need to be smarter here ...	2019-07-23 19:18:44 +02:00
Hans-Kristian Arntzen	a86308bce1	MSL: Begin rewrite of buffer packing logic.	2019-07-19 10:06:19 +02:00
Hans-Kristian Arntzen	45805857e5	MSL: De-virtualize get_declared_struct_member_size. It does not make sense to use a virtual call in the Compiler base class here. Make it clearer by renaming the MSL-specific version to _msl.	2019-06-26 19:11:38 +02:00
Hans-Kristian Arntzen	2b11b331d6	Merge pull request #1036 from KhronosGroup/msl-auto-binding MSL: Rewrite how resources are automatically assigned bindings.	2019-06-21 15:58:50 +02:00
Hans-Kristian Arntzen	c365cc1b43	Deal with OpPhi and case fallthrough. This is quite complex since we cannot flush Phi inside the case labels, we have to do it outside by emitting a lot of manual branches ourselves. This should be extremely rare, but we need to handle this case.	2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen	e2c95bdcbc	MSL: Rewrite how resource indices are fallback-assigned. We used to use the Binding decoration for this, but this method is hopelessly broken. If no explicit MSL resource remapping exists, we remap automatically in a manner which should always "just work".	2019-06-21 12:54:08 +02:00
Hans-Kristian Arntzen	457eba355e	Employ heuristics to figure out how to emit SSBO/UAV reflection names. This is rather shaky, but we don't have many choices here except add a lot of awkward and unintuitive options. Try to deduce this from OpSource and fallback to heuristic.	2019-06-10 11:24:24 +02:00
Hans-Kristian Arntzen	03d93abc1a	Deal with case where a variable is dominated by inner part of a loop. There is a risk that we try to preserve a loop variable through multiple iterations, even though the dominating block is inside a loop. Fix this by analyzing if a block starts off by writing to a variable. In that case, there cannot be any preservation going on. If we don't, pretend the loop header is reading the variable, which moves the variable to an appropriate scope.	2019-06-06 11:11:44 +02:00
Hans-Kristian Arntzen	96492648d4	MSL: Fix struct declaration order with complex type aliases. MSL generally emits the aliases, which means we cannot always place the master type first, unlike GLSL and HLSL. The logic fix is just to reorder after we have tagged types with packing information, rather than doing it in the parser fixup.	2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen	6f091e7c8f	GLSL: Support GL_EXT_scalar_block_layout.	2019-04-26 15:43:37 +02:00
Hans-Kristian Arntzen	2cc374a0c8	GLSL: Implement GL_EXT_buffer_reference. Buffer objects can contain arbitrary pointers to blocks. We can also implement ConvertPtrToU and ConvertUToPtr. The latter can cast a uint64_t to any type as it pleases, so we will need to generate fake buffer reference blocks to be able to cast the type.	2019-04-26 11:43:51 +02:00
Hans-Kristian Arntzen	e23c9ea700	Force complex loop in certain rare access chain scenarios. If we generate an access chain in a loop body, and it is consumed in the loop continue block, we have a problem because we cannot emit a temporary here holding the access chain reference. Force a complex loop body to workaround this exceptionally rare case.	2019-04-10 16:02:03 +02:00
Hans-Kristian Arntzen	3fe57d3798	Do not use SmallVector as input type in public interfaces. This is an API break, which we need to be careful with. Handing out SmallVectors is easier since the interface is basically the same.	2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen	a489ba7fd1	Reduce pressure on global allocation. - Replace ostringstream with custom implementation. ~30% performance uplift on vector-shuffle-oom test. Allocations are measurably reduced in Valgrind. - Replace std::vector with SmallVector. Classic malloc optimization, small vectors are backed by inline data. ~ 7-8% gain on vector-shuffle-oom on GCC 8 on Linux. - Use an object pool for IVariant type. We generally allocate a lot of SPIR* objects. We can amortize these allocations neatly by pooling them. - ~15% overall uplift on ./test_shaders.py --iterations 10000 shaders/.	2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen	317144a59c	Detect invalid DoWhileLoop early. We had a bug where error conditions in DoWhileLoop emit path would not detect that statements were being emitted due to the masking behavior which happens when force_recompile is true. Fix this. Also, refactor force_recompile into member functions so we can properly break on any situation where this is set, without having to rely on watchpoints in debuggers.	2019-04-05 12:19:32 +02:00
Hans-Kristian Arntzen	9b92e68d71	Add an option to override the namespace used for spirv_cross. This is a pragmatic trick to avoid symbol collision where a project links against SPIRV-Cross statically, while linking to other projects which also use SPIRV-Cross statically. We can end up with very awkward symbol collisions which can resolve themselves silently because SPIRV-Cross is pulled in as necessary. To fix this, we must use different symbols and embed two copies of SPIRV-Cross in this scenario, now with different namespaces, which in turn leads to different symbols.	2019-03-29 10:29:44 +01:00
Patrick Mours	b2a667520d	Add reflection support for ray tracing acceleration structures	2019-03-26 15:09:42 +01:00

1 2 3 4

200 Commits