SPIRV-Cross

Author	SHA1	Message	Date
Lukas Hermanns	37df74035b	Merge branch 'ue4_dev'	2019-09-20 09:42:42 -04:00
Lukas Hermanns	50ac6862ac	Rearranged all 'UE Change' comments to match to project's coding style.	2019-09-18 14:03:54 -04:00
Lukas Hermanns	a9f3c981d9	Adjustments after rebase of ue4_dev branch.	2019-09-13 14:03:02 -04:00
Hans-Kristian Arntzen	bfa76ee2ab	Consider discard and demote as impure statements. Fixes cases where discard and demote are called in pure functions and the function result is not consumed.	2019-09-12 14:21:10 +02:00
Mark Satterthwaite	869d628521	The result of an AccessChain intrinsic in SPIRV can be referenced by multiple blocks but when they are loops that can result in compilation problems because the source variables might not be declared early enough. This forces us to hoist those variables high enough to make it work.	2019-09-11 14:01:40 -04:00
Mark Satterthwaite	32557e9093	SPIRV doesn't distinguish depth textures from regular textures, but Metal does, so if we've ever seen a depth comparison operation we must ensure that the texture is specified as a depth-texture.	2019-09-06 16:58:27 -04:00
Hans-Kristian Arntzen	333980ae91	Refactor into stronger types in public API. Some fallout where internal functions are using stronger types. Overkill to move everything over to strong types right now, but perhaps move over to it slowly over time.	2019-09-06 12:29:47 +02:00
Hans-Kristian Arntzen	261b46982a	Deal with complex interlock cases in GLSL.	2019-09-04 12:18:04 +02:00
Hans-Kristian Arntzen	36c433bd92	Deal with call stacks when analyzing access.	2019-09-04 11:42:29 +02:00
Hans-Kristian Arntzen	3f2ce375e1	Analyze complex cases for fragment interlocks. If we are using interlocks in split functions or in control flow, we have some serious workarounds we need to employ.	2019-09-04 11:20:25 +02:00
Chip Davis	2eff420d9a	Support the SPV_EXT_fragment_shader_interlock extension. This was straightforward to implement in GLSL. The `ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT` modes aren't implemented yet, because we don't support `SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet. HLSL and MSL were more interesting. They don't support this directly, but they do support marking resources as "rasterizer ordered," which does roughly the same thing. So this implementation scans all accesses inside the critical section and marks all storage resources found therein as rasterizer ordered. They also don't support the fine-grained controls on pixel- vs. sample-level interlock and disabling ordering guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here merely means the order is undefined; that it just so happens to be the same as rasterizer order is immaterial. As for pixel- vs. sample-level interlock, Vulkan explicitly states: > With sample shading enabled, [the `PixelInterlockOrderedEXT` and > `PixelInterlockUnorderedEXT`] execution modes are treated like > `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT` > respectively. and: > If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`] > execution modes are used in single-sample mode they are treated like > `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT` > respectively. So this will DTRT for MoltenVK and gfx-rs, at least. MSL additionally supports multiple raster order groups; resources that are not accessed together can be placed in different ROGs to allow them to be synchronized separately. A more sophisticated analysis might be able to place resources optimally, but that's outside the scope of this change. For now, we assign all resources to group 0, which should do for our purposes. `glslang` doesn't support the `RasterizerOrdered` UAVs this implementation produces for HLSL, so the test case needs `fxc.exe`. It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`, even though the spec says it needs either 4.20 or `GL_ARB_shader_image_load_store`; and it doesn't support the `GL_NV_fragment_shader_interlock` extension at all. So I haven't been able to test those code paths. Fixes #1002.	2019-09-02 12:31:10 -05:00
Chip Davis	39dce88d3b	MSL: Add support for sampler Y'CbCr conversion. This change introduces functions and in one case, a class, to support the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of GBGR8 and BGRG8 formats, for which Metal natively supports implicit chroma reconstruction, we're on our own here. We have to do everything ourselves. Much of the complexity comes from the need to support multiple planes, which must now be passed to functions that use the corresponding combined image-samplers. The rest is from the actual Y'CbCr conversion itself, which requires additional post-processing of the sample retrieved from the image. Passing sampled images to a function was a particular problem. To support this, I've added a new class which is emitted to MSL shaders that pass sampled images with Y'CbCr conversions attached around. It can handle sampled images with or without Y'CbCr conversion. This is an awful abomination that should not exist, but I'm worried that there's some shader out there which does this. This support requires Metal 2.0 to work properly, because it uses default-constructed texture objects, which were only added in MSL 2. I'm not even going to get into arrays of combined image-samplers--that's a whole other can of worms. They are deliberately unsupported in this change. I've taken the liberty of refactoring the support for texture swizzling while I'm at it. It's now treated as a post-processing step similar to Y'CbCr conversion. I'd like to think this is cleaner than having everything in `to_function_name()`/`to_function_args()`. It still looks really hairy, though. I did, however, get rid of the explicit type arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`. Update the C API. In addition to supporting this new functionality, add some compiler options that I added in previous changes, but for which I neglected to update the C API.	2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen	3ccfbce264	Run format_all.sh.	2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen	d5a65b4190	GLSL: Assume image and sampler can be RelaxedPrecision. When merging combined image samplers, we only looked at sampler, but DXC emits RelaxedPrecision only for texture. Does not hurt to check for more things.	2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen	9436cd3036	MSL: Deal with array copies from and to threadgroup.	2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen	5d97dae1eb	Move branchless analysis to CFG. Traverse backwards instead, far more robust. Should elide basically all redundant continue; statements now.	2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen	d620f1dd26	Do not force temporary unless continue-only for loop dominates. We would force temporaries in unexpected places, causing assertions to throw if access chains were consumed in such loops.	2019-07-26 10:39:05 +02:00
Hans-Kristian Arntzen	e06efb7259	Missed case where DoWhile continue block deals with Phi.	2019-07-25 12:30:50 +02:00
Hans-Kristian Arntzen	a86308bce1	MSL: Begin rewrite of buffer packing logic.	2019-07-19 10:06:19 +02:00
Lifeng Pan	5ca8779044	Parse SPIR-V debug information extended instructions, as well as OpNoLine. No impact on result shader string.	2019-07-04 16:21:44 +08:00
Hans-Kristian Arntzen	b4e0163749	Run format_all.sh.	2019-06-21 16:02:22 +02:00
Hans-Kristian Arntzen	2b11b331d6	Merge pull request #1036 from KhronosGroup/msl-auto-binding MSL: Rewrite how resources are automatically assigned bindings.	2019-06-21 15:58:50 +02:00
Hans-Kristian Arntzen	c365cc1b43	Deal with OpPhi and case fallthrough. This is quite complex since we cannot flush Phi inside the case labels, we have to do it outside by emitting a lot of manual branches ourselves. This should be extremely rare, but we need to handle this case.	2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen	e2c95bdcbc	MSL: Rewrite how resource indices are fallback-assigned. We used to use the Binding decoration for this, but this method is hopelessly broken. If no explicit MSL resource remapping exists, we remap automatically in a manner which should always "just work".	2019-06-21 12:54:08 +02:00
Hans-Kristian Arntzen	457eba355e	Employ heuristics to figure out how to emit SSBO/UAV reflection names. This is rather shaky, but we don't have many choices here except add a lot of awkward and unintuitive options. Try to deduce this from OpSource and fallback to heuristic.	2019-06-10 11:24:24 +02:00
Hans-Kristian Arntzen	6b52b0fe8b	Deal with nested loops. Actually need to hoist out variable to outermost loop.	2019-06-06 14:37:02 +02:00
Hans-Kristian Arntzen	02ae99f399	Use the existing loop dominator when doing loop variable preservation.	2019-06-06 12:22:28 +02:00
Hans-Kristian Arntzen	bf56dc88b9	Rewrite how loop dominators are propagated. Do this analysis in the CFG stage rather than last minute with the ad-hoc algorithm we had in place before CFG was introduced.	2019-06-06 12:17:46 +02:00
Hans-Kristian Arntzen	03d93abc1a	Deal with case where a variable is dominated by inner part of a loop. There is a risk that we try to preserve a loop variable through multiple iterations, even though the dominating block is inside a loop. Fix this by analyzing if a block starts off by writing to a variable. In that case, there cannot be any preservation going on. If we don't, pretend the loop header is reading the variable, which moves the variable to an appropriate scope.	2019-06-06 11:11:44 +02:00
Hans-Kristian Arntzen	65af09d2d1	Support emitting OpLine directive. Facilitates easier mapping from source language to cross-compiled output in tooling.	2019-05-28 13:44:24 +02:00
Hans-Kristian Arntzen	42e64597a7	OpArrayLength must trigger active variables.	2019-05-27 16:44:02 +02:00
Hans-Kristian Arntzen	96492648d4	MSL: Fix struct declaration order with complex type aliases. MSL generally emits the aliases, which means we cannot always place the master type first, unlike GLSL and HLSL. The logic fix is just to reorder after we have tagged types with packing information, rather than doing it in the parser fixup.	2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen	6fcf8c83d9	GLSL: Support OpBitcast for buffer references. Update glslang/SPIRV-Tools/SPIRV-Headers references.	2019-05-09 10:29:31 +02:00
Chip Davis	01c491648b	Fix a copy-pasto.	2019-04-26 17:16:21 -05:00
Hans-Kristian Arntzen	2cc374a0c8	GLSL: Implement GL_EXT_buffer_reference. Buffer objects can contain arbitrary pointers to blocks. We can also implement ConvertPtrToU and ConvertUToPtr. The latter can cast a uint64_t to any type as it pleases, so we will need to generate fake buffer reference blocks to be able to cast the type.	2019-04-26 11:43:51 +02:00
Hans-Kristian Arntzen	e23c9ea700	Force complex loop in certain rare access chain scenarios. If we generate an access chain in a loop body, and it is consumed in the loop continue block, we have a problem because we cannot emit a temporary here holding the access chain reference. Force a complex loop body to workaround this exceptionally rare case.	2019-04-10 16:02:03 +02:00
Hans-Kristian Arntzen	3fe57d3798	Do not use SmallVector as input type in public interfaces. This is an API break, which we need to be careful with. Handing out SmallVectors is easier since the interface is basically the same.	2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen	a489ba7fd1	Reduce pressure on global allocation. - Replace ostringstream with custom implementation. ~30% performance uplift on vector-shuffle-oom test. Allocations are measurably reduced in Valgrind. - Replace std::vector with SmallVector. Classic malloc optimization, small vectors are backed by inline data. ~ 7-8% gain on vector-shuffle-oom on GCC 8 on Linux. - Use an object pool for IVariant type. We generally allocate a lot of SPIR* objects. We can amortize these allocations neatly by pooling them. - ~15% overall uplift on ./test_shaders.py --iterations 10000 shaders/.	2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen	317144a59c	Detect invalid DoWhileLoop early. We had a bug where error conditions in DoWhileLoop emit path would not detect that statements were being emitted due to the masking behavior which happens when force_recompile is true. Fix this. Also, refactor force_recompile into member functions so we can properly break on any situation where this is set, without having to rely on watchpoints in debuggers.	2019-04-05 12:19:32 +02:00
Hans-Kristian Arntzen	fc37c52d26	Fix typo with array stride error message. Trivial copy-paste bug.	2019-04-02 19:18:13 +02:00
Hans-Kristian Arntzen	9b92e68d71	Add an option to override the namespace used for spirv_cross. This is a pragmatic trick to avoid symbol collision where a project links against SPIRV-Cross statically, while linking to other projects which also use SPIRV-Cross statically. We can end up with very awkward symbol collisions which can resolve themselves silently because SPIRV-Cross is pulled in as necessary. To fix this, we must use different symbols and embed two copies of SPIRV-Cross in this scenario, now with different namespaces, which in turn leads to different symbols.	2019-03-29 10:29:44 +01:00
Patrick Mours	b2a667520d	Add reflection support for ray tracing acceleration structures	2019-03-26 15:09:42 +01:00
Patrick Mours	90c91e4f23	Fix missing check for purity on ray tracing builtins	2019-03-26 14:25:25 +01:00
Hans-Kristian Arntzen	d8e4d995e5	Remove strange include which got included for some reason.	2019-03-15 21:55:53 +01:00
Hans-Kristian Arntzen	e47a77d596	MSL: Implement Metal 2.0 indirect argument buffers.	2019-03-15 11:01:27 +01:00
Hans-Kristian Arntzen	8bfb04d29d	Run format_all.sh Disable clang format in C wrapper for now. Some weird formatting bug with the try/catch macro.	2019-03-06 12:20:13 +01:00
Hans-Kristian Arntzen	ef24337849	Support do-while where test is negative.	2019-03-06 12:17:38 +01:00
Hans-Kristian Arntzen	70ff96b03f	Deal with more for loop candidate cases. We can trivially deal with cases where the loop tests are simply inverted. We can also deal with cases where the condition block branches to the merge block via other noop blocks. This makes SPIR-V codegen easier when targeting SPIRV-Cross.	2019-03-06 11:24:43 +01:00
Hans-Kristian Arntzen	9bbdccddb7	Add a stable C API for SPIRV-Cross. This adds a new C API for SPIRV-Cross which is intended to be stable, both API and ABI wise. The C++ API has been refactored a bit to make the C wrapper easier and cleaner to write. Especially the vertex attribute / resource interfaces for MSL has been rewritten to avoid taking mutable pointers into the interface. This would be very annoying to wrap and it didn't fit well with the rest of the C++ API to begin with. While doing this, I went ahead and removed all the old deprecated interfaces. The CMake build system has also seen an overhaul. It is now possible to build static/shared/CLI separately with -D options. The shared library only exposes the C API, as it is the only ABI-stable API. pkg-configs as well as CMake modules are exported and installed for the shared library configuration.	2019-03-01 11:53:51 +01:00
Hans-Kristian Arntzen	825ff4af7e	Replace locale handling. We were using std::locale::global() to force a C locale which is not safe when SPIRV-Cross is used in a multi-threaded environment. To fix this, we could tap into various per-platform specific locale handling to get safe thread-local locales, but since locales only affect the decimal point in floats, we simply query the locale instead and do the necessary radix replacement ourselves, without touching the locale. This should be much safer and cleaner than the alternative.	2019-02-28 11:28:31 +01:00

1 2 3 4 5 ...

317 Commits