SPIRV-Cross

Author	SHA1	Message	Date
Hans-Kristian Arntzen	a3d3c80dd7	GLSL/HLSL: Implement nonuniform qualifier for image atomics.	2020-03-19 11:35:29 +01:00
Hans-Kristian Arntzen	185551bfaf	HLSL: Do not emit globallycoherent for SRV ByteAddressBuffer.	2020-03-05 10:37:36 +01:00
Hans-Kristian Arntzen	c27e1efbf1	HLSL: Add option to always treat SSBO as UAV, even with readonly. This can make codegen more predictable since ByteAddressBuffer is SRV and not UAV.	2020-03-04 16:42:31 +01:00
Hans-Kristian Arntzen	e81c1b1d98	HLSL: Declare undef variables as static. Undef variables would somehow become cbuffer variables without any warning ...	2020-02-08 13:39:50 +01:00
Hans-Kristian Arntzen	f9818f0804	Update license headers to 2020.	2020-01-16 15:24:37 +01:00
Hans-Kristian Arntzen	f79c1e2fed	Deal with illegal names in types as well. - Fixes issue with clip_distance flattening in MSL where member to flatten from would come from to_member_name, where it should have used the builtin name directly. This member name was modified by this patch and broke clip distance test shaders. - Some cleanups with ir.meta, use ir.find_meta instead to not create unnecessary hashmap nodes.	2020-01-16 10:34:49 +01:00
Hans-Kristian Arntzen	172e39f039	Merge pull request #1257 from KhronosGroup/fix-1236 Deal with bitcasting for subgroup Min/Max operations	2020-01-09 15:35:43 +01:00
Hans-Kristian Arntzen	cc153f8d7f	HLSL: Add a resource remapping API similar to MSL. Allows more flexibility of how resources are assigned without having to remap decorations.	2020-01-09 12:41:06 +01:00
Hans-Kristian Arntzen	88ddeec49a	HLSL: Deal with casting for WaveActiveMin/Max.	2020-01-09 12:35:18 +01:00
Hans-Kristian Arntzen	c256525c7b	Run format_all.sh.	2020-01-08 14:27:34 +01:00
Hans-Kristian Arntzen	1cbd71b354	HLSL: Fix bug when reading and writing structs from SSBO.	2020-01-08 14:27:02 +01:00
Hans-Kristian Arntzen	151ff1e870	HLSL: Implement stores for complex composites in ByteAddressBuffers.	2020-01-08 14:17:28 +01:00
Hans-Kristian Arntzen	ca9398c122	HLSL: Support loading complex composites from ByteAddressBuffer.	2020-01-08 13:05:56 +01:00
Hans-Kristian Arntzen	b9e5fe01b0	HLSL: Add support to remove register() bindings. Sometimes it's useful to get automatic binding assignment from the D3D compiler instead.	2019-11-11 11:23:21 +01:00
Hans-Kristian Arntzen	0b417b586a	HLSL: Report more explicitly which member failed validation. This will be awkward to report in GLSL where we check multiple packing standards, but for HLSL it should be easy since there's only CBuffer packing standard to worry about.	2019-11-06 11:21:39 +01:00
Hans-Kristian Arntzen	e73d9bee38	HLSL: Report which cbuffer failed validation.	2019-11-06 11:05:31 +01:00
Hans-Kristian Arntzen	6edbf0c9e9	MSL: Minor cleanups for texture atomic emulation. Storing pointers to internal objects is generally not done, IDs are preferred.	2019-10-24 11:30:20 +02:00
Hans-Kristian Arntzen	a9be92569f	HLSL: Fix unrolled S/G LE/LT/GE/GT opcodes. Need to bitcast the unrolled expressions as well.	2019-10-14 16:08:39 +02:00
Hans-Kristian Arntzen	b960ae3b70	HLSL: Partially implement Unordered compare. We cannot correctly implement unordered equal/ordered not equal without a lot of extra instructions which slows normal code down.	2019-10-14 15:15:03 +02:00
Hans-Kristian Arntzen	333980ae91	Refactor into stronger types in public API. Some fallout where internal functions are using stronger types. Overkill to move everything over to strong types right now, but perhaps move over to it slowly over time.	2019-09-06 12:29:47 +02:00
Hans-Kristian Arntzen	261b46982a	Deal with complex interlock cases in GLSL.	2019-09-04 12:18:04 +02:00
Chip Davis	2eff420d9a	Support the SPV_EXT_fragment_shader_interlock extension. This was straightforward to implement in GLSL. The `ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT` modes aren't implemented yet, because we don't support `SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet. HLSL and MSL were more interesting. They don't support this directly, but they do support marking resources as "rasterizer ordered," which does roughly the same thing. So this implementation scans all accesses inside the critical section and marks all storage resources found therein as rasterizer ordered. They also don't support the fine-grained controls on pixel- vs. sample-level interlock and disabling ordering guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here merely means the order is undefined; that it just so happens to be the same as rasterizer order is immaterial. As for pixel- vs. sample-level interlock, Vulkan explicitly states: > With sample shading enabled, [the `PixelInterlockOrderedEXT` and > `PixelInterlockUnorderedEXT`] execution modes are treated like > `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT` > respectively. and: > If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`] > execution modes are used in single-sample mode they are treated like > `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT` > respectively. So this will DTRT for MoltenVK and gfx-rs, at least. MSL additionally supports multiple raster order groups; resources that are not accessed together can be placed in different ROGs to allow them to be synchronized separately. A more sophisticated analysis might be able to place resources optimally, but that's outside the scope of this change. For now, we assign all resources to group 0, which should do for our purposes. `glslang` doesn't support the `RasterizerOrdered` UAVs this implementation produces for HLSL, so the test case needs `fxc.exe`. It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`, even though the spec says it needs either 4.20 or `GL_ARB_shader_image_load_store`; and it doesn't support the `GL_NV_fragment_shader_interlock` extension at all. So I haven't been able to test those code paths. Fixes #1002.	2019-09-02 12:31:10 -05:00
Chip Davis	39dce88d3b	MSL: Add support for sampler Y'CbCr conversion. This change introduces functions and in one case, a class, to support the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of GBGR8 and BGRG8 formats, for which Metal natively supports implicit chroma reconstruction, we're on our own here. We have to do everything ourselves. Much of the complexity comes from the need to support multiple planes, which must now be passed to functions that use the corresponding combined image-samplers. The rest is from the actual Y'CbCr conversion itself, which requires additional post-processing of the sample retrieved from the image. Passing sampled images to a function was a particular problem. To support this, I've added a new class which is emitted to MSL shaders that pass sampled images with Y'CbCr conversions attached around. It can handle sampled images with or without Y'CbCr conversion. This is an awful abomination that should not exist, but I'm worried that there's some shader out there which does this. This support requires Metal 2.0 to work properly, because it uses default-constructed texture objects, which were only added in MSL 2. I'm not even going to get into arrays of combined image-samplers--that's a whole other can of worms. They are deliberately unsupported in this change. I've taken the liberty of refactoring the support for texture swizzling while I'm at it. It's now treated as a post-processing step similar to Y'CbCr conversion. I'd like to think this is cleaner than having everything in `to_function_name()`/`to_function_args()`. It still looks really hairy, though. I did, however, get rid of the explicit type arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`. Update the C API. In addition to supporting this new functionality, add some compiler options that I added in previous changes, but for which I neglected to update the C API.	2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen	b3305799a8	Deal correctly with sign on bitfield operations. Need a lot of special purpose implementation functions for these.	2019-08-26 11:36:36 +02:00
Hans-Kristian Arntzen	4bc8729c0e	HLSL query lod cleanups.	2019-07-24 11:34:28 +02:00
Hans-Kristian Arntzen	47a18b9f1b	Simplify row-major matrix/vector multiplies.	2019-07-23 10:56:57 +02:00
Hans-Kristian Arntzen	dd7ebaf9f7	Start considering how to emit physical type ID.	2019-07-19 10:06:19 +02:00
Hans-Kristian Arntzen	a86308bce1	MSL: Begin rewrite of buffer packing logic.	2019-07-19 10:06:19 +02:00
Chip Davis	50dce10c5d	Support the SPV_EXT_demote_to_helper_invocation extension. This extension provides a new operation which causes a fragment to be discarded without terminating the fragment shader invocation. The invocation for the discarded fragment becomes a helper invocation, so that derivatives will remain defined. The old `HelperInvocation` builtin becomes undefined when this occurs, so a second new instruction queries the current helper invocation status. This is only fully supported for GLSL. HLSL doesn't support the `IsHelperInvocation` operation and MSL doesn't support the `DemoteToHelperInvocation` op. Fixes #1052.	2019-07-17 09:12:22 -05:00
Hans-Kristian Arntzen	c7eda1bce9	Test glsl.std450 more exhaustively. Make sure to test everything with scalar as well to catch any weird edge cases. Not all opcodes are covered here, just the arithmetic ones. FP64 packing is also ignored.	2019-07-17 11:53:05 +02:00
Hans-Kristian Arntzen	932ee0e328	Deal correctly with return sign of bitscan operations.	2019-07-12 10:57:56 +02:00
Chip Davis	6628ea6e48	MSL: Use the select() function for OpSelect. This significantly improves codegen for vector `OpSelect` in MSL.	2019-07-11 10:30:37 -05:00
Hans-Kristian Arntzen	d12b54bbb4	Propagate NonUniformEXT to dependent expressions. This decoration might only be present for the very last ID which is consumed by a sampling or Load/Store instruction. To make sure our access chains are emitted correctly, we have to back-propagate this decoration.	2019-07-08 11:19:38 +02:00
Hans-Kristian Arntzen	4056d0b74e	Don't use scalar dot().	2019-07-03 14:32:06 +02:00
Hans-Kristian Arntzen	041f103d44	MSL/HLSL: Support scalar reflect and refract.	2019-07-03 12:31:52 +02:00
Hans-Kristian Arntzen	f8b084de61	MSL/HLSL: Support OpOuterProduct.	2019-07-01 10:57:27 +02:00
Hans-Kristian Arntzen	ff87419607	Deal with scalar input values for distance/length/normalize. HLSL and MSL don't support it, so fall back to simpler intrinsics.	2019-06-28 11:20:14 +02:00
Hans-Kristian Arntzen	581ed0fd59	HLSL: Does not support case-fallthrough. Disable any fallthrough on HLSL. Risky business if fallthrough blocks had a barrier(), but can't do anything about that ...	2019-06-27 15:10:17 +02:00
Hans-Kristian Arntzen	f171d82590	MSL: Support MinLod operand.	2019-06-19 09:43:03 +02:00
Hans-Kristian Arntzen	96492648d4	MSL: Fix struct declaration order with complex type aliases. MSL generally emits the aliases, which means we cannot always place the master type first, unlike GLSL and HLSL. The logic fix is just to reorder after we have tagged types with packing information, rather than doing it in the parser fixup.	2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen	45a36ad034	Run format_all.sh.	2019-05-14 09:54:35 +02:00
Hans-Kristian Arntzen	647ddaee42	HLSL/MSL: Deal correctly with nonuniformEXT qualifier. MSL does not seem to have a qualifier for this, but HLSL SM 5.1 does. glslangValidator for HLSL does not support this, so skip any validation, but it passes in FXC.	2019-05-13 14:58:27 +02:00
Hans-Kristian Arntzen	e9da5ed631	HLSL: Support OpArrayLength.	2019-05-07 15:53:41 +02:00
Hans-Kristian Arntzen	2cc374a0c8	GLSL: Implement GL_EXT_buffer_reference. Buffer objects can contain arbitrary pointers to blocks. We can also implement ConvertPtrToU and ConvertUToPtr. The latter can cast a uint64_t to any type as it pleases, so we will need to generate fake buffer reference blocks to be able to cast the type.	2019-04-26 11:43:51 +02:00
Hans-Kristian Arntzen	8b236f24f1	Fix infinite loop when OpAtomic* temporaries are used in other blocks. We made the mistake of registering a dependency on the atomic variable even if the atomic result was forced to a temporary. There is no need to register reads from atomic variables like this as we always force atomic results to a temporary and argument read/writes do not need to be tracked.	2019-04-24 09:33:39 +02:00
Hans-Kristian Arntzen	7a87701ebe	Merge pull request #945 from ashleyharris-maptek-com-au/fixHlslAttributeLeak Don't apply vertex attribute remapping to other interface blocks	2019-04-12 10:40:32 +02:00
Ashley Harris	cc2d290bfe	Don't apply vertex attribute remapping other non-vertex or non-input interface blocks	2019-04-12 13:54:58 +09:30
Hans-Kristian Arntzen	3fe57d3798	Do not use SmallVector as input type in public interfaces. This is an API break, which we need to be careful with. Handing out SmallVectors is easier since the interface is basically the same.	2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen	a489ba7fd1	Reduce pressure on global allocation. - Replace ostringstream with custom implementation. ~30% performance uplift on vector-shuffle-oom test. Allocations are measurably reduced in Valgrind. - Replace std::vector with SmallVector. Classic malloc optimization, small vectors are backed by inline data. ~ 7-8% gain on vector-shuffle-oom on GCC 8 on Linux. - Use an object pool for IVariant type. We generally allocate a lot of SPIR* objects. We can amortize these allocations neatly by pooling them. - ~15% overall uplift on ./test_shaders.py --iterations 10000 shaders/.	2019-04-09 15:09:44 +02:00
Hans-Kristian Arntzen	317144a59c	Detect invalid DoWhileLoop early. We had a bug where error conditions in DoWhileLoop emit path would not detect that statements were being emitted due to the masking behavior which happens when force_recompile is true. Fix this. Also, refactor force_recompile into member functions so we can properly break on any situation where this is set, without having to rely on watchpoints in debuggers.	2019-04-05 12:19:32 +02:00

1 2 3 4 5 ...

305 Commits