SPIRV-Cross

Author	SHA1	Message	Date
Hans-Kristian Arntzen	261b46982a	Deal with complex interlock cases in GLSL.	2019-09-04 12:18:04 +02:00
Hans-Kristian Arntzen	63a770ed5c	Add test shader for simple case of interlocked callstack.	2019-09-04 11:56:19 +02:00
Chip Davis	2eff420d9a	Support the SPV_EXT_fragment_shader_interlock extension. This was straightforward to implement in GLSL. The `ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT` modes aren't implemented yet, because we don't support `SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet. HLSL and MSL were more interesting. They don't support this directly, but they do support marking resources as "rasterizer ordered," which does roughly the same thing. So this implementation scans all accesses inside the critical section and marks all storage resources found therein as rasterizer ordered. They also don't support the fine-grained controls on pixel- vs. sample-level interlock and disabling ordering guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here merely means the order is undefined; that it just so happens to be the same as rasterizer order is immaterial. As for pixel- vs. sample-level interlock, Vulkan explicitly states: > With sample shading enabled, [the `PixelInterlockOrderedEXT` and > `PixelInterlockUnorderedEXT`] execution modes are treated like > `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT` > respectively. and: > If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`] > execution modes are used in single-sample mode they are treated like > `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT` > respectively. So this will DTRT for MoltenVK and gfx-rs, at least. MSL additionally supports multiple raster order groups; resources that are not accessed together can be placed in different ROGs to allow them to be synchronized separately. A more sophisticated analysis might be able to place resources optimally, but that's outside the scope of this change. For now, we assign all resources to group 0, which should do for our purposes. `glslang` doesn't support the `RasterizerOrdered` UAVs this implementation produces for HLSL, so the test case needs `fxc.exe`. It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`, even though the spec says it needs either 4.20 or `GL_ARB_shader_image_load_store`; and it doesn't support the `GL_NV_fragment_shader_interlock` extension at all. So I haven't been able to test those code paths. Fixes #1002.	2019-09-02 12:31:10 -05:00
Chip Davis	39dce88d3b	MSL: Add support for sampler Y'CbCr conversion. This change introduces functions and in one case, a class, to support the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of GBGR8 and BGRG8 formats, for which Metal natively supports implicit chroma reconstruction, we're on our own here. We have to do everything ourselves. Much of the complexity comes from the need to support multiple planes, which must now be passed to functions that use the corresponding combined image-samplers. The rest is from the actual Y'CbCr conversion itself, which requires additional post-processing of the sample retrieved from the image. Passing sampled images to a function was a particular problem. To support this, I've added a new class which is emitted to MSL shaders that pass sampled images with Y'CbCr conversions attached around. It can handle sampled images with or without Y'CbCr conversion. This is an awful abomination that should not exist, but I'm worried that there's some shader out there which does this. This support requires Metal 2.0 to work properly, because it uses default-constructed texture objects, which were only added in MSL 2. I'm not even going to get into arrays of combined image-samplers--that's a whole other can of worms. They are deliberately unsupported in this change. I've taken the liberty of refactoring the support for texture swizzling while I'm at it. It's now treated as a post-processing step similar to Y'CbCr conversion. I'd like to think this is cleaner than having everything in `to_function_name()`/`to_function_args()`. It still looks really hairy, though. I did, however, get rid of the explicit type arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`. Update the C API. In addition to supporting this new functionality, add some compiler options that I added in previous changes, but for which I neglected to update the C API.	2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen	9b845a4788	Merge pull request #1141 from troughton/inline-everything MSL: Inline all non-entry-point functions	2019-08-30 11:05:04 +02:00
Thomas Roughton	91b2f34a3d	Update tests to account for all non-entry-point functions being inlined	2019-08-30 09:39:06 +12:00
Hans-Kristian Arntzen	07c76f66b5	MSL: Add {Base,}{Vertex,Instance}Index to bitcast_from_builtin_load. Totally missed these, so float(index) would not work correctly for negative numbers.	2019-08-29 13:56:37 +02:00
Chip Davis	5fe1ecc324	GLSL: Fix post-depth coverage for ESSL. ESSL does not support `GL_ARB_post_depth_coverage`. There, we must use `GL_EXT_post_depth_coverage`. I've added this as a fallback for desktop as well. Note that `GL_EXT_post_depth_coverage` also requires the fragment shader to set `early_fragment_tests` explicitly, while `GL_ARB_post_depth_coverage` does not. It doesn't really matter either way, since `SPV_KHR_post_depth_coverage` also requires both execution modes to be explicitly set.	2019-08-28 13:40:13 -05:00
Hans-Kristian Arntzen	d5a65b4190	GLSL: Assume image and sampler can be RelaxedPrecision. When merging combined image samplers, we only looked at sampler, but DXC emits RelaxedPrecision only for texture. Does not hurt to check for more things.	2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen	563e994486	Merge pull request #1135 from KhronosGroup/fix-1119 MSL: Deal with array copies from and to threadgroup.	2019-08-27 15:48:08 +02:00
Hans-Kristian Arntzen	aec826222d	Merge pull request #1134 from KhronosGroup/fix-1117 Do not allow base expressions for non-native row-major matrices.	2019-08-27 15:47:33 +02:00
Hans-Kristian Arntzen	9436cd3036	MSL: Deal with array copies from and to threadgroup.	2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen	1017a02aad	Merge pull request #1133 from KhronosGroup/fix-1115 Deal with ldexp taking uint input.	2019-08-27 13:17:43 +02:00
Hans-Kristian Arntzen	7ff2db4570	Do not allow base expressions for non-native row-major matrices.	2019-08-27 11:41:54 +02:00
Hans-Kristian Arntzen	2f7848dcda	Deal with ldexp taking uint input. Need to value cast to int first.	2019-08-27 11:19:54 +02:00
Hans-Kristian Arntzen	5d97dae1eb	Move branchless analysis to CFG. Traverse backwards instead, far more robust. Should elide basically all redundant continue; statements now.	2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen	55c2ca90ae	Elide branches to continue block when continue block is also a merge.	2019-08-27 10:19:01 +02:00
Hans-Kristian Arntzen	903ef0e40a	Merge pull request #1130 from KhronosGroup/fix-1112 Deal correctly with sign on bitfield operations.	2019-08-26 16:23:00 +02:00
Hans-Kristian Arntzen	b3305799a8	Deal correctly with sign on bitfield operations. Need a lot of special purpose implementation functions for these.	2019-08-26 11:36:36 +02:00
Hans-Kristian Arntzen	e3d4dddfec	Fix variable scope when switch block exits multiple times. Inner scope can still dominate here, so we need to be conservative when we observe switch blocks specifically. Normal selection merges cannot merge from multiple paths.	2019-08-26 10:05:43 +02:00
Hans-Kristian Arntzen	87513f9ac0	Merge pull request #1102 from KhronosGroup/fix-1096 MSL: Deal with Modf/Frexp where output is access chain to scalar.	2019-07-26 14:28:16 +02:00
Hans-Kristian Arntzen	abb345d0b3	MSL: Deal with Modf/Frexp where output is access chain to scalar. This is not allowed as we cannot take mutable reference to a vec.{x,y,z,w}. We only care about scalar since entire vectors are fine.	2019-07-26 11:02:38 +02:00
Hans-Kristian Arntzen	d620f1dd26	Do not force temporary unless continue-only for loop dominates. We would force temporaries in unexpected places, causing assertions to throw if access chains were consumed in such loops.	2019-07-26 10:39:05 +02:00
Hans-Kristian Arntzen	301eab1b7a	Merge pull request #1099 from KhronosGroup/fix-1091 Missed case where DoWhile continue block deals with Phi.	2019-07-25 17:44:17 +02:00
Hans-Kristian Arntzen	e06efb7259	Missed case where DoWhile continue block deals with Phi.	2019-07-25 12:30:50 +02:00
Hans-Kristian Arntzen	12ca9d1982	Vulkan GLSL: Support disabling samplerless texture function EXT. Some platforms support Vulkan GLSL, but not this extension apparently ...	2019-07-25 11:07:14 +02:00
Chip Davis	fb5ee4cb5c	MSL: Adjust BuiltInWorkgroupId for vkCmdDispatchBase(). This command allows the caller to set the base value of `BuiltInWorkgroupId`, and thus of `BuiltInGlobalInvocationId`. Metal provides no direct support for this... but it does provide a builtin, `[[grid_origin]]`, normally used to pass the base values for the stage input region, which we will now abuse to pass the dispatch base and avoid burning a buffer binding. `[[grid_origin]]`, as part of Metal's support for compute stage input, requires MSL 1.2. For 1.0 and 1.1, we're forced to provide a buffer. (Curiously, this builtin was undocumented until the MSL 2.2 release. Go figure.)	2019-07-24 08:56:15 -05:00
Hans-Kristian Arntzen	c62503bca7	Do not attempt to pack types which are already scalar.	2019-07-24 11:52:28 +02:00
Hans-Kristian Arntzen	4bc8729c0e	HLSL query lod cleanups.	2019-07-24 11:34:28 +02:00
Hans-Kristian Arntzen	461f1506e7	Do not eagerly invalidate all active variables on a branch. This is not necessary, as we must emit an invalidating store before we potentially consume an invalid expression. In fact, we're a bit conservative here in this case for example: int tmp = variable; if (...) { variable = 10; } else { // Consuming tmp here is fine, but it was // invalidated while emitting other branch. // Technically, we need to study if there is an invalidating store // in the CFG between the loading block and this block, and the other // branch will not be a part of that analysis. int tmp2 = tmp * tmp; } Fixing this case means complex CFG traversal everywhere, and it feels like overkill. Fixing this exposed a bug with access chains, so fix a bug where expression dependencies were not inherited properly in access chains. Access chains are now considered forwarded if there is at least one dependency which is also forwarded.	2019-07-24 11:17:30 +02:00
Hans-Kristian Arntzen	18bcc9b790	Do not disable temporary forwarding when we suppress usage tracking. This subtle bug removed any expression validation for trivially swizzled variables. Make usage suppression a more explicit concept rather than just hacking off forwarded_temporaries. There is some fallout here with loop generation since our expression invalidation is currently a bit too naive to handle loops properly. The forwarding bug masked this problem until now. If part of the loop condition is also used in the body, we end up reading an invalid expression, which in turn forces a temporary to be generated in the condition block, not good. We'll need to be smarter here ...	2019-07-23 19:18:44 +02:00
Hans-Kristian Arntzen	8ba0507a6d	Add another test for unpacking without load forwarding.	2019-07-23 17:14:59 +02:00
Hans-Kristian Arntzen	1ece67a050	Look at pointee type when unpacking expressions. We might be unpacking in OpLoad, so don't want any pointer types from access chains creeping in.	2019-07-23 17:07:15 +02:00
Hans-Kristian Arntzen	ebe109d91d	Deal correctly with non-forwarded packed loads. Need to unpack the expression if we're not forwarding.	2019-07-23 16:25:19 +02:00
Hans-Kristian Arntzen	79f533b662	Test CompositeInsert/Extract/VectorShuffle on packed vectors.	2019-07-23 15:44:35 +02:00
Hans-Kristian Arntzen	5582145549	Add test for array of scalar struct.	2019-07-23 15:30:03 +02:00
Hans-Kristian Arntzen	5c1cb7accf	Recursively pack struct types when we find scalar packed structs.	2019-07-23 15:24:53 +02:00
Hans-Kristian Arntzen	ef1fa71bba	Unpack vector expression in Matrix-Vector multiplies.	2019-07-23 12:22:40 +02:00
Hans-Kristian Arntzen	0f10601f27	Test matrix multiplies in more complex scenarios.	2019-07-23 12:12:24 +02:00
Hans-Kristian Arntzen	978253c804	Test implicit packing of struct members.	2019-07-23 12:04:15 +02:00
Hans-Kristian Arntzen	fc741596d4	Add tests for struct padding and self-alignment.	2019-07-23 11:46:34 +02:00
Hans-Kristian Arntzen	7277c7ac46	Use to_unpacked_row_major_expression to unify row-major in MSL/GLSL.	2019-07-23 11:36:54 +02:00
Hans-Kristian Arntzen	47a18b9f1b	Simplify row-major matrix/vector multiplies.	2019-07-23 10:56:57 +02:00
Hans-Kristian Arntzen	d584d833fa	Test array of std140 vectors.	2019-07-23 10:38:32 +02:00
Hans-Kristian Arntzen	6224199c76	Add struct size padding tests.	2019-07-23 10:30:37 +02:00
Hans-Kristian Arntzen	82c819ee6c	Add test for CompositeExtract from row-major loaded vector.	2019-07-22 16:32:22 +02:00
Hans-Kristian Arntzen	d7a5303cf2	Add test for split access chain into row-major matrix.	2019-07-22 16:28:05 +02:00
Hans-Kristian Arntzen	2172b19be2	Remove obsolete matrix workaround code.	2019-07-22 16:27:47 +02:00
Hans-Kristian Arntzen	609d087f8f	Only transpose unpacked expressions.	2019-07-22 16:06:09 +02:00
Hans-Kristian Arntzen	6057ffcbb1	Deal correctly with complete stores to row_major matrices.	2019-07-22 15:49:17 +02:00

1 2 3 4 5 ...

780 Commits