SPIRV-Cross

Author	SHA1	Message	Date
Chip Davis	aca9b6879a	MSL: Support pull-model interpolation on MSL 2.3+. New in MSL 2.3 is a template that can be used in the place of a scalar type in a stage-in struct. This template has methods which interpolate the varying at the given points. Curiously, you can't set interpolation attributes on such a varying; perspective-correctness is encoded in the type, while interpolation must be done using one of the methods. This makes using this somewhat awkward from SPIRV-Cross, requiring us to jump through a bunch of hoops to make this all work. Using varyings from functions in particular is a pain point, requiring us to pass the stage-in struct itself around. An alternative is to pass references to the interpolants; except this will fall over badly with composite types, which naturally must be flattened. As with tessellation, dynamic indexing isn't supported with pull-model interpolation. This is because of the need to reference the original struct member in order to call one of the pull-model interpolation methods on it. Also, this is done at the variable level; this means that if one varying in a struct is used with the pull-model functions, then the entire struct is emitted as pull-model interpolants. For some reason, this was not documented in the MSL spec, though there is a property on `MTLDevice`, `supportsPullModelInterpolation`, indicating support for this, which is documented. This does not appear to be implemented yet for AMD: it returns `NO` from `supportsPullModelInterpolation`, and pipelines with shaders using the templates fail to compile. It is implemeted for Intel. It's probably also implemented for Apple GPUs: on Apple Silicon, OpenGL calls down to Metal, and it wouldn't be possible to use the interpolation functions without this implemented in Metal. Based on my testing, where SPIR-V and GLSL have the offset relative to the pixel center, in Metal it appears to be relative to the pixel's upper-left corner, as in HLSL. Therefore, I've added an offset 0.4375, i.e. one half minus one sixteenth, to all arguments to `interpolate_at_offset()`. This also fixes a long-standing bug: if a pull-model interpolation function is used on a varying, make sure that varying is declared. We were already doing this only for the AMD pull-model function, `interpolateAtVertexAMD()`; for reasons which are completely beyond me, we weren't doing this for the base interpolation functions. I also note that there are no tests for the interpolation functions for GLSL or HLSL.	2020-11-05 11:57:45 -06:00
Hans-Kristian Arntzen	e6f5ce6b89	Merge pull request #1471 from KhronosGroup/fix-1467 Work around MSVC warning.	2020-09-28 18:47:29 +02:00
Hans-Kristian Arntzen	34a6a45fba	Work around MSVC warning.	2020-09-28 14:12:54 +02:00
Hans-Kristian Arntzen	5ea576ece2	Allow flip_vert_y in all relevant stages.	2020-09-28 14:10:08 +02:00
Hans-Kristian Arntzen	66afe8c499	Implement a simple evaluator of specialization constants. In some cases, we need to get a literal value from a spec constant op. Mostly relevant when emitting buffers, so implement a 32-bit integer scalar subset of the evaluator. Can be extended as needed to support evaluating any specialization constant operation.	2020-09-14 11:45:59 +02:00
Hans-Kristian Arntzen	d573a95a9c	Run format_all.sh.	2020-07-01 11:42:58 +02:00
Hans-Kristian Arntzen	3afbfdb090	Implement context-sensitive expression read tracking. When inside a loop, treat any read of outer expressions to happen multiple times, forcing a temporary of said outer expressions. This avoids the problem where we can end up relying on loop-invariant code motion to happen in the compiler when converting optimized shaders.	2020-06-29 12:20:35 +02:00
Hans-Kristian Arntzen	7314f51a32	MSL: Deal with loading non-value-type arrays.	2020-06-18 12:46:39 +02:00
Hans-Kristian Arntzen	03d4bcea68	MSL: Improve handling of array types in buffer objects. When loading and storing array types which belong to buffer objects, we need to treat these values as not being value types. Also, need to handle array load/store from/to more address space combinations.	2020-06-18 11:49:03 +02:00
Hans-Kristian Arntzen	58dad82fcb	Handle physical pointers in reflection API.	2020-05-25 13:45:49 +02:00
Alexis Payen de la Garanderie	4edfe96739	Fixed recursion in combined_decoration_for_member Members in nested structs were not properly iterated on, and as a result, flags like row major for matrices could be not propagated properly.	2020-04-27 15:54:16 +09:00
Hans-Kristian Arntzen	6b0e558169	Handle RayQueryKHR type. Do not error out in parsing in shaders which use ray queries.	2020-04-21 14:25:18 +02:00
Hans-Kristian Arntzen	f9818f0804	Update license headers to 2020.	2020-01-16 15:24:37 +01:00
Hans-Kristian Arntzen	cf725b4c63	Go through access chain path for OpCopyLogical. We will need to deal with packing/unpacking data when copying from/to complex types in MSL.	2020-01-06 12:29:44 +01:00
Hans-Kristian Arntzen	e2155053c5	Fix broken access tracking for OpFunctionCall results. We were looking at args[1] after incrementing args array, not before, which means we tracked garbage. This is also an out-of-bounds hazard.	2019-10-29 11:13:39 +01:00
Hans-Kristian Arntzen	648dfa5070	MSL: Ensure stable output for access chain CFG workarounds. We had output dependent on complex_continue being set, but setting that flag was dependent on unordered_set declaration order. Make it invariant to ordering and change the implementation so it knows about the new temporary hoisting for access chains.	2019-10-28 10:57:51 +01:00
Hans-Kristian Arntzen	8066d13599	MSL: Rewrite propagated depth comparison state handling. Far cleaner, and more correct to run the traversal twice. Fixes a case where we propagate depth state through multiple functions.	2019-10-26 16:10:11 +02:00
Hans-Kristian Arntzen	830e24c4ba	MSL: Do read-only lookups of access_chain_children.	2019-10-26 16:10:11 +02:00
Lukas Hermanns	7ad0a84778	Updates for pull request #1162	2019-09-24 14:35:25 -04:00
Lukas Hermanns	37df74035b	Merge branch 'ue4_dev'	2019-09-20 09:42:42 -04:00
Lukas Hermanns	50ac6862ac	Rearranged all 'UE Change' comments to match to project's coding style.	2019-09-18 14:03:54 -04:00
Lukas Hermanns	a9f3c981d9	Adjustments after rebase of ue4_dev branch.	2019-09-13 14:03:02 -04:00
Hans-Kristian Arntzen	bfa76ee2ab	Consider discard and demote as impure statements. Fixes cases where discard and demote are called in pure functions and the function result is not consumed.	2019-09-12 14:21:10 +02:00
Mark Satterthwaite	869d628521	The result of an AccessChain intrinsic in SPIRV can be referenced by multiple blocks but when they are loops that can result in compilation problems because the source variables might not be declared early enough. This forces us to hoist those variables high enough to make it work.	2019-09-11 14:01:40 -04:00
Mark Satterthwaite	32557e9093	SPIRV doesn't distinguish depth textures from regular textures, but Metal does, so if we've ever seen a depth comparison operation we must ensure that the texture is specified as a depth-texture.	2019-09-06 16:58:27 -04:00
Hans-Kristian Arntzen	333980ae91	Refactor into stronger types in public API. Some fallout where internal functions are using stronger types. Overkill to move everything over to strong types right now, but perhaps move over to it slowly over time.	2019-09-06 12:29:47 +02:00
Hans-Kristian Arntzen	261b46982a	Deal with complex interlock cases in GLSL.	2019-09-04 12:18:04 +02:00
Hans-Kristian Arntzen	36c433bd92	Deal with call stacks when analyzing access.	2019-09-04 11:42:29 +02:00
Hans-Kristian Arntzen	3f2ce375e1	Analyze complex cases for fragment interlocks. If we are using interlocks in split functions or in control flow, we have some serious workarounds we need to employ.	2019-09-04 11:20:25 +02:00
Chip Davis	2eff420d9a	Support the SPV_EXT_fragment_shader_interlock extension. This was straightforward to implement in GLSL. The `ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT` modes aren't implemented yet, because we don't support `SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet. HLSL and MSL were more interesting. They don't support this directly, but they do support marking resources as "rasterizer ordered," which does roughly the same thing. So this implementation scans all accesses inside the critical section and marks all storage resources found therein as rasterizer ordered. They also don't support the fine-grained controls on pixel- vs. sample-level interlock and disabling ordering guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here merely means the order is undefined; that it just so happens to be the same as rasterizer order is immaterial. As for pixel- vs. sample-level interlock, Vulkan explicitly states: > With sample shading enabled, [the `PixelInterlockOrderedEXT` and > `PixelInterlockUnorderedEXT`] execution modes are treated like > `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT` > respectively. and: > If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`] > execution modes are used in single-sample mode they are treated like > `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT` > respectively. So this will DTRT for MoltenVK and gfx-rs, at least. MSL additionally supports multiple raster order groups; resources that are not accessed together can be placed in different ROGs to allow them to be synchronized separately. A more sophisticated analysis might be able to place resources optimally, but that's outside the scope of this change. For now, we assign all resources to group 0, which should do for our purposes. `glslang` doesn't support the `RasterizerOrdered` UAVs this implementation produces for HLSL, so the test case needs `fxc.exe`. It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`, even though the spec says it needs either 4.20 or `GL_ARB_shader_image_load_store`; and it doesn't support the `GL_NV_fragment_shader_interlock` extension at all. So I haven't been able to test those code paths. Fixes #1002.	2019-09-02 12:31:10 -05:00
Chip Davis	39dce88d3b	MSL: Add support for sampler Y'CbCr conversion. This change introduces functions and in one case, a class, to support the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of GBGR8 and BGRG8 formats, for which Metal natively supports implicit chroma reconstruction, we're on our own here. We have to do everything ourselves. Much of the complexity comes from the need to support multiple planes, which must now be passed to functions that use the corresponding combined image-samplers. The rest is from the actual Y'CbCr conversion itself, which requires additional post-processing of the sample retrieved from the image. Passing sampled images to a function was a particular problem. To support this, I've added a new class which is emitted to MSL shaders that pass sampled images with Y'CbCr conversions attached around. It can handle sampled images with or without Y'CbCr conversion. This is an awful abomination that should not exist, but I'm worried that there's some shader out there which does this. This support requires Metal 2.0 to work properly, because it uses default-constructed texture objects, which were only added in MSL 2. I'm not even going to get into arrays of combined image-samplers--that's a whole other can of worms. They are deliberately unsupported in this change. I've taken the liberty of refactoring the support for texture swizzling while I'm at it. It's now treated as a post-processing step similar to Y'CbCr conversion. I'd like to think this is cleaner than having everything in `to_function_name()`/`to_function_args()`. It still looks really hairy, though. I did, however, get rid of the explicit type arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`. Update the C API. In addition to supporting this new functionality, add some compiler options that I added in previous changes, but for which I neglected to update the C API.	2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen	3ccfbce264	Run format_all.sh.	2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen	d5a65b4190	GLSL: Assume image and sampler can be RelaxedPrecision. When merging combined image samplers, we only looked at sampler, but DXC emits RelaxedPrecision only for texture. Does not hurt to check for more things.	2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen	9436cd3036	MSL: Deal with array copies from and to threadgroup.	2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen	5d97dae1eb	Move branchless analysis to CFG. Traverse backwards instead, far more robust. Should elide basically all redundant continue; statements now.	2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen	d620f1dd26	Do not force temporary unless continue-only for loop dominates. We would force temporaries in unexpected places, causing assertions to throw if access chains were consumed in such loops.	2019-07-26 10:39:05 +02:00
Hans-Kristian Arntzen	e06efb7259	Missed case where DoWhile continue block deals with Phi.	2019-07-25 12:30:50 +02:00
Hans-Kristian Arntzen	a86308bce1	MSL: Begin rewrite of buffer packing logic.	2019-07-19 10:06:19 +02:00
Lifeng Pan	5ca8779044	Parse SPIR-V debug information extended instructions, as well as OpNoLine. No impact on result shader string.	2019-07-04 16:21:44 +08:00
Hans-Kristian Arntzen	b4e0163749	Run format_all.sh.	2019-06-21 16:02:22 +02:00
Hans-Kristian Arntzen	2b11b331d6	Merge pull request #1036 from KhronosGroup/msl-auto-binding MSL: Rewrite how resources are automatically assigned bindings.	2019-06-21 15:58:50 +02:00
Hans-Kristian Arntzen	c365cc1b43	Deal with OpPhi and case fallthrough. This is quite complex since we cannot flush Phi inside the case labels, we have to do it outside by emitting a lot of manual branches ourselves. This should be extremely rare, but we need to handle this case.	2019-06-21 13:38:23 +02:00
Hans-Kristian Arntzen	e2c95bdcbc	MSL: Rewrite how resource indices are fallback-assigned. We used to use the Binding decoration for this, but this method is hopelessly broken. If no explicit MSL resource remapping exists, we remap automatically in a manner which should always "just work".	2019-06-21 12:54:08 +02:00
Hans-Kristian Arntzen	457eba355e	Employ heuristics to figure out how to emit SSBO/UAV reflection names. This is rather shaky, but we don't have many choices here except add a lot of awkward and unintuitive options. Try to deduce this from OpSource and fallback to heuristic.	2019-06-10 11:24:24 +02:00
Hans-Kristian Arntzen	6b52b0fe8b	Deal with nested loops. Actually need to hoist out variable to outermost loop.	2019-06-06 14:37:02 +02:00
Hans-Kristian Arntzen	02ae99f399	Use the existing loop dominator when doing loop variable preservation.	2019-06-06 12:22:28 +02:00
Hans-Kristian Arntzen	bf56dc88b9	Rewrite how loop dominators are propagated. Do this analysis in the CFG stage rather than last minute with the ad-hoc algorithm we had in place before CFG was introduced.	2019-06-06 12:17:46 +02:00
Hans-Kristian Arntzen	03d93abc1a	Deal with case where a variable is dominated by inner part of a loop. There is a risk that we try to preserve a loop variable through multiple iterations, even though the dominating block is inside a loop. Fix this by analyzing if a block starts off by writing to a variable. In that case, there cannot be any preservation going on. If we don't, pretend the loop header is reading the variable, which moves the variable to an appropriate scope.	2019-06-06 11:11:44 +02:00
Hans-Kristian Arntzen	65af09d2d1	Support emitting OpLine directive. Facilitates easier mapping from source language to cross-compiled output in tooling.	2019-05-28 13:44:24 +02:00
Hans-Kristian Arntzen	42e64597a7	OpArrayLength must trigger active variables.	2019-05-27 16:44:02 +02:00

1 2 3 4 5 ...

336 Commits