SPIRV-Cross

Author	SHA1	Message	Date
Hans-Kristian Arntzen	5555f2784b	MSL: Refactor and fix use of quadgroup vs simdgroup.	2022-02-28 11:58:33 +01:00
Hans-Kristian Arntzen	5b952d2cbf	MSL: Rethink how opaque descriptors are passed to leaf functions. We were passing arrays by value which the compiler fails to optimize, causing abyssal performance. To fix this, we need to consider that descriptors can be in constant or const device address spaces. Also, lone descriptors are passed by value, so we explicitly remove address space qualifiers. One failure case is when shader passes a texture/sampler array as an argument. It's all UniformConstant in SPIR-V, but in MSL it might be thread, const device or constant, so that won't work ... Global variable use works fine though, and that should cover 99.9999999% of use cases.	2022-01-18 14:40:52 +01:00
Hans-Kristian Arntzen	5a5be7f9b9	MSL: Handle signed atomic min/max. C++ deduces this based on the pointer type, so cast to atomic_uint/int if we have to.	2022-01-17 15:40:58 +01:00
Bill Hollings	248e9ae9ed	MSL: Don't output depth and stencil values with explicit early fragment tests. Fragment shaders that require explicit early fragment tests are incompatible with specifying depth and stencil values within the shader. If explicit early fragment tests is specified, remove the depth and stencil outputs from the output structure, and replace them with dummy local variables. Add CompilerMSL:uses_explicit_early_fragment_test() function to consolidate testing for whether early fragment tests are required. Add two unit tests for depth-out with, and without, early fragment tests.	2021-11-12 14:17:00 -05:00
Hans-Kristian Arntzen	edf247fb1c	MSL: Workaround compiler crashes when using threadgroup bool. Promote to short instead and do simple casts on load/store instead. Not 100% complete fix since structs can contain booleans, but this is getting into pretty ridiculously complicated territory.	2021-10-25 10:55:11 +02:00
丛越	d52ec1e196	Fix all requested changes, test_shaders.py supports compiling MSL 2.4 shaders, and the Intersection Query currently only supports MSL 2.4 on the iOS platform.	2021-10-21 17:46:45 +08:00
丛越	597f29d09d	Support Metal 2.4 Intersection Query, Implement GL_EXT_ray_query.	2021-10-19 18:45:10 +08:00
Hans-Kristian Arntzen	325f107c5b	Merge pull request #1745 from billhollings/location-component-vecsize MSL: Track location component to match vecsize between shader stages.	2021-09-30 14:02:25 +02:00
Bill Hollings	5742047b24	MSL: Honor infinities in OpQuantizeToF16 when compiling using fast-math. Add spvQuantizeToF16() family of synthetic functions to convert from float to half and back again, and add function attribute [[clang::optnone]] to honor infinities during conversions. Adjust SPIRV-Cross unit test reference shaders to accommodate these changes.	2021-09-24 11:22:05 -04:00
Bill Hollings	548a23da34	MSL: Track location component to match vecsize between shader stages. Matching output/input struct member types between shader stages could fail if a location is shared between members, each using different components of that location, because the member vecsize was only stored once for the location. Add MSLShaderInput::component member. Use LocationComponentPair to key inputs_by_location, instead of just location. ensure_correct_input_type() pass component value as well as location.	2021-09-23 09:56:04 -04:00
Bill Hollings	86dfac12c8	MSL: Fix location and component variable matching between shader stages. Consolidate derivation of Metal 'user(locnL_C)' output/input location and component attribute qualifier, to establish SVOT across stages.	2021-09-18 18:55:12 -04:00
Bill Hollings	5fb1ca4f0d	Add support for additional ops in OpSpecConstantOp. MSL: Support op OpQuantizeToF16 in OpSpecConstantOp. All: Support op OpSRem in OpSpecConstantOp.	2021-09-03 18:20:49 -04:00
Bill Hollings	ebb5098def	MSL: Adjust gl_SampleMaskIn for sample-shading and/or fixed sample mask. Vulkan specifies that the Sample Mask Test occurs before fragment shading. This means gl_SampleMaskIn should be influenced by both sample-shading and VkPipelineMultisampleStateCreateInfo::pSampleMask. CTS tests dEQP-VK.pipeline.multisample_shader_builtin.* bear this out. For sample-shading, gl_SampleMaskIn should only have a single bit set, Since Metal does not filter for this, apply a bitmask based on gl_SampleID. For a fixed sample mask, since Metal is unaware of VkPipelineMultisampleStateCreateInfo::pSampleMask, we need to ensure that we apply it to both gl_SampleMaskIn and gl_SampleMask. This has the side effect of a redundant application of pSampleMask if the shader already includes gl_SampleMaskIn when setting gl_SampleMask, but I don't see an easy way around this. Also, simplify the logic for including the fixed sample mask in gl_ShaderMask, and print the fixed sample mask as a hex value for readability of bits.	2021-07-13 21:22:13 -04:00
Jon Leech	f2a65545b8	Finish adding SPDX tags and setup a reuse checked in Github Actions CI	2021-06-29 11:03:52 +02:00
Hans-Kristian Arntzen	d62b3c2b92	GLSL: Implement control flow hints.	2021-06-03 12:01:49 +02:00
Hans-Kristian Arntzen	99ae0d32e9	MSL: Handle array with component when we cannot rely on user() attrib. In these cases, we emit one variable per location, and so we must flatten stuff.	2021-05-21 13:46:33 +02:00
Hans-Kristian Arntzen	e47a30e807	Honor NoContraction qualifier. We'll need to force a temporary and mark it as precise. MSL is a little weird here, but we can piggyback on top of the invariant float math option here to force fma() operations everywhere.	2021-05-07 12:59:47 +02:00
Hans-Kristian Arntzen	96ba044f01	HLSL: Fix automatic location assignment in block IO.	2021-04-20 13:04:26 +02:00
Hans-Kristian Arntzen	ae9ca7d73c	MSL: Fix copy of arrays to/from stage IO variables. Need to take into account effective storage classes and whether or not we target stage IO blocks since native arrays are conditionally enabled.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	7b9a591aa7	MSL: Hoist out to_tesc_invocation_id() in more places. When emitting fixup code, we might not have gl_InvocationID yet.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	75ed73818c	MSL: Handle loading Clip/CullDistance in TESE. Need to allow the flattened space to go through in some edge cases where we cannot reasonably unflatten.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	23da445bd4	MSL: Emit multiple threadgroup slices for multi-patch. Multiple patches can run in the same workgroup when using multi-patch mode, so we need to allocate enough storage to avoid false sharing.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	faf80b08fc	MSL: Don't report fallback location allocations as being "used". It may shadow unused real inputs and confuse applications.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	5e9c2d060e	MSL: Cleanup fallback IO block emission. Need to emit in add_variable_to_iface(). Unifies the code paths a fair bit.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	e32c474911	MSL: Handle masking of TESC IO block members.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	40f628f49c	MSL: Add test for complex control point outputs.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	46c48ee6b5	MSL: Rewrite how IO blocks are emitted in multi-patch mode. Firstly, never flatten inputs or outputs in multi-patch mode. The main scenario where we do need to care is Block IO. In this case, we should only flatten the top-level member, and after that we use access chains as normal. Using structs in Input storage class is now possible as well. We don't need to consider per-location fixups at all here. In Vulkan, IO structs must match exactly. Only plain vectors can have smaller vector sizes as a special case.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	425e968720	MSL: Handle flattening of patch block outputs as well. Always propagate InterfaceMember decoration.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	6ecdd64a91	MSL: Emit a masked builtin IO block if necessary.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	ae7bb41ef4	MSL: Test that we can mask location writes in TESC.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	ba93b6518d	MSL: Fix masking of vertex block outputs.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	f2b5fb3f45	MSL: Emit threadgroup storage class for masked control point outputs. Shader can still rely on writes to threadgroup memory to be visible.	2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen	9a144bb2b9	Clean up member sorting.	2021-04-19 12:10:49 +02:00
Bill Hollings	b3bfe22eaa	MSL: Fixes to support padding Metal argument buffer entries based on argument index. For buffers, support all MSLResourceBinding::basetype pointers, not just void*. Rename MSLResourceBinding::base_type to basetype for consistent use in other structs.	2021-04-18 17:34:55 -04:00
Bill Hollings	daba0dfba6	MSL: Fixes to support padding Metal argument buffer entries based on argument index. For completeness, add [[id(N)]] qualifier to padding struct members. Run clang-format.	2021-04-17 15:20:53 -04:00
Bill Hollings	9060e5a13c	MSL: Fixes to support padding Metal argument buffer entries based on argument index. Use separate lookups for texture and sampler members when padding for SamplerImages. Remove unreachable code following SPIRV_CROSS_THROW.	2021-04-16 15:00:59 -04:00
Bill Hollings	9866cf4496	MSL: Fixes to support padding Metal argument buffer entries based on argument index. Add lookup from argument buffer argument index to resource binding for efficiency. Fix error in advancing padding counts with combined image samplers. Run clang-format.	2021-04-16 09:05:15 -04:00
Bill Hollings	17dab614dc	MSL: Support padding Metal argument buffer entries based on argument index. If CompilerMSL::Options::pad_argument_buffer_resources enabled, Metal argument buffer struct members are positionally aligned to their argument indexes by adding synthetic padding members when needed. The types and sizes of these synthetic members are identified in the resource_bindings vector provided through the API. Add CompilerMSL::Options::pad_argument_buffer_resources to enable padding Metal argument buffer structs to positionally match members to argument indexes. Add MSLResourceBinding::base_type to identify resource type through API.	2021-04-13 19:01:20 -04:00
Hans-Kristian Arntzen	97796e0609	MSL: Deal with pointer-to-pointer qualifier ordering.	2021-02-26 13:37:14 +01:00
Hans-Kristian Arntzen	621884d709	Merge pull request #1622 from KhronosGroup/fix-1619 MSL: Handle load and store to TessLevel array in TESC.	2021-02-17 20:46:06 +01:00
Hans-Kristian Arntzen	85704f70bc	MSL: Handle load and store to TessLevel array in TESC. More edge cases ... :(	2021-02-17 13:26:08 +01:00
Hans-Kristian Arntzen	ce552f4f91	MSL: Gracefully assign automatic input locations to builtin attributes.	2021-02-17 12:29:19 +01:00
Hans-Kristian Arntzen	aa271c1460	MSL: Refactor out location consumption count computation.	2021-02-17 11:29:33 +01:00
Hans-Kristian Arntzen	6f1f6775f3	Add comment where aux image atomic buffers are reflected from. They also use secondary bindings, not just samplers.	2021-02-17 10:42:58 +01:00
Hans-Kristian Arntzen	4704482bbc	meta: Update copyright headers to 2021.	2021-01-14 16:07:49 +01:00
Hans-Kristian Arntzen	893a011299	MSL: Fix various bugs with framebuffer fetch on macOS and argument buffers. Introduce a helper to make it clearer if a resource can be considered for argument buffers or not.	2021-01-08 10:19:18 +01:00
Hans-Kristian Arntzen	c4ff129fe3	MSL: Handle reserved identifiers for entry point. We only considered invalid names, and overwrote the alias for the function. The correct fix is to replace illegal names early, do the reserved fixup, then copy back alias to entry point name.	2021-01-04 09:40:11 +01:00
Hans-Kristian Arntzen	cf1e9e0643	Add MIT dual license for the SPIRV-Cross API.	2020-12-01 16:47:08 +01:00
Chip Davis	fd738e3387	MSL: Adjust FragCoord for sample-rate shading. In Metal, the `[[position]]` input to a fragment shader remains at fragment center, even at sample rate, like OpenGL and Direct3D. In Vulkan, however, when the fragment shader runs at sample rate, the `FragCoord` builtin moves to the sample position in the framebuffer, instead of the fragment center. To account for this difference, adjust the `FragCoord`, if present, by the sample position. The -0.5 offset is because the fragment center is at (0.5, 0.5). Also, add an option to force sample-rate shading in a fragment shader. Since Metal has no explicit control for this, this is done by adding a dummy `[[sample_id]]` which is otherwise unused, if none is already present. This is intended to be used from e.g. MoltenVK when a pipeline's `minSampleShading` value is nonzero. Instead of checking if any `Input` variables have `Sample` interpolation, I've elected to check that the `SampleRateShading` capability is present. Since `SampleId`, `SamplePosition`, and the `Sample` interpolation decoration require this cap, this should be equivalent for any valid SPIR-V module. If this isn't acceptable, let me know.	2020-11-23 10:30:24 -06:00
Chip Davis	68908355a9	MSL: Expand subgroup support. Add support for declaring a fixed subgroup size. Metal, like Vulkan with `VK_EXT_subgroup_size_control`, allows the thread execution width to vary depending on factors such as register usage. Unfortunately, this breaks several tests that depend on the subgroup size being what the device says it is. So we'll fix the subgroup size at the size the device declares. The extra invocations in the subgroup will appear to be inactive. Because of this, the ballot mask builtins are now ANDed with the active subgroup mask. Add support for emulating a subgroup of size 1. This is intended to be used by Vulkan Portability implementations (e.g. MoltenVK) when the hardware/software combo provides insufficient support for subgroups. Luckily for us, Vulkan 1.1 only requires that the subgroup size be at least 1. Add support for quadgroup and SIMD-group functions which were added to iOS in Metal 2.2 and 2.3. This will allow clients to take advantage of expanded quadgroup and SIMD-group support in recent Metal versions and on recent Apple GPUs (families 6 and 7). Gut emulation of subgroup builtins in fragment shaders. It turns out codegen for the SIMD-group functions in fragment wasn't implemented for AMD on Mojave; it's a safe bet that it wasn't implemented for the other drivers either. Subgroup support in fragment shaders now requires Metal 2.2.	2020-11-20 15:55:49 -06:00

1 2 3 4 5 ...

357 Commits