When using argument buffers, handle descriptor set entry points with
recursive content, similar to discrete entry points with recursive content.
- For descriptor sets entry points with recursive content, add
descriptor set to recursive_inputs, and create a local var for it.
- For recursive entry points that are contained in a descriptor set
argument buffer, don't add entry point to recursive_inputs, or create
a local var for that content entry point.
- Add test shader.
Constexpr samplers are defined as local variables,
but were treated as held within an argument buffer.
- CompilerMSL::to_sampler_expression() support constexpr samplers
when using argument buffers, plus refactor to minimize generating
expression text that may not be used.
- Handle padding around multi-plane images that require multiple textures.
Only check for padding on the first plane, but include plane count in
total argument buffer slots consumed.
Metal writes to the depth/stencil attachment before fragment
shader execution if the execution does not modify the depth
value. However, Vulkan expects the write to happen after
fragment shader execution. To circumvent the issue we add
a simple depth passthrough if the user opts in. Only
required when the depth/stencil attachment is used as
input attachment at the same time. It seems Metal does not
correctly detect the dependency.
Metal will incorrectly discard fragments with side effects under
certain circumstances prematurely. The conditions are the following:
- Fragment will always be discarded after side effect operation
- Pre fragment depth fails
- Modifies depth value for a constant value in the fragment shader.
This constant value will also fail the depth test.
However, Metal will also discard the fragment even if it has
operations with side effects inside the fragment shader before the
discard operation.
Vulkan states the graphics pipeline to execute in the following
order:
- Pre fragment depth test (cannot discard here due to modifying
depth value in fragment shader)
- Fragment shader (where the depth is modified and fragment
discarded)
- Post fragment depth test
Therefore, we need to enforce fragment shader execution and not
let Metal discard the fragment before that for such cases. This
change adds an option to provide such utility.
This adds support for bindings which share the same DescriptorSet/Binding pair.
The motivating example is vkd3d, which uses overlapping arrays of resources to
emulate D3D12 descriptor tables. The generated MSL argument buffer only
includes the first resource (in this example 't0'):
struct spvDescriptorSetBuffer2
{
array<texture2d<float>, 499968> t0 [[id(0)]];
// Overlapping binding: array<texture3d<float>, 499968> t2 [[id(0)]];
};
When t2 is referenced, we cast the instantiated member:
float4 r1 = spvDescriptorSet2.t0[_79].sample(...);
float4 r2 = (*(constant array<texture3d<float>, 499968>*)&spvDescriptorSet2.t0)[_97].sample(...);
Comment claims we can't, but I tested a number of older Metal compilers (Xcode 8 targeting macOS 10.11, macOS 10.13 online compiler, Xcode 14 targeting iOS 8) and none of them had any issues with it
CompilerMSL:msl_options.texture_1D_as_2D emulates a Metal 1D texture
as a 2D texture in order to expand features available for 1D textures.
Support accessing such textures as 2D for atomic_compare_exchange_weak().
An entry-point array of buffers, that is not part of a Metal argument
buffer, requires a known length, so it can be emitted as discrete buffers.
For runtime arrays of resources, this can be retrieved from the resource
binding information added via add_msl_resource_binding().
- Redefine get_resource_array_size() to consolidate array sizing using both var
type, and runtime array sizing from resource bindings, if not found in type.
- Use get_resource_array_size() to fix issue for runtime arrays of buffers.
- Update runtime arrays of images and samplers to use get_resource_array_size().
- Add .DS_Store to .gitignore (unrelated).
I cannot find any evidence that this does not actually work.
The original case here was from Epic's PR series in 2019, but I cannot see why it doesn't work.
It might have been a bug in a very old compiler at some point.