SPIRV-Cross/reference/opt/shaders-hlsl/frag/pixel-interlock-ordered.sm51.fxconly.frag
Chip Davis 2eff420d9a Support the SPV_EXT_fragment_shader_interlock extension.
This was straightforward to implement in GLSL. The
`ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT`
modes aren't implemented yet, because we don't support
`SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet.

HLSL and MSL were more interesting. They don't support this directly,
but they do support marking resources as "rasterizer ordered," which
does roughly the same thing. So this implementation scans all accesses
inside the critical section and marks all storage resources found
therein as rasterizer ordered. They also don't support the fine-grained
controls on pixel- vs. sample-level interlock and disabling ordering
guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here
merely means the order is undefined; that it just so happens to be the
same as rasterizer order is immaterial. As for pixel- vs. sample-level
interlock, Vulkan explicitly states:

> With sample shading enabled, [the `PixelInterlockOrderedEXT` and
> `PixelInterlockUnorderedEXT`] execution modes are treated like
> `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`
> respectively.

and:

> If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`]
> execution modes are used in single-sample mode they are treated like
> `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT`
> respectively.

So this will DTRT for MoltenVK and gfx-rs, at least.

MSL additionally supports multiple raster order groups; resources that
are not accessed together can be placed in different ROGs to allow them
to be synchronized separately. A more sophisticated analysis might be
able to place resources optimally, but that's outside the scope of this
change. For now, we assign all resources to group 0, which should do for
our purposes.

`glslang` doesn't support the `RasterizerOrdered` UAVs this
implementation produces for HLSL, so the test case needs `fxc.exe`.

It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`,
even though the spec says it needs either 4.20 or
`GL_ARB_shader_image_load_store`; and it doesn't support the
`GL_NV_fragment_shader_interlock` extension at all. So I haven't been
able to test those code paths.

Fixes #1002.
2019-09-02 12:31:10 -05:00

25 lines
790 B
JavaScript

RWByteAddressBuffer _9 : register(u6, space0);
globallycoherent RasterizerOrderedByteAddressBuffer _42 : register(u3, space0);
RasterizerOrderedByteAddressBuffer _52 : register(u4, space0);
RWTexture2D<unorm float4> img4 : register(u5, space0);
RasterizerOrderedTexture2D<unorm float4> img : register(u0, space0);
RasterizerOrderedTexture2D<unorm float4> img3 : register(u2, space0);
RasterizerOrderedTexture2D<uint> img2 : register(u1, space0);
void frag_main()
{
_9.Store(0, uint(0));
img4[int2(1, 1)] = float4(1.0f, 0.0f, 0.0f, 1.0f);
img[int2(0, 0)] = img3[int2(0, 0)];
uint _39;
InterlockedAdd(img2[int2(0, 0)], 1u, _39);
_42.Store(0, uint(int(_42.Load(0)) + 42));
uint _55;
_42.InterlockedAnd(4, _52.Load(0), _55);
}
void main()
{
frag_main();
}