Commit Graph

206 Commits

Author SHA1 Message Date
Aitor Camacho
a43fabbe2a MSL: Add option to force depth write in fragment shaders
Metal writes to the depth/stencil attachment before fragment
shader execution if the execution does not modify the depth
value. However, Vulkan expects the write to happen after
fragment shader execution. To circumvent the issue we add
a simple depth passthrough if the user opts in. Only
required when the depth/stencil attachment is used as
input attachment at the same time. It seems Metal does not
correctly detect the dependency.
2024-05-24 12:13:17 +02:00
Aitor Camacho
cd8865deab Add option to enforce fragment execution with side effects in MSL
Metal will incorrectly discard fragments with side effects under
certain circumstances prematurely. The conditions are the following:
 - Fragment will always be discarded after side effect operation
 - Pre fragment depth fails
 - Modifies depth value for a constant value in the fragment shader.
   This constant value will also fail the depth test.

However, Metal will also discard the fragment even if it has
operations with side effects inside the fragment shader before the
discard operation.

Vulkan states the graphics pipeline to execute in the following
order:
 - Pre fragment depth test (cannot discard here due to modifying
   depth value in fragment shader)
 - Fragment shader (where the depth is modified and fragment
   discarded)
 - Post fragment depth test

Therefore, we need to enforce fragment shader execution and not
let Metal discard the fragment before that for such cases. This
change adds an option to provide such utility.
2024-05-21 11:24:56 +02:00
Chip Davis
18976c4307 Add missing new MSL options to the C API and the CLI. 2023-11-27 15:03:58 -08:00
Chip Davis
7ef52b04c3 MSL: Work around broken cube texture gradients on Apple Silicon.
To date, all released Apple Silicon GPUs incorrectly interpret the
gradient vectors when sampling a cube texture. Specifically, they ignore
one of the three partial derivatives in each gradient depending on the
selected major axis, and they expect the remaining derivatives to be
partially transformed.

h/t @lexaknyazev for the code used in the `spvGradientCube()` function.

Fixes 8 tests under `dEQP-VK.glsl.texture_functions.texturegrad.*`.
2023-11-27 15:03:26 -08:00
Hans-Kristian Arntzen
a4b8553982 Style fixups. 2023-10-16 11:55:41 +02:00
Bill Hollings
16fbf8872a MSL: Workaround Metal 3.1 regression bug on recursive input structs.
Metal 3.1 introduced a Metal regression bug which causes an infinite recursion
crash during Metal's analysis of an entry point input structure that itself
contains internal recursion. This patch works around this by replacing the
recursive input declaration with a alternate variable of type void*, and
then casting to the correct type at the top of the entry point function.

- Add CompilerMSL::Options::replace_recursive_inputs to enable
  replacing recursive input.
- Add Compiler::type_contains_recursion() to determine if a struct
  contains internal recursion, and add custom Decorations to mark
  such structs, to short-cut future similar checks.
- Replace recursive input struct declarations with void*,
  and emit a recast to correct type at top of entry function.
- Add unit test.
- Compiler::type_is_top_level_block() remove hardcode reference to spirv_cross
  namespace, as it interferes with configurable namespaces (unrelated).
2023-10-14 14:46:47 -04:00
Hans-Kristian Arntzen
44966e5000 MSL: Fixup nits from review. 2023-08-17 12:01:26 +02:00
Try
844cb59cd6 MSL: runtime array over argument buffers 2023-08-17 11:37:29 +02:00
Hans-Kristian Arntzen
63ea1a521b HLSL: Add CLI option for --hlsl-preserve-structured-buffers. 2023-05-19 11:38:09 +02:00
John Wells
4c622ce030 Revert "Make argument buffer padding testable"
This reverts commit 275e4d7e88.
2023-03-30 11:13:39 -04:00
John Wells
275e4d7e88 Make argument buffer padding testable 2023-03-29 21:55:32 -04:00
John Wells
e31deb8340 Fix for typo in help 2023-03-23 19:05:12 -04:00
Chip Davis
e8d419854f MSL: Add a workaround for broken level() arguments.
Some Metal devices have a bug with depth array textures using comparison
with explicit LoD, where the LoD given will be biased by some amount.
For these devices, we can use a gradient instead, which does not exhibit
this problem. As with the fragment demote workaround, this is only
expected to be needed until the bug is fixed in Metal.
2023-02-02 22:01:46 -08:00
Bill Hollings
284ccf5d2d Fixes from code review of adding writable images to iOS Tier2 argument buffers. 2023-01-08 21:22:23 -05:00
Bill Hollings
643b7be196 MSL: Add support for writable images in iOS Tier2 argument buffers.
- Add CompilerMSL::Options::argument_buffers_tier as an enumeration to
  allow calling app to specify platform argument buffer tier capabilities.
- Support iOS writable images in Tier2 argument buffers when specified.

Tier capabilities based on recommendations from Apple engineering.
2022-12-28 12:40:37 -05:00
Chip Davis
aa5a8c482e MSL: Prevent stores to storage resources in discarded fragments.
Some Metal devices have a bug where storage resources can still be
written to even if the fragment is discarded. This is obviously a bug in
Metal, but bothering Apple to fix it will only fix it for newer
versions; therefore, a workaround is needed for older versions. I have
made this an option so that, in case the bug is ever fixed, the
workaround can be disabled.

This workaround is simple: if a fragment shader may discard its fragment
and writes to a storage resource, a variable representing the
`HelperInvocation` built-in is created and passed to all functions. The
flag is checked on all resource writes; writes do not occur when
`HelperInvocation` is `true`. This relies on the earlier workaround to
update `HelperInvocation` when the fragment is discarded.

Fixes at least 3 failures in the CTS.
2022-11-20 01:29:41 -08:00
Chip Davis
c7ce92a95b MSL: Manually update BuiltInHelperInvocation when a fragment is discarded.
Some Metal devices have a bug where `simd_is_helper_thread()` won't
return true after a fragment has been discarded. We can work around this
by manually setting `gl_HelperInvocation` upon discarding a fragment.
This is fairly unintrusive, so it is enabled by default. I've made it an
option so that, when the bug is fixed, we can disable it.
2022-11-19 23:48:26 -08:00
Hans-Kristian Arntzen
47c7fc16eb HLSL: Add option to bind vertex input smemantics by name. 2022-10-26 12:41:23 +02:00
Chip Davis
0b679334e4 MSL: Don't flatten arrayed per-patch output blocks in tessellation shaders.
Flattening doesn't play well with dynamic indices. In this case, it's
better to leave it as an array of structs.

(I wanted to do this for named blocks generally. Trouble is, the builtin
`gl_out` block is *also* a named block...)

Fixes six more CTS tests, under
`dEQP-VK.tessellation.user_defined_io.per_patch_block_array.*`.
2022-10-18 15:04:42 -07:00
Chip Davis
a171087180 MSL: Support "raw" buffer input in tessellation evaluation shaders.
Using vertex-style stage input is complex, and it doesn't support
nesting of structures or arrays. By using raw buffer input instead, we
get this support "for free," and everything becomes much simpler.
Arguably, this is the way I should've done this in the first place.

Eventually, I'd like to make this the default, and then remove the
option altogether. (And I still need to do that with
`multi_patch_workgroup`...)

Should help fix 66 tests in the Vulkan CTS, under the following trees:

 - `dEQP-VK.pipeline.*.interface_matching.*`
 - `dEQP-VK.tessellation.user_defined_io.*`
 - `dEQP-VK.clipping.user_defined.*`
2022-10-18 14:58:59 -07:00
Hans-Kristian Arntzen
f09ba27777
Merge pull request #2035 from KhronosGroup/fix-2032
HLSL: Improve support for VertexInfo aux struct.
2022-10-03 14:54:07 +02:00
Hans-Kristian Arntzen
b5386e3ea9 HLSL: Improve support for VertexInfo aux struct.
Add concept of explicit bindings for aux structs and allows query if
these aux structs are required.
2022-10-03 13:31:27 +02:00
Hans-Kristian Arntzen
f3b1375b13 Add reflection support for shader record buffers.
Reflect naming scheme in a context sensitive way that matches the
frontend.

GLSL -> use block name
HLSL (DXC) -> use instance name.
2022-10-03 12:20:08 +02:00
Chip Davis
064eaebe72 MSL: Add a mechanism to fix up shader outputs.
This is analogous to the existing support for fixing up shader inputs.
It is intended to be used with tessellation to add implicit builtins
that are read from a later stage, despite not being written in an
earlier stage. (Believe it or not, this is in fact legal in Vulkan.)

Helps fix 8 CTS tests under `dEQP-VK.pipeline.*.no_position`. (Eight
other tests work solely by accident without this change.)
2022-09-09 17:06:34 -07:00
Hans-Kristian Arntzen
4c345166dc GLSL: Implement task shaders.
Due to bugged glslang / spirv-tools w.r.t. terminator instructions,
add a hack to ignore invalid SPIR-V for the time being.
2022-09-05 12:31:22 +02:00
Hans-Kristian Arntzen
06ca9accd7 HLSL: Add option to emit entry point name 1:1 instead of main().
MSL backend supports emitting custom name, and there's no reason for
HLSL to not support that as well, but we have to make it an option to
not break existing users.
2022-07-22 12:04:33 +02:00
Sergii Penner
1bba4d5137
Fix typo
Add a missing coma
2022-06-20 09:26:34 -06:00
Hans-Kristian Arntzen
0b303aab16 Add --stage handling for ray tracing. 2022-05-10 17:14:54 +02:00
Stefan Lienhard
05c9a14422 cli: display missing memory qualifiers for reflect and dump-resources 2022-04-25 22:05:34 +02:00
Hans-Kristian Arntzen
31be74a853 Add relax_nan_checks options.
Makes codegen from typical D3D emulation SPIR-V more readable.
Also makes cross compilation with NotEqual more sensible.
It's very rare to actually need the strict NaN-checks in practice.

Also, glslang now emits UnordNotEqual by default it seems, so give up
trying to assume OrdNotEqual. Harmonize for UnordNotEqual as the sane
default.
2022-03-03 14:50:56 +01:00
Daniel Thornburgh
44c3333a1c Qualify std::move.
Clang added -Wunqualified-std-cast-call in
https://reviews.llvm.org/D119670, which warns on unqualified std::move
and std::forward calls. This change qualifies these calls to allow the
project to build on HEAD Clang -Werror.
2022-03-02 23:17:58 +00:00
Hans-Kristian Arntzen
188dc8b13c
Merge pull request #1862 from flokart-world/feature/flatten-ubo-for-hlsl
HLSL: Make --flatten-ubo work correctly
2022-02-16 16:39:45 +01:00
Shintaro Sakahara
ed4ded040e HLSL: Make --flatten-ubo work correctly 2022-02-16 21:53:24 +09:00
Hans-Kristian Arntzen
c716a9a5dd Add debug option to modify maximum number of compile iterations.
Should be seen as a hack, but it's pragmatic in some scenarios.
2022-02-16 12:12:27 +01:00
Hans-Kristian Arntzen
bb04156d3c CLI/HLSL: Don't set explicit binding for synthesized NumWorkgroups CBV. 2021-09-30 14:30:49 +02:00
Jon Leech
f2a65545b8 Finish adding SPDX tags and setup a reuse checked in Github Actions CI 2021-06-29 11:03:52 +02:00
Hans-Kristian Arntzen
d75666b170 GLSL: Emit num_views for OVR_multiview2. 2021-06-28 12:56:27 +02:00
Hans-Kristian Arntzen
585fc6f3cb MSL: Always enable support for base vertex/index on iOS.
No good reason to not just enable it in CLI.
2021-06-03 11:27:49 +02:00
Hans-Kristian Arntzen
c87cb54499 MSL: Add CLI option for sampler suffix. 2021-05-21 16:47:41 +02:00
Hans-Kristian Arntzen
26a4986009 GLSL: Implement noncoherent framebuffer fetch. 2021-05-21 14:22:57 +02:00
Hans-Kristian Arntzen
b4a380a04c Support reflecting builtins.
They were ignored in input/output variables.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
ee85bb345e Fix print_help comment. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
9c1cadd440 Add --mask-stage-output-* CLI options. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
4704482bbc meta: Update copyright headers to 2021. 2021-01-14 16:07:49 +01:00
Hans-Kristian Arntzen
ce18d1b8a5 CLI: Fix silly regression with handling of -V. 2021-01-08 10:51:49 +01:00
Hans-Kristian Arntzen
02b7f9cbe9 CLI: Add stdin support. 2021-01-06 11:06:41 +01:00
Hans-Kristian Arntzen
cf1e9e0643 Add MIT dual license for the SPIRV-Cross API. 2020-12-01 16:47:08 +01:00
Chip Davis
fd738e3387 MSL: Adjust FragCoord for sample-rate shading.
In Metal, the `[[position]]` input to a fragment shader remains at
fragment center, even at sample rate, like OpenGL and Direct3D. In
Vulkan, however, when the fragment shader runs at sample rate, the
`FragCoord` builtin moves to the sample position in the framebuffer,
instead of the fragment center. To account for this difference, adjust
the `FragCoord`, if present, by the sample position. The -0.5 offset is
because the fragment center is at (0.5, 0.5).

Also, add an option to force sample-rate shading in a fragment shader.
Since Metal has no explicit control for this, this is done by adding a
dummy `[[sample_id]]` which is otherwise unused, if none is already
present. This is intended to be used from e.g. MoltenVK when a
pipeline's `minSampleShading` value is nonzero.

Instead of checking if any `Input` variables have `Sample`
interpolation, I've elected to check that the `SampleRateShading`
capability is present. Since `SampleId`, `SamplePosition`, and the
`Sample` interpolation decoration require this cap, this should be
equivalent for any valid SPIR-V module. If this isn't acceptable, let me
know.
2020-11-23 10:30:24 -06:00
Chip Davis
68908355a9 MSL: Expand subgroup support.
Add support for declaring a fixed subgroup size. Metal, like Vulkan with
`VK_EXT_subgroup_size_control`, allows the thread execution width to
vary depending on factors such as register usage. Unfortunately, this
breaks several tests that depend on the subgroup size being what the
device says it is. So we'll fix the subgroup size at the size the device
declares. The extra invocations in the subgroup will appear to be
inactive. Because of this, the ballot mask builtins are now ANDed with
the active subgroup mask.

Add support for emulating a subgroup of size 1. This is intended to be
used by Vulkan Portability implementations (e.g. MoltenVK) when the
hardware/software combo provides insufficient support for subgroups.
Luckily for us, Vulkan 1.1 only requires that the subgroup size be at
least 1.

Add support for quadgroup and SIMD-group functions which were added to
iOS in Metal 2.2 and 2.3. This will allow clients to take advantage of
expanded quadgroup and SIMD-group support in recent Metal versions and
on recent Apple GPUs (families 6 and 7).

Gut emulation of subgroup builtins in fragment shaders. It turns out
codegen for the SIMD-group functions in fragment wasn't implemented for
AMD on Mojave; it's a safe bet that it wasn't implemented for the other
drivers either. Subgroup support in fragment shaders now requires Metal
2.2.
2020-11-20 15:55:49 -06:00
Hans-Kristian Arntzen
6fc2a0581a Run format_all.sh. 2020-11-08 13:59:52 +01:00