Commit Graph

218 Commits

Author SHA1 Message Date
Loic Sharma
d69a2cafe5 Accept no ops 2023-01-09 18:14:37 -08:00
Hans-Kristian Arntzen
68a012a4f2 CFG: Handle implied access to opaque loaded values.
Similar concern as access chains. Objects that we cannot lower to
temporaries must implicitly access all expression dependencies when they
are themselves accessed.
2022-12-13 16:34:00 +01:00
Chip Davis
a171087180 MSL: Support "raw" buffer input in tessellation evaluation shaders.
Using vertex-style stage input is complex, and it doesn't support
nesting of structures or arrays. By using raw buffer input instead, we
get this support "for free," and everything becomes much simpler.
Arguably, this is the way I should've done this in the first place.

Eventually, I'd like to make this the default, and then remove the
option altogether. (And I still need to do that with
`multi_patch_workgroup`...)

Should help fix 66 tests in the Vulkan CTS, under the following trees:

 - `dEQP-VK.pipeline.*.interface_matching.*`
 - `dEQP-VK.tessellation.user_defined_io.*`
 - `dEQP-VK.clipping.user_defined.*`
2022-10-18 14:58:59 -07:00
Hans-Kristian Arntzen
f3b1375b13 Add reflection support for shader record buffers.
Reflect naming scheme in a context sensitive way that matches the
frontend.

GLSL -> use block name
HLSL (DXC) -> use instance name.
2022-10-03 12:20:08 +02:00
Hans-Kristian Arntzen
6d3518e238
Merge pull request #2018 from atyuwen/master
MSL: only fix up gl_FragCoord if really necessary.
2022-09-15 11:44:38 +02:00
Bill Hollings
5493b3030e MSL: Support OpPtrEqual, OpPtrNotEqual, and OpPtrDiff.
- Add CompilerMSL::emit_binary_ptr_op() and to_ptr_expression()
  to emit binary pointer op. Compare matrix addresses without automatic
  transpose() conversion, to avoid error taking address of temporary copy.
- Add Compiler::add_active_interface_variable() to also track active
  interface vars in the entry point for SPIR-V 1.4 and above.
- For OpPtrAccessChain that ends in array element, use Element
  as offset to existing index, otherwise it will access into
  array dimension that doesn't exist.
- Dereference pointer function call arguments. Ultimately, this
  dereferencing is actually backwards, and in future, we should aim
  to properly support passing pointer variables between functions,
  but such a refactoring was beyond the scope here.
- Use [] to declare array of pointers, as array<T*> is not supported in MSL.
- Add unit test shaders.
2022-09-14 15:19:15 -04:00
Yuwen Wu
1b9296e1a5 MSL: only fix up gl_FragCoord if really necessary. 2022-09-13 18:50:57 +08:00
Hans-Kristian Arntzen
7eb5ced2a0 Refactor out query for operation type/result IDs. 2022-05-02 15:27:09 +02:00
Hans-Kristian Arntzen
7a6c2da9aa GLSL: Handle more proper semantics for RelaxedPrecision.
GLSL and RelaxedPrecision are quite different in what they affect.
RelaxedPrecision affects operations, while this is merely implied in
GLSL based on inputs.

This leads to situations where we have to promote mediump inputs to
highp, and the simplest approach is to force highp temporaries for
inputs which are consumed in a highp context. For completeness, we also
demote RelaxedPrecision inputs to mediump variables.

PHI is handled by copying the PHI into a temporary.

We have to be very careful with hoisted temporaries, since the child
temporary will not be analyzed up-front. We inherit the hoisted-ness
state and emit the hoisted child temporary as necessary. When faking the
temporaries with OpCopyObject, we make sure to block any variable
hoisting.

Hoisting children of PHI variables is fine, since PHIs are not hoisted with
the same framework as other temporaries.
2022-05-02 15:11:24 +02:00
Hans-Kristian Arntzen
1d13a3e36a Rework how loop iteration counts are validated.
Introduces an idea of a recompilation making forward progress.

There are some extreme edge cases where we need more than 3 loops, but
only allow this in specific circumstances where we can reason about
forward progress being made.
2022-01-17 14:12:01 +01:00
Hans-Kristian Arntzen
7c83fc22fa Add support for LocalSizeId.
WorkgroupSize builtin is deprecated in 1.6 and LocalSizeId is supported
in Vulkan starting with maintenance4.
2022-01-06 13:57:10 +01:00
Hans-Kristian Arntzen
37dfb3f45f
Merge pull request #1794 from etra0/master
Add 64 bit support for OpSwitch
2021-11-15 15:05:10 +01:00
Sebastián Aedo
75e3752273 Added block.cases_32bit and reworked the cases fix
Now we added block.cases_32bit as requested and we only parse if the
remaining ops are a multiple of 2. None of them are mutable because we
return a reference of them depending of the op.condition width.

Signed-off-by: Sebastián Aedo <saedo@codeweavers.com>
2021-11-12 12:50:39 -03:00
Bill Hollings
fd252b21ff Separate (partially) the tracking of depth images from depth compare ops.
SPIR-V allows an image to be marked as a depth image, but with a non-depth
format. Such images should be read or sampled as vectors instead of scalars,
except when they are subject to compare operations.

Don't mark an OpSampledImage as using a compare operation just because the
image contains a depth marker. Instead, require that a compare operation
is actually used on that image.

Compiler::image_is_comparison() was really testing whether an image is a
depth image, since it incorporates the depth marker. Rename that function
to is_depth_image(), to clarify what it is really testing.

In Compiler::is_depth_image(), do not treat an image  as a depth image
if it has been explicitly marked with a color format, unless the image
is subject to compare operations.

In CompilerMSL::to_function_name(), test for compare operations
specifically, rather than assuming them from the depth-image marker.

CompilerGLSL and CompilerMSL still contain a number of internal tests that
use is_depth_image() both for testing for a depth image, and for testing
whether compare operations are being used. I've left these as they are
for now, but these should be cleaned up at some point.

Add unit tests for fetch/sample depth images with color formats and no compare ops.
2021-11-08 15:59:45 -05:00
Hans-Kristian Arntzen
f1b411c9e8 GLSL: Deal with buffer_reference_align.
This is somewhat awkward to support, but the best effort we can do here
is to analyze various Load/Store opcodes and deduce the ideal overall
alignment based on this. This is not a 100% perfect solution, but should
be correct for any reasonable use case.

Also fix various nitpicks with BDA support while I'm at it.
2021-11-07 17:11:46 +01:00
Sebastián Aedo
f099d714f3 Removing logic in the parser
Moving out the logic from the parser as requested because it's sensitive
to try to keep the parsing the most simple process as said.

For that, the load_types is now tracked in the ParsedIR, which can be
accessed in the Compiler struct. The switch cases are fixed in the CFG
stage since that's the point where the nullptr is deref.

Signed-off-by: Sebastián Aedo <saedo@codeweavers.com>
2021-11-02 17:17:13 -03:00
Hans-Kristian Arntzen
cb613eb675 Handle value access in terminators.
Fixes case where value is created inside loop body and consumed by a
return outside it.
2021-07-29 15:27:52 +02:00
Jon Leech
f2a65545b8 Finish adding SPDX tags and setup a reuse checked in Github Actions CI 2021-06-29 11:03:52 +02:00
Hans-Kristian Arntzen
ae9ca7d73c MSL: Fix copy of arrays to/from stage IO variables.
Need to take into account effective storage classes and whether or not
we target stage IO blocks since native arrays are conditionally enabled.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
406af8ff4d c: Add C API for builtin stage IO reflection. 2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
b4a380a04c Support reflecting builtins.
They were ignored in input/output variables.
2021-04-19 12:10:49 +02:00
Hans-Kristian Arntzen
4ca06c7278 Handle edge cases in OpCopyMemory.
Implement this by synthesizing an OpLoad/OpStore pair instead.
2021-03-08 14:15:27 +01:00
Bill Hollings
8e03cb60a5 Expose position invariance.
Used with MSL to determine whether to compile with invariance preserved.
2021-01-28 16:13:20 -05:00
Hans-Kristian Arntzen
4704482bbc meta: Update copyright headers to 2021. 2021-01-14 16:07:49 +01:00
Hans-Kristian Arntzen
014b3bc5ea MSL: Make sure initialized output builtins are considered active. 2021-01-07 15:32:13 +01:00
Hans-Kristian Arntzen
cf1e9e0643 Add MIT dual license for the SPIRV-Cross API. 2020-12-01 16:47:08 +01:00
Hans-Kristian Arntzen
5ea576ece2 Allow flip_vert_y in all relevant stages. 2020-09-28 14:10:08 +02:00
Hans-Kristian Arntzen
66afe8c499 Implement a simple evaluator of specialization constants.
In some cases, we need to get a literal value from a spec constant op.
Mostly relevant when emitting buffers, so implement a 32-bit integer
scalar subset of the evaluator. Can be extended as needed to support
evaluating any specialization constant operation.
2020-09-14 11:45:59 +02:00
Hans-Kristian Arntzen
3afbfdb090 Implement context-sensitive expression read tracking.
When inside a loop, treat any read of outer expressions to happen
multiple times, forcing a temporary of said outer expressions.
This avoids the problem where we can end up relying on loop-invariant code motion to happen in the
compiler when converting optimized shaders.
2020-06-29 12:20:35 +02:00
Hans-Kristian Arntzen
7314f51a32 MSL: Deal with loading non-value-type arrays. 2020-06-18 12:46:39 +02:00
Hans-Kristian Arntzen
58dad82fcb Handle physical pointers in reflection API. 2020-05-25 13:45:49 +02:00
Hans-Kristian Arntzen
f9818f0804 Update license headers to 2020. 2020-01-16 15:24:37 +01:00
Bill Hollings
ef8260dea6 Expose as public Compiler::update_active_builtins() and has_active_builtin().
MoltenVK tessellation needs to be able to identify when a shader has declared
an output built-in, but does not populate it, in order to keep the expectations
about how intermediary buffers are populated aligned between tessellation stages.
2019-11-25 16:53:54 -05:00
Hans-Kristian Arntzen
8066d13599 MSL: Rewrite propagated depth comparison state handling.
Far cleaner, and more correct to run the traversal twice.
Fixes a case where we propagate depth state through multiple functions.
2019-10-26 16:10:11 +02:00
Lukas Hermanns
f3a6d28a1d Further updates for pull request #1162; also added two test cases for spvCubemapTo2DArrayFace function and added '--msl-framebuffer-fetch'/ '--msl-emulate-cube-array' compiler options. 2019-09-27 15:49:54 -04:00
Lukas Hermanns
7ad0a84778 Updates for pull request #1162 2019-09-24 14:35:25 -04:00
Lukas Hermanns
37df74035b Merge branch 'ue4_dev' 2019-09-20 09:42:42 -04:00
Lukas Hermanns
50ac6862ac Rearranged all 'UE Change' comments to match to project's coding style. 2019-09-18 14:03:54 -04:00
Lukas Hermanns
a9f3c981d9 Adjustments after rebase of ue4_dev branch. 2019-09-13 14:03:02 -04:00
Mark Satterthwaite
869d628521 The result of an AccessChain intrinsic in SPIRV can be referenced by multiple blocks but when they are loops that can result in compilation problems because the source variables might not be declared early enough. This forces us to hoist those variables high enough to make it work. 2019-09-11 14:01:40 -04:00
Mark Satterthwaite
32557e9093 SPIRV doesn't distinguish depth textures from regular textures, but Metal does, so if we've ever seen a depth comparison operation we must ensure that the texture is specified as a depth-texture. 2019-09-06 16:58:27 -04:00
Hans-Kristian Arntzen
333980ae91 Refactor into stronger types in public API.
Some fallout where internal functions are using stronger types.
Overkill to move everything over to strong types right now, but perhaps
move over to it slowly over time.
2019-09-06 12:29:47 +02:00
Hans-Kristian Arntzen
36c433bd92 Deal with call stacks when analyzing access. 2019-09-04 11:42:29 +02:00
Hans-Kristian Arntzen
3f2ce375e1 Analyze complex cases for fragment interlocks.
If we are using interlocks in split functions or in control flow, we
have some serious workarounds we need to employ.
2019-09-04 11:20:25 +02:00
Chip Davis
2eff420d9a Support the SPV_EXT_fragment_shader_interlock extension.
This was straightforward to implement in GLSL. The
`ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT`
modes aren't implemented yet, because we don't support
`SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet.

HLSL and MSL were more interesting. They don't support this directly,
but they do support marking resources as "rasterizer ordered," which
does roughly the same thing. So this implementation scans all accesses
inside the critical section and marks all storage resources found
therein as rasterizer ordered. They also don't support the fine-grained
controls on pixel- vs. sample-level interlock and disabling ordering
guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here
merely means the order is undefined; that it just so happens to be the
same as rasterizer order is immaterial. As for pixel- vs. sample-level
interlock, Vulkan explicitly states:

> With sample shading enabled, [the `PixelInterlockOrderedEXT` and
> `PixelInterlockUnorderedEXT`] execution modes are treated like
> `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`
> respectively.

and:

> If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`]
> execution modes are used in single-sample mode they are treated like
> `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT`
> respectively.

So this will DTRT for MoltenVK and gfx-rs, at least.

MSL additionally supports multiple raster order groups; resources that
are not accessed together can be placed in different ROGs to allow them
to be synchronized separately. A more sophisticated analysis might be
able to place resources optimally, but that's outside the scope of this
change. For now, we assign all resources to group 0, which should do for
our purposes.

`glslang` doesn't support the `RasterizerOrdered` UAVs this
implementation produces for HLSL, so the test case needs `fxc.exe`.

It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`,
even though the spec says it needs either 4.20 or
`GL_ARB_shader_image_load_store`; and it doesn't support the
`GL_NV_fragment_shader_interlock` extension at all. So I haven't been
able to test those code paths.

Fixes #1002.
2019-09-02 12:31:10 -05:00
Hans-Kristian Arntzen
3ccfbce264 Run format_all.sh. 2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen
d5a65b4190 GLSL: Assume image and sampler can be RelaxedPrecision.
When merging combined image samplers, we only looked at sampler, but DXC
emits RelaxedPrecision only for texture. Does not hurt to check for more
things.
2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen
5d97dae1eb Move branchless analysis to CFG.
Traverse backwards instead, far more robust. Should elide basically all
redundant continue; statements now.
2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen
b97e9b0499 Fix severe performance issue with invariant expression invalidation.
We were going down a tree of expressions multiple times and this caused
an exponential explosion in time, which was not caught until recently.

Fix this by blocking any traversal going through an ID more than one
time.

This fix overall improves performance by almost an order of magnitude on a
particular test shader rather than slowing it down by ~75x.
2019-08-01 09:55:21 +02:00