HLSL is very fussy about fallthrough in switch blocks, so promote
Unreachable blocks to breaks if they are inside a switch construct.
Some false positives are possible in weird multi-break scenarios, but
this is benign.
Speculate that we can modify the SSA value in-place. As long as it is
not used after the modify, this is fine.
Also need to make sure we don't attempt to RMW something that is
impossible to modify.
GLSL and RelaxedPrecision are quite different in what they affect.
RelaxedPrecision affects operations, while this is merely implied in
GLSL based on inputs.
This leads to situations where we have to promote mediump inputs to
highp, and the simplest approach is to force highp temporaries for
inputs which are consumed in a highp context. For completeness, we also
demote RelaxedPrecision inputs to mediump variables.
PHI is handled by copying the PHI into a temporary.
We have to be very careful with hoisted temporaries, since the child
temporary will not be analyzed up-front. We inherit the hoisted-ness
state and emit the hoisted child temporary as necessary. When faking the
temporaries with OpCopyObject, we make sure to block any variable
hoisting.
Hoisting children of PHI variables is fine, since PHIs are not hoisted with
the same framework as other temporaries.
Test for array presence using is_array() instead of element count.
Add shaders-msl/vert/interface-block-single-element-array.vert regression test.
Fixes a regression error introduced in 3d4daab.
This allows two variables of the same struct type to be flattened
into the same interface struct without a member name conflict.
Add shaders-msl/frag/in_block_with_multiple_structs_of_same_type.frag
unit test shader to demonstrate this.
Makes codegen from typical D3D emulation SPIR-V more readable.
Also makes cross compilation with NotEqual more sensible.
It's very rare to actually need the strict NaN-checks in practice.
Also, glslang now emits UnordNotEqual by default it seems, so give up
trying to assume OrdNotEqual. Harmonize for UnordNotEqual as the sane
default.
Fixes numerous CTS tests of types
dEQP-VK.pipeline.interface_matching.vector_length.member_of_*,
passing complex nested structs between stages as stage I/O.
- Make add_composite_member_variable_to_interface_block() recursive to allow
struct members to contain nested structs, building up member names and access
chains recursively, and only add the resulting flattened leaf members to the
synthetic input and output interface blocks.
- Recursively generate individual location numbers for the flattened members
of the input/output block.
- Replace to_qualified_member_name() with append_member_name().
- Update add_variable_to_interface_block() to support arrays as struct members,
adding a member to input and output interface blocks for each element of the array.
- Pass name qualifiers to add_plain_member_variable_to_interface_block() to allow
struct members to be arrays of structs, building up member names and access chains,
and adding multiple distinct flattened leaf members to the synthetic input and
output interface blocks.
- Generate individual location numbers for the individual array members
of the input/output block.
- SPIRVCrossDecorationInterfaceMemberIndex references the index of a member
of a variable that is a struct type. The value is relative to the variable,
and for structs nested within that top-level struct, the index value needs
to take into consideration the members within those nested structs.
- Pass var_mbr_idx to add_plain_member_variable_to_interface_block() and
add_composite_member_variable_to_interface_block(), start at zero for each
variable, and increment for each member or nested member within that variable.
- Add unit test shaders-msl/vert/out-block-with-nested-struct-array.vert
- Add unit test shaders-msl/vert/out-block-with-struct-array.vert
- Add unit test shaders-msl/tese/in-block-with-nested-struct.tese
We were passing arrays by value which the compiler fails to optimize,
causing abyssal performance. To fix this, we need to consider that
descriptors can be in constant or const device address spaces.
Also, lone descriptors are passed by value, so we explicitly remove address
space qualifiers.
One failure case is when shader passes a texture/sampler array as an
argument. It's all UniformConstant in SPIR-V, but in MSL it might be
thread, const device or constant, so that won't work ...
Global variable use works fine though, and that should cover 99.9999999%
of use cases.
In cases where we know the size of the vector and the index at compile
time, we can check if it's accessing in bounds and rely in undefined
behavior otherwise.
Signed-off-by: Sebastián Aedo <saedo@codeweavers.com>
Fragment shaders that require explicit early fragment tests are incompatible
with specifying depth and stencil values within the shader. If explicit early
fragment tests is specified, remove the depth and stencil outputs from the
output structure, and replace them with dummy local variables.
Add CompilerMSL:uses_explicit_early_fragment_test() function to consolidate
testing for whether early fragment tests are required.
Add two unit tests for depth-out with, and without, early fragment tests.
SPIR-V allows an image to be marked as a depth image, but with a non-depth
format. Such images should be read or sampled as vectors instead of scalars,
except when they are subject to compare operations.
Don't mark an OpSampledImage as using a compare operation just because the
image contains a depth marker. Instead, require that a compare operation
is actually used on that image.
Compiler::image_is_comparison() was really testing whether an image is a
depth image, since it incorporates the depth marker. Rename that function
to is_depth_image(), to clarify what it is really testing.
In Compiler::is_depth_image(), do not treat an image as a depth image
if it has been explicitly marked with a color format, unless the image
is subject to compare operations.
In CompilerMSL::to_function_name(), test for compare operations
specifically, rather than assuming them from the depth-image marker.
CompilerGLSL and CompilerMSL still contain a number of internal tests that
use is_depth_image() both for testing for a depth image, and for testing
whether compare operations are being used. I've left these as they are
for now, but these should be cleaned up at some point.
Add unit tests for fetch/sample depth images with color formats and no compare ops.
This is somewhat awkward to support, but the best effort we can do here
is to analyze various Load/Store opcodes and deduce the ideal overall
alignment based on this. This is not a 100% perfect solution, but should
be correct for any reasonable use case.
Also fix various nitpicks with BDA support while I'm at it.
Previous test for SPIRVCrossDecorationPhysicalTypePacked on parent struct
when unpacking member struct was too restrictive, and not needed as long
as padding compensates.
Populate member_type_index_redirection as reverse lookup, not forward lookup.
Move use of member_type_index_redirection from CompilerMSL::to_member_reference()
to CompilerGLSL::access_chain_internal() to access all redirected type info,
not just name.
Promote to short instead and do simple casts on load/store instead.
Not 100% complete fix since structs can contain booleans, but this is
getting into pretty ridiculously complicated territory.
Emit synthetic functions before function constants.
Support use of spvQuantizeToF16() in function constants for numerical
behavior consistency with the op code.
Ensure subnormal results from OpQuantizeToF16 are flushed to zero per SPIR-V spec.
Adjust SPIRV-Cross unit test reference shaders to accommodate these changes.
Any MSL reference shader that inclues a synthetic function is affected,
since the location it is emitted has changed.
Add spvQuantizeToF16() family of synthetic functions to convert
from float to half and back again, and add function attribute
[[clang::optnone]] to honor infinities during conversions.
Adjust SPIRV-Cross unit test reference shaders to accommodate these changes.
Add [[clang::optnone]] attribute to spvF*() functions used for handling
floating point operations decorated with DecorationNoContraction.
Just using precise::fma() did not work.
Adjust SPIRV-Cross unit test reference shaders to accommodate these changes.
Based on CTS testing, math optimizations between MSL and Vulkan are inconsistent.
In some cases, enabling MSL's fast-math compilation option matches Vulkan's math
results. In other cases, disabling it does. Broadly enabling or disabling fast-math
across all shaders results in some CTS test failures either way.
To fix this, selectively enable/disable fast-math optimizations in the MSL code,
using metal::fast and metal::precise function namespaces, where supported, and
the [[clang::optnone]] function attribute otherwise.
Adjust SPIRV-Cross unit test reference shaders to accommodate these changes.
Add test shader for new functionality.
Add legacy test reference shader for unrelated buffer-bitcast
test, that doesn't seem to have been added previously.