Commit Graph

1116 Commits

Author SHA1 Message Date
Nathan Gauër
07f49ce65d
spirv-opt: make traversal deterministic (#5790)
Related to https://github.com/microsoft/DirectXShaderCompiler/issues/6804
2024-09-09 08:59:51 -04:00
Laura Hermanns
b31baff4ee
[opt] Add struct-packing pass and unit test. (#5778)
This pass allows to re-assign offset layout decorations
to tightly pack a struct according to its packing rules.
2024-09-05 15:24:29 -04:00
Steven Perron
0c40b591a3
[OPT] Add SPV_KHR_ray_tracing_position_fetch to allow lists (#5757)
Fixes https://github.com/microsoft/DirectXShaderCompiler/issues/6844
2024-08-21 11:05:43 -04:00
Steven Perron
246daf246b
[OPT] Avoid assert in generatecopy (#5756)
We want to be able to recover when fix storage class is not able to fix
everything, and just leave the spir-v in an invalid state. The pass
should not fail because of that.
2024-07-31 14:11:45 -07:00
Steven Perron
81a116002b
[opt] Fix uses of type manager in fix storage class (#5740)
This removes some uses of the type manager. One use could not be
removed. Instead I had to update GenCopy to not use the type manager,
and be able to copy pointers.

Part of #5691
2024-07-24 14:42:00 +02:00
Steven Perron
ca373497f1
[opt] Fix pointer stores in DCE (#5739)
When a trying to mark store that use the same address as a load live, we
consider any use of the pointer in the store instruction enough to make
the store live. This is not correct. We should only mark the store as
live if it store to the pointer, and not storing the pointer to another
memory location.

This causes DCE to miss some dead code.
2024-07-24 14:36:26 +02:00
Nathan Gauër
2ea4003633
opt: split composite from array flattening (#5733)
* opt: split composite from array flattening

DXC has an option to flatten resource arrays. But when this option
is not used, the resource arrays should be kept as-is.
On the other hand, when a struct contains resources, we MUST flatten is
to be compliant with the Vulkan spec.

Because this pass flattens both types of resources, using a struct of
resources automatically implied flattening arrays.
By adding those 2 new settings, we decide if the pass flattens only one type
of resources, or both.
Note: the flatten_arrays flag only impacts resource arrays.
Arrays of composites containing resources are still flattened.

Since the API is considered stable, I added 2 new functions to create
passes with one flag or the other, and kept the original behavior as-is.

Related to https://github.com/microsoft/DirectXShaderCompiler/issues/6745

Signed-off-by: Nathan Gauër <brioche@google.com>

* add commandline options

Signed-off-by: Nathan Gauër <brioche@google.com>

* clang-format

Signed-off-by: Nathan Gauër <brioche@google.com>

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-07-19 11:48:21 -04:00
Steven Perron
6248fda376
Handle coop matrix in fix storage class (#5729) 2024-07-17 15:22:32 +02:00
Victor Lomuller
3bc9744d0a
Add FPEncoding operand type. (#5726)
This patch adds the optional FPEncoding operand that can be added to OpTypeFloat.
At the moment there is no usable operand, so support is limited to adding the entry.

Co-authored-by: Kévin Petit <kevin.petit@arm.com>
Co-authored-by: David Neto <dneto@google.com>
2024-07-03 13:18:40 -04:00
Steven Perron
ca004da9f9
Add knowledge of cooperative matrices (#5720)
* Add knowledge of cooperative matrices

Some optimizations are not aware of cooperative matrices, and either do
nothing or assert. This commits fixes that up.

* Add int tests, and a handle a couple more cases.

* Add float tests, and a handle a couple more cases.

* Add NV coop matrix as well.
2024-06-26 08:00:29 -04:00
Steven Perron
581279dedd
[OPT] Zero-extend unsigned 16-bit integers when bitcasting (#5714)
The folding rule `BitCastScalarOrVector` was incorrectly handling
bitcasting to unsigned integers smaller than 32-bits. It was simply
copying the entire 32-bit word containing the integer. This conflicts with the
requirement in section 2.2.1 of the SPIR-V spec which states that
unsigned numeric types with a bit width less than 32-bits must have the
high-order bits set to 0.

This change include a refactor of the bit extension code to be able to
test it better, and to use it in multiple files.

Fixes https://github.com/microsoft/DirectXShaderCompiler/issues/6319.
2024-06-19 19:17:05 +02:00
Shahbaz Youssefi
7564e142d6
spirv-dis: Add --nested-indent and --reorder-blocks (#5671)
With --nested-indent, the SPIR-V blocks are nested according to the
structured control flow.  Each OpLabel is nested that much with the
contents of the block nested a little more.  The blocks are separated by
a blank line for better visualization.

With --reorder-blocks, the SPIR-V blocks are reordered according to the
structured control flow.  This is particularly useful with
--nested-indent.

Note that with --nested-indent, the disassembly does not exactly show
the binary as-is, and the instructions may be reordered.
2024-06-17 09:54:18 -04:00
Nathan Gauër
bc28ac7c19
opt: add OpExtInst forward ref fixup pass (#5708)
This pass fixups the opcode used for OpExtInst instructions
to use OpExtInstWithForwardRefsKHR when it contains a forward
reference.
This pass is agnostic to the extension used, hence the validity
of the code depends of the validity of the usage:

If a forward reference is used on a non-semantic extended instruction,
the generated code will remain invalid, but the opcode will change.
What this pass guarantees is valid code won't become invalid.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
Co-authored-by: Steven Perron <stevenperron@google.com>
2024-06-13 02:09:58 -07:00
Nathan Gauër
65d30c3150
opt: fix Subgroup* trimming (#5706)
PR #5648 added support for the GroupNonUniformPartitionedNV. But there
was an issue: the opcodes are enabled by multiple capabilities, and the
actual operand is what matters.

Added testing coverage and the implementation to correctly trim a few
NonUniform capabilities.

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-06-11 17:13:46 +02:00
Nathan Gauër
ce46482db7
Add KHR suffix to OpExtInstWithForwardRef opcode. (#5704)
The KHR suffix was missing from the published SPIR-V extension.
This is now fixed, but requires some patches in SPIRV-Tools.

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-06-06 03:17:51 -07:00
Nathan Gauër
6a2bdeee75
spirv-val, core: add support for OpExtInstWithForwardRefs (#5698)
* val, core: add support for OpExtInstWithForwardRefs

This commit adds validation and support for
OpExtInstWithForwardRefs. This new instruction will be used
for non-semantic debug info, when forward references are
required.

For now, this commit only fixes the code to handle this new instruction,
and adds validation rules. But it does not add the pass to generate/fix
the OpExtInst instruction when forward references are in use.
Such pass would be useful for DXC or other tools, but I wanted to land
validation rules first.

This commit also bumps SPIRV-Headers to get this new opcode.

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-06-04 16:18:06 +02:00
Steven Perron
4a2e0c9b36
Fix comments in liveness.h (#5699)
Addressed comments from #5693 that were not fixed before merging.
2024-06-03 12:05:04 -04:00
Steven Perron
fd96922e9a
Remove calls to GetId in liveness analysis (#5693)
Part of #5691
2024-06-03 15:21:14 +02:00
Steven Perron
95681dc42f
Remove implicit call to GetId in ConvertToSampledImagePass. (#5692)
We replace getting the id of a poitner type with a specific funciton
call to FindPointerToType. Also, FindPointerToType is updated to not
indirectly call GetId. This leads to a linear search for an existing
type in all cases, but it is necessary.

Note that this function could have a similar problem. There could be two
pointer types with the same pointee and storage class, and the first one
will be returned. I have checked the ~20 uses, and they are all used in
situations where the id is used to create something new, and it does not
have to match an existing type. These will not cause problems.

Part of #5691
2024-06-03 15:07:52 +02:00
Steven Perron
148c97f687
Avoid use of type manager in extact->construct folding (#5684)
* Avoid use of type manager in extact->construct folding

When dealing with structs the type manager merge two different structs
into a single entry if they have all of the same decorations and
element types. This is because they hash to the same value in the hash
table. This can cause problems if you need to get the id of a type from
the type manager because you could get either one. In this case, it
returns the wrong one.

The fix avoids using the type manager in one place. I have not
looked closely at other places the type manager is used to make
sure it is used safely everywhere.

Fixes #5624

* Remove use of TypeManager::GetId

This removes a use of TypeManager::GetId by keeping the id around. This
avoid a potential problem if the type manager gets confused. These types
of bugs are hard to generate test cases for, so I do not have a test.
However, existing tests make sure that do not regress.
2024-05-31 14:13:20 +02:00
Steven Perron
336b5710a5
Do not fold mul and adds to generate fmas (#5682)
This removes the folding rules added in #4783 and #4808. They lead to
poor code generation on Adreno devices when 16-bit floating point values
were used. Since this change is transformation is suppose to be neutral,
there is no general reason to continue doing it.

I have talked to the owners of SwiftShader, and they do not mind if the
transform is removed. They were the ones the requested the change in the
first place.

Fixes #5658
2024-05-22 13:01:26 -04:00
Jeremy Gebben
9241a58a80
opt: Remove bindless and buff addr instrumentation passes (#5657)
These were only used by Vulkan-Validation layers, but they
have been replaced by other code for several months.
2024-05-02 18:52:17 -04:00
Natalie Chouinard
67a3ed6705
opt: add GroupNonUniformPartitionedNV capability to trim pass (#5648) 2024-04-18 16:04:58 -04:00
Diego Novillo
3983d15a1d
Fix rebuilding types with circular references (#5623). (#5637)
This fixes the problem reported in #5623 using the observation that if
we are re-building a type that already exists in the type pool, we
should just return that type.

This makes type rebuilding more efficient, and it also prevents the
type builder from getting itself into infinite recursion (as reported in
this issue).

In fixing this, I found a couple of other bugs in the type builder:

- When rebuilding an Array type, we were not re-building the element
  type. This caused stale type references in the rebuilt type.

- This bug had not been caught by the test, because the test itself had
  a bug in it: the test was rebuilding types on top of the same ID (the
  ID counter was never incremented).

Initially, the bug in the test caused a failure with the new logic in
the builder because we now return types from the pool directly, which
causes a failure when two incompatible types are registered under the
same ID.

Fixing that issue in the test exposed another bug in the rebuilder: we
were not re-building the element type for Array types. This was causing
a stale type reference inside Array types which was later caught by the
type removal logic in the test.
2024-04-09 10:36:21 -04:00
Jeremy Hayes
ade1f7cfd7
Add AliasedPointer decoration (#5635)
Fix #5607

When inlining, decorate return variable with AliasedPointer if the
storage class of the pointee type is PhysicalStorageBuffer.
2024-04-05 11:45:55 -06:00
Romaric Jodin
f20663ca7f
add support for vulkan-shader-profiler external passes (#5512) 2024-03-15 13:46:42 -04:00
Kévin Petit
f869d391a5
[OPT] Fix handling of analyses rebuild (#5608)
All tests treat kAnalysisEnd like STL end iterators, which
means its value must be greater than that of the last valid
Analysis.


Change-Id: Ibfaaf60bb450c508af0528dbe9c0729e6aa07b3b

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
2024-03-12 09:09:46 +00:00
Steven Perron
7604147c25
[OPT] Add removed unused interface var pass to legalization passes (#5579)
DXC does not do a good job of recognizing which variables need to be
on the entry point for which functions. This is because it does not
want to have to walk the call tree to determine which instructions
are reachable from which entry points.

This is also useful if the same input variable gets used from two
different shader, but the uses in one get optimized away.

Will parially fix
https://github.com/microsoft/DirectXShaderCompiler/issues/4621. Will not
fix code compiled with -fcgl.
2024-02-14 13:08:25 -05:00
Steven Perron
e08c012b19
[OPT] Identify arrays with unknown length in copy prop arrays (#5570)
* [OPT] Identify arrays with unknown length in copy prop arrays

The code in copy propagate arrays assumes that the length of an
OpTypeArray is known at compile time, but that is not true when the size
is an OpSpecConstant. We try to fix that assumption.

Fixes https://crbug.com/oss-fuzz/66634
2024-02-13 14:41:38 -05:00
Steven Perron
b7413609cf
[OPT] Use new instruction folder for for all opcodes in spec consti folding (#5569)
* [OPT] Use new instruction folder for for all opcodes in spec consti folding

When folding and OpSpecConstantOp, we use the new instruction folder for
a small number of opcodes. This enable the new instruction folder for
all opcodes and uses the old one as a fall back. This allows us to
remove some code from the older folder that is now covered by the new
one.

Fixes #5499
2024-02-12 19:52:55 +00:00
Steven Perron
a8959dc653
Fold 64-bit int operations (#5561)
Adds folding rules that will fold basic artimetic for signed and
unsigned integers of all sizes, including 64-bit.

Also folds OpSConvert and OpUConvert.
2024-02-09 14:02:48 -05:00
Nathan Gauër
ab59dc6087
opt: prevent meld to merge block with MaximalReconvergence (#5557)
The extension SPV_KHR_maximal_reconvergence adds more constraints
around the merge blocks, and how the control flow can be altered.

The one we address here is explained in the following part of the spec:

  Note: This means that the instructions in a break block will execute as if
  they were still diverged according to the loop iteration. This restricts
  potential transformations an implementation may perform on the IR to match
  shader author expectations. Similarly, instructions in the loop construct
  cannot be moved into the continue construct unless it can be proven that
  invocations are always converged.

Until the optimizer is clever enough to determine if the invocation
have already converged, we shall not meld a block which branches to a
merge block into it, as it might move some instructions outside of the
convergence region.

This behavior being only required with the extension, this commit
behavior change is gated by the extension.
This means using wave operations without the maximal reconvergence
extension might lead to undefined behaviors.

Co-authored-by: Natalie Chouinard <chouinard.nm@gmail.com>
2024-02-06 06:12:00 -05:00
Ben Doherty
8d3ee2e8f0
spirv-opt: Fix OpCompositeExtract relaxation with struct operands (#5536) 2024-02-01 15:19:02 -07:00
Natalie Chouinard
5d3c8b73f7
opt: Add OpEntryPoint to DescriptorScalarReplacement pass (#5553)
Add OpEntryPoint to the list of instructions processed by the
DescriptorScalarReplacement pass. This is necessary for SPIR-V 1.4 and
above where global variables must be included in the interface.

Fixes microsoft/DirectXShaderCompiler#5962
2024-02-01 09:50:36 -05:00
Natalie Chouinard
de65e81740
[NFC] Remove unused code (#5554) 2024-02-01 09:47:42 -05:00
Nathan Gauër
ad11927e6c
opt: add SPV_EXT_mesh_shader to opt allowlist (#5551)
Add this extension to the allowlist, allowing DCE and other
optimizations on modules exposing this.
Note: NV equivalent is already allowed.
2024-01-30 12:13:46 -05:00
Natalie Chouinard
0a6f0d1893
opt: Add TrimCapabilities pass to spirv-opt tool (#5545)
Add an option to the spirv-opt tool to run the TrimCapabilitiesPass.
2024-01-26 16:15:29 -05:00
Natalie Chouinard
0045b01ff9
opt: Add VulkanMemoryModelDeviceScope to trim (#5544)
Add the VulkanMemoryModelDeviceScope capability to the capability
trimming pass. According the the spec, "If the Vulkan memory model is
declared and any instruction uses Device scope, the
VulkanMemoryModelDeviceScope capability must be declared." Since this
case, based on the type of an operand, is not covered by the JSON
grammar, it is added explicitly.
2024-01-25 14:05:04 -05:00
alan-baker
de3d5acc04
Add tooling support for SPV_KHR_maximal_reconvergence (#5542)
* Validation for SPV_KHR_maximal_reconvergence
* Add pass to add/remove maximal reconvergence execution mode
---------

Co-authored-by: David Neto <dneto@google.com>
2024-01-25 09:39:49 -05:00
Steven Perron
155728b2e9
Add preserver-interface option to spirv-opt (#5524)
The optimizer is able to preserve the interface variables of the
shaders, but that feature has not been exposed to the command line
tool.

This commit adds an option `--preserve-interface` to spirv-opt that will
cause all calls to ADCE to leave the input and output variables, even if
the variable is unused. It will apply regardless of where the option
appears on the command line.

Fixes #5522
2024-01-12 14:45:17 -05:00
Nathan Gauër
c7affa1707
opt: add Int16 and Float16 to capability trim pass (#5519)
Add support for Int16 and Float16 trim.

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-01-04 20:01:03 +01:00
Ben Clayton
d75b3cfbb7
Zero initialize local variables (#5501)
Certain versions of GCC warn about these variables being potentially uninitialized when used.
I believe this is a false-positive, but zero-init'ing them is a safe way to fix this.
2023-12-11 10:32:45 -05:00
Jeremy Gebben
6b4f0c9d0b
instrument: Fix handling of gl_InvocationID (#5493)
This is an int and needs to be cast to a unit for inclusion in the
stage specific data passed to the instrumentation check function.
2023-12-05 09:59:51 -07:00
Jeremy Gebben
b5d60826e9
printf: Remove stage specific info (#5495)
Remove stage specific debug info that is only needed by GPU-AV.
This allows debug printfs to be used in multi-stage shader modules.

Fixes #4892
2023-12-04 15:43:36 -07:00
ncesario-lunarg
2da75e152e
Do not crash when tryingto fold unsupported spec constant (#5496)
Remove assertion in FoldWithInstructionFolder; there are cases where
folding spec constants is unsupported.

Closes #5492.
2023-12-04 08:48:16 -05:00
ChristianReinbold
0df791f97a
Fix nullptr argument in MarkInsertChain (#5465)
Fixes an access violation issue that sporadically occured for me when DXC uses spirv-opt to legalize generated spirv code.
2023-11-16 19:36:32 +00:00
Spencer Fricke
8ee3ae5244
Add comment to --inst-debug-printf option (#5466) 2023-11-14 13:00:54 -05:00
Nathan Gauër
f43c464d53
opt: add PhysicalStorageBufferAddresses to trim (#5476)
The PhysicalStorageBufferAddresses capability can now be
trimmed. From the spec, it seems any instruction enabled by this
required some operand to have the PhysicalStorageBuffer storage class.
This means checking the storage class is enough.

Now, because the pass uses the grammar, we don't need to add any
new logic.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-11-14 12:49:04 -05:00
Nathan Gauër
c91e9d09b5
opt: add StorageImageReadWithoutFormat to cap trim (#5475)
The StorageImageReadWithoutFormat capability is only required when
an image type with the format set to Unknown is used with some specific
OpImageRead or OpImageSparseRead instructions.

This patch adds the required code to the capability trimming pass to
remove the StorageImageReadWithoutFormat capability when not required.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-11-14 09:29:31 -05:00
Steven Perron
9e7a1f2ddd
Fix array size calculation (#5463)
The function that get the number of elements in a composite variable
returns an incorrect values for the arrays. This is fixed, so that it
returns the correct number of elements for arrays where the number of
elements is represented as a 32-bit integer and is known at compile
time.

Fixes #4953
2023-11-02 13:29:57 -04:00