Commit Graph

2303 Commits

Author SHA1 Message Date
alan-baker
e03c8f5c8e
Fix broken build (#5505)
Fixes #5503

* SPIRV-Headers name change broke the build
  * Update SPIRV-Headers deps and fix
2023-12-11 11:45:10 -05:00
Jeremy Gebben
6b4f0c9d0b
instrument: Fix handling of gl_InvocationID (#5493)
This is an int and needs to be cast to a unit for inclusion in the
stage specific data passed to the instrumentation check function.
2023-12-05 09:59:51 -07:00
Jeremy Gebben
b5d60826e9
printf: Remove stage specific info (#5495)
Remove stage specific debug info that is only needed by GPU-AV.
This allows debug printfs to be used in multi-stage shader modules.

Fixes #4892
2023-12-04 15:43:36 -07:00
ncesario-lunarg
2da75e152e
Do not crash when tryingto fold unsupported spec constant (#5496)
Remove assertion in FoldWithInstructionFolder; there are cases where
folding spec constants is unsupported.

Closes #5492.
2023-12-04 08:48:16 -05:00
Sajjad Mirza
246e6d4c68
spirv-val: Loosen restriction on base type of DebugTypePointer and DebugTypeQualifier (#5479)
* Allow base type for DebugTypePointer and DebugTypeQualifier to be any DebugType
2023-11-17 10:22:46 -05:00
ChristianReinbold
0df791f97a
Fix nullptr argument in MarkInsertChain (#5465)
Fixes an access violation issue that sporadically occured for me when DXC uses spirv-opt to legalize generated spirv code.
2023-11-16 19:36:32 +00:00
Nathan Gauër
f43c464d53
opt: add PhysicalStorageBufferAddresses to trim (#5476)
The PhysicalStorageBufferAddresses capability can now be
trimmed. From the spec, it seems any instruction enabled by this
required some operand to have the PhysicalStorageBuffer storage class.
This means checking the storage class is enough.

Now, because the pass uses the grammar, we don't need to add any
new logic.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-11-14 12:49:04 -05:00
Nathan Gauër
c91e9d09b5
opt: add StorageImageReadWithoutFormat to cap trim (#5475)
The StorageImageReadWithoutFormat capability is only required when
an image type with the format set to Unknown is used with some specific
OpImageRead or OpImageSparseRead instructions.

This patch adds the required code to the capability trimming pass to
remove the StorageImageReadWithoutFormat capability when not required.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-11-14 09:29:31 -05:00
Steven Perron
9e7a1f2ddd
Fix array size calculation (#5463)
The function that get the number of elements in a composite variable
returns an incorrect values for the arrays. This is fixed, so that it
returns the correct number of elements for arrays where the number of
elements is represented as a 32-bit integer and is known at compile
time.

Fixes #4953
2023-11-02 13:29:57 -04:00
Steven Perron
a08f648c86
Remove references to __FILE__ (#5462)
* Remove references to __FILE__

Uses of `__FILE__` leak the directory structure of the machine used to
build because it adds a string to the string table with the full path
name. I've removed the uses that show up in the release builds.

Fixes #5416
2023-11-01 15:19:48 -07:00
Spencer Fricke
c87755bb9f
spirv-val: Add WorkgroupMemoryExplicitLayoutKHR check for Block (#5461) 2023-11-01 10:48:40 -04:00
Cassandra Beckley
73876defc8
opt: support 64-bit OpAccessChain index in FixStorageClass (#5446)
The SPIR-V specification allows any scalar integer type as an index. DXC
usually emits indexes as 32-bit integer types, however, in some cases it
is possible to make it emit 64-bit indexes instead (as in
https://github.com/microsoft/DirectXShaderCompiler/issues/5638).
2023-10-19 20:02:46 +00:00
Steven Perron
5bb595091b
Add ComputeDerivativeGroup*NV capabilities to trim capabilities pass. (#5430)
* Add ComputeDerivativeGroup*NV capabilities to trim capabilities pass.

* Add SPV_NV_compute_shader_derivatives to allow lists

No tests needed for this. The code path is well tested. Just adding new
data.
2023-10-16 19:03:33 +00:00
Cassandra Beckley
023a8c79e9
opt: add Float64 capability to trim pass (#5428) 2023-10-05 11:12:09 +02:00
Cassandra Beckley
1bc0e6f59a
Add a new legalization pass to dedupe invocation interlock instructions (#5409)
Add a new legalization pass to dedupe invocation interlock instructions

DXC will be adding support for HLSL's rasterizer ordered views by using
the SPV_EXT_fragment_shader_interlock_extension. That extension
stipulates that if an entry point has an interlock ordering execution
mode, it must dynamically execute OpBeginInvocationInterlockEXT and
OpEndInvocationInterlockEXT, in that order, exactly once. This would be
difficult to determine in DXC's SPIR-V backend, so instead we will emit
these instructions potentially multiple times, and use this legalization
pass to ensure that the final SPIR-V follows the specification.

This PR uses data-flow analysis to determine where to place begin and
end instructions; in essence, determining whether a block contains or is
preceded by a begin instruction is similar to a specialized case of a
reaching definitions analysis, where we have only a single definition,
such as `bool has_begun = false`. For this simpler case, we can compute
the set of blocks using BFS to determine the reachability of the begin
instruction.

We need to do this for both begin and end instructions, so I have
generalized portions of the code to run both forward and backward over
the CFG for each respective case.
2023-09-27 19:54:10 -04:00
Jeremy Gebben
ee7598d497
instrument: Use Import linkage for instrumentation functions (#5355)
These functions are getting far too complicated to code in SPIRV-Tools
C++. Replace them with import stubs so that the real implementations
can live in Vulkan-ValidationLayers where they belong.

VVL will need to define these functions in spirv and link them to the
instrumented version of the user's shader.

From here on out, VVL can redefine the functions and any data they use
without updating SPIRV-Tools. Changing the function declarations will
still require both VVL and SPIRV-Tools to be updated in lock step.
2023-09-20 10:50:30 -06:00
David Neto
a996591b1c
Update SPIRV-Headers, add cache control operand kinds (#5406)
* Update SPIRV-Headers, add cache control operand kinds

Adds SPV_OPERAND_TYPE_LOAD_CACHE_CONTROL
and  SPV_OPERAND_TYPE_STORE_CACHE_CONTROL,
from SPV_INTEL_cache_controls

Fixes: #5404

* Update tests: remove Kernel from constant sampler enum dependencies

This corresponds to header change
https://github.com/KhronosGroup/SPIRV-Headers/pull/378
2023-09-13 17:43:12 -04:00
Cassandra Beckley
361638cfd0
Make sure that fragment shader interlock instructions are not removed by DCE (#5400) 2023-09-11 15:26:10 -04:00
Nathan Gauër
47b63a4d7d
val: re-add ImageMSArray validation (#5394)
This has been removed in #4752, but not added since.

* fixup! val: re-add ImageMSArray validation

clang-format
2023-09-07 09:39:28 -04:00
Nathan Gauër
4e0b94ed7a
opt: add ImageMSArray capability to trim pass. (#5395)
From the Capability's text in the SPIRV spec:

```
An MS operand in OpTypeImage indicates multisampled, used with an
OpTypeImage having Sampled == 2 and Arrayed == 1.
```

Adding this logic to the capability trimming pass.
2023-09-05 18:36:03 +00:00
Nathan Gauër
1f07f483ef
opt: add raytracing/rayquery to trim pass (#5397)
Adds the RayTracingKHR and RayQueryKHR capabilities to
the supported capabilities list (this includes the linked extension).
(NV and KHR capabilities/extensions shared the same IDs, so it also
works for NV flavors of those).
2023-09-05 14:36:14 +00:00
Nathan Gauër
1121c23198
opt: add Int64 capability to trim pass (#5398)
Adds support for Int64 capability trimming.
2023-09-05 09:47:46 -04:00
Nathan Gauër
3cc7e1c4c3
NFC: rename tests using capability as prefix (#5396) 2023-09-04 14:32:28 -07:00
Cassandra Beckley
4c16c35b16
opt: add FragmentShader*InterlockEXT to capability trim pass (#5390)
* opt: add FragmentShader*InterlockEXT to capability trim pass

* move to addInstructionRequirementsForOpcode
2023-09-04 11:27:56 +02:00
Jeremy Gebben
714966003d
opt: Add SwitchDescriptorSetPass (#5375)
This is a simple pass to change DescriptorSet decoration values.
2023-08-22 00:16:35 +00:00
Jeremy Gebben
6520d83eff
linker: Add --use-highest-version option (#5376)
Currently spirv-link fails if all input files don't use the same
SPIR-V version. Add an option to instead use the highest input
version as the output version. Note that if one of the 'old'
input files uses an opcode that is deprecated in the 'new'
version, the output spirv will be invalid.
2023-08-21 17:05:33 -06:00
Wooyoung Kim
89ca3aa571
SPV_QCOM_image_processing support (#5223) 2023-08-15 15:15:21 -04:00
Nathan Gauër
0f17d05c48
opt: add bitmask support for capability trimming (#5372)
Some operands are not simple values, but bitmasks.
The lookup in the table for required decomposing the mask into
single values.
This commit adds support for such operands, like MinLod|Offset.
2023-08-15 09:50:57 -04:00
Nathan Gauër
8714d7fad2
enable StorageUniform16 (#5371)
Adds support for the StorageUniform16 capability.
2023-08-10 13:54:31 -04:00
David Neto
8e3da01b45
Move token version/cap/ext checks from parsing to validation (#5370)
A token is allowed to parse even when it's from the wrong
version, or is not enabled by a capability or extension.
This allows more modules to parse.

Version/capability/extension checking is fully moved to
validation instead.

Fixes: #5364
2023-08-10 12:19:12 -04:00
Nathan Gauër
4788ff1578
opt: add StorageUniformBufferBlock16 to trim pass (#5367)
Add StorageUniformBufferBlock16 to the list of enabled capabilities.
2023-08-10 14:21:35 +00:00
Nathan Gauër
ebda56e352
opt: add StoragePushConstant16 to trim pass (#5366)
* opt: add StoragePushConstant16 to trim pass

* fix comment
2023-08-10 12:34:46 +00:00
Nathan Gauër
60e684fe71
opt: fix StorageInputOutput16 trimming. (#5359)
* opt: fix StorageInputOutput16 trimming.

While integrating this pass into DXC, I found a lot of missing
cases. This PR fixes a few issues centered around this capability
while laying out fondations for more fixes.

1. The grammar can define extensions in operand & opcode tables.
   - opcode can rely on common capabilities, but require a new
     extension.
   - opcode can also rely on a capability which requires an extension.
   Sometimes, the extension is listed twice, in the opcode, and
   capability. But this redundancy is not guaranteed.

2. minVersion check. The condition was flipped: we added the extension
   when the minVersion was less than current.
   Didn't noticed the issue as I only tests on the default env.

3. Capability/Extension instructions were not ignored.
   - `OpCapability Foo` will require the `Foo` capability.
   - it doesn't mean the module requires the `Foo` capability.
   Same for extensions.

This commit adds disabled tests, for fixes which are too large to
be brought into this already large PR.
2023-08-09 06:30:23 -04:00
David Neto
09b76c23ea
Update SPIRV-Headers; test some coop matrix enums (#5361)
Test:
  MatrixASignedComponentsKHR
  MatrixBSignedComponentsKHR
  MatrixCSignedComponentsKHR
  ResultSignedComponentsKHR
2023-08-04 14:50:54 -04:00
Jeremy Gebben
47fff21d52
instrument: Reduce number of inst_bindless_stream_write_6 calls (#5327)
Multiple calls to this function were causing vkCreateGraphicsPipelines
to be 3x slower on some driver. I suspect this was because each
call had to be inlined separately which bloated the code and caused
more work in the driver's SPIRV -> native instruction compilation.
2023-08-01 13:49:12 -06:00
ncesario-lunarg
a0f1c87272
opt: Fix incorrect half float conversion (#5349)
Fixes image operands not decorated as relaxed from
getting marked relaxed and converted to half precision.

Fixes #5044.
2023-07-26 10:03:24 -04:00
Nathan Gauër
35d8b05de4
opt: add capability trimming pass (not default). (#5278)
This commit adds a new optimization which tries to remove unnecessary
capabilities from a SPIR-V module.

When compiling a SPIR-V module, you may have some dead-code using
features gated by a capability.
DCE will remove this code, but the capability will remain. This means
your module would still require some capability, even if it doesn't
require it. Calling this pass on your module would remove obsolete
capabilities.

This pass wouldn't be enabled by default, and would only be usable
from the API (at least for now).

NOTE: this commit only adds the basic skeleton/structure, and
doesn't mark as supported many capabilities it could support.
I'll add them as supported as I write tests.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-25 16:52:41 +02:00
Steven Perron
d52c39c37d
Do not crash when folding 16-bit OpFDiv (#5338)
The code currently tries to get the value of the floating point constant
to see if it is -0.0. However, we are not able to get the value for
16-bit floating point value, and we hit an assert.

To avoid this, we add an early check for the width to make sure it is
either 32 or 64.

Fixes https://github.com/microsoft/DirectXShaderCompiler/issues/5413.
2023-07-21 10:17:12 -04:00
Nathan Gauër
17d9669d51
enumset: add iterator based constructor/insert (#5344)
Expanding a bit the EnumSet API to have iterator-based
insert and constructors (like the STL).
This is also a pre-requisite from the capability-trimming pass as
it allows to build a const set from a constexpr std::array easily.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-20 17:54:50 +00:00
Nathan Gauër
bf03d40922
opt: change Get* functions to return const& (#5331)
GetCapabilities returned a const*, and GetExtensions did not exist.
This commit adds GetExtensions, and changes the return value to
be a const&.

This commit also removes the overload to GetCapabilities which returns
a mutable set, as it is unused.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-20 10:18:19 -04:00
David Neto
876ccc6cd5
Add /bigobj to test_opt for VS 2017 (#5336)
This apparently is required for debug builds.

Fixes: #5335
2023-07-20 10:14:35 -04:00
ncesario-lunarg
7dd5f95d25
[spirv-opt] Handle OpFunction in GetPtr (#5316)
When using PhysicalStorageBuffer it is possible for a function to
return a pointer type. This was not being handled correctly in
`GetLoadedVariablesFromFunctionCall` in the DCE pass because
`IsPtr` returns the wrong result.

Fixes #5270.
2023-07-17 19:16:25 +00:00
Nathan Gauër
85a4482131
NFC: makes the FeatureManager immutable for users (#5329)
* NFC: makes the FeatureManager immutable for users

The FeatureManager contains some internal state, like
a set of capabilities and extensions. Those are derived
from the module.

Before this commit, the FeatureManager exposed Remove* functions
which could unsync the reported extensions/capabilities from
the truth: the module.

The only valid usecase to remove items directly from the FeatureManager
is by the context itself, when an instruction is killed:
instead of running the whole an analysis, we remove the single outdated
item.

The was 2 users who mutated its state:
 - one to invalidate the manager. Moved to call a reset function.
 - one who removed an extension from the feature manager after removing
   it from the module. This logic has been moved to the context, who
   now handles the extension removal itself.

Signed-off-by: Nathan Gauër <brioche@google.com>

* clang-format

* add RemoveCapability since the fuzztests are using it

* add tests

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-17 11:15:08 -04:00
Nathan Gauër
29431859f5
NFC: replace EnumSet::ForEach with range-based-for (#5322)
EnumSet now supports iterators, meaning we can remove the custom
ForEach.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-13 14:40:47 -04:00
Nathan Gauër
5b4fb072eb
enumset: fix bug in the new iterator class (#5321)
The iterator class was initialized by setting the offset
and bucket to 0. Big oversight: what if the first enum is
not valid? Then `*iterator->begin()` would return the wrong
value.

Because the first capacity is Matrix, this bug was not visible by
any SPIRV test.
And this specific case wasn't tested correctly in the new enumset tests.

Signed-off-by: Nathan Gauër <brioche@google.com>

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-13 09:55:24 -04:00
Jeremy Gebben
9266197c37
instrument: Cast gl_VertexIndex and InstanceIndex to uint (#5319)
This avoids errors like this from instrumenting vertex shaders:

error: 165: Expected Constituents to be scalars or vectors of the
  same type as Result Type components
  %195 = OpCompositeConstruct %v4uint %uint_0 %191 %194 %uint_0
2023-07-12 15:12:26 -06:00
Nathan Gauër
3424b16c10
enumset: STL-ize container (#5311)
This commit adds forward iterator, and renames functions to
it matches the std::unordered_set/std::set better.
This goes against the SPIR-V coding style, but might be better in
the long run, especially when this set is used along real STL
sets.
(Right now, they are not compatible, and requires 2 syntaxes).

This container could in theory handle bidirectional
iterator, but for now, only forward seemed required for
our use-cases.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-12 11:34:44 -04:00
Spencer Fricke
7ff331af66
source: Give better message if using new Source Language (#5314) 2023-07-11 11:50:41 -04:00
alan-baker
0530a532fc
Validate GroupNonUniform instructions (#5296)
Fixes #5283

* Validate group non-uniform instructions
2023-07-11 08:40:40 -04:00
Nathan Gauër
0f3bea06ef
NFC: rewrite EnumSet to handle larger enums. (#5289)
The current EnumSet implementation is only efficient for enums with
values < than 64. The reason is the first 63 values are stored as a
bitmask in a 64 bit unsigned integer, and the other values are stored
in a std::set.
For small enums, this is fine (most SPIR-V enums have IDs < than 64),
but performance starts to drop with larger enums (Capabilities,
opcodes).

Design considerations:
----------------------

This PR changes the internal behavior of the EnumSet to handle enums
with arbitrary values while staying performant.
The idea is to extend the 64-bits buckets sparsely:
 - each bucket can store 64 value, starting from a multiplier of 64.
This could be considered as a hashset with linear probing.

- For small enums, there is a slight memory overhead due to the bucket
storage, but lookup is still constant.
- For linearly distributed values, lookup is constant.
- Worse case for storage are for enums with values which are multiples of 64.
But lookup is constant.
- Worse case for lookup are enums with a lot of small ranges scattered in
the space (requires linear probing).

For enums like capabilities/opcodes, this bucketing is useful as values
are usually scatters in distinct, but almost contiguous blocks.
(vendors usually have allocated ranges, like [5000;5500], while [1000;5000]
is mostly unused).

Benchmarking:
-------------

Benchmarking was done in 2 ways:
 - a benchmark built for the occasion, which only measure the EnumSet
   performance.
 - SPIRV-Tools tests, to measure a more realist scenario.

Running SPIR-V tests with both implementations shows the same
performance (delta < noise). So seems like we have no regressions.
This method is noisy by nature (I/O, etc), but the most representative
of a real-life scenario.

Protocol:
 - run spirv-tests with no stdout using perf, multiple times.
Result:
 - measure noise is larger than the observed difference.

The custom benchmark was testing EnumSet interfaces using SPIRV enums.
Doing thousand of insertion/deletion/lookup, with 2 kind of scenarios:
 - add once, lookup many times.
 - add/delete/loopkup many time.

For small enums, results are similar (delta < noise). Seems relevant
with the previously observed results as most SPIRV enums are small, and
SPIRV-Tools is not doing that many intensive operations on EnumSets.

Performance on large enums (opcode/capabilities) shows an improvement:

+-----------------------------+---------+---------+---------+
| Metric                      |  Old    |   New   | Delta % |
+-----------------------------+---------+---------+---------+
| Execution time              |   27s   |   7s    |  -72%   |
| Instruction count           |  174b   |  129b   |  -25%   |
| Branch count                |   28b   |   33b   |  +17%   |
| Branch miss                 |  490m   |   26m   |  -94%   |
| Cache-misses                |  149k   |   26k   |  -82%   |
+-----------------------------+---------+---------+---------+

Future work
-----------

This was by-design an NFC change to compare apples-to-apples.
The next PR aims to add STL-like iterators to the EnumSet to allow
using it with STL algorithms, and range-based for loops.

Signed-off-by: Nathan Gauër <brioche@google.com>
2023-07-07 10:41:52 -04:00