Commit Graph

736 Commits

Author SHA1 Message Date
alan-baker
4773879b68
Update structure layout validation (#4876)
* Uniform block layout rules for matrices should use extended layouts by
  default
2022-07-29 10:16:54 -04:00
Greg Fischer
faa8d6a653
Revert "Optimize DefUseManager allocations (#4709)" (#4846)
This reverts commit d18d0d92e5.

This is reverted because it causes a 7X slowdown when legalizing
SPIR-V with NonSemantic.Shader.DebugInfo.100 instructions.
This is due to the creation of very large UseLists for several
heavily used operands for this extension combined with the fact
that the original commit changed the performance of Uselists to O(n).
2022-07-12 13:14:47 -06:00
Steven Perron
92fe420c8a
Reduce load size does not work for array with spec const size (#4845)
Arrays do not have to have a size that is known at compile time.  It
could be a spec constant.  In these cases, treat the array
as if it is arbitrarily long.  This commit will treat it like it is an
array of size UINT32_MAX.

Fixes https://crbug.com/oss-fuzz/47397.
2022-07-05 16:16:50 -04:00
Steven Perron
d5a3bfcf2f
Avoid undefined behaviour when getting debug opcode (#4842)
If the `instruction` operand in an extended instruction instruction is
too large, it causes undefined behaviour when that value is cast to the
enum for the corresponding set.  This is done with the
NonSemanticDebug100 instruction set.  We need to avoid the undefined
behaviour.

Fixes #4727
2022-07-05 14:14:29 -04:00
Steven Perron
32622ba7c6
DCE: clean up the cfg for all functions that were processed. (#4840)
Which functions are processed is determined by which ones are on the
call tree from the entry points before dead code is removed.  So it is
possible that a function is process because it is called from an entry
point, but the CFG is not cleaned up because the call to the function
was removed.

The fix is to process and cleanup every function in the module.  Since
all of the dead functions would have already been removed in an earlier
step of DCE, it should not make a different in compile time.

Fixes #4731
2022-07-05 12:23:32 -04:00
alan-baker
286e9c1187
Use structural dominance to validate cfg (#4832)
* Structural dominance introduced in SPIR-V 1.6 rev2
* Changes the structured cfg validation to use structural dominance
  * structural dominance is based on a cfg where merge and continue
    declarations are counted as graph edges
* Basic blocks now track structural predecessors and structural
  successors
* Add validation for entry into a loop
* Fixed an issue with inlining a single block loop
  * The continue target needs to be moved to the latch block
* Simplify the calculation of structured exits
  * no longer requires block depth
* Update many invalid tests
2022-06-29 23:32:20 -04:00
Steven Perron
66d88508dd
Build struct order only for the section needed when unrolling. (#4830)
We currently build the structured order for all nodes reachable from the
loop header when unrolling a loop.  However, unrolling only needs the
nodes in the loop and possibly the merge node.

To avoid needless computation, I have implemented a search that will
stop at the merge node.

Fixes #4827
2022-06-29 09:53:26 -04:00
Steven Perron
37d2396cab
Fix SplitLoopHeader to handle single block loop (#4829)
The code in `CFG::SplitLoopHeader` assumes the loop header is not the
latch.  This leads to it not being able to find the latch block.  This
has been fixed, and a test added.

Fixes #4527
2022-06-24 12:33:45 -04:00
Steven Perron
3c9fd7577f
Avoid if-conversion if both predecessors are the same (#4826)
If the predecessor blocks are the same, then there is only 1 value for the
OpPhi.  The simplition pass will simplify it, and it causes problems for
if-conversion.  In these cases, if-conversion can just punt.

Fixes #3554.
2022-06-24 15:28:06 +00:00
David Neto
2eff41e707
Remove stray output to stdout from tests (#4816) 2022-06-20 10:57:44 -04:00
manas-kulkarni
fbcb6cf4c8
Ability to fold Constant Vector times Matrix and Matrix times vector instructions (#4818) 2022-06-16 13:54:12 -04:00
Steven Perron
76ebfb989f
Avoid replacing access chain with OOB access (#4819)
An access chain could have a constant index that is an out of bounds
access.  This is valid spir-v, even if it can cause problems at runtime.
However, it is not valid to have an OpCompositeExtract with an out of
bounds access.  This means we have to stop local-access-chain-convert
from making that change.

Fixes #4605
2022-06-14 13:06:38 -04:00
David Neto
8f7f5024f8
Simplify invocation of snprintf (#4815) 2022-06-10 17:55:45 -04:00
Nicolas Capens
130a05d2e3
Fold multiply and subtraction into FMA with negation (#4808)
This change adds a folding rule which transforms x * y - a and a - x * y
into FMA(x, y, -a) and FMA(-x, y, a), respectively.

While the SPIR-V instruction count remains the same, target instruction
sets typically feature FMA instruction variants that can negate an
operand. Also this transformation may unlock further optimizations which
eliminate the negation.

(Google bug: b/226145988)
2022-05-31 12:03:56 -04:00
Steven Perron
088cb1a5c8
Add more folding for composite instructions (#4802)
* Add move folding for composite instructions

Fold chains of insert into construct

If a chain of OpCompositeInsert instruction write to every element of a
composite object, then we can replace it with an OpCompositeConstruct.

Fold a construct fed by extracts to a single extract

We already fold an OpCompositeConstruct when it is simlpy reconstructing
an object that was decomposed by a series of OpCompositeExtract
instructions.  However, we do not do that if that object is an element
of a larger object.

I have updated the rule, so that if the original object is a an element
of a larger object, then the OpCompositeConstruct is replaced with a
single OpCompositeExtract from the larger object.

Fixes #4371.
2022-05-26 10:29:02 -04:00
Steven Perron
f74b85853c
Handle 64-bit integers in local access chain convert (#4798)
* Handle 64-bit integers in local access chain convert

The local access chain convert pass does on run on module that have
64-bit integers, even if they have nothing to to with access chains.
This is very limiting because other passes rely on the access chains
being removed. So this commit will add this functionality to the pass.
2022-05-10 17:02:14 +00:00
Daniele Vettorel
f7a6e3b9d5
Handle chains of OpAccessChain in replacing variable index access for flattened resources. (#4797) 2022-05-10 11:41:43 -04:00
Jaebaek Seo
ad3514b732
spirv-opt: add pass for interface variable scalar replacement (#4779)
Replace shader's stage variables whose types are array or matrix
with scalars/vectors.
For example,
```
Before:
  %foo = OpVariable %_ptr_Output__arr_v2float_uint_4 Output
After:
  %foo = OpVariable %_ptr_Output_v2float Output
  %foo_0 = OpVariable %_ptr_Output_v2float Output
  %foo_1 = OpVariable %_ptr_Output_v2float Output
  %foo_2 = OpVariable %_ptr_Output_v2float Output
```
2022-05-09 14:04:52 -04:00
JiaoluAMD
c11ea09652
spirv-opt : Add FixFuncCallArgumentsPass (#4775)
spirv validation require OpFunctionCall with memory object, usually this
is non issue as all the functions are inlined.
This pass deal with some case for
DontInline function. accesschain input operand would be replaced new
created variable
2022-05-06 10:39:26 -04:00
JiaoluAMD
2c7fb9707b
Handle dontinline function in spread-volatile-semantics (#4776)
Handle function calls in spread-volatile-semantics
2022-05-04 10:52:58 -04:00
Steven Perron
1295dca8e2
Reapply "Add folding rule to generate Fma instructions (#4783)" (#4789)
This reverts commit 671f6e633f.

PR #4783 was reverted because it caused OpenCL CTS failures for clvk.
The was in clspv, which was not adding the no contract decoration when
it was required.  This has been fixed in
https://github.com/google/clspv/pull/845.  We can now reapply #4783.
2022-05-03 10:20:23 -04:00
sindney
46492aa45a
spirv-opt: skips if_conversion when dontflatten is set (#4770) 2022-04-28 19:26:02 +00:00
Daniele Vettorel
671f6e633f
Revert "Add folding rule to generate Fma instructions (#4783)" (#4785)
This reverts commit 2b2b0282af.
2022-04-20 10:55:20 -04:00
Steven Perron
2b2b0282af
Add folding rule to generate Fma instructions (#4783)
Adding Fma instruction can speed up the code.  This was requested by
swiftshader, so they do not have to do this analysis themselves.  It can
also help reduce the code size, and the work the ICD compilers have to
do.
2022-04-19 11:25:07 -04:00
Steven Perron
92c17edde7
Don't try to unroll loop with step count 0. (#4769)
These loop are infinite loop, so there is no reason to unroll the loop.
Fixes #4711.
2022-04-11 10:21:15 -04:00
Jaebaek Seo
05745cc9d4
Handle shaders without execution model in spread-volatile-semantics (#4766)
spread-volatile-semantics pass spreads Volatile semantics for builtin
variables used by certain execution models based on
VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V"). Therefore, shaders without execution model (e.g., used only
for linkage) are not the target of the pass. This commit lets the pass
just return SuccessWithoutChange in that case.
2022-03-25 17:54:46 +00:00
Greg Fischer
9d1b572884
spirv-opt: (WIP) Eliminate Dead Input Component Pass (#4720)
This adds the --eliminate-dead-input-components pass which currently
removes trailing unused components from input arrays.

Fixes #4532
2022-03-22 20:50:52 -06:00
Steven Perron
0741f42738
Reset the id bound on the module in compact ids (#4744)
If the body of the module does not have any ids change, compact ids will
not change the id bound.  This can cause problems because the id bound
could be much higher than the largest id in that is used.  It should be
reset any time it is not the larger id used + 1.

Fixes #4604
2022-03-07 20:33:01 +00:00
Steven Perron
48a36c72e4
Better handling of 0xFFFFFFFF when folding vector shuffle (#4743)
When folding a vector shuffle feeding a vector shuffle, we do not
propagate an 0xFFFFFFFF, which has a special meaning, correctly.  We
adjust the value making it lose it meaning as an undefined value.

Fixes #4581
2022-03-07 19:35:57 +00:00
Steven Perron
4fa1a6f9b4
Generalize assert in ccp (#4735)
CCP does not want to fold an instruction unless it folds to a constant.
There is an asser to check for this.  The question if a spec constant
counts as a constant.  The constant folder considers a spec constant a
constand, but CCP does not.  I've fixed the assert in CCP to match what
the folder does.  It should not require any new changes to CCP.
2022-03-07 19:33:10 +00:00
Steven Perron
920156cf18
Add pass to remove DontInline function control (#4747)
Swift shader needs a way to inline all functions, even those marked as
DontInline.  See https://github.com/KhronosGroup/SPIRV-Tools/pull/4471.
This implements the suggestion I made in the PR.  We add a pass that
will remove the DontInline function control, so that the inlining passes
will inline them.

SwiftShader will still have to modify their code to add this pass before
the other passes are run.
2022-03-07 12:45:17 -05:00
Steven Perron
0b8426346d
Don't rebuilt valid analyses. (#4733)
The function `BuildInvalideAnalyses` will be rebuilt for every analysis that
has been requested, but it is not necessary.  It also can cause problems
because if the CFG needs to be rebuilt, so do the dominator trees.

This change will make the functionality match the description of the
function.
2022-03-04 20:16:42 +00:00
pd-valve
d18d0d92e5
Optimize DefUseManager allocations (#4709)
* Optimize DefUseManager allocations

Saves around 30-35% of compilation time.

For inst->use_ids, use a pool linked list instead of allocating vectors for every instruction.  For inst->uses, use a "PooledLinkedList"' -- a linked list that has shared storage for all nodes.  Neither re-use nodes, instead we do a bulk compaction operation when too much memory is being wasted (tuneable).

Includes separate PooledLinkedList templated datastructure, a very special case construct, but split out to make the code a little easier to understand.
2022-02-15 19:17:30 -05:00
sfricke-samsung
471428a04f
spirv-opt: Add OpExecutionModeId support (#4719)
Needed for Vulkan1.3 adding LocalSizeId support
2022-02-14 14:33:29 +00:00
Natalie Chouinard
72e4475b41
Handle propagation of arrays with decorations (#4717)
When copy propagating, OpDecorate instructions can be copied as is. For
array flattening, they should be ignored.
2022-02-11 16:13:14 -05:00
Steven Perron
5b371918b9
Have scalar replacement use undef instead of null (#4691)
Scalar replacement generates a null when there value for a member will
not be used.  The null is used to make sure things are
deterministic in case there is an error.

However, some type cannot be null, so we will change that to use undef.
To keep the code simpler we will always use the undef.

Fixes #3996
2022-02-03 15:51:15 +00:00
Steven Perron
b846f8f1dc
Complete handling of RayQueryKHR type (#4690)
The handling of the RayQueryKHR type is not complete in the type
manager.  The tests were not picking this up.  I've added a test to make
sure that the `GenerateAllTypes` function actually does generate all of
the types.  Once it is added there other tests should pick up on the
other parts that were missing.
2022-01-31 15:44:32 +00:00
luzpaz
65ecfd1093
Fix various source comment (doxygen) typos (#4680)
Found via `codespell -q 3 -L fo,lod,parm
2022-01-26 15:13:08 -05:00
Jaebaek Seo
fb9a10cd48
spirv-opt: add pass to Spread Volatile semantics (#4667)
Add a pass to spread Volatile semantics to variables with SMIDNV,
WarpIDNV, SubgroupSize, SubgroupLocalInvocationId, SubgroupEqMask,
SubgroupGeMask, SubgroupGtMask, SubgroupLeMask, or SubgroupLtMask BuiltIn
decorations or OpLoad for them when the shader model is the ray
generation, closest hit, miss, intersection, or callable shaders. This
pass can be used for VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V").

Handle variables used by multiple entry points:

1. Update error check to make it working regardless of the order of
   entry points.
2. For a variable, if it is used by two entry points E1 and E2 and
   it needs the Volatile semantics for E1 while it does not for E2
  - If VulkanMemoryModel capability is enabled, which means we have to
    set memory operation of load instructions for the variable, we
    update load instructions in E1, but do not update the ones in E2.
  - If VulkanMemoryModel capability is disabled, which means we have
    to add Volatile decoration for the variable, we report an error
    because E1 needs to add Volatile decoration for the variable while
    E2 does not.

For the simplicity of the implementation, we assume that all functions
other than entry point functions are inlined.
2022-01-25 13:14:36 -05:00
Steven Perron
b7251d4fb7
reflect debug (#4662)
The pass to remove the nonsemantic information and instructions
is used for drivers or tools that may not support them.  Debug
information was only partially handle, which is causing a
problem.  We need to either fully remove debug information or
not remove it all.  Since I can see it being useful to keep the
debug information even when the nonsemantic instructions are
removed, I propose we do not remove debug info.

Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4269
2021-12-15 11:06:51 -05:00
Steven Perron
354a46a2a2
Rename strip reflect to strip nonsemantic (#4661)
In https://github.com/KhronosGroup/SPIRV-Tools/pull/3110, the strip reflect
pass was changed to also remove all explicitly nonsemantic instructions.  This
makes it so that the name of the pass no longer reflects what the pass actually
does.  This change renames the pass so that it reflects what the pass actaully does.
2021-12-15 09:55:30 -05:00
Sebastien Alaiwan
a2260d3b1f
Fix compilation (#4656)
* Delete public accessor, whose only user is one unit-test

The underlying container type becomes invisible from the outside.

* Fix compilation
2021-12-09 12:42:53 -05:00
Marius Hillenbrand
1ed847f438
Fix endianness of string literals (#4622)
* Fix endianness of string literals

To get correct and consistent encoding and decoding of string literals
on big-endian platforms, use spvtools::utils::MakeString and MakeVector
(or wrapper functions) consistently for handling string literals.

- add variant of MakeVector that encodes a string literal into an
  existing vector of words
- add variants of MakeString
- add a wrapper spvDecodeLiteralStringOperand in source/
- fix wrapper Operand::AsString to use MakeString (source/opt)
- remove Operand::AsCString as broken and unused
- add a variant of GetOperandAs for string literals (source/val)
... and apply those wrappers throughout the code.

Fixes  #149

* Extend round trip test for StringLiterals to flip word order

In the encoding/decoding roundtrip tests for string literals, include
a case that flips byte order in words after encoding and then checks for
successful decoding. That is, on a little-endian host flip to big-endian
byte order and then decode, and vice versa.

* BinaryParseTest.InstructionWithStringOperand: also flip byte order

Test binary parsing of string operands both with the host's and with the
reversed byte order.
2021-12-08 12:01:26 -05:00
Diego Novillo
c75a1a46f3
Fix https://github.com/KhronosGroup/SPIRV-Tools/issues/4462 (#4651)
This prevents CCP from making constant -> constant transitions when
evaluating instruction values.  In this case, FClamp is evaluated twice.
On the first evaluation, if computes FClamp(0.5, 0.5, -1) which returns
-1.  On the second evaluation, it computes FClamp(0.5, 0.5, VARYING)
which returns 0.5.

Both fold() computations are correct given the semantics of FClamp() but
this causes a lateral transition in the constant lattice which was not
being considered VARYING by CCP.
2021-12-02 10:40:28 -05:00
Natalie Chouinard
d0a827a9f3
Copy OpDecorateStrings in DescriptorScalarReplacementPass (#4649)
Along with OpDecorate, also clone the OpDecorateString instructions for
variables created in the descriptor scalar replacement pass.

Fixes microsoft/DirectXShaderCompiler#3705
2021-11-29 02:11:22 -05:00
Steven Perron
8c155b364c
Manually fold floating point division by zero (#4637)
See https://github.com/KhronosGroup/SPIRV-Tools/issues/4636 for details.

Fixes #4636.
2021-11-24 14:13:58 -05:00
alan-baker
4b092d2ab8
Allow ADCE to remove dead inputs (#4629)
* https://github.com/KhronosGroup/Vulkan-Docs/issues/666 clearly
  specified that interfaces do not require an input if there is an
  associated output
* ADCE can now remove unused input variables (though they are kept if
  the preserve interfaces option is used)
2021-11-16 15:54:17 -05:00
alan-baker
21e3f681e2
Update SPIRV-Headers (#4628)
* Fix compile
* Fix test
2021-11-10 16:32:09 -05:00
Greg Fischer
352a411278
Fix handling of OpPhi in convert-relaxed-to-half (#4618)
Fixes #4452
2021-11-09 10:36:50 -07:00
alan-baker
339d4475c1
Improve decoration validation (#4490)
Fixes #4469

* Checks that decorations only usable with structure members are not
  used by OpDecorate or OpDecorateId
* Checks that decorations not allowed on structure members are not used
  with OpMemberDecorate
* Checks decoration targets for most core decorations
* Performs some Vulkan specific validation on deorations
2021-11-05 13:18:19 -04:00