Commit Graph

88 Commits

Author SHA1 Message Date
Greg Fischer
9d1b572884
spirv-opt: (WIP) Eliminate Dead Input Component Pass (#4720)
This adds the --eliminate-dead-input-components pass which currently
removes trailing unused components from input arrays.

Fixes #4532
2022-03-22 20:50:52 -06:00
Steven Perron
920156cf18
Add pass to remove DontInline function control (#4747)
Swift shader needs a way to inline all functions, even those marked as
DontInline.  See https://github.com/KhronosGroup/SPIRV-Tools/pull/4471.
This implements the suggestion I made in the PR.  We add a pass that
will remove the DontInline function control, so that the inlining passes
will inline them.

SwiftShader will still have to modify their code to add this pass before
the other passes are run.
2022-03-07 12:45:17 -05:00
Jaebaek Seo
fb9a10cd48
spirv-opt: add pass to Spread Volatile semantics (#4667)
Add a pass to spread Volatile semantics to variables with SMIDNV,
WarpIDNV, SubgroupSize, SubgroupLocalInvocationId, SubgroupEqMask,
SubgroupGeMask, SubgroupGtMask, SubgroupLeMask, or SubgroupLtMask BuiltIn
decorations or OpLoad for them when the shader model is the ray
generation, closest hit, miss, intersection, or callable shaders. This
pass can be used for VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V").

Handle variables used by multiple entry points:

1. Update error check to make it working regardless of the order of
   entry points.
2. For a variable, if it is used by two entry points E1 and E2 and
   it needs the Volatile semantics for E1 while it does not for E2
  - If VulkanMemoryModel capability is enabled, which means we have to
    set memory operation of load instructions for the variable, we
    update load instructions in E1, but do not update the ones in E2.
  - If VulkanMemoryModel capability is disabled, which means we have
    to add Volatile decoration for the variable, we report an error
    because E1 needs to add Volatile decoration for the variable while
    E2 does not.

For the simplicity of the implementation, we assume that all functions
other than entry point functions are inlined.
2022-01-25 13:14:36 -05:00
Steven Perron
354a46a2a2
Rename strip reflect to strip nonsemantic (#4661)
In https://github.com/KhronosGroup/SPIRV-Tools/pull/3110, the strip reflect
pass was changed to also remove all explicitly nonsemantic instructions.  This
makes it so that the name of the pass no longer reflects what the pass actually
does.  This change renames the pass so that it reflects what the pass actaully does.
2021-12-15 09:55:30 -05:00
Jaebaek Seo
d997c83b10
Add spirv-opt pass to replace descriptor accesses based on variable indices (#4574)
This commit adds a spirv-opt pass to replace accesses to
descriptor array based on variable indices with constant
elements.

Before:
```
%descriptor = OpVariable %_ptr_array_Image Uniform
...
%ac = OpAccessChain %_ptr_Image %descriptor %variable_index
(some image instructions using %ac)
```
After:
```
%descriptor = OpVariable %_ptr_array_Image Uniform
...
OpSwitch %variable_index 0 %case0 1 %case1 ...
...
%case0 = OpLabel
%ac = OpAccessChain %_ptr_Image %descriptor %uint_0
...
%case1 = OpLabel
%ac = OpAccessChain %_ptr_Image %descriptor %uint_1
...
(use OpPhi for value with concrete type)
```
2021-10-26 17:20:58 -04:00
Jaebaek Seo
57e1d8ebe3
Add spirv-opt convert-to-sampled-image pass (#4340)
convert-to-sampled-image pass converts images and/or samplers with
given pairs of descriptor set and binding to sampled image.

If a pair of an image and a sampler have the same pair of descriptor
set and binding that is one of the given pairs, they will be
converted to a sampled image. In addition, if only an image has the
descriptor set and binding that is one of the given pairs, it will
be converted to a sampled image as well.

For example, when we have

  %a = OpLoad %type_2d_image %texture
  %b = OpLoad %type_sampler %sampler
  %combined = OpSampledImage %type_sampled_image %a %b
  %value = OpImageSampleExplicitLod %v4float %combined ...

1. If %texture and %sampler have the same descriptor set and binding

  %combine_texture_and_sampler = OpVaraible %ptr_type_sampled_image_Uniform
  ...
  %combined = OpLoad %type_sampled_image %combine_texture_and_sampler
  %value = OpImageSampleExplicitLod %v4float %combined ...

2. If %texture and %sampler have different pairs of descriptor set and binding

  %a = OpLoad %type_sampled_image %texture
  %extracted_image = OpImage %type_2d_image %a
  %b = OpLoad %type_sampler %sampler
  %combined = OpSampledImage %type_sampled_image %extracted_image %b
  %value = OpImageSampleExplicitLod %v4float %combined ...
2021-08-18 08:30:48 -04:00
ZHOU He
f9893c4549
spirv-opt: A pass to removed unused input on OpEntryPoint instructions. (#4275)
The new pass will removed interface variable on the OpEntryPoint instruction when they are not statically referenced in the call tree of the entry point.

It can be enabled on the command line using the options `remove-unused-interface-variables`.
2021-06-29 11:33:58 -04:00
Greg Fischer
48007a5c7f
Add interpolate legalization pass (#4220)
This pass converts an internal form of GLSLstd450 Interpolate ops
to the externally valid form. The external form takes the lvalue
of the interpolant. The internal form can do a load of the interpolant.
The pass replaces the load with its pointer. The internal form is
generated by glslang and possibly other frontends for HLSL shaders.
The new pass is called as part of HLSL legalization after all
propagation is complete.

Also adds internal interpolate form to pre-legalization validation
2021-03-31 14:26:36 -04:00
Ryan Harrison
9150cd441f
Remove WebGPU support (#4108)
Leaves SPV_ENV_WEBGPU_0 enum in place, but marked deprecated, so users
of the library are not broken by an API enum being removed.

Fixes #4101
2021-01-14 16:45:18 -05:00
Jaebaek Seo
f7da527757
Temporarily add EmptyPass to prevent glslang from failing (#4004)
Removing PropagateLineInfoPass and RedundantLineInfoElimPass from
56d0f5035 makes unit tests of many open source projects fail.
It will happen before submitting this glslang PR
https://github.com/KhronosGroup/glslang/pull/2440. This commit will be
git-reverted after merging the glslang PR.
2020-10-30 18:03:56 -04:00
Jaebaek Seo
56d0f50357
Propagate OpLine to all applied instructions in spirv-opt (#3951)
Based on the OpLine spec, an OpLine instruction must be applied to
the instructions physically following it up to the first occurrence
of the next end of block, the next OpLine instruction, or the next
OpNoLine instruction.

```
OpLine %file 0 0
OpNoLine
OpLine %file 1 1
OpStore %foo %int_1
%value = OpLoad %int %foo
OpLine %file 2 2
```

For the above code, the current spirv-opt keeps three line
instructions `OpLine %file 0 0`, `OpNoLine`, and `OpLine %file 1 1`
in `std::vector<Instruction> dbg_line_insts_` of Instruction class
for `OpStore %foo %int_1`. It does not put any line instruction to
`std::vector<Instruction> dbg_line_insts_` of
`%value = OpLoad %int %foo` even though `OpLine %file 1 1` must be
applied to `%value = OpLoad %int %foo` based on the spec.

This results in the missing line information for
`%value = OpLoad %int %foo` while each spirv-opt pass optimizes the
code. We have to put `OpLine %file 1 1` to
`std::vector<Instruction> dbg_line_insts_` of
both `%value = OpLoad %int %foo` and `OpStore %foo %int_1`.

This commit conducts the line instruction propagation and skips
emitting the eliminated line instructions at the end, which are the same
with PropagateLineInfoPass and RedundantLineInfoElimPass. This
commit removes PropagateLineInfoPass and RedundantLineInfoElimPass.

KhronosGroup/glslang#2440 is a related PR that stop using
PropagateLineInfoPass and RedundantLineInfoElimPass from glslang.
When the code in this PR applied, the glslang tests will pass.
2020-10-29 13:06:30 -04:00
greg-lunarg
1fe9bcc108
Instrument: Debug Printf support (#3215)
Create a pass to instrument OpDebugPrintf instructions.  This pass replaces all OpDebugPrintf instructions with instructions to write a record containing the string id and the all specified values into a special printf output buffer (if space allows). This pass is designed to support the printf validation in the Vulkan validation layers.

Fixes #3210
2020-03-12 09:19:52 -04:00
Steven Perron
35c9518c4e
Handle id overflow in the ssa rewriter. (#2845)
* Handle id overflow in the ssa rewriter.

Remove LocalSSAElim pass at the same time.  It does the same thing as the SSARewrite pass. Then even share almost all of the same code.

Fixes crbug.com/997246
2019-09-10 09:38:23 -04:00
greg-lunarg
d11725b1d4 Add --relax-float-ops and --convert-relaxed-to-half (#2808)
The first pass applies the RelaxedPrecision decoration to all executable
instructions with float32 based type results. The second pass converts
all executable instructions with RelaxedPrecision result to the equivalent
float16 type, inserting converts where necessary.
2019-09-03 13:22:13 -04:00
Steven Perron
35d98be3bc
Amd ext to khr (#2811)
Add the first steps to removing the AMD extension VK_AMD_shader_ballot.
Splitting up to make the PRs smaller.

Adding utilities to add capabilities and change the version of the
module.

Replaces the instructions:

OpGroupIAddNonUniformAMD = 5000
OpGroupFAddNonUniformAMD = 5001
OpGroupFMinNonUniformAMD = 5002
OpGroupUMinNonUniformAMD = 5003
OpGroupSMinNonUniformAMD = 5004
OpGroupFMaxNonUniformAMD = 5005
OpGroupUMaxNonUniformAMD = 5006
OpGroupSMaxNonUniformAMD = 5007

and extentend instructions

WriteInvocationAMD = 3
MbcntAMD = 4

Part of #2814
2019-08-29 12:48:17 -04:00
greg-lunarg
06407250a1 Instrument: Add support for Buffer Device Address extension (#2792) 2019-08-16 09:18:34 -04:00
Steven Perron
60043edfa1
Replace OpKill With function call. (#2790)
We are no able to inline OpKill instructions into a continue construct.
See #2433.  However, we have to be able to inline to correctly do
legalization.  This commit creates a pass that will wrap OpKill
instructions into a function of its own.  That way we are able to inline
the rest of the code.

The follow up to this will be to not inline any function that contains
an OpKill.

Fixes #2726
2019-08-14 09:27:12 -04:00
Steven Perron
4b64beb1ae
Add descriptor array scalar replacement (#2742)
Creates a pass that will replace a descriptor array with individual variables.  See #2740 for details.

Fixes #2740.
2019-08-08 10:53:19 -04:00
David Neto
31590104ec
Add pass to inject code for robust-buffer-access semantics (#2771)
spirv-opt: Add --graphics-robust-access

Clamps access chain indices so they are always
in bounds.

Assumes:
- Logical addressing mode
- No runtime-array-descriptor-indexing
- No variable pointers

Adds stub code for clamping coordinate and samples
for OpImageTexelPointer.

Adds SinglePassRunAndFail optimizer test fixture.

Android.mk: add source/opt/graphics_robust_access_pass.cpp

Adds Constant::GetSignExtendedValue, Constant::GetZeroExtendedValue
2019-07-30 19:52:46 -04:00
greg-lunarg
92c41ff1e7 Remove Common Uniform Elimination Pass (#2731)
Remove Common Uniform Elimination Pass

Fixes #2520.
2019-07-12 11:02:10 -04:00
Ryan Harrison
f6d9a17843
Add pass to fix some invalid unreachable blocks for WebGPU (#2563)
Attempts to split up unreachable blocks that are used both as a
merge-block and a continue-target.

Fixes #2429
2019-05-09 12:56:10 -04:00
Ryan Harrison
048dcd38ce
Implement WebGPU->Vulkan initializer conversion for 'Function' variables (#2513)
WebGPU requires certain variables to be initialized, whereas there are
known issues with using initializers in Vulkan. This PR is the first
of three implementing a pass to decompose initialized variables into
a variable declaration followed by a store. This has been broken up
into multiple PRs, because there 3 distinct cases that need to be
handled, which require separate implementations.

This first PR implements the basic infrastructure that is needed, and
handling of Function storage class variables. Private and Output will
be handled in future PRs.

This is part of resolving #2388
2019-04-16 14:31:36 -04:00
Ryan Harrison
102e430a88
Add pass to legalize OpVectorShuffle for WebGPU (#2509)
In WebGPU, the component operand 0xFFFFFFFF is forbidden, but in
Vulkan it is used to indicate a value is undefined. When converting to
WebGPU, 0xFFFFFFFF needs to converted to a legal value, though the
specific one does not matter, since it was used to indicate an
undefined entry in the original code. Choosing to use 0, since the
operands are required to be on [0, N-1], so 0 is guaranteed to always
be valid.

Fixes #2349
2019-04-12 12:14:23 -04:00
Steven Perron
3a0bc9e724
Add fix storage class code. (#2434)
This pass tries to fix validation error due to a mismatch of storage classes
in instructions.  There is no guarantee that all such error will be fixed,
and it is possible that in fixing these errors, it could lead to other
errors.

Fixes #2430.
2019-04-05 13:12:08 -04:00
Ryan Harrison
01964e325f
Add pass to generate needed initializers for WebGPU (#2481)
Fixes #2387
2019-04-03 11:44:09 -04:00
Ryan Harrison
e545522146
Add --strip-atomic-counter-memory (#2413)
Adds an optimization pass to remove usages of AtomicCounterMemory
bit. This bit is ignored in Vulkan environments and outright forbidden
in WebGPU ones.

Fixes #2242
2019-03-14 13:34:33 -04:00
Steven Perron
1b0047f210
Add pass to remove dead members. (#2379)
Add a pass that looks for members of structs whose values do not affects
the output of the shader. Those members are then removed and just
treated like padding in the struct.
2019-02-14 13:42:35 -05:00
Steven Perron
dd4157dcee
Sink (#2284)
Add code sinking pass. It will move OpLoad and OpAccessChain instructions as close as possible to their uses.

Part of #1611.
2019-01-17 15:56:36 -05:00
alan-baker
e510b1bac5
Update memory model (#1904)
Upgrade to VulkanKHR memory model

* Converts Logical GLSL450 memory model to Logical VulkanKHR
* Adds extension and capability
* Removes deprecated decorations and replaces them with appropriate
flags on downstream instructions
* Support for Workgroup upgrades
* Support for copy memory
* Adding support for image functions
* Adding barrier upgrades and tests
* Use QueueFamilyKHR scope instead of device
2018-11-30 14:15:51 -05:00
greg-lunarg
c37388f1ad Add passes to propagate and eliminate redundant line instructions (#2027). (#2039)
These are bookend passes designed to help preserve line information
across passes which delete, move and clone instructions. The propagation
pass attaches a debug line instruction to every instruction based on
SPIR-V line propagation rules. It should be performed before optimization.
The redundant line elimination pass eliminates all line instructions
which match the previous line instruction. This pass should be performed
at the end of optimization to reduce physical SPIR-V file size.

Fixes #2027.
2018-11-15 14:06:17 -05:00
greg-lunarg
1e9fc1aac1 Add base and core bindless validation instrumentation classes (#2014)
* Add base and core bindless validation instrumentation classes

* Fix formatting.

* Few more formatting fixes

* Fix build failure

* More build fixes

* Need to call non-const functions in order.

Specifically, these are functions which call TakeNextId(). These need to
be called in a specific order to guarantee that tests which do exact
compares will work across all platforms. c++ pretty much does not
guarantee order of evaluation of operands, so any such functions need to
be called separately in individual statements to guarantee order.

* More ordering.

* And more ordering.

* And more formatting.

* Attempt to fix NDK build

* Another attempt to address NDK build problem.

* One more attempt at NDK build failure

* Add instrument.hpp to BUILD.gn

* Some name improvement in instrument.hpp

* Change all types in instrument.hpp to int.

* Improve documentation in instrument.hpp

* Format fixes

* Comment clean up in instrument.hpp

* imageInst -> image_inst

* Fix GetLabel() issue.
2018-11-08 13:54:54 -05:00
Steven Perron
75c1bf2843
Add option for the max id bound. (#1870)
* Create a new entry point for the optimizer

Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.

The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.

Part of the optimizer options will also be the upper bound on the id bound.

* Add a command line option to set the max value for the id bound.  The default is 0x3FFFFF.

* Modify `TakeNextIdBound` to return 0 when the limit is reached.
2018-09-10 11:49:41 -04:00
dan sinclair
eda2cfbe12
Cleanup includes. (#1795)
This Cl cleans up the include paths to be relative to the top level
directory. Various include-what-you-use fixes have been added.
2018-08-03 15:06:09 -04:00
dan sinclair
58a6876cee
Rewrite include guards (#1793)
This CL rewrites the include guards to make PRESUBMIT.py include guard
check happy.
2018-08-03 08:05:33 -04:00
Alan Baker
755e5c9420 Transform to combine consecutive access chains
* Combines OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain and
OpInBoundsPtrAccessChain
* New folding rule to fold add with 0 for integers
 * Converts to a bitcast if the result type does not match the operand
 type
V
2018-07-31 13:42:47 -04:00
Steven Perron
fe2fbee294 Delete the insert-extract-elim pass.
Replaces anything that creates an insert-extract-elim pass and create
a simplifiation pass instead.  Then delete the implementation of the
pass.

Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1570.
2018-06-01 10:13:39 -04:00
Steven Perron
af430ec822 Add pass to fold a load feeding an extract.
We have already disabled common uniform elimination because it created
sequences of loads an entire uniform object, then we extract just a
single element.  This caused problems in some drivers, and is just
generally slow because it loads more memory than needed.

However, there are other way to get into this situation, so I've added
a pass that looks specifically for this pattern and removes it when only
a portion of the load is used.

Fixes #1547.
2018-05-14 15:40:34 -04:00
Toomas Remmelg
1dc2458060 Add a loop fusion pass.
This pass will look for adjacent loops that are compatible and legal to
be fused.

Loops are compatible if:

- they both have one induction variable
- they have the same upper and lower bounds
    - same initial value
    - same condition
- they have the same update step
- they are adjacent
- there are no break/continue in either of them

Fusion is legal if:

- fused loops do not have any dependencies with dependence distance
  greater than 0 that did not exist in the original loops.
- there are no function calls in the loops (could have side-effects)
- there are no barriers in the loops

It will fuse all such loops as long as the number of registers used for
the fused loop stays under the threshold defined by
max_registers_per_loop.
2018-05-01 15:40:37 -04:00
Stephen McGroarty
9a5dd6fe88 Support loop fission.
Adds support for spliting loops whose register pressure exceeds a user
provided level. This pass will split a loop into two or more loops given
that the loop is a top level loop and that spliting the loop is legal.
Control flow is left intact for dead code elimination to remove.

This pass is enabled with the --loop-fission flag to spirv-opt.
2018-05-01 15:15:10 -04:00
Steven Perron
2c0ce87210
Vector DCE (#1512)
Introduce a pass that does a DCE type analysis for vector elements
instead of the whole vector as a single element.

It will then rewrite instructions that are not used with something else.
For example, an instruction whose value are not used, even though it is
referenced, is replaced with an OpUndef.
2018-04-23 11:13:07 -04:00
Victor Lomuller
10e5d7cf13 Add a loop peeling pass.
For each loop in a function, the pass walks the loops from inner to outer most loop
and tries to peel loop for which a certain amount of iteration can be done before or after the loop.

To limit code growth, peeling will not happen if the growth in code size goes above a configurable threshold.
2018-04-11 15:41:29 +01:00
Steven Perron
c4dc046399 Copy propagate arrays
The sprir-v generated from HLSL code contain many copyies of very large
arrays.  Not only are these time consumming, but they also cause
problems for drivers because they require too much space.

To work around this, we will implement an array copy propagation.  Note
that we will not implement a complete array data flow analysis in order
to implement this.  We will be looking for very simple cases:

1) The source must never be stored to.
2) The target must be stored to exactly once.
3) The store to the target must be a store to the entire array, and be a
copy of the entire source.
4) All loads of the target must be dominated by the store.

The hard part is keeping all of the types correct.  We do not want to
have to do too large a search to update everything, which may not be
possible, do we give up if we see any instruction that might be hard to
update.

Also in types.h, the element decorations are not stored in an std::map.
This change was done so the hashing algorithm for a Struct is
consistent.  With the std::unordered_map, the traversal order was
non-deterministic leading to the same type getting hashed to different
values.  See |Struct::GetExtraHashWords|.

Contributes to #1416.
2018-03-26 14:44:41 -04:00
Diego Novillo
735d8a579e SSA rewrite pass.
This pass replaces the load/store elimination passes.  It implements the
SSA re-writing algorithm proposed in

     Simple and Efficient Construction of Static Single Assignment Form.
     Braun M., Buchwald S., Hack S., Leißa R., Mallon C., Zwinkau A. (2013)
     In: Jhala R., De Bosschere K. (eds)
     Compiler Construction. CC 2013.
     Lecture Notes in Computer Science, vol 7791.
     Springer, Berlin, Heidelberg

     https://link.springer.com/chapter/10.1007/978-3-642-37051-9_6

In contrast to common eager algorithms based on dominance and dominance
frontier information, this algorithm works backwards from load operations.

When a target variable is loaded, it queries the variable's reaching
definition.  If the reaching definition is unknown at the current location,
it searches backwards in the CFG, inserting Phi instructions at join points
in the CFG along the way until it finds the desired store instruction.

The algorithm avoids repeated lookups using memoization.

For reducible CFGs, which are a superset of the structured CFGs in SPIRV,
this algorithm is proven to produce minimal SSA.  That is, it inserts the
minimal number of Phi instructions required to ensure the SSA property, but
some Phi instructions may be dead
(https://en.wikipedia.org/wiki/Static_single_assignment_form).
2018-03-20 20:56:55 -04:00
David Neto
844e186cf7 Add --strip-reflect pass
Strips reflection info. This is limited to decorations and
decoration instructions related to the SPV_GOOGLE_hlsl_functionality1
extension.
It will remove the OpExtension for SPV_GOOGLE_hlsl_functionality1.
It will also remove the OpExtension for SPV_GOOGLE_decorate_string
if there are no further remaining uses of OpDecorateStringGOOGLE.

Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1398
2018-03-15 21:20:42 -04:00
Victor Lomuller
3497a94460 Add loop unswitch pass.
It moves all conditional branching and switch whose conditions are loop
invariant and uniform. Before performing the loop unswitch we check that
the loop does not contain any instruction that would prevent it
(barriers, group instructions etc.).
2018-02-27 08:52:46 -05:00
Stephen McGroarty
dd8400e150 Initial support for loop unrolling.
This patch adds initial support for loop unrolling in the form of a
series of utility classes which perform the unrolling. The pass can
be run with the command spirv-opt --loop-unroll. This will unroll
loops within the module which have the unroll hint set. The unroller
imposes a number of requirements on the loops it can unroll. These are
documented in the comments for the LoopUtils::CanPerformUnroll method in
loop_utils.h. Some of the restrictions will be lifted in future patches.
2018-02-14 15:44:38 -05:00
Alexander Johnston
84ccd0b9ae Loop invariant code motion initial implementation 2018-02-08 22:55:47 -05:00
Steven Perron
61d8c0384b Add pass to reaplce invalid opcodes
Creates a pass that will remove instructions that are invalid for the
current shader stage.  For the instruction to be considered for replacement

1) The opcode must be valid for a shader modules.
2) The opcode must be invalid for the current shader stage.
3) All entry points to the module must be for the same shader stage.
4) The function containing the instruction must be reachable from an entry point.

Fixes #1247.
2018-02-01 15:25:09 -05:00
GregF
f28b106173 InsertExtractElim: Split out DeadInsertElim as separate pass 2018-01-30 08:52:14 -05:00
Alan Baker
2e93e806e4 Initial implementation of if conversion
* Handles simple cases only
* Identifies phis in blocks with two predecessors and attempts to
convert the phi to an select
 * does not perform code motion currently so the converted values must
 dominate the join point (e.g. can't be defined in the branches)
 * limited for now to two predecessors, but can be extended to handle
 more cases
* Adding if conversion to -O and -Os
2018-01-25 09:42:00 -08:00