Commit Graph

97 Commits

Author SHA1 Message Date
alan-baker
f3cec93665
Support SPV_KHR_terminate_invocation (#3568)
Covers:
- assembler
- disassembler
- validator
- optimizer

Co-authored-by: David Neto <dneto@google.com>
2020-07-22 11:45:02 -04:00
greg-lunarg
4410272bdd
Remove deprecated interfaces from instrument passes (#3361) 2020-05-21 13:10:42 -04:00
greg-lunarg
1fe9bcc108
Instrument: Debug Printf support (#3215)
Create a pass to instrument OpDebugPrintf instructions.  This pass replaces all OpDebugPrintf instructions with instructions to write a record containing the string id and the all specified values into a special printf output buffer (if space allows). This pass is designed to support the printf validation in the Vulkan validation layers.

Fixes #3210
2020-03-12 09:19:52 -04:00
David Neto
5d786f6cc7
Clarify mapping of target env to SPIR-V version (#3150)
* Clarify mapping of target env to SPIR-V version

It depends on the API method.

* Update SPIR-V version comments on validator
2020-01-24 16:26:07 -05:00
greg-lunarg
578c5ac133 Change default version for CreatInstBindlessCheckPass to 2 (#3119) 2019-12-27 10:46:43 -05:00
greg-lunarg
9215c1b7df Fix convert-relax-to-half invalid code (#3099) (#3106) 2019-12-20 21:08:12 -05:00
David Neto
af7410597e graphics robust access: use signed clamp (#3073)
Access chain indices are always interpreted as signed integers.
So use signed clamp instead of unsigned clamp.  We must also
clamp to the max signed int for the index type.

Fixes #3072
2019-12-03 11:18:56 -05:00
Ryan Harrison
19b256616d
For WebGPU<->Vulkan optimization, set correct execution environment (#2834)
Fixes #2833
2019-09-04 13:08:58 -04:00
greg-lunarg
d11725b1d4 Add --relax-float-ops and --convert-relaxed-to-half (#2808)
The first pass applies the RelaxedPrecision decoration to all executable
instructions with float32 based type results. The second pass converts
all executable instructions with RelaxedPrecision result to the equivalent
float16 type, inserting converts where necessary.
2019-09-03 13:22:13 -04:00
Steven Perron
35d98be3bc
Amd ext to khr (#2811)
Add the first steps to removing the AMD extension VK_AMD_shader_ballot.
Splitting up to make the PRs smaller.

Adding utilities to add capabilities and change the version of the
module.

Replaces the instructions:

OpGroupIAddNonUniformAMD = 5000
OpGroupFAddNonUniformAMD = 5001
OpGroupFMinNonUniformAMD = 5002
OpGroupUMinNonUniformAMD = 5003
OpGroupSMinNonUniformAMD = 5004
OpGroupFMaxNonUniformAMD = 5005
OpGroupUMaxNonUniformAMD = 5006
OpGroupSMaxNonUniformAMD = 5007

and extentend instructions

WriteInvocationAMD = 3
MbcntAMD = 4

Part of #2814
2019-08-29 12:48:17 -04:00
greg-lunarg
06407250a1 Instrument: Add support for Buffer Device Address extension (#2792) 2019-08-16 09:18:34 -04:00
Steven Perron
60043edfa1
Replace OpKill With function call. (#2790)
We are no able to inline OpKill instructions into a continue construct.
See #2433.  However, we have to be able to inline to correctly do
legalization.  This commit creates a pass that will wrap OpKill
instructions into a function of its own.  That way we are able to inline
the rest of the code.

The follow up to this will be to not inline any function that contains
an OpKill.

Fixes #2726
2019-08-14 09:27:12 -04:00
Steven Perron
4b64beb1ae
Add descriptor array scalar replacement (#2742)
Creates a pass that will replace a descriptor array with individual variables.  See #2740 for details.

Fixes #2740.
2019-08-08 10:53:19 -04:00
David Neto
31590104ec
Add pass to inject code for robust-buffer-access semantics (#2771)
spirv-opt: Add --graphics-robust-access

Clamps access chain indices so they are always
in bounds.

Assumes:
- Logical addressing mode
- No runtime-array-descriptor-indexing
- No variable pointers

Adds stub code for clamping coordinate and samples
for OpImageTexelPointer.

Adds SinglePassRunAndFail optimizer test fixture.

Android.mk: add source/opt/graphics_robust_access_pass.cpp

Adds Constant::GetSignExtendedValue, Constant::GetZeroExtendedValue
2019-07-30 19:52:46 -04:00
greg-lunarg
92c41ff1e7 Remove Common Uniform Elimination Pass (#2731)
Remove Common Uniform Elimination Pass

Fixes #2520.
2019-07-12 11:02:10 -04:00
greg-lunarg
3d62cb8148 Instrument: Add version 2 of record formats (#2630)
New version has additional word in stage-specific section. Also
some changes in content for tesselation and compute shaders. Either
version can be invoked at pass creation. This is done to ease integration
and updating of validation layers. Version 1 is deprecated and eventually
will go away.

Also sneaking in fix to version 1 compute shaders.
2019-05-29 15:08:21 -04:00
Ryan Harrison
f6d9a17843
Add pass to fix some invalid unreachable blocks for WebGPU (#2563)
Attempts to split up unreachable blocks that are used both as a
merge-block and a continue-target.

Fixes #2429
2019-05-09 12:56:10 -04:00
Ryan Harrison
048dcd38ce
Implement WebGPU->Vulkan initializer conversion for 'Function' variables (#2513)
WebGPU requires certain variables to be initialized, whereas there are
known issues with using initializers in Vulkan. This PR is the first
of three implementing a pass to decompose initialized variables into
a variable declaration followed by a store. This has been broken up
into multiple PRs, because there 3 distinct cases that need to be
handled, which require separate implementations.

This first PR implements the basic infrastructure that is needed, and
handling of Function storage class variables. Private and Output will
be handled in future PRs.

This is part of resolving #2388
2019-04-16 14:31:36 -04:00
Ryan Harrison
102e430a88
Add pass to legalize OpVectorShuffle for WebGPU (#2509)
In WebGPU, the component operand 0xFFFFFFFF is forbidden, but in
Vulkan it is used to indicate a value is undefined. When converting to
WebGPU, 0xFFFFFFFF needs to converted to a legal value, though the
specific one does not matter, since it was used to indicate an
undefined entry in the original code. Choosing to use 0, since the
operands are required to be on [0, N-1], so 0 is guaranteed to always
be valid.

Fixes #2349
2019-04-12 12:14:23 -04:00
Ryan Harrison
0cb2d4079e
Add WebGPU->Vulkan and Vulkan->WebGPU flags in spirv-opt (#2496)
Renames the existing flag '--webgpu-mode' to '--vulkan-to-webgpu' for
the Vulkan->WebGPU operation, and adds a new flag '--webgpu-to-vulkan'
for the WebGPU->Vulkan operation.

Currently '--webgpu-to-vulkan' doesn't have any passes associated with
it yet, but further patches will implement them.

Fixes #2495
2019-04-05 15:12:26 -04:00
Steven Perron
3a0bc9e724
Add fix storage class code. (#2434)
This pass tries to fix validation error due to a mismatch of storage classes
in instructions.  There is no guarantee that all such error will be fixed,
and it is possible that in fixing these errors, it could lead to other
errors.

Fixes #2430.
2019-04-05 13:12:08 -04:00
Ryan Harrison
01964e325f
Add pass to generate needed initializers for WebGPU (#2481)
Fixes #2387
2019-04-03 11:44:09 -04:00
alan-baker
42e6f1aa62
Add option to validate after each pass (#2462)
* New command-line option to opt: --validate-after-all
 * Pass manager will validate after each pass it runs
2019-03-26 14:38:59 -04:00
greg-lunarg
e1a76269b6 Bindless Validation: Descriptor Initialization Check (#2419)
If SPV_EXT_descriptor_indexing is enabled, add check that for a
descriptor-based reference, the descriptor is initialized. Initialization
data is stored in the debug input buffer, added to the length information
already there. This feature must be seperately enabled on the pass
creation routine. NOTE: Currently just supports image references; buffer
references are still TODO.
2019-03-19 09:53:43 -04:00
Ryan Harrison
e545522146
Add --strip-atomic-counter-memory (#2413)
Adds an optimization pass to remove usages of AtomicCounterMemory
bit. This bit is ignored in Vulkan environments and outright forbidden
in WebGPU ones.

Fixes #2242
2019-03-14 13:34:33 -04:00
Steven Perron
1b0047f210
Add pass to remove dead members. (#2379)
Add a pass that looks for members of structs whose values do not affects
the output of the shader. Those members are then removed and just
treated like padding in the struct.
2019-02-14 13:42:35 -05:00
greg-lunarg
cf21146137 Expand bindless bounds checking to runtime-sized descriptor arrays (#2316) 2019-02-07 14:00:36 -05:00
Steven Perron
dd4157dcee
Sink (#2284)
Add code sinking pass. It will move OpLoad and OpAccessChain instructions as close as possible to their uses.

Part of #1611.
2019-01-17 15:56:36 -05:00
Ryan Harrison
47c08a79c4
Implement initial --webgpu-mode flag (#2217)
Fixes #2166
2018-12-18 15:10:34 -05:00
Ryan Harrison
e0292c269d
Add --target-env flag to spirv-opt (#2216)
Fixes #2199
2018-12-17 16:54:23 -05:00
alan-baker
e510b1bac5
Update memory model (#1904)
Upgrade to VulkanKHR memory model

* Converts Logical GLSL450 memory model to Logical VulkanKHR
* Adds extension and capability
* Removes deprecated decorations and replaces them with appropriate
flags on downstream instructions
* Support for Workgroup upgrades
* Support for copy memory
* Adding support for image functions
* Adding barrier upgrades and tests
* Use QueueFamilyKHR scope instead of device
2018-11-30 14:15:51 -05:00
greg-lunarg
c37388f1ad Add passes to propagate and eliminate redundant line instructions (#2027). (#2039)
These are bookend passes designed to help preserve line information
across passes which delete, move and clone instructions. The propagation
pass attaches a debug line instruction to every instruction based on
SPIR-V line propagation rules. It should be performed before optimization.
The redundant line elimination pass eliminates all line instructions
which match the previous line instruction. This pass should be performed
at the end of optimization to reduce physical SPIR-V file size.

Fixes #2027.
2018-11-15 14:06:17 -05:00
greg-lunarg
1e9fc1aac1 Add base and core bindless validation instrumentation classes (#2014)
* Add base and core bindless validation instrumentation classes

* Fix formatting.

* Few more formatting fixes

* Fix build failure

* More build fixes

* Need to call non-const functions in order.

Specifically, these are functions which call TakeNextId(). These need to
be called in a specific order to guarantee that tests which do exact
compares will work across all platforms. c++ pretty much does not
guarantee order of evaluation of operands, so any such functions need to
be called separately in individual statements to guarantee order.

* More ordering.

* And more ordering.

* And more formatting.

* Attempt to fix NDK build

* Another attempt to address NDK build problem.

* One more attempt at NDK build failure

* Add instrument.hpp to BUILD.gn

* Some name improvement in instrument.hpp

* Change all types in instrument.hpp to int.

* Improve documentation in instrument.hpp

* Format fixes

* Comment clean up in instrument.hpp

* imageInst -> image_inst

* Fix GetLabel() issue.
2018-11-08 13:54:54 -05:00
Steven Perron
75c1bf2843
Add option for the max id bound. (#1870)
* Create a new entry point for the optimizer

Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.

The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.

Part of the optimizer options will also be the upper bound on the id bound.

* Add a command line option to set the max value for the id bound.  The default is 0x3FFFFF.

* Modify `TakeNextIdBound` to return 0 when the limit is reached.
2018-09-10 11:49:41 -04:00
Steven Perron
bcb0b6935c
Reenable --skip-validation. (#1820)
In previous changes, the option `--skip-validation` was disabled.  This
change is to reenable it.
2018-08-13 13:18:46 -04:00
Steven Perron
5c8b4f5a1c
Validate the input to Optimizer::Run (#1799)
* Run the validator in the optimization fuzzers.

The optimizers assumes that the input to the optimizer is valid.  Since
the fuzzers do not check that the input is valid before passing the
spir-v to the optimizer, we are getting a few errors.

The solution is to run the validator in the optimizer to validate the
input.

For the legalization passes, we need to add an extra option to the
validator to accept certain types of variable pointers, even if the
capability is not given.  At the same time, we changed the option
"--legalize-hlsl" to relax the validator in the same way instead of
turning it off.
2018-08-08 11:16:19 -04:00
dan sinclair
58a6876cee
Rewrite include guards (#1793)
This CL rewrites the include guards to make PRESUBMIT.py include guard
check happy.
2018-08-03 08:05:33 -04:00
Alan Baker
755e5c9420 Transform to combine consecutive access chains
* Combines OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain and
OpInBoundsPtrAccessChain
* New folding rule to fold add with 0 for integers
 * Converts to a bitcast if the result type does not match the operand
 type
V
2018-07-31 13:42:47 -04:00
Diego Novillo
99fe61e724 Add API to create passes out of a list of command-line flags.
This re-implements the -Oconfig=<file> flag to use a new API that takes
a list of command-line flags representing optimization passes.

This moves the processing of flags that create new optimization passes
out of spirv-opt and into the library API.  Useful for other tools that
want to incorporate a facility similar to -Oconfig.

The main changes are:

1- Add a new public function Optimizer::RegisterPassesFromFlags. This
   takes a vector of strings.  Each string is assumed to have the form
   '--pass_name[=pass_args]'.  It creates and registers into the pass
   manager all the passes specified in the vector.  Each pass is
   validated internally.  Failure to create a pass instance causes the
   function to return false and a diagnostic is emitted to the
   registered message consumer.

2- Re-implements -Oconfig in spirv-opt to use the new API.
2018-07-27 15:10:08 -04:00
Arseny Kapoulkine
f765d16bd9 Add external interface for creating a pass token
Currently it's impossible for external code to register a pass because
the only source file that can create pass tokens is optimizer.cpp. This
makes it hard to add passes that can't be upstreamed since you can't run
them from the usual pass sequence without reimplementing Optimizer.

This change adds a PassToken constructor that takes unique_ptr to
opt::Pass; if out-of-tree code implements opt::Pass it can register a
custom pass without having to add it to SPIRV-Tools source code.
2018-05-25 09:19:43 -04:00
Steven Perron
a579e720a8 Remove the limit on struct size in SROA.
Removes the limit on scalar replacement for the lagalization passes.
This is done by adding an option to the pass (and command line option)
to set the limit on maximum size of the composite that scalar
replacement is willing to divide.

Fixes #1494.
2018-05-18 10:03:46 -04:00
Steven Perron
af430ec822 Add pass to fold a load feeding an extract.
We have already disabled common uniform elimination because it created
sequences of loads an entire uniform object, then we extract just a
single element.  This caused problems in some drivers, and is just
generally slow because it loads more memory than needed.

However, there are other way to get into this situation, so I've added
a pass that looks specifically for this pattern and removes it when only
a portion of the load is used.

Fixes #1547.
2018-05-14 15:40:34 -04:00
Toomas Remmelg
1dc2458060 Add a loop fusion pass.
This pass will look for adjacent loops that are compatible and legal to
be fused.

Loops are compatible if:

- they both have one induction variable
- they have the same upper and lower bounds
    - same initial value
    - same condition
- they have the same update step
- they are adjacent
- there are no break/continue in either of them

Fusion is legal if:

- fused loops do not have any dependencies with dependence distance
  greater than 0 that did not exist in the original loops.
- there are no function calls in the loops (could have side-effects)
- there are no barriers in the loops

It will fuse all such loops as long as the number of registers used for
the fused loop stays under the threshold defined by
max_registers_per_loop.
2018-05-01 15:40:37 -04:00
Stephen McGroarty
9a5dd6fe88 Support loop fission.
Adds support for spliting loops whose register pressure exceeds a user
provided level. This pass will split a loop into two or more loops given
that the loop is a top level loop and that spliting the loop is legal.
Control flow is left intact for dead code elimination to remove.

This pass is enabled with the --loop-fission flag to spirv-opt.
2018-05-01 15:15:10 -04:00
Steven Perron
2c0ce87210
Vector DCE (#1512)
Introduce a pass that does a DCE type analysis for vector elements
instead of the whole vector as a single element.

It will then rewrite instructions that are not used with something else.
For example, an instruction whose value are not used, even though it is
referenced, is replaced with an OpUndef.
2018-04-23 11:13:07 -04:00
Victor Lomuller
10e5d7cf13 Add a loop peeling pass.
For each loop in a function, the pass walks the loops from inner to outer most loop
and tries to peel loop for which a certain amount of iteration can be done before or after the loop.

To limit code growth, peeling will not happen if the growth in code size goes above a configurable threshold.
2018-04-11 15:41:29 +01:00
Steven Perron
c4dc046399 Copy propagate arrays
The sprir-v generated from HLSL code contain many copyies of very large
arrays.  Not only are these time consumming, but they also cause
problems for drivers because they require too much space.

To work around this, we will implement an array copy propagation.  Note
that we will not implement a complete array data flow analysis in order
to implement this.  We will be looking for very simple cases:

1) The source must never be stored to.
2) The target must be stored to exactly once.
3) The store to the target must be a store to the entire array, and be a
copy of the entire source.
4) All loads of the target must be dominated by the store.

The hard part is keeping all of the types correct.  We do not want to
have to do too large a search to update everything, which may not be
possible, do we give up if we see any instruction that might be hard to
update.

Also in types.h, the element decorations are not stored in an std::map.
This change was done so the hashing algorithm for a Struct is
consistent.  With the std::unordered_map, the traversal order was
non-deterministic leading to the same type getting hashed to different
values.  See |Struct::GetExtraHashWords|.

Contributes to #1416.
2018-03-26 14:44:41 -04:00
Jaebaek Seo
3b594e1630 Add --time-report to spirv-opt
This patch adds a new option --time-report to spirv-opt.  For each pass
executed by spirv-opt, the flag prints resource utilization for the pass
(CPU time, wall time, RSS and page faults)

This fixes issue #1378
2018-03-20 21:30:06 -04:00
Diego Novillo
735d8a579e SSA rewrite pass.
This pass replaces the load/store elimination passes.  It implements the
SSA re-writing algorithm proposed in

     Simple and Efficient Construction of Static Single Assignment Form.
     Braun M., Buchwald S., Hack S., Leißa R., Mallon C., Zwinkau A. (2013)
     In: Jhala R., De Bosschere K. (eds)
     Compiler Construction. CC 2013.
     Lecture Notes in Computer Science, vol 7791.
     Springer, Berlin, Heidelberg

     https://link.springer.com/chapter/10.1007/978-3-642-37051-9_6

In contrast to common eager algorithms based on dominance and dominance
frontier information, this algorithm works backwards from load operations.

When a target variable is loaded, it queries the variable's reaching
definition.  If the reaching definition is unknown at the current location,
it searches backwards in the CFG, inserting Phi instructions at join points
in the CFG along the way until it finds the desired store instruction.

The algorithm avoids repeated lookups using memoization.

For reducible CFGs, which are a superset of the structured CFGs in SPIRV,
this algorithm is proven to produce minimal SSA.  That is, it inserts the
minimal number of Phi instructions required to ensure the SSA property, but
some Phi instructions may be dead
(https://en.wikipedia.org/wiki/Static_single_assignment_form).
2018-03-20 20:56:55 -04:00
Steven Perron
b3daa93b46 Change merge return pass to handle structured cfg.
We are seeing shaders that have multiple returns in a functions.  These
functions must get inlined for legalization purposes; however, the
inliner does not know how to inline functions that have multiple
returns.

The solution we will go with it to improve the merge return pass to
handle structured control flow.

Note that the merge return pass will assume the cfg has been cleanedup
by dead branch elimination.

Fixes #857.
2018-03-19 13:49:04 -04:00