We need an analysis for OpenCL.DebugInfo.100 extension instructions such
as a map between function id and its DebugFunction. This commit add an
analysis for it.
* Preserve debug info in eliminate-dead-functions
The elimination of dead functions makes OpFunction operand of
DebugFunction invalid. This commit replaces the operand with
DebugInfoNone.
* Allow OpExtInst for DebugInfo between secion 9 and 10
Fixes#3086
* Handle spirv-opt errors on DebugInfo Ext
* Add IR Loader test
* Fix ir loader bug
* Handle DebugFunction/DebugTypeMember forward reference
* Add test cases (forward reference to function)
* Support old DebugInfo extension
* Validate local debug info out of function
* Add continue construct analysis to struct cfg analysis
Add the ability to identify which blocks are in the continue construct for a
loop, and to get functions that are called from those blocks, directly or
indirectly.
Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/2912.
Add the first steps to removing the AMD extension VK_AMD_shader_ballot.
Splitting up to make the PRs smaller.
Adding utilities to add capabilities and change the version of the
module.
Replaces the instructions:
OpGroupIAddNonUniformAMD = 5000
OpGroupFAddNonUniformAMD = 5001
OpGroupFMinNonUniformAMD = 5002
OpGroupUMinNonUniformAMD = 5003
OpGroupSMinNonUniformAMD = 5004
OpGroupFMaxNonUniformAMD = 5005
OpGroupUMaxNonUniformAMD = 5006
OpGroupSMaxNonUniformAMD = 5007
and extentend instructions
WriteInvocationAMD = 3
MbcntAMD = 4
Part of #2814
Fixes#2764
* Don't replace all uses when simplifying instructions, instead only
update non-debug, non-decoration uses
* added a test
* Add a new version of RAUW that takes a predicate to decide whether to
replace the use or not
* used in simplification pass
New version has additional word in stage-specific section. Also
some changes in content for tesselation and compute shaders. Either
version can be invoked at pass creation. This is done to ease integration
and updating of validation layers. Version 1 is deprecated and eventually
will go away.
Also sneaking in fix to version 1 compute shaders.
There is a case where sroa is not handling id overflow gracefully. It
is handled and an error message is output when the ids overflow.
Fixes https://crbug.com/961030.
Currently it is impossible to invalidate the constnat and type manager.
However, the compact ids pass changes the ids for the types and
constants, which makes them invalid. This change will make them
analyses that have to been explicitly marked as preserved by passes.
This will allow compact ids to invalidate them.
Fixes#2220.
Added documentation to the ir context to indicates that TakeNextId()
returns 0 when the max id is reached. TODOs were added to each call
sight so that we know where we have to start to handle this case.
Handle id overflow in |SplitLoopHeader|.
Handle id overflow in |GetOrCreatePreHeaderBlock|.
Handle failure to create preheader in LICM.
Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/1841.
* Move ProcessFunction* function from pass to the context.
There are a few functions that are used to traverse the call tree.
They currently live in the Pass class, but they have nothing to do with
a pass, and may be needed outside of a pass. They would be better in
the ir context, or in a specific call tree class if we ever have a need
for it.
* Don't inline recursive functions.
Inlining does not check if a function is recursive or not. This has
been fine as long as the shader was a Vulkan shader, which forbid
recursive functions. However, not all shaders are vulkan, so either
we limit inlining to Vulkan shaders or we teach it to look for recursive
functions.
I prefer to keep the passes as general as is reasonable. The change
does not require much new code in inlining and gives a reason to refactor
some other code.
The changes are to add a member function to the Function class that
checks if that function is recursive or not.
Then this is used in inlining to not inlining a function call if it calls
a recursive function.
* Add id to function analysis
There are a few places that build a map from ids to Function whose
result is that id. I decided to add an analysis to the context for this
to reduce that code, and simplify some of the functions.
* Add missing file.
* Add base and core bindless validation instrumentation classes
* Fix formatting.
* Few more formatting fixes
* Fix build failure
* More build fixes
* Need to call non-const functions in order.
Specifically, these are functions which call TakeNextId(). These need to
be called in a specific order to guarantee that tests which do exact
compares will work across all platforms. c++ pretty much does not
guarantee order of evaluation of operands, so any such functions need to
be called separately in individual statements to guarantee order.
* More ordering.
* And more ordering.
* And more formatting.
* Attempt to fix NDK build
* Another attempt to address NDK build problem.
* One more attempt at NDK build failure
* Add instrument.hpp to BUILD.gn
* Some name improvement in instrument.hpp
* Change all types in instrument.hpp to int.
* Improve documentation in instrument.hpp
* Format fixes
* Comment clean up in instrument.hpp
* imageInst -> image_inst
* Fix GetLabel() issue.
If there is only 1 return and it is in a loop, then the function cannot be inlined.
Fix condition when inlined code needs one-trip loop wrapper. The dummy loop is needed when there is a return inside a selection construct. Even if there is only 1 return.
Merge return assumes that the only unreachable blocks are those needed
to keep the structured cfg valid. Even those must be essentially empty
blocks.
If this is not the case, we get unpredictable behaviour. This commit
add a check in merge return, and emits an error if it is not the case.
Added a pass of dead branch elimination before merge return in both the
performance and size passes. It is a precondition of merge return.
Fixes#1962.
* Create a new entry point for the optimizer
Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.
The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.
Part of the optimizer options will also be the upper bound on the id bound.
* Add a command line option to set the max value for the id bound. The default is 0x3FFFFF.
* Modify `TakeNextIdBound` to return 0 when the limit is reached.
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
The folding routines are currently global functions. They also rely on
data in an std::map that holds the folding rules for each opcode. This
causes that map to not have a clear owner, and therefore never gets
deleted.
There has been a request to delete this map. To implement this, we will
create a InstructionFolder class that owns the maps. The IRContext will
own the InstructionFolder instance. Then the global functions will
become public memeber functions of the InstructionFolder.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1659.
There are a few locations where we need to handle duplicate types. We
cannot merge them because they may be needed for reflection. When this
happens we need do some extra lookups in the type manager.
The specific fixes are:
1) When generating a constant through `GetDefiningInstruction` accept
and use an id for the desired type of the constant. This will make sure
you get the type that is needed.
2) In Private-to-local, make sure we to update the def-use chains when a
new pointer type is created.
3) In the type manager, make sure that `FindPointerToType` returns a
pointer that points to the given type and not a duplicate type.
4) In scalar replacment, make sure the null constants that are created
are the correct type.
When doing if-conversion, we do not currently move code out of the side
nodes. The reason for this is that it can increase the number of
instructions that get executed because both side nods will have to be
executed now.
In this commit, we add code to move an instruction, and all of the
instructions it depends on, out of a side node and into the header of
the selection construct. However to keep the cost down, we only do it
when the two values in the OpPhi node compute the same value. This way
we have to move only one of the instructions and the other becomes
unused most of the time. So no real extra cost.
Makes the value number table an alalysis in the ir context.
Added more opcodes to list of code motion safe opcodes.
Fixes#1526.
For each function, the analysis determine which SSA registers are live
at the beginning of each basic block and which one are killed at
the end of the basic block.
It also includes utilities to simulate the register pressure for loop
fusion and fission.
The implementation is based on the paper "A non-iterative data-flow
algorithm for computing liveness sets in strict ssa programs" from
Boissinot et al.
This patch adds support for the analysis of scalars in loops. It works
by traversing the defuse chain to build a DAG of scalar operations and
then simplifies the DAG by folding constants and grouping like terms.
It represents induction variables as recurrent expressions with respect
to a given loop and can simplify DAGs containing recurrent expression by
rewritting the entire DAG to be a recurrent expression with respect to
the same loop.
We are seeing shaders that have multiple returns in a functions. These
functions must get inlined for legalization purposes; however, the
inliner does not know how to inline functions that have multiple
returns.
The solution we will go with it to improve the merge return pass to
handle structured control flow.
Note that the merge return pass will assume the cfg has been cleanedup
by dead branch elimination.
Fixes#857.
Adding a map from an id to it set of OpName and OpMemberName
instructions. This will be used in KillNameAndDecorates to kill the
names for the ids that are being removed.
In my test, the compile time for 50 shaders went from 1m57s to 55s.
This was on linux using the release build.
Fixes#1290.
* Had to remove templating from InstructionBuilder as a result
* now preserved analyses are specified as a constructor argument
* updated tests and uses
* changed static_assert to a runtime assert
* this should probably get further changes in the future
The class factorize the instruction building process.
Def-use manager analysis can be updated on the fly to maintain coherency.
To be updated to take into account more analysis.
* Added for Instruction, BasicBlock, Function and Module
* Uses new disassembly functionality that can disassemble individual
instructions
* For debug use only (no caching is done)
* Each output converts module to binary, parses and outputs an
individual instruction
* Added a test for whole module output
* Disabling Microsoft checked iterator warnings
* Updated check_copyright.py to accept 2018
* Changed MemPass::InsertPhiInstructions to set basic blocks for new
phis
* Local SSA elim now maintains instr to block mapping
* Added a test and confirmed it fails without the updated phis
* IRContext::set_instr_block no longer builds the map if the analysis is
invalid
* Added instruction to block mapping verification to
IRContext::IsConsistent()
Add post-order tree iterator.
Add DominatorTreeNode extensions:
- Add begin/end methods to do pre-order and post-order tree traversal from a given DominatorTreeNode
Add DominatorTree extensions:
- Add begin/end methods to do pre-order and post-order tree traversal
- Tree traversal ignore by default the pseudo entry block
- Retrieve a DominatorTreeNode from a basic block
Add loop descriptor:
- Add a LoopDescriptor class to register all loops in a given function.
- Add a Loop class to describe a loop:
- Loop parent
- Nested loops
- Loop depth
- Loop header, merge, continue and preheader
- Basic blocks that belong to the loop
Correct a bug that forced dominator tree to be constantly rebuilt.
In order to keep track of all of the implicit capabilities as well as
the explicit ones, we will add them all to the feature manager. That is
the object that needs to be queried when checking if a capability is
enabled.
The name of the "HasCapability" function in the module was changed to
make it more obvious that it does not check for implied capabilities.
Keep an spv_context and AssemblyGrammar in IRContext
When a private variable is used in a single function, it can be
converted to a function scope variable in that function. This adds a
pass that does that. The pass can be enabled using the option
`--private-to-local`.
This transformation allows other transformations to act on these
variables.
Also moved `FindPointerToType` from the inline class to the type manager.
types. This allows the lookup of type declaration ids from arbitrarily
constructed types. Users should be cautious when dealing with non-unique
types (structs and potentially pointers) to get the exact id if
necessary.
* Changed the spec composite constant folder to handle ambiguous composites
* Added functionality to create necessary instructions for a type
* Added ability to remove ids from the type manager
Adds a scalar replacement pass. The pass considers all function scope
variables of composite type. If there are accesses to individual
elements (and it is legal) the pass replaces the variable with a
variable for each composite element and updates all the uses.
Added the pass to -O
Added NumUses and NumUsers to DefUseManager
Added some helper methods for the inst to block mapping in context
Added some helper methods for specific constant types
No longer generate duplicate pointer types.
* Now searches for an existing pointer of the appropriate type instead
of failing validation
* Fixed spec constant extracts
* Addressed changes for review
* Changed RunSinglePassAndMatch to be able to run validation
* current users do not enable it
Added handling of acceptable decorations.
* Decorations are also transfered where appropriate
Refactored extension checking into FeatureManager
* Context now owns a feature manager
* consciously NOT an analysis
* added some test
* fixed some minor issues related to decorates
* added some decorate related tests for scalar replacement
This patch adds a new constant manager class to interface with
analysis::Constant. The new constant manager lives in ir::IRContext
together with the type manager (analysis::TypeManager).
The new analysis::ConstantManager is used by the spec constant folder
and the constant propagator (in progress).
Another cleanup introduced by this patch removes the ID management from
the fold spec constant pass, and ir::IRContext and moves it to
ir::Module. SSA IDs were maintained by IRContext and Module. That's
pointless and leads to mismatch IDs. Fixed by moving all the bookkeeping
to ir::Module.
Support for dominator and post dominator analysis on ir::Functions. This patch contains a DominatorTree class for building the tree and DominatorAnalysis and DominatorAnalysisPass classes for interfacing and caching the built trees.
The current method of removing an instruction is to call ToNop. The
problem with this is that it leaves around an instruction that later
passes will look at. We should just delete the instruction.
In MemPass there is a utility routine called DCEInst. It can delete
essentially any instruction, which can invalidate pointers now that they
are actually deleted. The interface was changed to add a call back that
can be used to update any local data structures that contain
ir::Intruction*.