It seems like the current implementation of KillNameAndDecorates does
not handle group decorations correctly. The id being removed is not
removed from the OpGroupDecorate instructions. Even worst, any
decorations that apply to that group are removed.
The solution is to use the function in the decoration manager that will
remove the decorations and update the instructions instead of doing the
work itself.
Adds unrolling to the legalization passes.
After enabling unrolling I found a bug when there is a self-referencing
phi node. That has been fixed.
The test that checks for that the order of optimizations is correct also
needed to be updated.
The current implementation of merge return can create bad, but correct,
code. When it is not in a loop construct, it will insert a lot of
extra branch around code. The potentially large number of branches are
bad. At the same time, it can separate code store to variables from
its uses hiding the fact that the store dominates the load.
This hurts the later analysis because the compiler thinks that multiple
values can reach a load, when there is really only 1. This poorer
analysis leads to missed optimizations.
The solution is to create a dummy loop around the entire body of the
function, then we can break from that loop with a single branch. Also
only new merge nodes would be those at the end of loops meaning that
most analysies will not be hurt.
Remove dead code for cases that are no longer possible.
It seems like some drivers expect there the be an OpSelectionMerge
before conditional branches, even if they are not strictly needed.
So we add them.
* Create structed cfg analysis.
There are lots of optimization that have to traverse the CFG in a
structured order just because it wants to know which constructs a
basic block in contained in. This adds extra complexity to these
optimizations, for causes too much refactoring of older optimizations.
To help with this problem, I have written an analysis that can give this
information.
* Identify branches breaking from loops.
Dead branch elimination does a search for a conditional branch to the
end of the current selection construct. This search assumes that the
only way to leave the construct is through the merge node. But that is
not true. The code can jump to the merge node of a loop that contains
the construct.
The search needs to take this into consideration.
When using lldb and/or gdb I frequently get odd std::string failures
when using the IR printing instructions we have now. This adds the
methods Instruction::Dump(), BasicBlock::Dump() and Function::Dump() to
emit the output of the pretty print to stderr.
With this I can now reliably print IR from gdb and lldb sessions.
In merge blocks, we do not allow the merging of two blocks with merge
instructions. This is because if the two block are merged only 1 of
those instructions can exists. However, if the successor block is the
merge block of the predecessor, then we can delete the merge instruction
in the predecessor. In this case, we are able to merge the blocks.
* Create a new entry point for the optimizer
Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.
The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.
Part of the optimizer options will also be the upper bound on the id bound.
* Add a command line option to set the max value for the id bound. The default is 0x3FFFFF.
* Modify `TakeNextIdBound` to return 0 when the limit is reached.
* Have the constant manager take ownership of constants.
Right now the owner of an object of type contant that is in the
|const_pool_| of the constant manager is unclear. The constant
manager does not delete them, there is no other reasonable owner. This
causes memory leaks.
This change fixes the memory leaks by having the constant manager
take ownership of the constant that is stores in |const_pool_|. Other
changes include interface changes to make it explicit that the constant
manager takes ownership of the object when a constant is registered
with the constant manager.
Fixes#1865.
Right now the owner of an object of type contant that is in the
|const_pool_| of the constant manager is unclear. The constant
manager does not delete them, there is no other reasonable owner. This
causes memory leaks.
This change fixes the memory leaks by having the constant manager
take ownership of the constant that is stores in |const_pool_|. Other
changes include interface changes to make it explicit that the constant
manager takes ownership of the object when a constant is registered
with the constant manager.
* Copy decorations when creating new ids.
When creating a new value based on an old value, we need to copy the
decorations to the new id. This change does this in 3 places:
1) The variable holding the return value of the function generated by
merge return should get decorations from the function.
2) The results of the OpPhi instructions should get decorations from the
variable they are replacing in the ssa writer.
3) In local access chain convert the intermediate struct (result of
OpCompositeInsert) generated for the store replacement should get its
decorations from the variable being stored to.
Fixes#1787.
If seems like at least 1 driver does not like a condition jump to the end
of a selection construct. We are generating these in the merge return
pass. This change stops merge return from generating this sequence.
Part of #1861.
When doing predicate blocks, we need to traverse every block in
structured order in order to keep track of which construct a block is
contained in. The standard way of traversing code in structured order
is to create a list with all of the nodes in order. However, when
predicating blocks, new blocks are created, and those blocks are missed.
This causes branches that go too far.
The solution is to update the order as new blocks are created. Since
we are using an std::list, we do not have to worry about invalidation of
iterators when changing the list.
* Refactor PredicateBlocks
Refactor PredicateBlocks so that we know which constructs a return
is contained in. Will be used later.
* Have PredicateBlocks jump the existing merge blocks.
In PredicateBlocks, we currently skip instructions with side effects,
but it still follows the same control flow (sort-of). This causes a
problem, when we are trying to predicate code in a loop. We skip all
of the code with side effects (IV increment), but still follow the
same control flow (jump back the start of the loop). This creates an
infinite loop because the code will keep jumping back to the start of
the loop without changing the values that effect the exit condition.
This is a large change to merge-return. When predicating a block that
is in a loop or merge construct, it will jump to the merge block of the
construct. Once out of all constructs we will generate code as we did
before.
* Handle breaks from structured-ifs in DCE.
dead code elimination assumes that are conditional branches except for
breaks and continues in loops will have an OpSelectionMerge before them.
That is not true when breaking out of a selection construct.
The fix is to look for breaks in selection constructs in the same place
we look for breaks and continues for loops.
When dead-branch-elim folds a conditional branch, it also deletes the
OpSelectionMerge instruction. If that construct contains a
conditional branch to the merge node, it will not have its own
OpSelectionMerge. When the headers merge instruction is deleted, the
the inner conditional branch will no longer be legal. It will be a
selection to a node that is not a merge node.
We fix this up by moving the OpSelectionMerge to a new location if it is
still needed.
This forks the testing harness from https://github.com/google/shaderc
to allow testing CLI tools.
New features needed for SPIRV-Tools include:
1- A new PlaceHolder subclass for spirv shaders. This place holder
calls spirv-as to convert assembly input into SPIRV bytecode. This is
required for most tools in SPIRV-Tools.
2- A minimal testing file for testing basic functionality of spirv-opt.
Add tests for all flags in spirv-opt.
1. Adds tests to check that known flags match the names that each pass
advertises.
2. Adds tests to check that -O, -Os and --legalize-hlsl schedule the
expected passes.
3. Adds more functionality to Expect classes to support regular
expression matching on stderr.
4. Add checks for integer arguments to optimization flags.
5. Fixes#1817 by modifying the parsing of integer arguments in
flags that take them.
6. Fixes -Oconfig file parsing (#1778). It reads every line of the file
into a string and then parses that string by tokenizing every group of
characters between whitespaces (using the standard cin reading
operator). This mimics shell command-line parsing, but it does not
support quoting (and I'm not planning to).
In local-access-chain-convert, we replace loads by load the entire
variable, then doing the extract. The extract will have the same value
as the load. However, if the load has a decoration on it, the
decoration is lost because we do not copy any them to the new id.
This is fixed by rewritting the load into the extract and keeping the
same result id.
This change has the effect that we do not call DCEInst on the loads
because the load is not being deleted, but replaced. This could leave
OpAccessChain instructions around that are not used. This is not a
problem for -O and -Os. They run local_single_*_elim passes and then
dead code elimination. The dce will remove the unused access chains,
and the load elimination passes work even if there are unused access
chains. I have added test to them to ensure they will not loss
opportunities.
Fixes#1787.
In `TypeManager::RebuildType`, the base cases call `Clone`, which will
copy the decorations for the type. After that it breaks out of the
switch statement and copies the decorations again.
This has not causes any real problems yet because none of those types
are allowed to have decorations. However to make the code more robust
it is best to not copy twice because it should be empty.
This way if a new base type or decoration is added that changes this
rule the code will be correct.
* Run the validator in the optimization fuzzers.
The optimizers assumes that the input to the optimizer is valid. Since
the fuzzers do not check that the input is valid before passing the
spir-v to the optimizer, we are getting a few errors.
The solution is to run the validator in the optimizer to validate the
input.
For the legalization passes, we need to add an extra option to the
validator to accept certain types of variable pointers, even if the
capability is not given. At the same time, we changed the option
"--legalize-hlsl" to relax the validator in the same way instead of
turning it off.
In the merge return pass, we will split a block, but not update the phi
instructions that reference the block. Since the branch in the original
block is now part of the block with the new id, the phi nodes must be
updated.
This commit will change this.
I have also considered other places where an id of a basic block could
be referenced, and I don't think any of them need to change.
1) Branch and merge instructions: These jump to the start of the
original block, and so we want them to jump to the block that uses the
original id. Nothing needs to change.
2) Names and decorations: I don't think it matters with block keeps the
name, and there are no decorations that apply to basic blocks.
Fixes#1736.
Many of the files have using std::<foo> statements in them, but then the
use of <foo> will be inconsistently std::<foo> or <foo> scattered
through the file. This CL removes all of the using statements and
updates the code to have the required std:: prefix.
When creating a new phi for a value in the function, merge return will
rewrite all uses of an id that are no longer dominated by its
definition. Uses that are not in a basic block, like OpName or
decorations, are not dominated, but they should not be replaced.
Fixes#1736.
* Combines OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain and
OpInBoundsPtrAccessChain
* New folding rule to fold add with 0 for integers
* Converts to a bitcast if the result type does not match the operand
type
V
This re-implements the -Oconfig=<file> flag to use a new API that takes
a list of command-line flags representing optimization passes.
This moves the processing of flags that create new optimization passes
out of spirv-opt and into the library API. Useful for other tools that
want to incorporate a facility similar to -Oconfig.
The main changes are:
1- Add a new public function Optimizer::RegisterPassesFromFlags. This
takes a vector of strings. Each string is assumed to have the form
'--pass_name[=pass_args]'. It creates and registers into the pass
manager all the passes specified in the vector. Each pass is
validated internally. Failure to create a pass instance causes the
function to return false and a diagnostic is emitted to the
registered message consumer.
2- Re-implements -Oconfig in spirv-opt to use the new API.
Fixes#1731
* Updated folding rules related to vector shuffle to account for the
undef literal value:
* FoldVectorShuffleFeedingShuffle
* FoldVectorShuffleFeedingExtract
* FoldVectorShuffleWithConstants
* These rules would commit memory violations due to treating the undef
literal value as an accessible composite component
Currentlty opt::Instruction class holds a cache of the result_id and
type_id for the instruction. That cache needs to be updated if the
underlying operand values are changes.
This CL changes the cache to being a flag if there is a type or result
id for the instruction. We then retrieve the value if needed from the
operands.
Fixes#1727
* If the pass finds any dead branches it can optimize then at the end of
the pass it reorders basic blocks to ensure they satisfy block ordering
requirements
* Added some new tests
* While investigating this issue, found and fixed a non-deterministic
ordering of dominators
* Now the edges used to construct the dominator tree are sorted
according to posorder traversal indices
With current implementation, the constant manager does not keep around
two constant with the same value but different types when the types
hash to the same value. So when you start looking for that constant you
will get a constant with the wrong type back.
I've made a few changes to the constant manager to fix this. First off,
I have changed the map from constant to ids to be an std::multimap.
This way a single constant can be mapped to mutiple ids each
representing a different type.
Then when asking for an id of a constant, we can search all of the ids
associated with that constant in order to find the one with the correct
type.
When folding an OpVectorShuffle where the first operand is defined by
an OpVectorShuffle, is unused, and is equal to the second, we end up
with an infinite loop. This is because we think we change the
instruction, but it does not actually change. So we keep trying to
folding the same instruction.
This commit fixes up that specific issue. When the operand is unused,
we replace it with Null.
When folding a vector shuffle that feeds another vector shuffle causes
the size of the first operand to change, when other indices have to be
adjusted reletive to the new size.
The function class provides a {Set|Get}Parent call in order to provide
the context to the LoopDescriptor methods. This CL removes the module
from Function and provides the needed context directly to LoopDescriptor
on creation.
This CL removes the context() method from opt::Function. In the places
where the context() was used we can retrieve, or provide, the context in
another fashion.
Currently the IRContext is passed into the Pass::Process method. It is
then up to the individual pass to store the context into the context_
variable. This CL changes the Run method to store the context before
calling Process which no-longer receives the context as a parameter.