In DecorationManager::RemoveDecorationsFrom, we do not remove the id
from a decoration group if the group has no decorations. This causes
problems because KillNamesAndDecorates is suppose to remove all
references to the id, but in this case, there is still a reference.
This is fixed by adding a special case.
Also, there is the possibility of a double free because
RemoveDecorationsFrom will delete the instructions defining |id| when
|id| is a decoration group. Later, KillInst would later write to memory
that has been deleted when trying to turn it into a Nop. To fix this,
we will only remove the decorations that use |id| and not its definition
in RemoveDecorationsFrom.
It seems like the current implementation of KillNameAndDecorates does
not handle group decorations correctly. The id being removed is not
removed from the OpGroupDecorate instructions. Even worst, any
decorations that apply to that group are removed.
The solution is to use the function in the decoration manager that will
remove the decorations and update the instructions instead of doing the
work itself.
Adds unrolling to the legalization passes.
After enabling unrolling I found a bug when there is a self-referencing
phi node. That has been fixed.
The test that checks for that the order of optimizations is correct also
needed to be updated.
The current implementation of merge return can create bad, but correct,
code. When it is not in a loop construct, it will insert a lot of
extra branch around code. The potentially large number of branches are
bad. At the same time, it can separate code store to variables from
its uses hiding the fact that the store dominates the load.
This hurts the later analysis because the compiler thinks that multiple
values can reach a load, when there is really only 1. This poorer
analysis leads to missed optimizations.
The solution is to create a dummy loop around the entire body of the
function, then we can break from that loop with a single branch. Also
only new merge nodes would be those at the end of loops meaning that
most analysies will not be hurt.
Remove dead code for cases that are no longer possible.
It seems like some drivers expect there the be an OpSelectionMerge
before conditional branches, even if they are not strictly needed.
So we add them.
* Create structed cfg analysis.
There are lots of optimization that have to traverse the CFG in a
structured order just because it wants to know which constructs a
basic block in contained in. This adds extra complexity to these
optimizations, for causes too much refactoring of older optimizations.
To help with this problem, I have written an analysis that can give this
information.
* Identify branches breaking from loops.
Dead branch elimination does a search for a conditional branch to the
end of the current selection construct. This search assumes that the
only way to leave the construct is through the merge node. But that is
not true. The code can jump to the merge node of a loop that contains
the construct.
The search needs to take this into consideration.
In merge blocks, we do not allow the merging of two blocks with merge
instructions. This is because if the two block are merged only 1 of
those instructions can exists. However, if the successor block is the
merge block of the predecessor, then we can delete the merge instruction
in the predecessor. In this case, we are able to merge the blocks.
* Create a new entry point for the optimizer
Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.
The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.
Part of the optimizer options will also be the upper bound on the id bound.
* Add a command line option to set the max value for the id bound. The default is 0x3FFFFF.
* Modify `TakeNextIdBound` to return 0 when the limit is reached.
* Copy decorations when creating new ids.
When creating a new value based on an old value, we need to copy the
decorations to the new id. This change does this in 3 places:
1) The variable holding the return value of the function generated by
merge return should get decorations from the function.
2) The results of the OpPhi instructions should get decorations from the
variable they are replacing in the ssa writer.
3) In local access chain convert the intermediate struct (result of
OpCompositeInsert) generated for the store replacement should get its
decorations from the variable being stored to.
Fixes#1787.
If seems like at least 1 driver does not like a condition jump to the end
of a selection construct. We are generating these in the merge return
pass. This change stops merge return from generating this sequence.
Part of #1861.
When doing predicate blocks, we need to traverse every block in
structured order in order to keep track of which construct a block is
contained in. The standard way of traversing code in structured order
is to create a list with all of the nodes in order. However, when
predicating blocks, new blocks are created, and those blocks are missed.
This causes branches that go too far.
The solution is to update the order as new blocks are created. Since
we are using an std::list, we do not have to worry about invalidation of
iterators when changing the list.
* Refactor PredicateBlocks
Refactor PredicateBlocks so that we know which constructs a return
is contained in. Will be used later.
* Have PredicateBlocks jump the existing merge blocks.
In PredicateBlocks, we currently skip instructions with side effects,
but it still follows the same control flow (sort-of). This causes a
problem, when we are trying to predicate code in a loop. We skip all
of the code with side effects (IV increment), but still follow the
same control flow (jump back the start of the loop). This creates an
infinite loop because the code will keep jumping back to the start of
the loop without changing the values that effect the exit condition.
This is a large change to merge-return. When predicating a block that
is in a loop or merge construct, it will jump to the merge block of the
construct. Once out of all constructs we will generate code as we did
before.
* Handle breaks from structured-ifs in DCE.
dead code elimination assumes that are conditional branches except for
breaks and continues in loops will have an OpSelectionMerge before them.
That is not true when breaking out of a selection construct.
The fix is to look for breaks in selection constructs in the same place
we look for breaks and continues for loops.
When dead-branch-elim folds a conditional branch, it also deletes the
OpSelectionMerge instruction. If that construct contains a
conditional branch to the merge node, it will not have its own
OpSelectionMerge. When the headers merge instruction is deleted, the
the inner conditional branch will no longer be legal. It will be a
selection to a node that is not a merge node.
We fix this up by moving the OpSelectionMerge to a new location if it is
still needed.
In local-access-chain-convert, we replace loads by load the entire
variable, then doing the extract. The extract will have the same value
as the load. However, if the load has a decoration on it, the
decoration is lost because we do not copy any them to the new id.
This is fixed by rewritting the load into the extract and keeping the
same result id.
This change has the effect that we do not call DCEInst on the loads
because the load is not being deleted, but replaced. This could leave
OpAccessChain instructions around that are not used. This is not a
problem for -O and -Os. They run local_single_*_elim passes and then
dead code elimination. The dce will remove the unused access chains,
and the load elimination passes work even if there are unused access
chains. I have added test to them to ensure they will not loss
opportunities.
Fixes#1787.
The code in source/message was only used in a single set of tests to
format the output results. This CL changes the test to verify the
message instead of all the error values and removes the source/message
code.
* Run the validator in the optimization fuzzers.
The optimizers assumes that the input to the optimizer is valid. Since
the fuzzers do not check that the input is valid before passing the
spir-v to the optimizer, we are getting a few errors.
The solution is to run the validator in the optimizer to validate the
input.
For the legalization passes, we need to add an extra option to the
validator to accept certain types of variable pointers, even if the
capability is not given. At the same time, we changed the option
"--legalize-hlsl" to relax the validator in the same way instead of
turning it off.
In the merge return pass, we will split a block, but not update the phi
instructions that reference the block. Since the branch in the original
block is now part of the block with the new id, the phi nodes must be
updated.
This commit will change this.
I have also considered other places where an id of a basic block could
be referenced, and I don't think any of them need to change.
1) Branch and merge instructions: These jump to the start of the
original block, and so we want them to jump to the block that uses the
original id. Nothing needs to change.
2) Names and decorations: I don't think it matters with block keeps the
name, and there are no decorations that apply to basic blocks.
Fixes#1736.
When creating a new phi for a value in the function, merge return will
rewrite all uses of an id that are no longer dominated by its
definition. Uses that are not in a basic block, like OpName or
decorations, are not dominated, but they should not be replaced.
Fixes#1736.
* Combines OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain and
OpInBoundsPtrAccessChain
* New folding rule to fold add with 0 for integers
* Converts to a bitcast if the result type does not match the operand
type
V
This re-implements the -Oconfig=<file> flag to use a new API that takes
a list of command-line flags representing optimization passes.
This moves the processing of flags that create new optimization passes
out of spirv-opt and into the library API. Useful for other tools that
want to incorporate a facility similar to -Oconfig.
The main changes are:
1- Add a new public function Optimizer::RegisterPassesFromFlags. This
takes a vector of strings. Each string is assumed to have the form
'--pass_name[=pass_args]'. It creates and registers into the pass
manager all the passes specified in the vector. Each pass is
validated internally. Failure to create a pass instance causes the
function to return false and a diagnostic is emitted to the
registered message consumer.
2- Re-implements -Oconfig in spirv-opt to use the new API.
Fixes#1731
* Updated folding rules related to vector shuffle to account for the
undef literal value:
* FoldVectorShuffleFeedingShuffle
* FoldVectorShuffleFeedingExtract
* FoldVectorShuffleWithConstants
* These rules would commit memory violations due to treating the undef
literal value as an accessible composite component
Fixes#1727
* If the pass finds any dead branches it can optimize then at the end of
the pass it reorders basic blocks to ensure they satisfy block ordering
requirements
* Added some new tests
* While investigating this issue, found and fixed a non-deterministic
ordering of dominators
* Now the edges used to construct the dominator tree are sorted
according to posorder traversal indices
With current implementation, the constant manager does not keep around
two constant with the same value but different types when the types
hash to the same value. So when you start looking for that constant you
will get a constant with the wrong type back.
I've made a few changes to the constant manager to fix this. First off,
I have changed the map from constant to ids to be an std::multimap.
This way a single constant can be mapped to mutiple ids each
representing a different type.
Then when asking for an id of a constant, we can search all of the ids
associated with that constant in order to find the one with the correct
type.
When folding an OpVectorShuffle where the first operand is defined by
an OpVectorShuffle, is unused, and is equal to the second, we end up
with an infinite loop. This is because we think we change the
instruction, but it does not actually change. So we keep trying to
folding the same instruction.
This commit fixes up that specific issue. When the operand is unused,
we replace it with Null.
When folding a vector shuffle that feeds another vector shuffle causes
the size of the first operand to change, when other indices have to be
adjusted reletive to the new size.
The function class provides a {Set|Get}Parent call in order to provide
the context to the LoopDescriptor methods. This CL removes the module
from Function and provides the needed context directly to LoopDescriptor
on creation.
This CL removes the context() method from opt::Function. In the places
where the context() was used we can retrieve, or provide, the context in
another fashion.
Currently the IRContext is passed into the Pass::Process method. It is
then up to the individual pass to store the context into the context_
variable. This CL changes the Run method to store the context before
calling Process which no-longer receives the context as a parameter.
For the instructions which execute after the IdPass check we can provide
the Instruction instead of the spv_parsed_instruction_t. This
Instruction class provides a bit more context (like the source line)
that is not available from spv_parsed_instruction_t.
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
The folding routines are currently global functions. They also rely on
data in an std::map that holds the folding rules for each opcode. This
causes that map to not have a clear owner, and therefore never gets
deleted.
There has been a request to delete this map. To implement this, we will
create a InstructionFolder class that owns the maps. The IRContext will
own the InstructionFolder instance. Then the global functions will
become public memeber functions of the InstructionFolder.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1659.
There are a few locations where we need to handle duplicate types. We
cannot merge them because they may be needed for reflection. When this
happens we need do some extra lookups in the type manager.
The specific fixes are:
1) When generating a constant through `GetDefiningInstruction` accept
and use an id for the desired type of the constant. This will make sure
you get the type that is needed.
2) In Private-to-local, make sure we to update the def-use chains when a
new pointer type is created.
3) In the type manager, make sure that `FindPointerToType` returns a
pointer that points to the given type and not a duplicate type.
4) In scalar replacment, make sure the null constants that are created
are the correct type.
Revert "Don't merge types of resources"
This reverts commit f393b0e480, but leaves
the tests that were added. Added new test. These test are the so that,
if someone tries the same change I made, they will see the test that
they need to handle.
Don't run remove duplicates in -O and -Os
Romve duplicates was run to help reduce compile time when looking for
types in the type manager. I've run compile time test on three sets
of shaders, and the compile time does not seem to change.
It should be safe to remove it.
The Compact IDs pass can corrupt the CFG, but first the CFG has to
be setup. To do this, a test that builds the CFG, then performs the
compact IDs pass, then checks context consistency.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1648
When doing reflection users care about the names of the variable, the
name of the type, and the name of the members. Remove duplicates breaks
this because it removes the names one of the types when merging.
To fix this we have to keep the different types around for each
resource. This commit adds code to remove duplicates to look for the
types uses to describe resources, and make sure they do not get merged.
However, allow merging of a type used in a resource with something not
used in a resource. Was done when the non resource type came second.
This could have a negative effect on compile time, but it was not
expected to be much.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1372.