* Added a reduction pass to replace ids with ids of the same type that dominate them.
* Introduce helper method for querying whether an operand type is an input id.
These are bookend passes designed to help preserve line information
across passes which delete, move and clone instructions. The propagation
pass attaches a debug line instruction to every instruction based on
SPIR-V line propagation rules. It should be performed before optimization.
The redundant line elimination pass eliminates all line instructions
which match the previous line instruction. This pass should be performed
at the end of optimization to reduce physical SPIR-V file size.
Fixes#2027.
The type manager in spirv-opt currently asserts if a function parameter
has type void. It is not exactly clear from the spec that this is
disallowed, even if it probably will be disallowed. In either case,
asserts should be used to verify assumptions that will actually make a
difference to the code. As far as the optimizer is concerned, a void
parameter does not matter. I don't see the point of the assert. I'll
just remove it and let the validator decide whether to accept it or not.
No test was added because it is not clear that it is legal, and should
not force us to accept it in the future unless the spec make it clear
that it is legal.
Fixes crbug.com/903088.
That function currently only handled OpPtrAccessChain if it was in the
middle of the chain, but not at the start. Fixing that up.
Fixes crbug.com/905271.
* Add base and core bindless validation instrumentation classes
* Fix formatting.
* Few more formatting fixes
* Fix build failure
* More build fixes
* Need to call non-const functions in order.
Specifically, these are functions which call TakeNextId(). These need to
be called in a specific order to guarantee that tests which do exact
compares will work across all platforms. c++ pretty much does not
guarantee order of evaluation of operands, so any such functions need to
be called separately in individual statements to guarantee order.
* More ordering.
* And more ordering.
* And more formatting.
* Attempt to fix NDK build
* Another attempt to address NDK build problem.
* One more attempt at NDK build failure
* Add instrument.hpp to BUILD.gn
* Some name improvement in instrument.hpp
* Change all types in instrument.hpp to int.
* Improve documentation in instrument.hpp
* Format fixes
* Comment clean up in instrument.hpp
* imageInst -> image_inst
* Fix GetLabel() issue.
If there is only 1 return and it is in a loop, then the function cannot be inlined.
Fix condition when inlined code needs one-trip loop wrapper. The dummy loop is needed when there is a return inside a selection construct. Even if there is only 1 return.
When looking for a break from a selection construct, we do not realize
that a jump to the continue target of a loop containing the selection
is a break. This causes and infinit loop, or possibly other failures.
Fixes#2004.
When looking for a break from a selection construct, we do not need to
look inside nested constructs. However, if a loop header has an
unconditional branch, then we enter the loop. Entering the loop causes
an infinite loop because we keep going through the loop.
The solution is to look for a merge block, if one exsits, even for block
terminated by an OpBranch.
Fixes#1979.
ADCE liveness algorithm should treat OpUnreachable at least like other
branch instructions. It was being treated as always live which was
preventing useless structured constructs from being eliminated.
OpUnreachable is generated by dead branch elimination which is now
being required by merge return, so this fix should accompany that
change.
We currently run merge-return on all functions, but
dead-branch-elimination only runs on function reachable from an entry
point or exported function. Since dead-branch-elimination is needed for
merge-return, they have to match.
Fixes#1976.
Was removing control structures which didn't have data dependency
with enclosed live loop and otherwise did not contain live code.
An example is a counting loop around a live loop.
Fixes#1967.
Consider atomics that load when analyzing live stores in ADCE.
Previously it asserted that the base of an OpImageTexelPointer should
be an image. It is actually a pointer to an image, so IsValidBasePointer
should suffice.
Merge return assumes that the only unreachable blocks are those needed
to keep the structured cfg valid. Even those must be essentially empty
blocks.
If this is not the case, we get unpredictable behaviour. This commit
add a check in merge return, and emits an error if it is not the case.
Added a pass of dead branch elimination before merge return in both the
performance and size passes. It is a precondition of merge return.
Fixes#1962.
The current implementation in the folder when seeing a division by zero
is to assert. In the release build, the compiler will attempt to
compute the value, which causes its own problems.
The solution I will go with is to fold the division, and just give it
the value of 0. The same goes for remainder and mod operations.
Fixes#1961.
The HlslCounterBufferGOOGLE that was introduced changed the OpDecorateId
so that is can now reference an id other than the target. If that other
id is used only in the decoration, then the definition of the id will be
removed because decoration do not count as real uses.
However, if the target of the decoration is still live the decoration
will not be removed. This leaves a reference to an id that is not
defined.
There are two solutions to consider. The first is that is the decoration
is kept, then the definition of the id should be kept live. Implementing
this change would be involved because the way ADCE handles decorations
will have to be reimplemented.
The other solution is to remove the decoration the id is otherwise dead.
This works for this specific case. Also this is the more desirable
behaviour in this case. The id will always be the id of a variable that
belongs to a descriptor set. If that variable is not bound and we do
not remove it, the driver will complain.
I chose to implement the second solution. The first will be left to when
a case for it comes up.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1885.
There are a few spots where copy propagate arrays is trying
to go from a Type to an id, but the type is not unique. When generating
code this pass needs specific ids, otherwise we get type mismatches.
However, the ambigous types means we can sometimes get the wrong type
and generate invalid code.
That code has been rewritten to not rely on the type manager, and just
look at the instructions instead.
I have opened https://github.com/KhronosGroup/SPIRV-Tools/issues/1939 to
try to get a way to make this more robust.
* Analyze uses for all instructions.
The def-use manager needs to fill in the `inst_to_used_ids_` field for
every instruction. This means we have to analyze the uses for every
instruction, even if they do not have any uses.
This mistake was not found earlier because there was a typo in the
equality check for def-use managers. No new tests are needed.
While looking into this I found redundant work in block merge. Cleaning
that up at the same time.
* Fix other transformations
Aggressive dead code elimination did not update the OpGroupDecorate
and the OpGroupMemberDecorate instructions properly when they are
updated. That is fixed.
Dead branch elimination did not analyze the OpUnreachable instructions
that is would add. That is taken care of.
In DecorationManager::RemoveDecorationsFrom, we do not remove the id
from a decoration group if the group has no decorations. This causes
problems because KillNamesAndDecorates is suppose to remove all
references to the id, but in this case, there is still a reference.
This is fixed by adding a special case.
Also, there is the possibility of a double free because
RemoveDecorationsFrom will delete the instructions defining |id| when
|id| is a decoration group. Later, KillInst would later write to memory
that has been deleted when trying to turn it into a Nop. To fix this,
we will only remove the decorations that use |id| and not its definition
in RemoveDecorationsFrom.
Add code to keep the def-use manger and the inst-to-block mapping up-to-date. This means we do not have to rebuild them later.
To make this work, we will have to have to find places to update the
def-use manager. Updating the def-use manager is not straight forward
because we are unrolling loops, and we have circular references.
This forces one pass to register all of the definitions. A second one
to analyze the uses. Also because there will be references to the new
instructions in the old code, we want to register the definitions of the
new instructions early, so we can update the uses of the older code as
we go along.
The inst-to-block mapping is not too difficult. It can be done as instructions are created.
Fixes#1928.
A limit of 0 for the scalar replacement options it used to indicate that
there is no limit. The current implementation does not allow 0. This
should be fixed.
It seems like the current implementation of KillNameAndDecorates does
not handle group decorations correctly. The id being removed is not
removed from the OpGroupDecorate instructions. Even worst, any
decorations that apply to that group are removed.
The solution is to use the function in the decoration manager that will
remove the decorations and update the instructions instead of doing the
work itself.
Adds unrolling to the legalization passes.
After enabling unrolling I found a bug when there is a self-referencing
phi node. That has been fixed.
The test that checks for that the order of optimizations is correct also
needed to be updated.
The current implementation of merge return can create bad, but correct,
code. When it is not in a loop construct, it will insert a lot of
extra branch around code. The potentially large number of branches are
bad. At the same time, it can separate code store to variables from
its uses hiding the fact that the store dominates the load.
This hurts the later analysis because the compiler thinks that multiple
values can reach a load, when there is really only 1. This poorer
analysis leads to missed optimizations.
The solution is to create a dummy loop around the entire body of the
function, then we can break from that loop with a single branch. Also
only new merge nodes would be those at the end of loops meaning that
most analysies will not be hurt.
Remove dead code for cases that are no longer possible.
It seems like some drivers expect there the be an OpSelectionMerge
before conditional branches, even if they are not strictly needed.
So we add them.
* Create structed cfg analysis.
There are lots of optimization that have to traverse the CFG in a
structured order just because it wants to know which constructs a
basic block in contained in. This adds extra complexity to these
optimizations, for causes too much refactoring of older optimizations.
To help with this problem, I have written an analysis that can give this
information.
* Identify branches breaking from loops.
Dead branch elimination does a search for a conditional branch to the
end of the current selection construct. This search assumes that the
only way to leave the construct is through the merge node. But that is
not true. The code can jump to the merge node of a loop that contains
the construct.
The search needs to take this into consideration.
When using lldb and/or gdb I frequently get odd std::string failures
when using the IR printing instructions we have now. This adds the
methods Instruction::Dump(), BasicBlock::Dump() and Function::Dump() to
emit the output of the pretty print to stderr.
With this I can now reliably print IR from gdb and lldb sessions.
In merge blocks, we do not allow the merging of two blocks with merge
instructions. This is because if the two block are merged only 1 of
those instructions can exists. However, if the successor block is the
merge block of the predecessor, then we can delete the merge instruction
in the predecessor. In this case, we are able to merge the blocks.
* Create a new entry point for the optimizer
Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.
The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.
Part of the optimizer options will also be the upper bound on the id bound.
* Add a command line option to set the max value for the id bound. The default is 0x3FFFFF.
* Modify `TakeNextIdBound` to return 0 when the limit is reached.
* Have the constant manager take ownership of constants.
Right now the owner of an object of type contant that is in the
|const_pool_| of the constant manager is unclear. The constant
manager does not delete them, there is no other reasonable owner. This
causes memory leaks.
This change fixes the memory leaks by having the constant manager
take ownership of the constant that is stores in |const_pool_|. Other
changes include interface changes to make it explicit that the constant
manager takes ownership of the object when a constant is registered
with the constant manager.
Fixes#1865.
Right now the owner of an object of type contant that is in the
|const_pool_| of the constant manager is unclear. The constant
manager does not delete them, there is no other reasonable owner. This
causes memory leaks.
This change fixes the memory leaks by having the constant manager
take ownership of the constant that is stores in |const_pool_|. Other
changes include interface changes to make it explicit that the constant
manager takes ownership of the object when a constant is registered
with the constant manager.
* Copy decorations when creating new ids.
When creating a new value based on an old value, we need to copy the
decorations to the new id. This change does this in 3 places:
1) The variable holding the return value of the function generated by
merge return should get decorations from the function.
2) The results of the OpPhi instructions should get decorations from the
variable they are replacing in the ssa writer.
3) In local access chain convert the intermediate struct (result of
OpCompositeInsert) generated for the store replacement should get its
decorations from the variable being stored to.
Fixes#1787.
If seems like at least 1 driver does not like a condition jump to the end
of a selection construct. We are generating these in the merge return
pass. This change stops merge return from generating this sequence.
Part of #1861.