In the existing code, ADCE pass does not check DebugScope of an
instruction when it checks the users of each instruction, which results
in removing OpenCL.Debug.100 instructions that are only used by
DebugScope. This commit lets ADCE pass add DebugScope of an instruction
to the live instruction set when the instruction is added to the live
instruction set.
* No longer blindly add global non-semantic info instructions to global
types and values
* functions now have a list of non-semantic instructions that succeed
them in the global scope
* global non-semantic instructions go in global types and values if
they appear before any function, otherwise they are attached to the
immediate function predecessor in the module
* changed ADCE to use the function removal utility
* Modified EliminateFunction to have special handling for non-semantic
instructions in the global scope
* non-semantic instructions are moved to an earlier function (or full
global set) if the function they are attached to is eliminated
* Added IRContext::KillNonSemanticInfo to remove the tree of
non-semantic instructions that use an instruction
* this is used in function elimination
* There is still significant work in the optimizer to handle
non-semantic instructions fully in the optimizer
Essentially, it marks all DebugInfo instructions in functions (and their operands) as live. It treats DebugDeclare and DebugValue with Deref as loads and so marks Stores of their variables as live.
It marks each DebugGlobalVariables as live except for its variable. After closure, it rechecks if the variable is live. If not, the DebugGlobalVariable instruction's variable operand is set to DebugInfoNone, per the DebugInfo spec.
When there are multiple entries and the shader has a variable with
WorkGroup storage class, those multiple entry functions store values to
the variable. Since ADCE pass uses def-use chains to propagate the work
list, some of instructions in the work list are not actually a part of
the currently processed function. As a result, it adds instructions in
other functions and put them in |live_insts_|. However, it does not
have the control flow information for those instructions in other
functions i.e., |block2headerBranch_| and |header2nextHeaderBranch_|.
When it processes those instructions (they are added when it processes a
different function), it skips handling them because they are already in
|live_insts_| and does not check |block2headerBranch_| and
|header2nextHeaderBranch_|, which results in skipping some branches.
Even though those branches are live branches, it considers they are dead
branches.
This also fixes ADCE to not remove possibly needed OpTypeForwardPointer.
The bug, its fix and the corresponding test have a circular dependency
with the extension, so they are packaged together.
* Process OpDecorateId in ADCE
When there is an OpDecorateId instruction that is live,
the ids that is references must be kept live. This change
adds them to the worklist.
I've also updated a validator check to allow OpDecorateId
to be able to apply to decoration groups.
Fixes#1759.
* Remove dead code.
Fixes#2551
* Add support for 1.4 entry point interface lists
* only input and output variables are automatically live
* can clean up interfaces after DCE
* added tests
* allow opt tests to specify a target environment
* Check var pointer capability in ADCE.
* Check var ptr capability for common uniform.
* Check var ptr capability in access chain convert.
Since we want this pass to run even if there are variable pointer on
storage buffers, we had to remove asserts that assumed there were no
variable pointers. The functions with the asserts will now work, it
becomes the responsibility of the callers to deal with the output as
appropriate.
* Single block elimination and variable pointers.
It seems like the code in local single block elimination is able to
handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed a
variable pointer are not candidates.
* Single store elimination and variable pointers.
It seems like the code in local single stroe elimination is able to
handle cases with variable pointers already. This is because the
function `FindSingleStoreAndCheckUses` ensures that variables that feed
a variable pointer are not candidates.
* SSA rewriter and variable pointers.
It seems like the code in the two passes that call the SSA rewriter are
able to handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed
a variable pointer are not candidates.
Fixes#2458.
Fixes#2456
* When eliminating a structured construct that has an unreachable merge,
replace that unreachable terminator with an appropriate return
* New tests
* Invalidate the decoration manager at the start of ADCE.
If the decoration manager is kept live the the contex will try to keep
it up to date. ADCE deals with group decorations by changing the
operands in |OpGroupDecorate| instructions directly without informing
the decoration manager. This puts it in an invalid state, which will
cause an error when the context tries to update it. To Avoid this
problem, we will invalidate the decoration manager upfront.
At the same time, the decoration manager is now considered when checking
the consistency of the decoration manager.
* Move ProcessFunction* function from pass to the context.
There are a few functions that are used to traverse the call tree.
They currently live in the Pass class, but they have nothing to do with
a pass, and may be needed outside of a pass. They would be better in
the ir context, or in a specific call tree class if we ever have a need
for it.
* Don't inline recursive functions.
Inlining does not check if a function is recursive or not. This has
been fine as long as the shader was a Vulkan shader, which forbid
recursive functions. However, not all shaders are vulkan, so either
we limit inlining to Vulkan shaders or we teach it to look for recursive
functions.
I prefer to keep the passes as general as is reasonable. The change
does not require much new code in inlining and gives a reason to refactor
some other code.
The changes are to add a member function to the Function class that
checks if that function is recursive or not.
Then this is used in inlining to not inlining a function call if it calls
a recursive function.
* Add id to function analysis
There are a few places that build a map from ids to Function whose
result is that id. I decided to add an analysis to the context for this
to reduce that code, and simplify some of the functions.
* Add missing file.
ADCE liveness algorithm should treat OpUnreachable at least like other
branch instructions. It was being treated as always live which was
preventing useless structured constructs from being eliminated.
OpUnreachable is generated by dead branch elimination which is now
being required by merge return, so this fix should accompany that
change.
Was removing control structures which didn't have data dependency
with enclosed live loop and otherwise did not contain live code.
An example is a counting loop around a live loop.
Fixes#1967.
Consider atomics that load when analyzing live stores in ADCE.
Previously it asserted that the base of an OpImageTexelPointer should
be an image. It is actually a pointer to an image, so IsValidBasePointer
should suffice.
The HlslCounterBufferGOOGLE that was introduced changed the OpDecorateId
so that is can now reference an id other than the target. If that other
id is used only in the decoration, then the definition of the id will be
removed because decoration do not count as real uses.
However, if the target of the decoration is still live the decoration
will not be removed. This leaves a reference to an id that is not
defined.
There are two solutions to consider. The first is that is the decoration
is kept, then the definition of the id should be kept live. Implementing
this change would be involved because the way ADCE handles decorations
will have to be reimplemented.
The other solution is to remove the decoration the id is otherwise dead.
This works for this specific case. Also this is the more desirable
behaviour in this case. The id will always be the id of a variable that
belongs to a descriptor set. If that variable is not bound and we do
not remove it, the driver will complain.
I chose to implement the second solution. The first will be left to when
a case for it comes up.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1885.
* Analyze uses for all instructions.
The def-use manager needs to fill in the `inst_to_used_ids_` field for
every instruction. This means we have to analyze the uses for every
instruction, even if they do not have any uses.
This mistake was not found earlier because there was a typo in the
equality check for def-use managers. No new tests are needed.
While looking into this I found redundant work in block merge. Cleaning
that up at the same time.
* Fix other transformations
Aggressive dead code elimination did not update the OpGroupDecorate
and the OpGroupMemberDecorate instructions properly when they are
updated. That is fixed.
Dead branch elimination did not analyze the OpUnreachable instructions
that is would add. That is taken care of.
* Handle breaks from structured-ifs in DCE.
dead code elimination assumes that are conditional branches except for
breaks and continues in loops will have an OpSelectionMerge before them.
That is not true when breaking out of a selection construct.
The fix is to look for breaks in selection constructs in the same place
we look for breaks and continues for loops.
Currently the IRContext is passed into the Pass::Process method. It is
then up to the individual pass to store the context into the context_
variable. This CL changes the Run method to store the context before
calling Process which no-longer receives the context as a parameter.
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
The following passes are updated to preserve the inst-to-block and
def-use analysies:
private-to-local
aggressive dead-code elimination
dead branch elimination
local-single-block elimination
local-single-store elimination
reduce load size
compact ids (inst-to-block only)
merge block
dead-insert elimination
ccp
The one execption is that compact ids still kills the def-use manager.
This is because it changes so many ids it is faster to kill and rebuild.
Does everything in
https://github.com/KhronosGroup/SPIRV-Tools/issues/1593 except for the
changes to merge return.
ADCE does not treat OpCopyMemory as an instruction that references
memory. Because of that stores are removed that should not be.
This change teaches ADCE that OpCopyMemory and OpCopyMemorySize both
loads from and stores to memory. This will keep other stores live when
needed, and will allows ADCE to remove OpCopyMemory instructions as
well.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1556.
If there is a shader with a variable in the workgroup storage class that
is stored to, but not loadeds, then we know nothing will read those
loads. It should be safe to remove them.
This is implemented in ADCE by treating workgroup variables the same
way that private variables are treated.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1550.
At this time, DCE will only remove an instruction if it is a combinator.
However, there are certain non-combinator instructions that can be
safely removed if their results are not used. The derivative
instructions are on example.
We are also missing some instructions from the list of combinators
those are added as the same time.
The unordered_set in ADCE that holds all of the live instructions takes
a very long time to be destroyed. In some shaders, it takes over 40% of
the time.
If we look at the unique ids of the live instructions, I believe they
are dense enough make a simple bit vector a good choice for to hold that
data. When I check the density of the bit vector for larger shaders, we
are usually using less than 4 bytes per element in the vector, and
almost always less than 16.
So, in this commit, I introduce a simple bit vector class, and
use it in ADCE.
This help improve the compile time for some shaders on windows by the
40% mentioned above.
Contributes to https://github.com/KhronosGroup/SPIRV-Tools/issues/1328.
Currently OpImageTexelPointer operations are treat like a use of the
pointer, but it does
not look for the memory being referenced to make sure stores are not
removed.
This change teaches it so identify the memory being accessed, and
treats it as if that memory is loaded.
Fixes to #1445.
Optimizations should work in the presence of recent
SPV_GOOGLE_decorate_string and SPV_GOOGLE_hlsl_functionality1
SPV_GOOGLE_decorate_string:
- Adds operation OpDecorateStringGOOGLE to decorate an object with decorations
having string operands.
SPV_GOOGLE_hlsl_functionality1:
- Adds HlslSemanticGOOGLE, used to decorate an interface variable with
an HLSL semantic string. Optimizations already preserve those variables
as required because they are interface variables (with uses), independent
of whether they have HLSL decorations.
- Adds HlslCounterBufferGOOGLE, used to associate a buffer with a
counter variable.
Fixes#1391
* AddToWorklist can now be called unconditionally
* It will only add instructions that have not already been marked as
live
* Fixes a case where a merge was not added to the worklist because the
branch was already marked as live
* Added two similar tests that fail without the fix
Modified ADCE to remove dead globals.
* Entry point and execution mode instructions are marked as alive
* Reachable functions and their parameters are marked as alive
* Instruction deletion now deferred until the end of the pass
* Eliminated dead insts set, added IsDead to calculate that value
instead
* Ported applicable dead variable elimination tests
* Ported dead constant elim tests
Added dead function elimination to ADCE
* ported dead function elim tests
Added handling of decoration groups in ADCE
* Uses a custom sorter to traverse decorations in a specific order
* Simplifies necessary checks
Updated -O and -Os pass lists.