Constexpr guaranteed no runtime init in addition to const semantics.
Moving all opt/ to constexpr.
Moving all compile-unit statics to anonymous namespaces to uniformize
the method used (anonymous namespace vs static has the same behavior
here AFAIK).
Signed-off-by: Nathan Gauër <brioche@google.com>
If the `instruction` operand in an extended instruction instruction is
too large, it causes undefined behaviour when that value is cast to the
enum for the corresponding set. This is done with the
NonSemanticDebug100 instruction set. We need to avoid the undefined
behaviour.
Fixes#4727
* Don't eliminate dead members from StructuredBuffer as layout(offset) qualifiers cannot be applied to structure fields.
* Traverse arrays when marking structs as fully used.
Co-authored-by: Steven Perron <stevenperron@google.com>
Debug[No]Line are tracked and optimized using the same mechanism that tracks
and optimizes Op[No]Line.
Also:
- Fix missing DebugScope at top of block.
- Allow scalar replacement of access chain in DebugDeclare
Includes:
- Shift to use of spirv-header extinst.nonsemantic.shader grammar.json
- Remove extinst.nonsemantic.vulkan.debuginfo.100.grammar.json
- Enable all optimizations for Shader.DebugInfo
Also fixes scalar replacement to only insert DebugValue after all
OpVariables. This is not necessary for OpenCL.DebugInfo, but it is
for Shader.DebugInfo.
Likewise, fixes Private-to-Local to insert DebugDeclare after all
OpVariables.
Also fixes inlining to handle FunctionDefinition which can show up
after first block if early return processing happens.
Co-authored-by: baldurk <baldurk@baldurk.org>
For some cases, we have DebugDecl invisible to a value assignment, but
the value assignment information is important i.e., debugger cannot inspect
the variable without the information. For example, a parameter of an inlined
function must have its value assignment i.e., argument passing out of its
function scope. If we simply remove DebugDecl because it is invisible to the
argument passing, we cannot inspec the variable.
This PR
- Adds DebugValue for DebugDecl invisible to a value assignment. We use
the value of the variable in the basic block that contains DebugDecl, which is
found by ssa-rewrite. If the value instruction does not dominate DebugDecl,
we use the value of the variable in the immediate dominator of the basic block.
- Checks the visibility of DebugDecl for Phi value assignment based on the
all value operands of the Phi. Since Phi just references multiple values from
multiple basic blocks, scopes of value operands must be regarded as the scope
of the Phi.
When we update OpenCL.DebugInfo.100 lexical scopes e.g., DebugFunction,
we have to replace DebugScope of each instruction that uses the lexical
scope correctly.
* No longer blindly add global non-semantic info instructions to global
types and values
* functions now have a list of non-semantic instructions that succeed
them in the global scope
* global non-semantic instructions go in global types and values if
they appear before any function, otherwise they are attached to the
immediate function predecessor in the module
* changed ADCE to use the function removal utility
* Modified EliminateFunction to have special handling for non-semantic
instructions in the global scope
* non-semantic instructions are moved to an earlier function (or full
global set) if the function they are attached to is eliminated
* Added IRContext::KillNonSemanticInfo to remove the tree of
non-semantic instructions that use an instruction
* this is used in function elimination
* There is still significant work in the optimizer to handle
non-semantic instructions fully in the optimizer
Reject folding comparisons with unfoldable types.
Fixes#3343
When CCP is evaluating an instruction, it was trying to fold a
comparison with 64 bit integers. This was causing a fold failure later
since the folder still cannot deal with 64 bit integers.
In this PR, the classes that represent the adjust branch weights
transformation and fuzzer pass were implemented. This transformation
adjusts the branch weights of a OpBranchConditional instruction.
Generalizes the IsReadOnlyVariable() method, and related methods, so
that they can be used to ask whether pointer result ids are read-only.
Fixes#3324.
* Preserve debug info in eliminate-dead-functions
The elimination of dead functions makes OpFunction operand of
DebugFunction invalid. This commit replaces the operand with
DebugInfoNone.
When DebugScope is given in SPIR-V, each instruction following the
DebugScope is from the lexical scope pointed by the DebugScope in
the high level language. We add DebugScope struction to keep the
scope information in Instruction class. When ir_loader loads
DebugScope/DebugNoScope, it keeps the scope information in
|last_dbg_scope_| and lets following instructions have that scope
information.
In terms of DebugDeclare/DebugValue, if it is in a function body
but outside of a basic block, we keep it in |debug_insts_in_header_|
of Function class. If it is in a basic block, we keep it as a normal
instruction i.e., in a instruction list of BasicBlock.
* Refactor instruction folders
We want to refactor the instruction folder to allow different sets of
rules to be added to the instruction folder. We might want different
sets of rules in different circumstances.
We also need a way to add rules for extended instructions. Changes are
made to the FoldingRules class and ConstFoldingRules class to enable
that.
We added tests to check that we can fold extended instructions using the
new framework.
At the same time, I noticed that there were two tests that did not tests
what they were suppose to. They could not be easily salvaged. #2813 was
opened to track adding the new tests.
* Check var pointer capability in ADCE.
* Check var ptr capability for common uniform.
* Check var ptr capability in access chain convert.
Since we want this pass to run even if there are variable pointer on
storage buffers, we had to remove asserts that assumed there were no
variable pointers. The functions with the asserts will now work, it
becomes the responsibility of the callers to deal with the output as
appropriate.
* Single block elimination and variable pointers.
It seems like the code in local single block elimination is able to
handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed a
variable pointer are not candidates.
* Single store elimination and variable pointers.
It seems like the code in local single stroe elimination is able to
handle cases with variable pointers already. This is because the
function `FindSingleStoreAndCheckUses` ensures that variables that feed
a variable pointer are not candidates.
* SSA rewriter and variable pointers.
It seems like the code in the two passes that call the SSA rewriter are
able to handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed
a variable pointer are not candidates.
Fixes#2458.
That function currently only handled OpPtrAccessChain if it was in the
middle of the chain, but not at the start. Fixing that up.
Fixes crbug.com/905271.
Consider atomics that load when analyzing live stores in ADCE.
Previously it asserted that the base of an OpImageTexelPointer should
be an image. It is actually a pointer to an image, so IsValidBasePointer
should suffice.
When using lldb and/or gdb I frequently get odd std::string failures
when using the IR printing instructions we have now. This adds the
methods Instruction::Dump(), BasicBlock::Dump() and Function::Dump() to
emit the output of the pretty print to stderr.
With this I can now reliably print IR from gdb and lldb sessions.
Currentlty opt::Instruction class holds a cache of the result_id and
type_id for the instruction. That cache needs to be updated if the
underlying operand values are changes.
This CL changes the cache to being a flag if there is a type or result
id for the instruction. We then retrieve the value if needed from the
operands.
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
The folding routines are currently global functions. They also rely on
data in an std::map that holds the folding rules for each opcode. This
causes that map to not have a clear owner, and therefore never gets
deleted.
There has been a request to delete this map. To implement this, we will
create a InstructionFolder class that owns the maps. The IRContext will
own the InstructionFolder instance. Then the global functions will
become public memeber functions of the InstructionFolder.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1659.
We replace the std::vector in the Operand class by a new class that does
a small size optimization. This helps improve compile time on Windows.
Tested on three sets of shaders. Trying various values for the small
vector. The optimal value for the operand class was 2. However, for
the Instruction class, using an std::vector was optimal. Size of "0"
means that an std::vector was used.
Instruction size
0 4 8
Operand Size
0 489 544 684
1 593 487
2 469 570
4 473
8 505
This is a single thread run of ~120 shaders. For the multithreaded run
the results were the similar. The basline time was ~62sec. The
optimal configuration was an 2 for the OperandData and an
std::vector for the OperandList with a compile time of ~38sec. Similar
expiriments were done with other sets of shaders. The compile time still
improved, but not as much.
Contributes to https://github.com/KhronosGroup/SPIRV-Tools/issues/1609.
At this time, DCE will only remove an instruction if it is a combinator.
However, there are certain non-combinator instructions that can be
safely removed if their results are not used. The derivative
instructions are on example.
We are also missing some instructions from the list of combinators
those are added as the same time.
When doing if-conversion, we do not currently move code out of the side
nodes. The reason for this is that it can increase the number of
instructions that get executed because both side nods will have to be
executed now.
In this commit, we add code to move an instruction, and all of the
instructions it depends on, out of a side node and into the header of
the selection construct. However to keep the cost down, we only do it
when the two values in the OpPhi node compute the same value. This way
we have to move only one of the instructions and the other becomes
unused most of the time. So no real extra cost.
Makes the value number table an alalysis in the ir context.
Added more opcodes to list of code motion safe opcodes.
Fixes#1526.
Adds support for spliting loops whose register pressure exceeds a user
provided level. This pass will split a loop into two or more loops given
that the loop is a top level loop and that spliting the loop is legal.
Control flow is left intact for dead code elimination to remove.
This pass is enabled with the --loop-fission flag to spirv-opt.
Track live scalars in VDCE as if they were single element vectors.
Handle the extended instructions for GLSL in VDCE.
Handle composite construct instructions in VDCE.
Track live scalars in VDCE as if they were single element vectors.
Handle the extended instructions for GLSL in VDCE.
Handle composite construct instructions in VDCE.
Fixes#1511.
Currently OpImageTexelPointer operations are treat like a use of the
pointer, but it does
not look for the memory being referenced to make sure stores are not
removed.
This change teaches it so identify the memory being accessed, and
treats it as if that memory is loaded.
Fixes to #1445.
This change implements instruction folding for arithmetic operations
that are redundant, specifically:
x + 0 = 0 + x = x
x - 0 = x
0 - x = -x
x * 0 = 0 * x = 0
x * 1 = 1 * x = x
0 / x = 0
x / 1 = x
mix(a, b, 0) = a
mix(a, b, 1) = b
Cache ExtInst import id in feature manager
This allows us to avoid string lookups during optimization; for now we
just cache GLSL std450 import id but I can imagine caching more sets as
they become utilized by the optimizer.
Add tests for add/sub/mul/div/mix folding
The tests cover scalar float/double cases, and some vector cases.
Since most of the code for floating point folding is shared, the tests
for vector folding are not as exhaustive as scalar.
To test sub->negate folding I had to implement a custom fixture.
* Added for Instruction, BasicBlock, Function and Module
* Uses new disassembly functionality that can disassemble individual
instructions
* For debug use only (no caching is done)
* Each output converts module to binary, parses and outputs an
individual instruction
* Added a test for whole module output
* Disabling Microsoft checked iterator warnings
* Updated check_copyright.py to accept 2018