- No longer inline functions with early exits. Merge return can modify them so they can be inlined.
- Otherwise no functional change, should be just refactoring.
* Do merge return if the return is not at the end of the function.
We will remove the code in inlining to handle a return in the middle of
a function. To inline those functions, we need to run merge return to
move the return to the end of the function.
Generalizes the IsReadOnlyVariable() method, and related methods, so
that they can be used to ask whether pointer result ids are read-only.
Fixes#3324.
When moving blocks around, we ended up with a nullptr for a basic block,
and it was left in the list for a little bit. However, in that time, it
would end up being dereferenced while traversing the function.
To fix this, we delete it right away. This was found in an asan build
that runs our current tests. No new tests are needed, but I did add
extra check asan checks for our asan bot.
Many high-level languages like HLSL and GLSL generate termination
instructions such as return and branch from the actual part of the
high-level language code like return and if statements. This commit lets
IrLoader set `DebugScope` for termination instructions.
We need an analysis for OpenCL.DebugInfo.100 extension instructions such
as a map between function id and its DebugFunction. This commit add an
analysis for it.
* Preserve debug info in eliminate-dead-functions
The elimination of dead functions makes OpFunction operand of
DebugFunction invalid. This commit replaces the operand with
DebugInfoNone.
* Handle more cases in dead member elim
- Rewrite composite insert and extract operations on SpecConstnatOp.
- Leaves assert for Access chain instructions, which are only allowed
for kernels.
- Other operations do not require any extra code will no longer cause an
assert.
Fixes#3284.
Fixes#3282.
When DebugScope is given in SPIR-V, each instruction following the
DebugScope is from the lexical scope pointed by the DebugScope in
the high level language. We add DebugScope struction to keep the
scope information in Instruction class. When ir_loader loads
DebugScope/DebugNoScope, it keeps the scope information in
|last_dbg_scope_| and lets following instructions have that scope
information.
In terms of DebugDeclare/DebugValue, if it is in a function body
but outside of a basic block, we keep it in |debug_insts_in_header_|
of Function class. If it is in a basic block, we keep it as a normal
instruction i.e., in a instruction list of BasicBlock.
Create a pass to instrument OpDebugPrintf instructions. This pass replaces all OpDebugPrintf instructions with instructions to write a record containing the string id and the all specified values into a special printf output buffer (if space allows). This pass is designed to support the printf validation in the Vulkan validation layers.
Fixes#3210
This adds support for replacing TimeAMD with OpReadClockKHR. The scope
for OpReadClockKHR is fixed to be a subgroup because TimeAMD operates
only on subgroup.
* Implement constant folding for many transcendentals
This change adds support for folding of sin/cos/tan/asin/acos/atan,
exp/log/exp2/log2, sqrt, atan2 and pow.
The mechanism allows to use any C function to implement folding in the
future; for now I limited the actual additions to the most commonly used
intrinsics in the shaders.
Unary folder had to be tweaked to work with extended instructions - for
extended instructions, constants.size() == 2 and constants[0] ==
nullptr. This adjustment is similar to the one binary folder already
performs.
Fixes#1390.
* Fix Android build
On old versions of Android NDK, we don't get std::exp2/std::log2
because of partial C++11 support.
We do get ::exp2, but not ::log2 so we need to emulate that.
We must treat a branch to the merge node of a switch that is in the
header of a construct as a nested construced. The original merge
instruction is still needed in that case.
* Allow OpExtInst for DebugInfo between secion 9 and 10
Fixes#3086
* Handle spirv-opt errors on DebugInfo Ext
* Add IR Loader test
* Fix ir loader bug
* Handle DebugFunction/DebugTypeMember forward reference
* Add test cases (forward reference to function)
* Support old DebugInfo extension
* Validate local debug info out of function
As explained in #3118, spirv-opt merge-blocks pass causes a
spirv-val error when an OpBranch has an OpLine in front of it.
OpLoopMerge
OpBranch ; Will be killed by merge-blocks pass
OpLabel ; Will be killed by merge-blocks pass
OpLine ; will be placed between OpLoopMerge and OpBranch - error!
OpBranch
To fix this issue, this commit moves line info of OpBranch to
OpLoopMerge.
Fixes#3118
Add support for SPV_KHR_non_semantic_info
This entails a couple of changes:
- Allowing unknown OpExtInstImport that begin with the prefix `NonSemantic.`
- Allowing OpExtInst that reference any of those sets to contain unknown
ext inst instruction numbers, and assume the format is always a series of IDs
as guaranteed by the extension.
- Allowing those OpExtInst to appear in the types/variables/constants section.
- Not stripping OpString in the --strip-debug pass, since it may be referenced
by these non-semantic OpExtInsts.
- Stripping them instead in the --strip-reflect pass.
* Add adjacency validation of non-semantic OpExtInst
- We validate and test that OpExtInst cannot appear before or between
OpPhi instructions, or before/between OpFunctionParameter
instructions.
* Change non-semantic extinst type to single value
* Add helper function spvExtInstIsNonSemantic() which will check if the extinst
set is non-semantic or not, either the unknown generic value or any future
recognised non-semantic set.
* Add test of a complex non-semantic extinst
* Use DefUseManager in StripDebugInfoPass to strip some OpStrings
* Any OpString used by a non-semantic instruction cannot be stripped, all others
can so we search for uses to see if each string can be removed.
* We only do this if the non-semantic debug info extension is enabled, otherwise
all strings can be trivially removed.
* Silence -Winconsistent-missing-override in protobufs
* Make Instrumentation format version 2 the default (Step 1)
Add new interfaces without version number argument. Remove version 1
logic and tests. Version interfaces will be removed in step 2 after
layers have transitioned to new interface.
* Add error messages to InstrumentPass().
* Don't crash when folding construct of empty struct
An OpCompositeConstruct of an empty struct will be folded to a constant
under normal circumstances. However, if the id limit has been reached
and the constant cannot be generated, then other folding rules will be
tried.
These rules do not handle the case of an empty struct. We add allow it
to be handled.
Fixes http://crbug/1030194
* Changes based on the review.
Access chain indices are always interpreted as signed integers.
So use signed clamp instead of unsigned clamp. We must also
clamp to the max signed int for the index type.
Fixes#3072
* Validate that if a construct contains a header and it's merge is
reachable, the construct also contains the merge
* updated block merging to not merge into the continue
* update inlining to mark the original block of a single block loop as
the continue
* updated some tests
* remove dead code
* rename kBlockTypeHeader to kBlockTypeSelection for clarity
Wrap-opkill will create a new function, invalidating the id-to-func map.
The preserved analyses for the pass have been updated to reflect that.
Also adding consistency check for the id-to-func map. With this new
check, old tests identify this problem. No new tests are needed.
Fixes#3038
Implements the following simplifications:
(a - b) + b => a
(a * b) + (a * c) => a * (b + c)
Also adds logic to simplification to handle rules that create new operations
that might need simplification, such as the second rule above.
Only perform the second simplification if the multiplies have the add as their
only use. Otherwise this is a deoptimization of size and performance.
We have a check that ensures that the optimizer did not change the
binary when it says that it did not. However, when the binary is
converted back to a binary, we made a decision to remove OpNop
instructions. This means that any spv file that contains a NOP
originally will fail this check.
To get around this, we convert the module to a second binary that keeps
the OpNop instructions. That binary is compared against the original.
Fixes https://crbug.com/1010191
Added exports for libraries. External libraries that themselves use
libraries require all dependencies have exports, so not having exports can
cause major problems when used within other projects.
Install paths for exports are now placed in the proper directories expected
by Windows and *nix systems. Config files are generated as well, which
should work with CMake's find_package() function once installed.
We want to handle OpKill better. The wrap opkill causes lots of extra
code to be generated, even when they are not needed to avoid the main
problem: OpKill cannot be found directly in a continue construct.
This change will be more selective on which functions the OpKill will be
wrapped and inlining will avoid inlining.
Fixes#2912
* Add continue construct analysis to struct cfg analysis
Add the ability to identify which blocks are in the continue construct for a
loop, and to get functions that are called from those blocks, directly or
indirectly.
Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/2912.
There is nothing in the spir-v spec that says the last
instructions in a module cannot be OpLine or OpNoLine.
However, the code that parses the module will simply drop
these instructions.
We add code that will preserve these instructions.
Strip-debug-info is updated to remove these instructions.
Fixes https://crbug.com/1000689.
* Handle extract with no indexes
It is possible that OpCompositeExtract instructions will not have any
indexes. This is not handled well by scalar replacement and instruction
folding.
Fixes https://crbug.com/1006435
* Fix typo.
* Use OpReturn* in wrap-opkill
The warp-opkill pass is generating incorrect code. It is placing an
OpUnreachable at the end of a basic block, when the block can be
reached. We can't reach the end of the block, but we can reach the end.
Instead we will add a return instruction.
Fixes#2875.
The warp-opkill pass is generating incorrect code. It is placing an
OpUnreachable at the end of a basic block, when the block can be
reached. We can't reach the end of the block, but we can reach the end.
Instead we will add a return instruction.
Fixes#2875.
Many of the places in copy propagate arrays assumes that integer constant will be defined by an OpConstant instruction. That is not always true. We fix these spots by allowing for an OpConstantNull.
If an OpKill instruction is inlined into a continue construct, then the
spir-v is no longer valid. To avoid this issue, we do inline into an
OpKill at all. This method was chosen because it is difficult to keep
track of whether or not you are in a continue construct while changing
the function that is being inlined into. This will work well with wrap
OpKill because every will still be inlined except for the OpKill
instruction itself.
Fixes#2554Fixes#2433
This reverts commit aa9e8f5380.
* Handle id overflow in the ssa rewriter.
Remove LocalSSAElim pass at the same time. It does the same thing as the SSARewrite pass. Then even share almost all of the same code.
Fixes crbug.com/997246
The first pass applies the RelaxedPrecision decoration to all executable
instructions with float32 based type results. The second pass converts
all executable instructions with RelaxedPrecision result to the equivalent
float16 type, inserting converts where necessary.
Add the first steps to removing the AMD extension VK_AMD_shader_ballot.
Splitting up to make the PRs smaller.
Adding utilities to add capabilities and change the version of the
module.
Replaces the instructions:
OpGroupIAddNonUniformAMD = 5000
OpGroupFAddNonUniformAMD = 5001
OpGroupFMinNonUniformAMD = 5002
OpGroupUMinNonUniformAMD = 5003
OpGroupSMinNonUniformAMD = 5004
OpGroupFMaxNonUniformAMD = 5005
OpGroupUMaxNonUniformAMD = 5006
OpGroupSMaxNonUniformAMD = 5007
and extentend instructions
WriteInvocationAMD = 3
MbcntAMD = 4
Part of #2814
If they are not aliased, the function will always print the message:
"Binary unexpectedly changed despite optimizer saying there was no change"
Which is (usually) totally bogus.
Fixes#2798
* Refactor instruction folders
We want to refactor the instruction folder to allow different sets of
rules to be added to the instruction folder. We might want different
sets of rules in different circumstances.
We also need a way to add rules for extended instructions. Changes are
made to the FoldingRules class and ConstFoldingRules class to enable
that.
We added tests to check that we can fold extended instructions using the
new framework.
At the same time, I noticed that there were two tests that did not tests
what they were suppose to. They could not be easily salvaged. #2813 was
opened to track adding the new tests.
Now we need to handle id overflow when we overflow while replacing uses of the variable. While looking at this code, I noticed an error in the way we handle access chains that cannot be replaced because of overflow. Name it will make some change, and then give up by returning SuccessWithoutChange. But it was changed.
This is fixed up by returning Failure if we notice the error at the time of rewriting the users. This is for both id overflow or out-of-bounds accesses.
Code is added to "CheckUses" to remove variables that have out-of-bounds accesses from the candidate list, so we don't even try to rewrite its uses.
Fixes https://crbug.com/995032
If we run out of ids when creating a new variable, sroa does not recognize
the error, and continues doing work. This leads to segmentation faults.
Fixes https://crbug/969655
`#include <source/util/string_utils.h>` works only when we specify
`include_directories(${CMAKE_CURRENT_SOURCE_DIR}/)` in
cmake. It is hard to set the source directory as a include path
in some build systems e.g., bazel. Using the relative path easily
solves this issue. This commit uses
`#include "source/util/string_utils.h"` instead of
`#include <source/util/string_utils.h>`.
We are no able to inline OpKill instructions into a continue construct.
See #2433. However, we have to be able to inline to correctly do
legalization. This commit creates a pass that will wrap OpKill
instructions into a function of its own. That way we are able to inline
the rest of the code.
The follow up to this will be to not inline any function that contains
an OpKill.
Fixes#2726