This fixes a problem where TransformationInlineFunction could lead to
distinct instructions having identical unique ids. It adds a validity
check to detect this problem in general.
Fixes#3911.
`DebugInfoManager::AddDebugValueIfVarDeclIsVisible()` adds
OpenCL.DebugInfo.100 DebugValue from DebugDeclare only when the
DebugDeclare is visible to the give scope. It helps us correctly
handle a reference variable e.g.,
{ // scope #1.
int foo = /* init */;
{ // scope #2.
int& bar = foo;
...
in the above code, we must not propagate DebugValue of `int& bar` for
store instructions in the scope #1 because it is alive only in
the scope #2.
We have an exception: If the given DebugDeclare is used for a function
parameter, `DebugInfoManager::AddDebugValueIfVarDeclIsVisible()` has
to always add DebugValue instruction regardless
of the scope. It is because the initializer (store instruction) for
the function parameter can be out of the function parameter's scope
(the function) in particular when the function was inlined.
Without this change, the function parameter value information always
disappears whenever we run the function inlining pass and
`DebugInfoManager::AddDebugValueIfVarDeclIsVisible()`.
The validity check during fuzzing and in unit tests is strengthened to
require that every block has its enclosing function as its parent.
TransformationMergeFunctionReturns is fixed so that it ensures parents
are set appropriately.
Fixes#3907.
Currently the validator, when checking an instruction is in the correct
section, always advances the current section. This means if we have an
instruction from a previous section we'll end up reporting it as invalid
in a function definition. This error is confusing.
This CL updates the validator to check if the given opcode is from a
previous layout section before advancing the current section. If it is
from a previous layout section an error is emitted.
1. DebugValue/DebugDeclare references of load/store must not change
the behaviors of the convert-local-access-chains pass
2. We have to properly set the scope and line information of new
instructions made by the convert-local-access-chains pass
Adds a virtual method, GetFreshIds(), to Transformation. Every
transformation uses this to indicate which ids in its protobuf message
are fresh ids. This means that when replaying a sequence of
transformations the replayer can obtain a smallest id that is not in
use by the module already and that will not be used by any
transformation by necessity. Ids greater than or equal to this id
can be used as overflow ids.
Fixes#3851.
The following changes are introduced:
1. Entry block might have more than one predecessor, even if it is not
a selection/loop merge block. However Apply method asserts that
there is only one predecessor. Now, IsApplicable method ensures
that there is only one predecessor.
2. In fuzzer pass we exclude both loop headers and selection headers
as potential exit blocks.
Fixes#3827.
Before this change, the replayer would return a SPIR-V binary. This
did not allow further transforming the resulting module: it would need
to be re-parsed, and the transformation context arising from the
replayed transformations was not available. This change makes it so
that after replay an IR context and transformation context are
returned instead; the IR context can subsequently be turned into a
binary if desired.
This change paves the way for an upcoming PR to integrate spirv-reduce
with the spirv-fuzz shrinker.
TransformationContext now holds a std::unique_ptr to a FactManager,
rather than a plain pointer. This makes it easier for clients of
TransformationContext to work with heap-allocated instances of
TransformationContext, which is needed in some upcoming work.
This transformation, given a constant integer (scalar or vector) C,
constants I and S of compatible type and scalar 32-bit integer constant
N, such that C = I - S*N, adds a loop which subtracts S from I, N
times, creating a synonym for C.
The related fuzzer pass randomly chooses constants to which to add
synonyms using this transformation, and the location where they should
be added.
Fixes#3616.
In preparation for some upcoming work on the shrinker, this PR changes
the interfaces of Fuzzer, Replayer and Shrinker so that all data
relevant to each class is provided on construction, meaning that the
"Run" method can become a zero-argument method that returns a status,
transformed binary and sequence of applied transformations via a
struct.
This makes greater use of fields, so that -- especially in Fuzzer --
there is a lot less parameter passing.
This change introduces various strategies for controlling the manner
in which fuzzer passes are applied repeatedly, including infrastructure
to allow fuzzer passes to be recommended based on which passes ran
previously.
This PR modifies the FactManager methods IdIsIrrelevant and GetIrrelevantIds so
that an id is always considered irrelevant if it comes from a dead block.
Fixes#3733.
Introduces two changes:
- duplicated_exit_region refers to a correct block, regardless of the order
of the blocks in the enclosing function.
- Exclude the case where the continue target is the exit block.
This PR implements part of the add bit instruction synonym transformation.
For now, the implementation covers the OpBitwiseOr, OpBitwiseXor and
OpBitwiseAnd cases.
This PR changes the fact manager so that, when calling some of the
functions in submanagers, passes references to other submanagers if
necessary (e.g. to make consistency checks).
In particular:
- DataSynonymAndIdEquationFacts is passed to the AddFactIdIsIrrelevant
function of IrrelevantValueFacts
- IrrelevantValueFacts is passed to the AddFact functions of
DataSynonymAndIdEquationFacts
The IRContext is also passed when necessary and the calls to the
corresponding functions in FactManager were updated to be valid and
always use an updated context.
Fixes#3550.
This transformation, given the header of a selection construct with
branching instruction OpBranchConditional, flattens it.
Side-effecting operations such as OpLoad, OpStore and OpFunctionCall
are enclosed within smaller conditionals.
It is applicable if the construct does not contain inner selection
constructs or loops, or atomic or barrier instructions.
The corresponding fuzzer pass looks for selection headers and
tries to flatten them.
Needed for the issue #3544, but it does not fix it completely.
In #3636, I missed that the instruction folder may create more than a
single constant per call. Since CCP was only checking whether one
constant had been created after folding, it was wrongly thinking that
the IR had not changed.
Fixes#3738.
Adds a transformation that inserts a conditional statement with a
boolean expression of arbitrary value and duplicates a given
single-entry, single-exit region, so that it is present in each
conditional branch and will be executed regardless of which branch will
be taken.
Fixes#3614.
Motivated by integrating spirv-reduce into spirv-fuzz, so that an
added function can be made smaller during shrinking, this adds support
in spirv-reduce for asking reduction to be restricted to the
instructions of a single specified function.
This PR adds the NestingDepth function to StructuredCFGAnalysis.
This function, given a block id, returns the number of merge
constructs containing it.
This is needed by spirv-fuzz, but it makes sense to add it to
StructuredCFGAnalaysis, which contains related functionalities.
This transformation takes an OpSelect instruction and replaces it with
a conditional branch, selecting the correct value using an OpPhi
instruction.
Fixes part of the issue #3544.
The Vulkan 1.2.152 headers/spec now include Valid Usage IDs (VUID)
for every BuiltIn. This change adds labeling to help aid tracking
the coverage gap in Vulkan targeted validation. This is done with
a script in the Validation Layers that parses the source for VUID
strings, both that there is an implementation and a test for it.
There are 2 main changes:
1. A VUID string is added where applicable. It is wrapped in a function
that at runtimes checks if Vulkan is the target so that other targets
will not be effected as the output error only changes for Vulkan and
the overhead only occurs if there is an error at all.
2. For unit test, a new parameter value was added to allow make sure
that for each VUID there is a matching test. There are cases where the
same unit test checks multiple VUs via the parameter feature of gtest.
For this, I added a custom gMock Matcher to simply loop over a list
of VUID strings
Pointer (if VariablePointers is enabled) to find sets of potential
synonyms.
However, some instructions with these types cannot be used in an OpPhi:
- OpFunction cannot be used as a value
- OpUndef should not be used, because it yields an undefined value for
each use
Fixes#3761.
This transformation takes the id of an OpPhi instruction, of a dead
predecessor of the block containing it and a replacement id of
available to use and of the same type as the OpPhi, and changes
the id in the OpPhi corresponding to the given predecessor.
For example, %id = OpPhi %type %v1 %p1 %v2 %p2
becomes %id = OpPhi %type %v3 %p1 %v2 %p2
if the transformation is given %id, %p1 and %v3, %p1 is a dead block,
%v3 is type type and it is available to use at the end of %p1.
The fuzzer pass randomly decides to apply the transformation to OpPhi
instructions for which at least one of the predecessors is dead
Fixes#3726.
A transformation that replaces the use of an irrelevant id with
another id of the same type.
The related fuzzer pass, for every use of an irrelevant id,
checks whether the id can be replaced in that use by another
id of the same type and randomly decides whether to replace
it.
Fixes#3503.
This PR extends CallGraph with functions to return:
- a list of functions in lexicographical order, with respect to
function calls
- the maximum loop nesting depth that a function can be called from
(computed interprocedurally, e.g. if foo() calls bar() at depth 2
and bar() calls baz() at depth 1, the maximum depth of baz() will
be 3).
When we update OpenCL.DebugInfo.100 lexical scopes e.g., DebugFunction,
we have to replace DebugScope of each instruction that uses the lexical
scope correctly.
A transformation that adds new OpPhi instructions to blocks with >=1
predecessors, so that its value depends on previously-defined ids of
the right type, which are all synonymous. This instruction is also
recorded as synonymous to the others.
The related fuzzer pass still needs to be implemented.
Fixes#3592 .
This change adds the notion of "overflow ids", which can be used
during shrinking to facilitate applying transformations that would
otherwise have become inapplicable due to earlier transformations
being removed.
It is possible that the result of a void function call is used. In case
it is used, we need something that still defines its id after inlining.
We use an undef for that purpose.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/3704
CCP should mark IR changed if it created new constants.
This fixes#3636.
When CCP is simulating statements, it will sometimes successfully fold
an instruction, which laters switches to varying. The initial fold of
the instruction may generate a new constant K.
The problem we were running into is when K never gets propagated to the
IR. Its definition will still exist, so CCP should mark the IR modified
in this case.
In fixing this bug, I noticed that an existing test was suffering from
the same bug. The change also makes PassTest::SinglePassRunAndMatch()
return the result from the pass, so that we can check that the pass
marks the IR modified in this case.
Adds FuzzerPassAddCompositeInserts, which randomly adds new
OpCompositeInsert instructions. Each OpCompositeInsert instruction
yields a copy of an original composite with one subcomponent replaced
with an existing or newly added object. Synonym facts are added for the
unchanged components in the original and added composite, and for the
replaced subcomponent and the object, if possible.
Fixes#2859
For FuzzerPassAddParameters, adds pointer types (that have the storage
class Function or Private) to the pool of available types for new
parameters. If there are no variables of the chosen pointer type, it
invokes TransformationAddLocalVariable / TransformationAddGlobalVariable
to add one.
Part of #3403
In the existing code, ADCE pass does not check DebugScope of an
instruction when it checks the users of each instruction, which results
in removing OpenCL.Debug.100 instructions that are only used by
DebugScope. This commit lets ADCE pass add DebugScope of an instruction
to the live instruction set when the instruction is added to the live
instruction set.
`TransformationAddDeadBlock` did not check whether the existing block
(that will become a selection header) dominates its successor block (that
will become its merge block).
This change adds the check.
Fixes#3690.
Some OpenCL.DebugInfo.100 instructions such as DebugGlobalVariable
and DebugLocalVariable have the Type operand. This commit allows them to
use DebugTypeTemplate for the Type operand.
Improves the code coverage of tests for the following transformations:
1. TransformationAddRelaxedDecoration
2. TransformationReplaceCopyMemoryWithLoadStore
3. TransformationReplaceCopyObjectWithStoreLoad
4. TransformationReplaceLoadStoreWithCopyMemory
5. TransformationReplaceAddSubMulWithCarryingExtended
Support identical predecessors in TransformationPropagateInstructionUp.
A basic block may have multiple identical predecessors as follows:
%1 = OpLabel
OpSelectionMerge %2 None
OpBranchConditional %true %2 %2
%2 = OpLabel
...
This case wasn't supported before.
* No longer blindly add global non-semantic info instructions to global
types and values
* functions now have a list of non-semantic instructions that succeed
them in the global scope
* global non-semantic instructions go in global types and values if
they appear before any function, otherwise they are attached to the
immediate function predecessor in the module
* changed ADCE to use the function removal utility
* Modified EliminateFunction to have special handling for non-semantic
instructions in the global scope
* non-semantic instructions are moved to an earlier function (or full
global set) if the function they are attached to is eliminated
* Added IRContext::KillNonSemanticInfo to remove the tree of
non-semantic instructions that use an instruction
* this is used in function elimination
* There is still significant work in the optimizer to handle
non-semantic instructions fully in the optimizer
`TransformationAddTypeFloat` and `TransformationAddTypeInt` did not check whether the required capabilities were present when adding 16-bit, 64-bit, and 8-bit types.
This change adds these checks in the `IsApplicable` method of each transformation.
Fixes#3669.
`TransformationReplaceIdWithSynonym` is careful to avoid replacing id uses that index into a struct with synonyms because the indices must only be `OpConstant` instructions. However, the check only considered `OpAccessChain` instructions, even though the same restriction applies to `OpInBoundsAccessChain`, `OpPtrAccessChain`, etc.
This change extends the check to include all access chain instructions.
Fixes#3671.
Given an instruction (that may use an OpPhi result from the same block as an input operand), try to clone the instruction into each predecessor block, replacing the input operand with the corresponding OpPhi input operand in each case, if necessary.
Fixes#3458.
Replaces OpIAdd with OpIAddCarry, OpISub with OpISubBorrow, OpIMul with
OpUMulExtended or OpSMulExtended and stores the result into a fresh_id
representing a structure. Extracts the first element of the result into
the original result_id. This value is the same as the result of the
original instruction.
Fixes#3577
This PR changes FuzzerPassOutlineFunctions so that it uses some
transformation that make the TransformationOutlineFunction
transformation applicable in more cases. See the discussion in
#3095 for more details.
Fixes#3095.
The updated OpenCL.DebugInfo.100 spec says that we can use
variable for "Component Count" operand of DebugTypeArray i.e.,
DebugLocalVariable or DebugGlobalVariable. This commit updates spirv-val
based on the spec update.
For each local variable, ssa-rewrite should remove its DebugDeclare
if and only if it is replaced by any number of DebugValues for store
and phi instructions.
For example, when we have two variables `a` whose DebugDeclare
will be replaced to DebugValues by ssa-rewrite pass and `b` whose
DebugDeclare will not be replaced, we have to remove only DebugDeclare
for `a`, not `b`.
This PR introduces TransformationAddLoopPreheader, which, given
a loop header and enough fresh ids, adds a loop preheader, updating
all the references so that this new block is the only out-of-loop
predecessor of the header, which branches unconditionally to the
header.
See the discussion in #3095.
When we copy the loop body to unroll it, we have to copy its
instructions but DebugDeclare or DebugValue used for the declaration
i.e., DebugValue with Deref must not be copied and only the first block
can contain those instructions.
* Generate ext inst table for reflection
* Change build to use grammar files from SPIRV-Headers instead of
SPIRV-Tools
* Add enum for clspv reflection extended instruction set
* count it as non-semantic
* validate clspv reflection extended instruction set
* Remove local extended inst sets
* update headers deps
* Update nbuilds to use grammars from SPIRV-Headers instead of
local duplicates
Rename the `${SPIRV_TOOLS}` target to `${SPIRV_TOOLS}-static` and alias `${SPIRV_TOOLS}` to either `${SPIRV_TOOLS}-static` or `${SPIRV_TOOLS}-shared` depending on `BUILD_SHARED_LIBS`.
Re-point all internal uses of `${SPIRV_TOOLS}` to `${SPIRV_TOOLS}-static`.
`${SPIRV_TOOLS}-static` is explicitly renamed to just `${SPIRV_TOOLS}` to ensure the name does not change from current behavior.
Build the `SPIRV-Tools-*` libraries as static, as this is what they always were.
Force the external targets `gmock` and `effcee` to be built statically. These either do not support being built as shared libraries, or require special flags.
Issue: #3482
* Avoid operand type range checks
Deprecates the SPV_OPERAND_TYPE_FIRST_* and SPV_OPERAND_TYPE_LAST_*
macros.
The "variable" and "optional" operand types are only for internal use.
Export spvOperandIsConcrete instead, as that should cover intended
external uses.
Test that each operand type is classified either as one of:
- a sentinel value
- a concrete operand type
- an optional operand type (which includes variable-expansion types)
Test that each concrete and optional non-variable operand type
has a name for use internally when generating messages.
Co-authored-by: Steven Perron <stevenperron@google.com>
1. Set the debug scope and line information for the new replacement
instructions.
2. Replace DebugDeclare and DebugValue if their OpVariable or value
operands are replaced by scalars. It uses 'Indexes' operand of
DebugValue. For example,
struct S { int a; int b;}
S foo; // before scalar replacement
int foo_a; // after scalar replacement
int foo_b;
DebugDeclare %dbg_foo %foo %null_expr // before
DebugValue %dbg_foo %foo_a %Deref_expr 0 // after
DebugValue %dbg_foo %foo_b %Deref_expr 1 // means Value(foo.members[1]) == Deref(%foo_b)
* spirv-val: Pipes are no longer required for OpenCL 1.2 EP
This was removed from the OpenCL SPIR-V Environment Specification via
https://github.com/KhronosGroup/OpenCL-Docs/pull/56.
* spirv-val: Sort headers according to clang-format
* spirv-val: Groups capability is required since OpenCL 2.0
Adds a transformation that takes a pair of instruction descriptors to
OpLoad and OpStore that have the same intermediate value and replaces
the OpStore with an equivalent OpCopyMemory.
Fixes#3353.
Right now, TransformationRecordSynonymousConstants requires the type
ids of two candidate constants to be exactly the same.
This PR adds an exception for integer constants, which can be
considered equivalent even if their signedness is different.
This applies to both integers and vector constants.
The IsApplicable method of ReplaceIdWithSynonym is also updated so
that, in the case of two integer constants which don't have the same
type, they can only be swapped in particular instructions (those
that don't take the signedness into consideration).
Fixes#3536.
This PR generalises TransformationAddAccessChain so that dynamic
indices for non-struct composites (with clamping to ensure that
accesses are in-bound) are allowed.
The transformation will add instructions to clamp any index to
a non-struct composite, regardless of whether it is a constant
or not.
Fixes#3179.
Fixes an issue with the shrinker, where the message consumer set for
the shrinker was not being passed on to the replay object that the
shrinker creates. This meant that messages generated during replay
would cause an exception to be thrown.
Adds a transformation that replaces instruction OpCopyMemory with
loading the source variable to an intermediate value and storing this
value into the target variable of the original OpCopyMemory instruction.
Fixes#3352
Adds a transformation that replaces instruction OpCopyObject with
storing into a new variable and immediately loading this variable to
|result_id| of the original OpCopyObject instruction.
Fixes#3351.
Essentially, it marks all DebugInfo instructions in functions (and their operands) as live. It treats DebugDeclare and DebugValue with Deref as loads and so marks Stores of their variables as live.
It marks each DebugGlobalVariables as live except for its variable. After closure, it rechecks if the variable is live. If not, the DebugGlobalVariable instruction's variable operand is set to DebugInfoNone, per the DebugInfo spec.
Implemented AreEquivalentConstants method to check equivalency of
constants, changing IsApplicable method of
TransformationRecordSynonymousConstants to allow recording equivalence
of composite constants; added some tests to check this.
Tests with arrays and matrices still need to be added.
Fixes#3533.
Add TransformationAddRelaxedDecoration, which adds the RelaxedPrecision decoration to ids of numeric instructions (those yielding 32-bit ints or floats) in dead blocks.
Fixes#3502
Now that WebGPU ingests WGSL instead of SPIR-V,
there is no need to be so strict about the memory model.
Allow any memory model that is already allowed by Vulkan 1.0,
either directly or via an existing.
Fixes#3529
* Make BasicBlock::reachable() only consider static reachability
* Fix reachability calculation to be independent of block order
* add tests
This pass basically follows the same process as ssa-rewrite: it adds a DebugValue after each Store and removes the DebugDeclare or DebugValue Deref. It only does this if all instructions that are dependent on the Store are Loads and are replaced.
This commit lets the vector DCE pass preserve the OpenCL.DebugInfo.100
information properly. When the vector DCE pass determines the liveness
of instructions, the debug instructions must not affect the decision. In
addition, when it kills some instructions, it has to kill DebugValue
instructions that use the killed instructions. When it updates some
composite values to meaningful values (not undef), it has to remove
DebugValue because the value information becomes incorrect.
The decision to reduce the load must be not affected by debug
instructions. For example, even when a DebugValue references a
result id of a loaded composite value, this change lets the
reduce-load-size pass reduce the load if the full composite value is not
used anywhere other than the DebugValue.
When the pass replaces the local variable `OpVariable` ids to their
corresponding pointers, we have to update operands of DebugValue or
DebugDeclare instructions.
When there are multiple entries and the shader has a variable with
WorkGroup storage class, those multiple entry functions store values to
the variable. Since ADCE pass uses def-use chains to propagate the work
list, some of instructions in the work list are not actually a part of
the currently processed function. As a result, it adds instructions in
other functions and put them in |live_insts_|. However, it does not
have the control flow information for those instructions in other
functions i.e., |block2headerBranch_| and |header2nextHeaderBranch_|.
When it processes those instructions (they are added when it processes a
different function), it skips handling them because they are already in
|live_insts_| and does not check |block2headerBranch_| and
|header2nextHeaderBranch_|, which results in skipping some branches.
Even though those branches are live branches, it considers they are dead
branches.
For many spirv-opt passes such as simplify-instructions pass, we have to
correctly clear the OpenCL.DebugInfo.100 debug information for
KillInst() and ReplaceAllUses(). If we keep some debug information that
disappeared because of KillInst() and ReplaceAllUses(), adding new
DebugValue instructions based on the existing DebugDeclare information
will generate incorrect information. This CL update DebugInfoManager
and IRContext to correctly clear debug information.
* Add validation that input/output locations are assigned correctly
* Account for component assignment
* Account for 4 components per location and track the combined
coordinate
* Account for multiple output indexes
* handle specifically arrayed variables
* Arrayed variables that specify a component get locations determined
per index of the array for the sub type
* Added tests that check locations and components can be assigned
to interleave an array's locations and components
* Fix up which interfaces are allowed to be arrayed for various shader
stages based on glslang
* Support load and extract pattern in desc_sroa.
* Fix typo in comments.
* Load replacement var before use; and added test.
* fix formatting
* Address code review comments.
Add OpenCL.DebugInfo.100 `DebugValue` instructions for store
and phi instructions of local variables to provide the debugger with
the updated values of local variables correctly.
* Debug info preservation in dead branch elimination
The DeadBranchElimPass class already handles the OpenCL.DebugInfo.100
instructions correctly. This commit just adds a unit test to make sure
it preserves the information properly.
* Add unit test of ReplaceAllWith for debug instruction
* Debug info preservation in ccp pass
For constant propagation, the ccp pass already replaces the result id of
a value with a result id of the corresponding constant value. As a part
of the replacement, it correctly updates the operands of
DebugValue/DebugDeclare as well. Since we do not have to any addition
work other than the ccp pass itself, this commit just adds unit tests to
check the debug information preservation.
Need to actually instantiate them with TEST_P in the same source file.
Remove the instantiation of RoundTripTest from unit.cpp because
it's viewed as having no instantiations.
ImageGatherBiasLodAMD makes it possible to use the Bias and Lod image
operands with OpImageGather and OpImageSparseGather. This commit makes
sure the validator checks for that capability before reporting errors
and adds a few positive tests.
Handles the OpenCL100Debug extension in inlining. It preserves the information that is available while also adding the debug inlined at for all of the inlining that it does.
Reject folding comparisons with unfoldable types.
Fixes#3343
When CCP is evaluating an instruction, it was trying to fold a
comparison with 64 bit integers. This was causing a fold failure later
since the folder still cannot deal with 64 bit integers.
ssa-rewrite fails in `MemPass::GetPtr()` when the SPIR-V code contains
`OpLoad` for the result id of `OpConstantNull` because of the out of
index access for an operand to get the base address. This commit fixes
it.
Fixes#3344
In this PR, the classes that represent the adjust branch weights
transformation and fuzzer pass were implemented. This transformation
adjusts the branch weights of a OpBranchConditional instruction.
- No longer inline functions with early exits. Merge return can modify them so they can be inlined.
- Otherwise no functional change, should be just refactoring.
Extends the pass for removing unused instructions so that it can
remove global declarations (such as types and variables) that are only
used by decorations with which they are intimately connected, such as
descriptor set and binding decorations.
* Do merge return if the return is not at the end of the function.
We will remove the code in inlining to handle a return in the middle of
a function. To inline those functions, we need to run merge return to
move the return to the end of the function.
Generalizes the IsReadOnlyVariable() method, and related methods, so
that they can be used to ask whether pointer result ids are read-only.
Fixes#3324.
Many high-level languages like HLSL and GLSL generate termination
instructions such as return and branch from the actual part of the
high-level language code like return and if statements. This commit lets
IrLoader set `DebugScope` for termination instructions.
The outliner would outline regions ending with a loop header, making
the block containing the call to the outlined function serve as the
loop header. This, however, is incorrect in general, since the whole
outlined function -- rather than just the exit block for the region --
would end up getting called every time the loop would iterate.
This change restricts the outliner so that the last block in a region
cannot be a loop header.
We need an analysis for OpenCL.DebugInfo.100 extension instructions such
as a map between function id and its DebugFunction. This commit add an
analysis for it.
It has been resolved that statically out-of-bounds accesses are not
invalid in SPIR-V (they lead to undefind behaviour at runtime but
should not cause a module to be rejected during validation). This
change tolerates such accesses in donated code, clamping them in-bound
as part of making a function live-safe.
The Sample argument of OpImageTexelPointer is sometimes required to be
a zero constant. It thus cannot be replaced with a synonym in
general. This change avoids replacing this argument with a synonym.
The fact manager maintains an equivalence relation on data descriptors
that tracks when one data descriptor could be used in place of
another. An algorithm to compute the closure of such facts allows
deducing new synonym facts from existing facts. E.g., for two 2D
vectors u and v it is known that u.x is synonymous with v.x and u.y is
synonymous with v.y, it can be deduced that u and v are synonymous.
The closure computation algorithm is very expensive if we get large
equivalence relations.
This change addresses this in three ways:
- The size of equivalence relations is reduced by limiting the extent
to which the components of a composite are recursively noted as
being equivalent, so that when we have large synonymous arrays we do
not record all array elements as being pairwise equivalent.
- When computing the closure of facts, equivalence classes above a
certain size are simply skipped (which can lead to missed facts)
- The closure computation is performed less frequently - it is invoked
explicitly before fuzzer passes that will benefit from data synonym
facts. A new transformation is used to control its invocation, so
that fuzzing and replaying do not get out of sync.
The change also tidies up the order in which some getters are declared
in FuzzerContext.
Adds an extra condition on when a region can be outlined to avoid the
case where a region ends with a loop head but such that the loop's
continue target is in the region. (Outlining such a region would mean
that the loop merge is in the original function and the continue target
in the outlined function.)
The function outliner uses a struct to return ids that a region
generates and that are used outside that region. If these ids have
pointer type this would result in a struct with pointer members, which
leads to illegal loading from non-logical pointers if logical
addressing is used. This change bans that outlining possibility.
Provides support for runtime arrays in the code that traverses
composite types when checking applicability of transformations that
replace ids with synonyms.
Demotes the image storage class to Private during donation. Also
fixes an issue where instructions that depended on non-donated global
values would not be handled properly.
The SPIR-V data rules say that all uses of an OpSampledImage
instruction must be in the same block as the instruction, and highly
restrict those instructions that can consume the result id of an
OpSampledImage.
This adapts the transformations that split blocks and create synonyms
to avoid separating an OpSampledImage use from its definition, and to
avoid synonym-creation instructions such as OpCopyObject consuming an
OpSampledImage result id.
* Preserve debug info in eliminate-dead-functions
The elimination of dead functions makes OpFunction operand of
DebugFunction invalid. This commit replaces the operand with
DebugInfoNone.
* Handle more cases in dead member elim
- Rewrite composite insert and extract operations on SpecConstnatOp.
- Leaves assert for Access chain instructions, which are only allowed
for kernels.
- Other operations do not require any extra code will no longer cause an
assert.
Fixes#3284.
Fixes#3282.
The management of equation facts suffered from two problems:
(1) The processing of an equation fact required the data descriptors
used in the equation to be in canonical form. However, during
fact processing it can be deduced that certain data descriptors
are equivalent, causing their equivalence classes to be merged,
and that could cause previously canonical data descriptors to no
longer be canonical.
(2) Related to this, if id equations were known about a canonical data
descriptor dd1, and other id equations known about a different
canonical data descriptor dd2, the equation facts about these data
descriptors were not being merged in the event that dd1 and dd2
were deduced to be equivalent.
This changes solves (1) by not requiring equation facts to be in
canonical form while processing them, but instead always checking
whether (not necessary canonical) data descriptors are equivalent when
looking for corollaries of equation facts, rather than comparing them
using ==.
Problem (2) is solved by adding logic to merge sets of equations when
data descriptors are made equivalent.
In addition, the change also requires elements to be registered in an
equivalence relation before they can be made equivalent, rather than
being added (if not already present) at the point of being made
equivalent.
This change increases the extent to which arbitrary SPIR-V can be used
by the fuzzer pass that donates modules. It handles the case where
various ingredients (such as types, variables and particular
instructions) cannot be donated by omitting them, and then either
omitting their dependencies or replacing their dependencies with
alternative instructions.
The change pays particular attention to allowing code that manipulates
image types to be handled (by skipping anything image-specific).
(1) Runtime arrays are turned into fixed-size arrays, by turning
OpTypeRuntimeArray into OpTypeArray and uses of OpArrayLength into
uses of the constant used for the length of the fixed-size array.
(2) Atomic instructions are not donated, and uses of their results are
replaced with uses of constants of the result type.
The fuzzer pass that constructs composites had an issue where it would
regard isomorphic but distinct structs (similarly arrays) as being
interchangeable when constructing composites. This change fixes the
problem by relying less on the type manager.
Some transformations (e.g. TransformationAddFunction) rely on running
the validator to decide whether the transformation is applicable. A
recent change allowed spirv-fuzz to take validator options, to cater
for the case where a module should be considered valid under
particular conditions. However, validation during the checking of
transformations had no access to these validator options.
This change introduced TransformationContext, which currently consists
of a fact manager and a set of validator options, but could in the
future have other fields corresponding to other objects that it is
useful to have access to when applying transformations. Now, instead
of checking and applying transformations in the context of a
FactManager, a TransformationContext is used. This gives access to
the fact manager as before, and also access to the validator options
when they are needed.
When DebugScope is given in SPIR-V, each instruction following the
DebugScope is from the lexical scope pointed by the DebugScope in
the high level language. We add DebugScope struction to keep the
scope information in Instruction class. When ir_loader loads
DebugScope/DebugNoScope, it keeps the scope information in
|last_dbg_scope_| and lets following instructions have that scope
information.
In terms of DebugDeclare/DebugValue, if it is in a function body
but outside of a basic block, we keep it in |debug_insts_in_header_|
of Function class. If it is in a basic block, we keep it as a normal
instruction i.e., in a instruction list of BasicBlock.
Create a pass to instrument OpDebugPrintf instructions. This pass replaces all OpDebugPrintf instructions with instructions to write a record containing the string id and the all specified values into a special printf output buffer (if space allows). This pass is designed to support the printf validation in the Vulkan validation layers.
Fixes#3210
In this PR, the classes that represent the toggle access chain
instruction transformation and fuzzer pass were implemented. This
transformation toggles the instructions OpAccessChain and
OpInBoundsAccessChain between them.
Fixes#3193.
This introduces a new fuzzer pass to add instructions to the module
that define equations, and support in the fact manager for recording
equation facts and deducing synonym facts from equation facts.
Initially the only equations that are supported involve OpIAdd,
OpISub, OpSNegate and OpLogicalNot, but there is scope for adding
support for equations over various other operators.
According to SPV_AMD_shader_image_load_store_lod spec, Lod operand is
valid with OpImageRead, OpImageWrite, or OpImageSparseRead if the
extension is enabled.
This change adds a fuzzer pass that sprinkles access chain
instructions into a module at random. This allows other passes to
have a richer set of pointers available to them, in particular the
passes that add loads and stores.
Adds a fuzzer pass that inserts function calls into the module at
random. Calls from dead blocks can be arbitrary (so long as they do
not introduce recursion), while calls from other blocks can only be to
livesafe functions.
The change fixes some oversights in transformations to replace
constants with uniforms and to obfuscate constants which testing of
this fuzzer pass identified.
This change ensures that global and local variables donated from other
modules are always initialized at their declaration in the module
being transformed. This is to help limit issues related to undefined
behaviour that might arise due to accessing uninitialized memory.
The change also introduces some helper functions in fuzzer_util to
make it easier to find the pointee types of pointer types.
This change adds fuzzer passes that sprinkle loads and stores into a
module at random, with stores restricted to occur in either dead
blocks, or to use pointers for which it is known that the pointee
value does not influence the module's overall behaviour.
The change also generalises the VariableValueIsArbitrary fact to
PointeeValueIsIrrelevant, to allow stores through access chains or
object copies of variables whose values are known to be irrelevant.
The change includes some other minor refactorings.
Adds two new fuzzer passes to add variables to a module: one that adds
Private storage class global variables, another that adds Function
storage class local variables.
If the fuzzer object-copies a pointer we would like to be able to
perform loads from the copy (and stores to it, if its value is known
not to matter). Undefined and null pointers present a problem here,
so this change disallows copying them.
This adds support for replacing TimeAMD with OpReadClockKHR. The scope
for OpReadClockKHR is fixed to be a subgroup because TimeAMD operates
only on subgroup.
* Implement constant folding for many transcendentals
This change adds support for folding of sin/cos/tan/asin/acos/atan,
exp/log/exp2/log2, sqrt, atan2 and pow.
The mechanism allows to use any C function to implement folding in the
future; for now I limited the actual additions to the most commonly used
intrinsics in the shaders.
Unary folder had to be tweaked to work with extended instructions - for
extended instructions, constants.size() == 2 and constants[0] ==
nullptr. This adjustment is similar to the one binary folder already
performs.
Fixes#1390.
* Fix Android build
On old versions of Android NDK, we don't get std::exp2/std::log2
because of partial C++11 support.
We do get ::exp2, but not ::log2 so we need to emulate that.
This change adds a new kind of fact to the fact manager, which records
when a variable (or pointer parameter) refers to an arbitrary value,
so that anything can be stored to it, without affecting the observable
behaviour of the module, and nothing can be guaranteed about values
loaded from it. Donated modules are the current source of such
variables, and other transformations, such as outlining, have been
adapted to propagate these facts appropriately.
This change allows the generator to (optionally and at random) make
the functions of a module "livesafe" during donation. This involves
introducing a loop limiter variable to each function and gating the
number of total loop iterations for the function using that variable.
It also involves eliminating OpKill and OpUnreachable instructions
(changing them to OpReturn/OpReturnValue), and clamping access chain
indices so that they are always in-bounds.
We must treat a branch to the merge node of a switch that is in the
header of a construct as a nested construced. The original merge
instruction is still needed in that case.
Fixes#3139
* If the header of the construct is also a merge block, jump to the
associated header instead of the immediate dominator
* prevents spurious failures from unrelated constructs
* new tests
This new API lets clients request a minimal spv_target_env value
that supports a given Vulkan and SPIR-V version, by a generic
numbering scheme (as already defined by Vulkan and SPIR-V specs).
This breaks a formal source dependency from Glslang to SPIRV-Tools.
When a new API is rolled out, such as Vulkan 1.2, Glslang currently
needs to reference a specific SPIRV-Tools enum by name.
* Allow OpExtInst for DebugInfo between secion 9 and 10
Fixes#3086
* Handle spirv-opt errors on DebugInfo Ext
* Add IR Loader test
* Fix ir loader bug
* Handle DebugFunction/DebugTypeMember forward reference
* Add test cases (forward reference to function)
* Support old DebugInfo extension
* Validate local debug info out of function
This change refactors some code for walking access chain indexes to
make it mirror the structure of other code (to improve readability in
the first instance and potentially enable a future refactoring to
extract common code), and fixes a problem related to module donation
and function types.
As explained in #3118, spirv-opt merge-blocks pass causes a
spirv-val error when an OpBranch has an OpLine in front of it.
OpLoopMerge
OpBranch ; Will be killed by merge-blocks pass
OpLabel ; Will be killed by merge-blocks pass
OpLine ; will be placed between OpLoopMerge and OpBranch - error!
OpBranch
To fix this issue, this commit moves line info of OpBranch to
OpLoopMerge.
Fixes#3118
This adds a new kind of fact to the fact manager that knows whether a
block is dead - i.e. guaranteed to be statically unreachable - and a
new transformation for adding a selection construct to a CFG that
conditionally branches to a fresh, dead block, such that the branch
will never be dynamically taken. Transformations that may create new
blocks ('split block' and 'outline function') are updated to propagate
dead block facts to newly-created blocks where appropriate. A fuzzer
pass randomly adds dead blocks to the module.
Future transformations will be able to exploit the fact that such
blocks are known to be dead.
This change adds a fuzzer pass that allows code from other SPIR-V
modules to be donated into the module under transformation. It also
changes the command-line options of the tools so that, in fuzzing
mode, a file must be specified that contains the names of available
donor modules.
* Clone opencl.debuginfo.100 grammar from debuginfo grammar
Update version number to 200 revision 2
* Apply content from OpenCL.DebugInfo.100 extension text
* Rename grammar file
* Support OpenCL.DebugInfo.100 extended instructions
Add support for prefixing operand type names, to disambiguate
them between different instruction sets.
* Add tests for OpenCL.DebugInfo.100
* Support lookup of OpenCL.DebugInfo.100 extinst
* Add tests for enum values
* Recognize 2017-2019 as copyright date range
* Android.mk: support OpenCL.DebugInfo.100 extended instruction set
Also, stop generating core instruction tables for non-unified1 versions
of the grammar.
* Imported entity operand type is concrete
* Bazel: Suppoort OpenCL.DebugInfo.100
* BUILD.gn: Support OpenCL.DebugInfo.100
In the context of SPIR-V 1.4 or higher, global variables cannot be
used by an instruction unless they are listed in the interface of all
entry points that might invoke the instruction. This change
conservatively adds new global variables to the interfaces of all
entry points (if the SPIR-V version is 1.4 or higher).
Issue #3111 notes that a more rigorous approach to entry point
interfaces could be taken in spirv-fuzz, which would allow being less
conservative here.
This change prevents the spirv-fuzz function outliner from outlining a
region that uses the result of an OpAccessChain not defined inside the
region. Such accesses were turning into parameters to the outlined
function, and the result of an OpAccessChain cannot be passed as a
function parameter according to the SPIR-V specification.
Add support for SPV_KHR_non_semantic_info
This entails a couple of changes:
- Allowing unknown OpExtInstImport that begin with the prefix `NonSemantic.`
- Allowing OpExtInst that reference any of those sets to contain unknown
ext inst instruction numbers, and assume the format is always a series of IDs
as guaranteed by the extension.
- Allowing those OpExtInst to appear in the types/variables/constants section.
- Not stripping OpString in the --strip-debug pass, since it may be referenced
by these non-semantic OpExtInsts.
- Stripping them instead in the --strip-reflect pass.
* Add adjacency validation of non-semantic OpExtInst
- We validate and test that OpExtInst cannot appear before or between
OpPhi instructions, or before/between OpFunctionParameter
instructions.
* Change non-semantic extinst type to single value
* Add helper function spvExtInstIsNonSemantic() which will check if the extinst
set is non-semantic or not, either the unknown generic value or any future
recognised non-semantic set.
* Add test of a complex non-semantic extinst
* Use DefUseManager in StripDebugInfoPass to strip some OpStrings
* Any OpString used by a non-semantic instruction cannot be stripped, all others
can so we search for uses to see if each string can be removed.
* We only do this if the non-semantic debug info extension is enabled, otherwise
all strings can be trivially removed.
* Silence -Winconsistent-missing-override in protobufs
* Make Instrumentation format version 2 the default (Step 1)
Add new interfaces without version number argument. Remove version 1
logic and tests. Version interfaces will be removed in step 2 after
layers have transitioned to new interface.
* Add error messages to InstrumentPass().
* Don't crash when folding construct of empty struct
An OpCompositeConstruct of an empty struct will be folded to a constant
under normal circumstances. However, if the id limit has been reached
and the constant cannot be generated, then other folding rules will be
tried.
These rules do not handle the case of an empty struct. We add allow it
to be handled.
Fixes http://crbug/1030194
* Changes based on the review.
A new transformation and associated fuzzer pass in spirv-fuzz that
selects single-entry single-exit control flow graph regions and for
each selected region outlines the region into a new function and
replaces the original region with a call to this function.
The passes that add dead breaks and continues suffer from the
challenge that a new control flow graph edge can change dominance
information, leading to the potenital for definitions to no longer
dominate their uses. The attempt at guarding against this was known
to be incomplete. This change calls on the SPIR-V validator to do the
necessary checking: in deciding whether adding such an edge would be
legitimate, we clone the module, add the edge, and use the validator
to check whether the transformed clone is valid.
This strategy is heavy-weight, and should be used sparingly, but seems
like a good option when the validity of transformations is intricate,
to avoid reimplementing swathes of validation logic in the fuzzer.
Fixes#2919.
Access chain indices are always interpreted as signed integers.
So use signed clamp instead of unsigned clamp. We must also
clamp to the max signed int for the index type.
Fixes#3072
* Validate that if a construct contains a header and it's merge is
reachable, the construct also contains the merge
* updated block merging to not merge into the continue
* update inlining to mark the original block of a single block loop as
the continue
* updated some tests
* remove dead code
* rename kBlockTypeHeader to kBlockTypeSelection for clarity
Adds an option to run the validator on the SPIR-V binary after each
fuzzer pass has been applied, to help identify when the fuzzer has
made the module invalid. Also adds a helper method to allow dumping
of the sequence of transformations that have been applied to a JSON
file.
Prior to this change, TransformationReplaceIdWithSynonym was designed
to be able to replace an id with some synonymous data descriptor,
possibly necessitating extracting from a composite into a fresh id in
order to get at the synonymous data. This change simplifies things so
that TransformationReplaceIdWithSynonym just allows one id to be
replaced by another id. It is the responsibility of the associated
fuzzer pass - FuzzerPassApplyIdSynonyms - to perform the extraction
operations, using e.g. TransformationCompositeExtract.
Re-enable OpReadClockKHR validation
Fixes#2952
* Refactor some common scope validation
* Perform correct validation for scope in OpReadClockKHR
* Scope must be Subgroup or Device
* new tests
Inroduces a new transformation that adds a vector shuffle instruction
to the module, with associated facts about how the result vector of
the shuffle relates to the input vectors.
A fuzzer pass to add such transformations is not yet in place.
When a data synonym fact about two composites is added, data synonym
facts between all sub-components of the composites are also added.
Furthermore, when data synonym facts been all sub-components of two
composites are known, a data synonym fact relating the two composites
is added. Identification of this case is done in a lazy manner, when
questions about data synonym facts are asked.
The change introduces helper methods to get the size of an array type
and the number of elements of a struct type, and fixes
TransformationCompositeExtract to invalidate analyses appropriately.
An equivalence relation is computed by traversing the tree of values
rooted at the class's representative. Children were represented by
unordered sets, meaning that the order of values in an equivalence
class could be nondeterministic. This change makes things
deterministic by representing children using a vector.
The path compression optimization employed in the implementation of
the underlying union-find data structure has the potential to change
the order in which elements appear in an equivalence class by changing
the structure of the tree, so the guarantee of determinism is limited
to being a deterministic function of the manner in which the
equivalence relation is updated and inspected.
This change fixes a bug in EquivalenceRelation, changes the interface
of EquivalenceRelation to avoid exposing (potentially
nondeterministic) unordered sets, and changes the interface of
FactManager to allow querying data synonyms directly. These interface
changes have required a lot of corresponding changes to client code
and tests.
Implements the following simplifications:
(a - b) + b => a
(a * b) + (a * c) => a * (b + c)
Also adds logic to simplification to handle rules that create new operations
that might need simplification, such as the second rule above.
Only perform the second simplification if the multiplies have the add as their
only use. Otherwise this is a deoptimization of size and performance.
At present, TransformationReplaceIdWithSynonym both extracts elements
from composite objects and replaces uses of ids with synonyms. This
new TransformationCompositeExtract class will allow that
transformation to be broken into smaller transformations.
Class TransformationConstructComposite has been renamed to
TransformationCompositeConstruct, to correspond to the name of the
SPIR-V instruction (as is done with e.g. TransformationCopyObject).
Running tests revealed an issue related to checking dominance in
TransformationReplaceIdWithSynonym, which is also fixed here.
This change uses the recently-added equivalence relation class to
re-work the way synonyms between data values are managed by the fact
manager.
The tests for 'transformation_replace_id_with_synonym' have been
temporarily removed. This is because those tests are going to be
split into a number of test classes in an upcoming PR, once some other
refactorings have been applied, and it would be burdensome to
temporarily refactor all the tests to be in a working state for this
intermediate change.
Adds a templated class for representing an equivalence relation on a
value data type. This will be used by spirv-fuzz for representing
sets of distinct pieces of data in a shader that are known to have
equal values.
A new pass that gives spirv-fuzz the ability to adjust the memory
operand masks associated with memory access instructions (such as
OpLoad and OpCopy Memory).
Fixes#2940.
We have a check that ensures that the optimizer did not change the
binary when it says that it did not. However, when the binary is
converted back to a binary, we made a decision to remove OpNop
instructions. This means that any spv file that contains a NOP
originally will fail this check.
To get around this, we convert the module to a second binary that keeps
the OpNop instructions. That binary is compared against the original.
Fixes https://crbug.com/1010191