This transformation, given a constant integer (scalar or vector) C,
constants I and S of compatible type and scalar 32-bit integer constant
N, such that C = I - S*N, adds a loop which subtracts S from I, N
times, creating a synonym for C.
The related fuzzer pass randomly chooses constants to which to add
synonyms using this transformation, and the location where they should
be added.
Fixes#3616.
In preparation for some upcoming work on the shrinker, this PR changes
the interfaces of Fuzzer, Replayer and Shrinker so that all data
relevant to each class is provided on construction, meaning that the
"Run" method can become a zero-argument method that returns a status,
transformed binary and sequence of applied transformations via a
struct.
This makes greater use of fields, so that -- especially in Fuzzer --
there is a lot less parameter passing.
This change introduces various strategies for controlling the manner
in which fuzzer passes are applied repeatedly, including infrastructure
to allow fuzzer passes to be recommended based on which passes ran
previously.
This PR modifies the FactManager methods IdIsIrrelevant and GetIrrelevantIds so
that an id is always considered irrelevant if it comes from a dead block.
Fixes#3733.
Introduces two changes:
- duplicated_exit_region refers to a correct block, regardless of the order
of the blocks in the enclosing function.
- Exclude the case where the continue target is the exit block.
This PR implements part of the add bit instruction synonym transformation.
For now, the implementation covers the OpBitwiseOr, OpBitwiseXor and
OpBitwiseAnd cases.
This PR changes the fact manager so that, when calling some of the
functions in submanagers, passes references to other submanagers if
necessary (e.g. to make consistency checks).
In particular:
- DataSynonymAndIdEquationFacts is passed to the AddFactIdIsIrrelevant
function of IrrelevantValueFacts
- IrrelevantValueFacts is passed to the AddFact functions of
DataSynonymAndIdEquationFacts
The IRContext is also passed when necessary and the calls to the
corresponding functions in FactManager were updated to be valid and
always use an updated context.
Fixes#3550.
This transformation, given the header of a selection construct with
branching instruction OpBranchConditional, flattens it.
Side-effecting operations such as OpLoad, OpStore and OpFunctionCall
are enclosed within smaller conditionals.
It is applicable if the construct does not contain inner selection
constructs or loops, or atomic or barrier instructions.
The corresponding fuzzer pass looks for selection headers and
tries to flatten them.
Needed for the issue #3544, but it does not fix it completely.
In #3636, I missed that the instruction folder may create more than a
single constant per call. Since CCP was only checking whether one
constant had been created after folding, it was wrongly thinking that
the IR had not changed.
Fixes#3738.
Adds a transformation that inserts a conditional statement with a
boolean expression of arbitrary value and duplicates a given
single-entry, single-exit region, so that it is present in each
conditional branch and will be executed regardless of which branch will
be taken.
Fixes#3614.
Motivated by integrating spirv-reduce into spirv-fuzz, so that an
added function can be made smaller during shrinking, this adds support
in spirv-reduce for asking reduction to be restricted to the
instructions of a single specified function.
This PR adds the NestingDepth function to StructuredCFGAnalysis.
This function, given a block id, returns the number of merge
constructs containing it.
This is needed by spirv-fuzz, but it makes sense to add it to
StructuredCFGAnalaysis, which contains related functionalities.
This transformation takes an OpSelect instruction and replaces it with
a conditional branch, selecting the correct value using an OpPhi
instruction.
Fixes part of the issue #3544.
The Vulkan 1.2.152 headers/spec now include Valid Usage IDs (VUID)
for every BuiltIn. This change adds labeling to help aid tracking
the coverage gap in Vulkan targeted validation. This is done with
a script in the Validation Layers that parses the source for VUID
strings, both that there is an implementation and a test for it.
There are 2 main changes:
1. A VUID string is added where applicable. It is wrapped in a function
that at runtimes checks if Vulkan is the target so that other targets
will not be effected as the output error only changes for Vulkan and
the overhead only occurs if there is an error at all.
2. For unit test, a new parameter value was added to allow make sure
that for each VUID there is a matching test. There are cases where the
same unit test checks multiple VUs via the parameter feature of gtest.
For this, I added a custom gMock Matcher to simply loop over a list
of VUID strings
Pointer (if VariablePointers is enabled) to find sets of potential
synonyms.
However, some instructions with these types cannot be used in an OpPhi:
- OpFunction cannot be used as a value
- OpUndef should not be used, because it yields an undefined value for
each use
Fixes#3761.
This transformation takes the id of an OpPhi instruction, of a dead
predecessor of the block containing it and a replacement id of
available to use and of the same type as the OpPhi, and changes
the id in the OpPhi corresponding to the given predecessor.
For example, %id = OpPhi %type %v1 %p1 %v2 %p2
becomes %id = OpPhi %type %v3 %p1 %v2 %p2
if the transformation is given %id, %p1 and %v3, %p1 is a dead block,
%v3 is type type and it is available to use at the end of %p1.
The fuzzer pass randomly decides to apply the transformation to OpPhi
instructions for which at least one of the predecessors is dead
Fixes#3726.
A transformation that replaces the use of an irrelevant id with
another id of the same type.
The related fuzzer pass, for every use of an irrelevant id,
checks whether the id can be replaced in that use by another
id of the same type and randomly decides whether to replace
it.
Fixes#3503.
This PR extends CallGraph with functions to return:
- a list of functions in lexicographical order, with respect to
function calls
- the maximum loop nesting depth that a function can be called from
(computed interprocedurally, e.g. if foo() calls bar() at depth 2
and bar() calls baz() at depth 1, the maximum depth of baz() will
be 3).
When we update OpenCL.DebugInfo.100 lexical scopes e.g., DebugFunction,
we have to replace DebugScope of each instruction that uses the lexical
scope correctly.
A transformation that adds new OpPhi instructions to blocks with >=1
predecessors, so that its value depends on previously-defined ids of
the right type, which are all synonymous. This instruction is also
recorded as synonymous to the others.
The related fuzzer pass still needs to be implemented.
Fixes#3592 .
This change adds the notion of "overflow ids", which can be used
during shrinking to facilitate applying transformations that would
otherwise have become inapplicable due to earlier transformations
being removed.
It is possible that the result of a void function call is used. In case
it is used, we need something that still defines its id after inlining.
We use an undef for that purpose.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/3704
CCP should mark IR changed if it created new constants.
This fixes#3636.
When CCP is simulating statements, it will sometimes successfully fold
an instruction, which laters switches to varying. The initial fold of
the instruction may generate a new constant K.
The problem we were running into is when K never gets propagated to the
IR. Its definition will still exist, so CCP should mark the IR modified
in this case.
In fixing this bug, I noticed that an existing test was suffering from
the same bug. The change also makes PassTest::SinglePassRunAndMatch()
return the result from the pass, so that we can check that the pass
marks the IR modified in this case.
Adds FuzzerPassAddCompositeInserts, which randomly adds new
OpCompositeInsert instructions. Each OpCompositeInsert instruction
yields a copy of an original composite with one subcomponent replaced
with an existing or newly added object. Synonym facts are added for the
unchanged components in the original and added composite, and for the
replaced subcomponent and the object, if possible.
Fixes#2859
For FuzzerPassAddParameters, adds pointer types (that have the storage
class Function or Private) to the pool of available types for new
parameters. If there are no variables of the chosen pointer type, it
invokes TransformationAddLocalVariable / TransformationAddGlobalVariable
to add one.
Part of #3403
In the existing code, ADCE pass does not check DebugScope of an
instruction when it checks the users of each instruction, which results
in removing OpenCL.Debug.100 instructions that are only used by
DebugScope. This commit lets ADCE pass add DebugScope of an instruction
to the live instruction set when the instruction is added to the live
instruction set.
`TransformationAddDeadBlock` did not check whether the existing block
(that will become a selection header) dominates its successor block (that
will become its merge block).
This change adds the check.
Fixes#3690.
Some OpenCL.DebugInfo.100 instructions such as DebugGlobalVariable
and DebugLocalVariable have the Type operand. This commit allows them to
use DebugTypeTemplate for the Type operand.
Improves the code coverage of tests for the following transformations:
1. TransformationAddRelaxedDecoration
2. TransformationReplaceCopyMemoryWithLoadStore
3. TransformationReplaceCopyObjectWithStoreLoad
4. TransformationReplaceLoadStoreWithCopyMemory
5. TransformationReplaceAddSubMulWithCarryingExtended
Support identical predecessors in TransformationPropagateInstructionUp.
A basic block may have multiple identical predecessors as follows:
%1 = OpLabel
OpSelectionMerge %2 None
OpBranchConditional %true %2 %2
%2 = OpLabel
...
This case wasn't supported before.
* No longer blindly add global non-semantic info instructions to global
types and values
* functions now have a list of non-semantic instructions that succeed
them in the global scope
* global non-semantic instructions go in global types and values if
they appear before any function, otherwise they are attached to the
immediate function predecessor in the module
* changed ADCE to use the function removal utility
* Modified EliminateFunction to have special handling for non-semantic
instructions in the global scope
* non-semantic instructions are moved to an earlier function (or full
global set) if the function they are attached to is eliminated
* Added IRContext::KillNonSemanticInfo to remove the tree of
non-semantic instructions that use an instruction
* this is used in function elimination
* There is still significant work in the optimizer to handle
non-semantic instructions fully in the optimizer
`TransformationAddTypeFloat` and `TransformationAddTypeInt` did not check whether the required capabilities were present when adding 16-bit, 64-bit, and 8-bit types.
This change adds these checks in the `IsApplicable` method of each transformation.
Fixes#3669.
`TransformationReplaceIdWithSynonym` is careful to avoid replacing id uses that index into a struct with synonyms because the indices must only be `OpConstant` instructions. However, the check only considered `OpAccessChain` instructions, even though the same restriction applies to `OpInBoundsAccessChain`, `OpPtrAccessChain`, etc.
This change extends the check to include all access chain instructions.
Fixes#3671.
Given an instruction (that may use an OpPhi result from the same block as an input operand), try to clone the instruction into each predecessor block, replacing the input operand with the corresponding OpPhi input operand in each case, if necessary.
Fixes#3458.
Replaces OpIAdd with OpIAddCarry, OpISub with OpISubBorrow, OpIMul with
OpUMulExtended or OpSMulExtended and stores the result into a fresh_id
representing a structure. Extracts the first element of the result into
the original result_id. This value is the same as the result of the
original instruction.
Fixes#3577
This PR changes FuzzerPassOutlineFunctions so that it uses some
transformation that make the TransformationOutlineFunction
transformation applicable in more cases. See the discussion in
#3095 for more details.
Fixes#3095.
The updated OpenCL.DebugInfo.100 spec says that we can use
variable for "Component Count" operand of DebugTypeArray i.e.,
DebugLocalVariable or DebugGlobalVariable. This commit updates spirv-val
based on the spec update.
For each local variable, ssa-rewrite should remove its DebugDeclare
if and only if it is replaced by any number of DebugValues for store
and phi instructions.
For example, when we have two variables `a` whose DebugDeclare
will be replaced to DebugValues by ssa-rewrite pass and `b` whose
DebugDeclare will not be replaced, we have to remove only DebugDeclare
for `a`, not `b`.
This PR introduces TransformationAddLoopPreheader, which, given
a loop header and enough fresh ids, adds a loop preheader, updating
all the references so that this new block is the only out-of-loop
predecessor of the header, which branches unconditionally to the
header.
See the discussion in #3095.
When we copy the loop body to unroll it, we have to copy its
instructions but DebugDeclare or DebugValue used for the declaration
i.e., DebugValue with Deref must not be copied and only the first block
can contain those instructions.
* Generate ext inst table for reflection
* Change build to use grammar files from SPIRV-Headers instead of
SPIRV-Tools
* Add enum for clspv reflection extended instruction set
* count it as non-semantic
* validate clspv reflection extended instruction set
* Remove local extended inst sets
* update headers deps
* Update nbuilds to use grammars from SPIRV-Headers instead of
local duplicates
Rename the `${SPIRV_TOOLS}` target to `${SPIRV_TOOLS}-static` and alias `${SPIRV_TOOLS}` to either `${SPIRV_TOOLS}-static` or `${SPIRV_TOOLS}-shared` depending on `BUILD_SHARED_LIBS`.
Re-point all internal uses of `${SPIRV_TOOLS}` to `${SPIRV_TOOLS}-static`.
`${SPIRV_TOOLS}-static` is explicitly renamed to just `${SPIRV_TOOLS}` to ensure the name does not change from current behavior.
Build the `SPIRV-Tools-*` libraries as static, as this is what they always were.
Force the external targets `gmock` and `effcee` to be built statically. These either do not support being built as shared libraries, or require special flags.
Issue: #3482
* Avoid operand type range checks
Deprecates the SPV_OPERAND_TYPE_FIRST_* and SPV_OPERAND_TYPE_LAST_*
macros.
The "variable" and "optional" operand types are only for internal use.
Export spvOperandIsConcrete instead, as that should cover intended
external uses.
Test that each operand type is classified either as one of:
- a sentinel value
- a concrete operand type
- an optional operand type (which includes variable-expansion types)
Test that each concrete and optional non-variable operand type
has a name for use internally when generating messages.
Co-authored-by: Steven Perron <stevenperron@google.com>
1. Set the debug scope and line information for the new replacement
instructions.
2. Replace DebugDeclare and DebugValue if their OpVariable or value
operands are replaced by scalars. It uses 'Indexes' operand of
DebugValue. For example,
struct S { int a; int b;}
S foo; // before scalar replacement
int foo_a; // after scalar replacement
int foo_b;
DebugDeclare %dbg_foo %foo %null_expr // before
DebugValue %dbg_foo %foo_a %Deref_expr 0 // after
DebugValue %dbg_foo %foo_b %Deref_expr 1 // means Value(foo.members[1]) == Deref(%foo_b)
* spirv-val: Pipes are no longer required for OpenCL 1.2 EP
This was removed from the OpenCL SPIR-V Environment Specification via
https://github.com/KhronosGroup/OpenCL-Docs/pull/56.
* spirv-val: Sort headers according to clang-format
* spirv-val: Groups capability is required since OpenCL 2.0
Adds a transformation that takes a pair of instruction descriptors to
OpLoad and OpStore that have the same intermediate value and replaces
the OpStore with an equivalent OpCopyMemory.
Fixes#3353.
Right now, TransformationRecordSynonymousConstants requires the type
ids of two candidate constants to be exactly the same.
This PR adds an exception for integer constants, which can be
considered equivalent even if their signedness is different.
This applies to both integers and vector constants.
The IsApplicable method of ReplaceIdWithSynonym is also updated so
that, in the case of two integer constants which don't have the same
type, they can only be swapped in particular instructions (those
that don't take the signedness into consideration).
Fixes#3536.
This PR generalises TransformationAddAccessChain so that dynamic
indices for non-struct composites (with clamping to ensure that
accesses are in-bound) are allowed.
The transformation will add instructions to clamp any index to
a non-struct composite, regardless of whether it is a constant
or not.
Fixes#3179.
Fixes an issue with the shrinker, where the message consumer set for
the shrinker was not being passed on to the replay object that the
shrinker creates. This meant that messages generated during replay
would cause an exception to be thrown.
Adds a transformation that replaces instruction OpCopyMemory with
loading the source variable to an intermediate value and storing this
value into the target variable of the original OpCopyMemory instruction.
Fixes#3352
Adds a transformation that replaces instruction OpCopyObject with
storing into a new variable and immediately loading this variable to
|result_id| of the original OpCopyObject instruction.
Fixes#3351.
Essentially, it marks all DebugInfo instructions in functions (and their operands) as live. It treats DebugDeclare and DebugValue with Deref as loads and so marks Stores of their variables as live.
It marks each DebugGlobalVariables as live except for its variable. After closure, it rechecks if the variable is live. If not, the DebugGlobalVariable instruction's variable operand is set to DebugInfoNone, per the DebugInfo spec.
Implemented AreEquivalentConstants method to check equivalency of
constants, changing IsApplicable method of
TransformationRecordSynonymousConstants to allow recording equivalence
of composite constants; added some tests to check this.
Tests with arrays and matrices still need to be added.
Fixes#3533.
Add TransformationAddRelaxedDecoration, which adds the RelaxedPrecision decoration to ids of numeric instructions (those yielding 32-bit ints or floats) in dead blocks.
Fixes#3502
Now that WebGPU ingests WGSL instead of SPIR-V,
there is no need to be so strict about the memory model.
Allow any memory model that is already allowed by Vulkan 1.0,
either directly or via an existing.
Fixes#3529
* Make BasicBlock::reachable() only consider static reachability
* Fix reachability calculation to be independent of block order
* add tests
This pass basically follows the same process as ssa-rewrite: it adds a DebugValue after each Store and removes the DebugDeclare or DebugValue Deref. It only does this if all instructions that are dependent on the Store are Loads and are replaced.
This commit lets the vector DCE pass preserve the OpenCL.DebugInfo.100
information properly. When the vector DCE pass determines the liveness
of instructions, the debug instructions must not affect the decision. In
addition, when it kills some instructions, it has to kill DebugValue
instructions that use the killed instructions. When it updates some
composite values to meaningful values (not undef), it has to remove
DebugValue because the value information becomes incorrect.
The decision to reduce the load must be not affected by debug
instructions. For example, even when a DebugValue references a
result id of a loaded composite value, this change lets the
reduce-load-size pass reduce the load if the full composite value is not
used anywhere other than the DebugValue.
When the pass replaces the local variable `OpVariable` ids to their
corresponding pointers, we have to update operands of DebugValue or
DebugDeclare instructions.
When there are multiple entries and the shader has a variable with
WorkGroup storage class, those multiple entry functions store values to
the variable. Since ADCE pass uses def-use chains to propagate the work
list, some of instructions in the work list are not actually a part of
the currently processed function. As a result, it adds instructions in
other functions and put them in |live_insts_|. However, it does not
have the control flow information for those instructions in other
functions i.e., |block2headerBranch_| and |header2nextHeaderBranch_|.
When it processes those instructions (they are added when it processes a
different function), it skips handling them because they are already in
|live_insts_| and does not check |block2headerBranch_| and
|header2nextHeaderBranch_|, which results in skipping some branches.
Even though those branches are live branches, it considers they are dead
branches.
For many spirv-opt passes such as simplify-instructions pass, we have to
correctly clear the OpenCL.DebugInfo.100 debug information for
KillInst() and ReplaceAllUses(). If we keep some debug information that
disappeared because of KillInst() and ReplaceAllUses(), adding new
DebugValue instructions based on the existing DebugDeclare information
will generate incorrect information. This CL update DebugInfoManager
and IRContext to correctly clear debug information.
* Add validation that input/output locations are assigned correctly
* Account for component assignment
* Account for 4 components per location and track the combined
coordinate
* Account for multiple output indexes
* handle specifically arrayed variables
* Arrayed variables that specify a component get locations determined
per index of the array for the sub type
* Added tests that check locations and components can be assigned
to interleave an array's locations and components
* Fix up which interfaces are allowed to be arrayed for various shader
stages based on glslang
* Support load and extract pattern in desc_sroa.
* Fix typo in comments.
* Load replacement var before use; and added test.
* fix formatting
* Address code review comments.
Add OpenCL.DebugInfo.100 `DebugValue` instructions for store
and phi instructions of local variables to provide the debugger with
the updated values of local variables correctly.
* Debug info preservation in dead branch elimination
The DeadBranchElimPass class already handles the OpenCL.DebugInfo.100
instructions correctly. This commit just adds a unit test to make sure
it preserves the information properly.
* Add unit test of ReplaceAllWith for debug instruction
* Debug info preservation in ccp pass
For constant propagation, the ccp pass already replaces the result id of
a value with a result id of the corresponding constant value. As a part
of the replacement, it correctly updates the operands of
DebugValue/DebugDeclare as well. Since we do not have to any addition
work other than the ccp pass itself, this commit just adds unit tests to
check the debug information preservation.
Need to actually instantiate them with TEST_P in the same source file.
Remove the instantiation of RoundTripTest from unit.cpp because
it's viewed as having no instantiations.
ImageGatherBiasLodAMD makes it possible to use the Bias and Lod image
operands with OpImageGather and OpImageSparseGather. This commit makes
sure the validator checks for that capability before reporting errors
and adds a few positive tests.
Handles the OpenCL100Debug extension in inlining. It preserves the information that is available while also adding the debug inlined at for all of the inlining that it does.
Reject folding comparisons with unfoldable types.
Fixes#3343
When CCP is evaluating an instruction, it was trying to fold a
comparison with 64 bit integers. This was causing a fold failure later
since the folder still cannot deal with 64 bit integers.
ssa-rewrite fails in `MemPass::GetPtr()` when the SPIR-V code contains
`OpLoad` for the result id of `OpConstantNull` because of the out of
index access for an operand to get the base address. This commit fixes
it.
Fixes#3344
In this PR, the classes that represent the adjust branch weights
transformation and fuzzer pass were implemented. This transformation
adjusts the branch weights of a OpBranchConditional instruction.
- No longer inline functions with early exits. Merge return can modify them so they can be inlined.
- Otherwise no functional change, should be just refactoring.
Extends the pass for removing unused instructions so that it can
remove global declarations (such as types and variables) that are only
used by decorations with which they are intimately connected, such as
descriptor set and binding decorations.
* Do merge return if the return is not at the end of the function.
We will remove the code in inlining to handle a return in the middle of
a function. To inline those functions, we need to run merge return to
move the return to the end of the function.
Generalizes the IsReadOnlyVariable() method, and related methods, so
that they can be used to ask whether pointer result ids are read-only.
Fixes#3324.
Many high-level languages like HLSL and GLSL generate termination
instructions such as return and branch from the actual part of the
high-level language code like return and if statements. This commit lets
IrLoader set `DebugScope` for termination instructions.
The outliner would outline regions ending with a loop header, making
the block containing the call to the outlined function serve as the
loop header. This, however, is incorrect in general, since the whole
outlined function -- rather than just the exit block for the region --
would end up getting called every time the loop would iterate.
This change restricts the outliner so that the last block in a region
cannot be a loop header.
We need an analysis for OpenCL.DebugInfo.100 extension instructions such
as a map between function id and its DebugFunction. This commit add an
analysis for it.
It has been resolved that statically out-of-bounds accesses are not
invalid in SPIR-V (they lead to undefind behaviour at runtime but
should not cause a module to be rejected during validation). This
change tolerates such accesses in donated code, clamping them in-bound
as part of making a function live-safe.
The Sample argument of OpImageTexelPointer is sometimes required to be
a zero constant. It thus cannot be replaced with a synonym in
general. This change avoids replacing this argument with a synonym.
The fact manager maintains an equivalence relation on data descriptors
that tracks when one data descriptor could be used in place of
another. An algorithm to compute the closure of such facts allows
deducing new synonym facts from existing facts. E.g., for two 2D
vectors u and v it is known that u.x is synonymous with v.x and u.y is
synonymous with v.y, it can be deduced that u and v are synonymous.
The closure computation algorithm is very expensive if we get large
equivalence relations.
This change addresses this in three ways:
- The size of equivalence relations is reduced by limiting the extent
to which the components of a composite are recursively noted as
being equivalent, so that when we have large synonymous arrays we do
not record all array elements as being pairwise equivalent.
- When computing the closure of facts, equivalence classes above a
certain size are simply skipped (which can lead to missed facts)
- The closure computation is performed less frequently - it is invoked
explicitly before fuzzer passes that will benefit from data synonym
facts. A new transformation is used to control its invocation, so
that fuzzing and replaying do not get out of sync.
The change also tidies up the order in which some getters are declared
in FuzzerContext.
Adds an extra condition on when a region can be outlined to avoid the
case where a region ends with a loop head but such that the loop's
continue target is in the region. (Outlining such a region would mean
that the loop merge is in the original function and the continue target
in the outlined function.)
The function outliner uses a struct to return ids that a region
generates and that are used outside that region. If these ids have
pointer type this would result in a struct with pointer members, which
leads to illegal loading from non-logical pointers if logical
addressing is used. This change bans that outlining possibility.
Provides support for runtime arrays in the code that traverses
composite types when checking applicability of transformations that
replace ids with synonyms.
Demotes the image storage class to Private during donation. Also
fixes an issue where instructions that depended on non-donated global
values would not be handled properly.
The SPIR-V data rules say that all uses of an OpSampledImage
instruction must be in the same block as the instruction, and highly
restrict those instructions that can consume the result id of an
OpSampledImage.
This adapts the transformations that split blocks and create synonyms
to avoid separating an OpSampledImage use from its definition, and to
avoid synonym-creation instructions such as OpCopyObject consuming an
OpSampledImage result id.
* Preserve debug info in eliminate-dead-functions
The elimination of dead functions makes OpFunction operand of
DebugFunction invalid. This commit replaces the operand with
DebugInfoNone.
* Handle more cases in dead member elim
- Rewrite composite insert and extract operations on SpecConstnatOp.
- Leaves assert for Access chain instructions, which are only allowed
for kernels.
- Other operations do not require any extra code will no longer cause an
assert.
Fixes#3284.
Fixes#3282.
The management of equation facts suffered from two problems:
(1) The processing of an equation fact required the data descriptors
used in the equation to be in canonical form. However, during
fact processing it can be deduced that certain data descriptors
are equivalent, causing their equivalence classes to be merged,
and that could cause previously canonical data descriptors to no
longer be canonical.
(2) Related to this, if id equations were known about a canonical data
descriptor dd1, and other id equations known about a different
canonical data descriptor dd2, the equation facts about these data
descriptors were not being merged in the event that dd1 and dd2
were deduced to be equivalent.
This changes solves (1) by not requiring equation facts to be in
canonical form while processing them, but instead always checking
whether (not necessary canonical) data descriptors are equivalent when
looking for corollaries of equation facts, rather than comparing them
using ==.
Problem (2) is solved by adding logic to merge sets of equations when
data descriptors are made equivalent.
In addition, the change also requires elements to be registered in an
equivalence relation before they can be made equivalent, rather than
being added (if not already present) at the point of being made
equivalent.
This change increases the extent to which arbitrary SPIR-V can be used
by the fuzzer pass that donates modules. It handles the case where
various ingredients (such as types, variables and particular
instructions) cannot be donated by omitting them, and then either
omitting their dependencies or replacing their dependencies with
alternative instructions.
The change pays particular attention to allowing code that manipulates
image types to be handled (by skipping anything image-specific).
(1) Runtime arrays are turned into fixed-size arrays, by turning
OpTypeRuntimeArray into OpTypeArray and uses of OpArrayLength into
uses of the constant used for the length of the fixed-size array.
(2) Atomic instructions are not donated, and uses of their results are
replaced with uses of constants of the result type.
The fuzzer pass that constructs composites had an issue where it would
regard isomorphic but distinct structs (similarly arrays) as being
interchangeable when constructing composites. This change fixes the
problem by relying less on the type manager.
Some transformations (e.g. TransformationAddFunction) rely on running
the validator to decide whether the transformation is applicable. A
recent change allowed spirv-fuzz to take validator options, to cater
for the case where a module should be considered valid under
particular conditions. However, validation during the checking of
transformations had no access to these validator options.
This change introduced TransformationContext, which currently consists
of a fact manager and a set of validator options, but could in the
future have other fields corresponding to other objects that it is
useful to have access to when applying transformations. Now, instead
of checking and applying transformations in the context of a
FactManager, a TransformationContext is used. This gives access to
the fact manager as before, and also access to the validator options
when they are needed.
When DebugScope is given in SPIR-V, each instruction following the
DebugScope is from the lexical scope pointed by the DebugScope in
the high level language. We add DebugScope struction to keep the
scope information in Instruction class. When ir_loader loads
DebugScope/DebugNoScope, it keeps the scope information in
|last_dbg_scope_| and lets following instructions have that scope
information.
In terms of DebugDeclare/DebugValue, if it is in a function body
but outside of a basic block, we keep it in |debug_insts_in_header_|
of Function class. If it is in a basic block, we keep it as a normal
instruction i.e., in a instruction list of BasicBlock.
Create a pass to instrument OpDebugPrintf instructions. This pass replaces all OpDebugPrintf instructions with instructions to write a record containing the string id and the all specified values into a special printf output buffer (if space allows). This pass is designed to support the printf validation in the Vulkan validation layers.
Fixes#3210
In this PR, the classes that represent the toggle access chain
instruction transformation and fuzzer pass were implemented. This
transformation toggles the instructions OpAccessChain and
OpInBoundsAccessChain between them.
Fixes#3193.
This introduces a new fuzzer pass to add instructions to the module
that define equations, and support in the fact manager for recording
equation facts and deducing synonym facts from equation facts.
Initially the only equations that are supported involve OpIAdd,
OpISub, OpSNegate and OpLogicalNot, but there is scope for adding
support for equations over various other operators.
According to SPV_AMD_shader_image_load_store_lod spec, Lod operand is
valid with OpImageRead, OpImageWrite, or OpImageSparseRead if the
extension is enabled.
This change adds a fuzzer pass that sprinkles access chain
instructions into a module at random. This allows other passes to
have a richer set of pointers available to them, in particular the
passes that add loads and stores.
Adds a fuzzer pass that inserts function calls into the module at
random. Calls from dead blocks can be arbitrary (so long as they do
not introduce recursion), while calls from other blocks can only be to
livesafe functions.
The change fixes some oversights in transformations to replace
constants with uniforms and to obfuscate constants which testing of
this fuzzer pass identified.
This change ensures that global and local variables donated from other
modules are always initialized at their declaration in the module
being transformed. This is to help limit issues related to undefined
behaviour that might arise due to accessing uninitialized memory.
The change also introduces some helper functions in fuzzer_util to
make it easier to find the pointee types of pointer types.
This change adds fuzzer passes that sprinkle loads and stores into a
module at random, with stores restricted to occur in either dead
blocks, or to use pointers for which it is known that the pointee
value does not influence the module's overall behaviour.
The change also generalises the VariableValueIsArbitrary fact to
PointeeValueIsIrrelevant, to allow stores through access chains or
object copies of variables whose values are known to be irrelevant.
The change includes some other minor refactorings.
Adds two new fuzzer passes to add variables to a module: one that adds
Private storage class global variables, another that adds Function
storage class local variables.
If the fuzzer object-copies a pointer we would like to be able to
perform loads from the copy (and stores to it, if its value is known
not to matter). Undefined and null pointers present a problem here,
so this change disallows copying them.
This adds support for replacing TimeAMD with OpReadClockKHR. The scope
for OpReadClockKHR is fixed to be a subgroup because TimeAMD operates
only on subgroup.
* Implement constant folding for many transcendentals
This change adds support for folding of sin/cos/tan/asin/acos/atan,
exp/log/exp2/log2, sqrt, atan2 and pow.
The mechanism allows to use any C function to implement folding in the
future; for now I limited the actual additions to the most commonly used
intrinsics in the shaders.
Unary folder had to be tweaked to work with extended instructions - for
extended instructions, constants.size() == 2 and constants[0] ==
nullptr. This adjustment is similar to the one binary folder already
performs.
Fixes#1390.
* Fix Android build
On old versions of Android NDK, we don't get std::exp2/std::log2
because of partial C++11 support.
We do get ::exp2, but not ::log2 so we need to emulate that.
This change adds a new kind of fact to the fact manager, which records
when a variable (or pointer parameter) refers to an arbitrary value,
so that anything can be stored to it, without affecting the observable
behaviour of the module, and nothing can be guaranteed about values
loaded from it. Donated modules are the current source of such
variables, and other transformations, such as outlining, have been
adapted to propagate these facts appropriately.
This change allows the generator to (optionally and at random) make
the functions of a module "livesafe" during donation. This involves
introducing a loop limiter variable to each function and gating the
number of total loop iterations for the function using that variable.
It also involves eliminating OpKill and OpUnreachable instructions
(changing them to OpReturn/OpReturnValue), and clamping access chain
indices so that they are always in-bounds.
We must treat a branch to the merge node of a switch that is in the
header of a construct as a nested construced. The original merge
instruction is still needed in that case.
Fixes#3139
* If the header of the construct is also a merge block, jump to the
associated header instead of the immediate dominator
* prevents spurious failures from unrelated constructs
* new tests
This new API lets clients request a minimal spv_target_env value
that supports a given Vulkan and SPIR-V version, by a generic
numbering scheme (as already defined by Vulkan and SPIR-V specs).
This breaks a formal source dependency from Glslang to SPIRV-Tools.
When a new API is rolled out, such as Vulkan 1.2, Glslang currently
needs to reference a specific SPIRV-Tools enum by name.
* Allow OpExtInst for DebugInfo between secion 9 and 10
Fixes#3086
* Handle spirv-opt errors on DebugInfo Ext
* Add IR Loader test
* Fix ir loader bug
* Handle DebugFunction/DebugTypeMember forward reference
* Add test cases (forward reference to function)
* Support old DebugInfo extension
* Validate local debug info out of function
This change refactors some code for walking access chain indexes to
make it mirror the structure of other code (to improve readability in
the first instance and potentially enable a future refactoring to
extract common code), and fixes a problem related to module donation
and function types.
As explained in #3118, spirv-opt merge-blocks pass causes a
spirv-val error when an OpBranch has an OpLine in front of it.
OpLoopMerge
OpBranch ; Will be killed by merge-blocks pass
OpLabel ; Will be killed by merge-blocks pass
OpLine ; will be placed between OpLoopMerge and OpBranch - error!
OpBranch
To fix this issue, this commit moves line info of OpBranch to
OpLoopMerge.
Fixes#3118
This adds a new kind of fact to the fact manager that knows whether a
block is dead - i.e. guaranteed to be statically unreachable - and a
new transformation for adding a selection construct to a CFG that
conditionally branches to a fresh, dead block, such that the branch
will never be dynamically taken. Transformations that may create new
blocks ('split block' and 'outline function') are updated to propagate
dead block facts to newly-created blocks where appropriate. A fuzzer
pass randomly adds dead blocks to the module.
Future transformations will be able to exploit the fact that such
blocks are known to be dead.
This change adds a fuzzer pass that allows code from other SPIR-V
modules to be donated into the module under transformation. It also
changes the command-line options of the tools so that, in fuzzing
mode, a file must be specified that contains the names of available
donor modules.
* Clone opencl.debuginfo.100 grammar from debuginfo grammar
Update version number to 200 revision 2
* Apply content from OpenCL.DebugInfo.100 extension text
* Rename grammar file
* Support OpenCL.DebugInfo.100 extended instructions
Add support for prefixing operand type names, to disambiguate
them between different instruction sets.
* Add tests for OpenCL.DebugInfo.100
* Support lookup of OpenCL.DebugInfo.100 extinst
* Add tests for enum values
* Recognize 2017-2019 as copyright date range
* Android.mk: support OpenCL.DebugInfo.100 extended instruction set
Also, stop generating core instruction tables for non-unified1 versions
of the grammar.
* Imported entity operand type is concrete
* Bazel: Suppoort OpenCL.DebugInfo.100
* BUILD.gn: Support OpenCL.DebugInfo.100
In the context of SPIR-V 1.4 or higher, global variables cannot be
used by an instruction unless they are listed in the interface of all
entry points that might invoke the instruction. This change
conservatively adds new global variables to the interfaces of all
entry points (if the SPIR-V version is 1.4 or higher).
Issue #3111 notes that a more rigorous approach to entry point
interfaces could be taken in spirv-fuzz, which would allow being less
conservative here.
This change prevents the spirv-fuzz function outliner from outlining a
region that uses the result of an OpAccessChain not defined inside the
region. Such accesses were turning into parameters to the outlined
function, and the result of an OpAccessChain cannot be passed as a
function parameter according to the SPIR-V specification.
Add support for SPV_KHR_non_semantic_info
This entails a couple of changes:
- Allowing unknown OpExtInstImport that begin with the prefix `NonSemantic.`
- Allowing OpExtInst that reference any of those sets to contain unknown
ext inst instruction numbers, and assume the format is always a series of IDs
as guaranteed by the extension.
- Allowing those OpExtInst to appear in the types/variables/constants section.
- Not stripping OpString in the --strip-debug pass, since it may be referenced
by these non-semantic OpExtInsts.
- Stripping them instead in the --strip-reflect pass.
* Add adjacency validation of non-semantic OpExtInst
- We validate and test that OpExtInst cannot appear before or between
OpPhi instructions, or before/between OpFunctionParameter
instructions.
* Change non-semantic extinst type to single value
* Add helper function spvExtInstIsNonSemantic() which will check if the extinst
set is non-semantic or not, either the unknown generic value or any future
recognised non-semantic set.
* Add test of a complex non-semantic extinst
* Use DefUseManager in StripDebugInfoPass to strip some OpStrings
* Any OpString used by a non-semantic instruction cannot be stripped, all others
can so we search for uses to see if each string can be removed.
* We only do this if the non-semantic debug info extension is enabled, otherwise
all strings can be trivially removed.
* Silence -Winconsistent-missing-override in protobufs
* Make Instrumentation format version 2 the default (Step 1)
Add new interfaces without version number argument. Remove version 1
logic and tests. Version interfaces will be removed in step 2 after
layers have transitioned to new interface.
* Add error messages to InstrumentPass().
* Don't crash when folding construct of empty struct
An OpCompositeConstruct of an empty struct will be folded to a constant
under normal circumstances. However, if the id limit has been reached
and the constant cannot be generated, then other folding rules will be
tried.
These rules do not handle the case of an empty struct. We add allow it
to be handled.
Fixes http://crbug/1030194
* Changes based on the review.
A new transformation and associated fuzzer pass in spirv-fuzz that
selects single-entry single-exit control flow graph regions and for
each selected region outlines the region into a new function and
replaces the original region with a call to this function.
The passes that add dead breaks and continues suffer from the
challenge that a new control flow graph edge can change dominance
information, leading to the potenital for definitions to no longer
dominate their uses. The attempt at guarding against this was known
to be incomplete. This change calls on the SPIR-V validator to do the
necessary checking: in deciding whether adding such an edge would be
legitimate, we clone the module, add the edge, and use the validator
to check whether the transformed clone is valid.
This strategy is heavy-weight, and should be used sparingly, but seems
like a good option when the validity of transformations is intricate,
to avoid reimplementing swathes of validation logic in the fuzzer.
Fixes#2919.
Access chain indices are always interpreted as signed integers.
So use signed clamp instead of unsigned clamp. We must also
clamp to the max signed int for the index type.
Fixes#3072
* Validate that if a construct contains a header and it's merge is
reachable, the construct also contains the merge
* updated block merging to not merge into the continue
* update inlining to mark the original block of a single block loop as
the continue
* updated some tests
* remove dead code
* rename kBlockTypeHeader to kBlockTypeSelection for clarity
Adds an option to run the validator on the SPIR-V binary after each
fuzzer pass has been applied, to help identify when the fuzzer has
made the module invalid. Also adds a helper method to allow dumping
of the sequence of transformations that have been applied to a JSON
file.
Prior to this change, TransformationReplaceIdWithSynonym was designed
to be able to replace an id with some synonymous data descriptor,
possibly necessitating extracting from a composite into a fresh id in
order to get at the synonymous data. This change simplifies things so
that TransformationReplaceIdWithSynonym just allows one id to be
replaced by another id. It is the responsibility of the associated
fuzzer pass - FuzzerPassApplyIdSynonyms - to perform the extraction
operations, using e.g. TransformationCompositeExtract.
Re-enable OpReadClockKHR validation
Fixes#2952
* Refactor some common scope validation
* Perform correct validation for scope in OpReadClockKHR
* Scope must be Subgroup or Device
* new tests
Inroduces a new transformation that adds a vector shuffle instruction
to the module, with associated facts about how the result vector of
the shuffle relates to the input vectors.
A fuzzer pass to add such transformations is not yet in place.
When a data synonym fact about two composites is added, data synonym
facts between all sub-components of the composites are also added.
Furthermore, when data synonym facts been all sub-components of two
composites are known, a data synonym fact relating the two composites
is added. Identification of this case is done in a lazy manner, when
questions about data synonym facts are asked.
The change introduces helper methods to get the size of an array type
and the number of elements of a struct type, and fixes
TransformationCompositeExtract to invalidate analyses appropriately.
An equivalence relation is computed by traversing the tree of values
rooted at the class's representative. Children were represented by
unordered sets, meaning that the order of values in an equivalence
class could be nondeterministic. This change makes things
deterministic by representing children using a vector.
The path compression optimization employed in the implementation of
the underlying union-find data structure has the potential to change
the order in which elements appear in an equivalence class by changing
the structure of the tree, so the guarantee of determinism is limited
to being a deterministic function of the manner in which the
equivalence relation is updated and inspected.
This change fixes a bug in EquivalenceRelation, changes the interface
of EquivalenceRelation to avoid exposing (potentially
nondeterministic) unordered sets, and changes the interface of
FactManager to allow querying data synonyms directly. These interface
changes have required a lot of corresponding changes to client code
and tests.
Implements the following simplifications:
(a - b) + b => a
(a * b) + (a * c) => a * (b + c)
Also adds logic to simplification to handle rules that create new operations
that might need simplification, such as the second rule above.
Only perform the second simplification if the multiplies have the add as their
only use. Otherwise this is a deoptimization of size and performance.
At present, TransformationReplaceIdWithSynonym both extracts elements
from composite objects and replaces uses of ids with synonyms. This
new TransformationCompositeExtract class will allow that
transformation to be broken into smaller transformations.
Class TransformationConstructComposite has been renamed to
TransformationCompositeConstruct, to correspond to the name of the
SPIR-V instruction (as is done with e.g. TransformationCopyObject).
Running tests revealed an issue related to checking dominance in
TransformationReplaceIdWithSynonym, which is also fixed here.
This change uses the recently-added equivalence relation class to
re-work the way synonyms between data values are managed by the fact
manager.
The tests for 'transformation_replace_id_with_synonym' have been
temporarily removed. This is because those tests are going to be
split into a number of test classes in an upcoming PR, once some other
refactorings have been applied, and it would be burdensome to
temporarily refactor all the tests to be in a working state for this
intermediate change.
Adds a templated class for representing an equivalence relation on a
value data type. This will be used by spirv-fuzz for representing
sets of distinct pieces of data in a shader that are known to have
equal values.
A new pass that gives spirv-fuzz the ability to adjust the memory
operand masks associated with memory access instructions (such as
OpLoad and OpCopy Memory).
Fixes#2940.
We have a check that ensures that the optimizer did not change the
binary when it says that it did not. However, when the binary is
converted back to a binary, we made a decision to remove OpNop
instructions. This means that any spv file that contains a NOP
originally will fail this check.
To get around this, we convert the module to a second binary that keeps
the OpNop instructions. That binary is compared against the original.
Fixes https://crbug.com/1010191
This change refactors the 'split blocks' transformation so that an
instruction is identified via a base, opcode, and number of those
opcodes to be skipped when searching from the base, as opposed to the
previous design which used a base and offset.
* Validate that selections are structured
WIP
* new checks that switch and conditional branch are proceeded by a
selection merge where necessary
* Don't consider unreachable blocks
* Add some tests
* Changed how labels are marked as seen
* Moved check to more appropriate place
* Labels are now marked as seen when there are encountered in a
terminator instead of when the block is checked
* more tests
* more tests
* Method comment
* new test for a bad case
A refactoring that separates the identification of an instruction from
the identification of a use in an instruction, to enable the former to
be used independently of the latter.
A new pass that allows the fuzzer to change the 'loop control' operand
(and associated literal operands) of OpLoopMerge instructions.
Fixes#2938.
Fixes#2943.
Adds a fuzzer pass and transformation to create a composite (array,
matrix, struct or vector) from available constituent components, and
inform the fact manager that each component of the new composite is
synonymous with the id that was used to construct it. This allows the
"replace id with synonym" pass to then replace uses of said ids with
uses of elements extracted from the composite.
Fixes#2858.
* Remove Impl struct in Reducer; we can re-add it later (in a cleaner fashion) if we need to.
* Add cleanup passes in Reducer; needed so that removal of constants can be disabled during the main passes, and then enabled during cleanup passes, otherwise some main passes can perform worse due to lack of available constants.
* Delete passes: remove op name, remove relaxed precision. And delete associated tests.
* Add more tests for remove unreferenced instructions.
* Always return and write the output file, even if there was a reduction failure.
* Only exit with 0 if the reduction completed or we hit the reduction step limit.
Issue #2919 identifies a problem in spirv-fuzz's ability to determine
when it is safe to add a new control flow edge without breaking
dominance rules. This change adds a (currently disabled) test to
expose the issue, and a comment to document that the current solution
is incomplete.
We want to handle OpKill better. The wrap opkill causes lots of extra
code to be generated, even when they are not needed to avoid the main
problem: OpKill cannot be found directly in a continue construct.
This change will be more selective on which functions the OpKill will be
wrapped and inlining will avoid inlining.
Fixes#2912
* Add continue construct analysis to struct cfg analysis
Add the ability to identify which blocks are in the continue construct for a
loop, and to get functions that are called from those blocks, directly or
indirectly.
Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/2912.
There is nothing in the spir-v spec that says the last
instructions in a module cannot be OpLine or OpNoLine.
However, the code that parses the module will simply drop
these instructions.
We add code that will preserve these instructions.
Strip-debug-info is updated to remove these instructions.
Fixes https://crbug.com/1000689.
Because dominance information becomes a bit unreliable when blocks are
unreachable, this change makes it so that the 'dead break'
transformation will not introduce a break to an unreachable block.
Fixes#2907.
This change introduces a robust check for whether an index in an
access chain is indexing into a struct, in which case the index needs
to be an OpConstant and cannot be replaced with a synonym.
Fixes#2906.
Issues #2898 and #2900 identify some cases where adding a dead
continue would lead to an invalid module, and these turned out to be
due to the lack of sensible dominance information when a continue
target is unreachable. This change requires that the header of a loop
dominates the loop's continue target if a dead continue is to be
added.
Furthermore, issue #2905 identified a shortcoming in the algorithm
being used to identify when it is OK, from a dominance point of view,
to add a new break/continue edge to a control flow graph. This change
replaces that algorithm with a simpler and more obviously correct
algorithm (that incidentally does not require the new edge to be a
break/continue edge in particular).
Fixes#2898.
Fixes#2900.
Fixes#2905.
Before this change, spirv-fuzz would replace a pointer argument to a
function call with a synonym, which is problematic when the synonym is
not a memory object declaration, since function call arguments are
required to be memory object declarations. This change adds a check
to ensure that such a replacement is not made.
Fixes#2896.
Before this change, spirv-fuzz would replace a constant boolean
argument to an OpPhi with the result of a binary operation, inserting
the instruction to compute the binary operation right before the
OpPhi, leading to an invalid module. This change conservatively
disallows replacing OpPhi arguments. Issue #2902 notes that there is
scope for being less conservative.
Fixes#2897.
* Handle extract with no indexes
It is possible that OpCompositeExtract instructions will not have any
indexes. This is not handled well by scalar replacement and instruction
folding.
Fixes https://crbug.com/1006435
* Fix typo.
This change to spirv-fuzz uses ideas from "Swarm Testing" (Groce et al. 2012), so that a random subset of fuzzer passes are enabled. These passes are then applied repeatedly in a randomized fashion, with the aggression with which they are applied being randomly chosen per pass.
There is plenty of scope for refining the probabilities introduce in this change; this is just meant to be a reasonable first effort.
* Use OpReturn* in wrap-opkill
The warp-opkill pass is generating incorrect code. It is placing an
OpUnreachable at the end of a basic block, when the block can be
reached. We can't reach the end of the block, but we can reach the end.
Instead we will add a return instruction.
Fixes#2875.
A previous change that disabled long-running tests by default failed
to enable short-running tests when long-running tests are enabled.
This change fixes that problem.
To aid in debugging issues in spirv-fuzz, this change adds an option whereby the SPIR-V module is validated after each transformation is applied during replay. This can assist in finding a transformation that erroneously makes the module invalid, so that said transformation can be debugged.
spirv-fuzz has useful tests that run the fuzzer and shrinker, to give
the whole tool a good shake up, effectively "fuzzing the fuzzer". The
problems that this detects are sensitive to the source of randomness
that is used, which can change from test platform to test platform.
It is thus not a good idea to run these tests by default during
continuous integration - they may end up failing due to environtal
factors, making it look like an unrelated change has broken the fuzzer
when really the fuzzer has revealed an already-existing bug in itself.
This change makes the tests disabled by default; they can enabled
during dedicated testing of the fuzzer.
The warp-opkill pass is generating incorrect code. It is placing an
OpUnreachable at the end of a basic block, when the block can be
reached. We can't reach the end of the block, but we can reach the end.
Instead we will add a return instruction.
Fixes#2875.
Many of the places in copy propagate arrays assumes that integer constant will be defined by an OpConstant instruction. That is not always true. We fix these spots by allowing for an OpConstantNull.
If the fuzzer's fact manager knows that ids A and B are synonymous, it
can replace a use of A with a use of B, so long as various conditions
hold (e.g. the definition of B must dominate the use of A, and it is
not legal to replace a use of an OpConstant in a struct's access chain
with a synonym that is not an OpConstant).
This change adds a fuzzer pass to sprinke such synonym replacements
through the module.
* When input or result is a pointer type also allow 32-bit integer
vectors for the other type
* Relaxation only applies to SPIR-V 1.5 or in the presence of
SPV_KHR_physical_storage_buffer
* new tests
* Vulkan specific checks
* storage buffer variables must be structs or arrays of structs
* storage buffer struct must be Block decorated
* uniform struct must be Block or BufferBlock decorated
* new tests
* Ensure same enum values have consistent extension lists
* val: fix checking of capabilities
The operand for an OpCapability should only be
checked for the extension or core version.
The InstructionPass registers a capability, and all its implied
sub-capabilities before actually checking the operand to an
OpCapability.
* Add basic support for SPIR-V 1.5
- Adds SPV_ENV_UNIVERSAL_1_5
- Command line tools default to spv1.5 environment
- SPIR-V 1.5 incorporates several extensions. Now the disassembler
prefers outputing the non-EXT or non-KHR names. This requires
updates to many tests, to make strings match again.
- Command line tests: Expect SPIR-V 1.5 by default
* Test validation of SPIR-V 1.5 incorporated extensions
Starting with 1.5, incorporated features no longer require
the associated OpExtension instruction.
If an OpKill instruction is inlined into a continue construct, then the
spir-v is no longer valid. To avoid this issue, we do inline into an
OpKill at all. This method was chosen because it is difficult to keep
track of whether or not you are in a continue construct while changing
the function that is being inlined into. This will work well with wrap
OpKill because every will still be inlined except for the OpKill
instruction itself.
Fixes#2554Fixes#2433
This reverts commit aa9e8f5380.
The implementation of these passes had overlooked the fact that adding
a new edge to a control flow graph can change dominance information.
Adding a dead break/continue risks causing uses to no longer be
dominated by their definitions. This change introduces various tests
to expose such scenarios, and augments the preconditions for these
transformations with checks to guard against the situation.
* Handle id overflow in the ssa rewriter.
Remove LocalSSAElim pass at the same time. It does the same thing as the SSARewrite pass. Then even share almost all of the same code.
Fixes crbug.com/997246
The first pass applies the RelaxedPrecision decoration to all executable
instructions with float32 based type results. The second pass converts
all executable instructions with RelaxedPrecision result to the equivalent
float16 type, inserting converts where necessary.
Add the first steps to removing the AMD extension VK_AMD_shader_ballot.
Splitting up to make the PRs smaller.
Adding utilities to add capabilities and change the version of the
module.
Replaces the instructions:
OpGroupIAddNonUniformAMD = 5000
OpGroupFAddNonUniformAMD = 5001
OpGroupFMinNonUniformAMD = 5002
OpGroupUMinNonUniformAMD = 5003
OpGroupSMinNonUniformAMD = 5004
OpGroupFMaxNonUniformAMD = 5005
OpGroupUMaxNonUniformAMD = 5006
OpGroupSMaxNonUniformAMD = 5007
and extentend instructions
WriteInvocationAMD = 3
MbcntAMD = 4
Part of #2814
* Refactor instruction folders
We want to refactor the instruction folder to allow different sets of
rules to be added to the instruction folder. We might want different
sets of rules in different circumstances.
We also need a way to add rules for extended instructions. Changes are
made to the FoldingRules class and ConstFoldingRules class to enable
that.
We added tests to check that we can fold extended instructions using the
new framework.
At the same time, I noticed that there were two tests that did not tests
what they were suppose to. They could not be easily salvaged. #2813 was
opened to track adding the new tests.
Adds a reduction pass that removes OpDecorate and OpMemberDecorate
instructions that annotate instructions and members with
RelaxedPrecision. As well as being useful in its own right, removing
such references allows other passes to remove further instructions.
Now we need to handle id overflow when we overflow while replacing uses of the variable. While looking at this code, I noticed an error in the way we handle access chains that cannot be replaced because of overflow. Name it will make some change, and then give up by returning SuccessWithoutChange. But it was changed.
This is fixed up by returning Failure if we notice the error at the time of rewriting the users. This is for both id overflow or out-of-bounds accesses.
Code is added to "CheckUses" to remove variables that have out-of-bounds accesses from the candidate list, so we don't even try to rewrite its uses.
Fixes https://crbug.com/995032
If we run out of ids when creating a new variable, sroa does not recognize
the error, and continues doing work. This leads to segmentation faults.
Fixes https://crbug/969655
Fixes#2793
* Don't special case matrix validation compared to other composites
* just check the constituents are constants or undefs
* later checking validates the column type
* new test
We are no able to inline OpKill instructions into a continue construct.
See #2433. However, we have to be able to inline to correctly do
legalization. This commit creates a pass that will wrap OpKill
instructions into a function of its own. That way we are able to inline
the rest of the code.
The follow up to this will be to not inline any function that contains
an OpKill.
Fixes#2726
This also fixes ADCE to not remove possibly needed OpTypeForwardPointer.
The bug, its fix and the corresponding test have a circular dependency
with the extension, so they are packaged together.
If a member of a struct has a relaxed precision, sroa will not split the
struct. This means we do not get all cases. This commit handles these
cases. The other part is that the decoration needs to be passed on to
the new variables.
Fixes#2786
This transformation can introduce an instruction that uses
OpCopyObject to make a copy of some other result id. This change
introduces the transformation, but does not yet introduce a fuzzer
pass to actually apply it.
Fixes#2768
* In scalar replacement, interpret access chain indexes as signed counts
* Use Constant::GetSignExtendedValue and Constant::GetZeroExtendedValue
where appropriate
* new tests
spirv-opt: Add --graphics-robust-access
Clamps access chain indices so they are always
in bounds.
Assumes:
- Logical addressing mode
- No runtime-array-descriptor-indexing
- No variable pointers
Adds stub code for clamping coordinate and samples
for OpImageTexelPointer.
Adds SinglePassRunAndFail optimizer test fixture.
Android.mk: add source/opt/graphics_robust_access_pass.cpp
Adds Constant::GetSignExtendedValue, Constant::GetZeroExtendedValue
This fixes#2608.
The original test case had an out-of-bounds reference that ended up
folding into OpCompositeExtract that was indexing right outside the
constant composite.
The returned constant would then cause a segfault during constant
propagation.
Fixes#2764
* Don't replace all uses when simplifying instructions, instead only
update non-debug, non-decoration uses
* added a test
* Add a new version of RAUW that takes a predicate to decide whether to
replace the use or not
* used in simplification pass
* Fix#2609 - Handle out-of-bounds scalar replacements.
When SROA tries to do a replacement for an OpAccessChain that is exactly
one element out of bounds, the code was trying to access its internal
array of replacements and segfaulting.
This protects the code from doing this, and it additionally fixes the
way SROA works by not returning failure when it refuses to do a
replacement. Instead of failing the optimization pass, SROA will now
simply refuse to do the replacement and keep going.
Additionally, this patch fixes the SROA logic to now return a proper status so we can
correctly state that the pass made no changes to the IR if it only found
invalid references.
The recently added fuzzer_replayer and fuzzer_shrinker tests were
rather heavyweight and were leading to CI timeouts. This change
reduces the runtime of those tests by having them do fewer iterations.
Merge return expects unreachable merge block to look a certain way, and
unreachable continue blocks to look a certain way. What if an
unreachable block is both a merge and a continue? The continue is
suppose to take precedent, but merge-return implements it with the merge
taking precedent. This change flips that around.
Fixes#2746
Similar to the existing 'add dead breaks' pass, this adds a pass to
add dead continues to blocks in loops where such a transformation is
viable. Various functionality common to this new pass and 'add dead
breaks' has been factored into 'fuzzer_util', and some small
improvements to 'add dead breaks' that were identified while reviewing
that code again have been applied.
Fixes#2719.
* Process OpDecorateId in ADCE
When there is an OpDecorateId instruction that is live,
the ids that is references must be kept live. This change
adds them to the worklist.
I've also updated a validator check to allow OpDecorateId
to be able to apply to decoration groups.
Fixes#1759.
* Remove dead code.
In merge return, we need to know the original dominator for a block in order to
traverse code from the original dominator to the new dominator and add
appropriate Phi nodes. The current code gets this wrong because the dominator
tree is build as needed. The first time we get the immediate dominator for a
function we just built the dominator tree and it takes into account that a
block has been split. The second time it does not.
This inconsistency needs to be fixed. We do that by recording the original
dominator for all blocks at the start of the pass.
If we were to record just the basic block, that could change if the block is
split. We want to traverse the code in the body of the original dominator,
whatever block it ends up in. To make this easy to track, we not save the
terminator instruction to represent the original dominator.
Fixes#2745