Add name annotations to the generated instrumentation code to
make it easier to understand. Example spirv-cross output:
vec4 _140;
if (0u < inst_bindless_direct_read_4(0u, 0u, 1u, uint(_19)))
{
_140 = texture(textures[nonuniformEXT(_19)], inUV);
}
else
{
inst_bindless_stream_write_4(50u, 1u, uint(_19), 0u);
_140 = vec4(0.0);
}
* Improve algorithm to reorder blocks in a function
In dead branch elimination, blocks can end up in a the wrong order, so
there is code to reorder the blocks in structured order. The problem is
that the algorithm to do that is very poor. It involves many searchs in
the function for the correct position to place the block, as well as
moving many block in the vector.
The solution is to write a specialized function in the function class
that will reorder the blocks in structured order. After computing the
structured order, reordering the block can be done in linear time, with
very little overhead.
Using SinglePassRunAndMatch<> instead of SinglePassRunAndCheck<>
makes tests more concise and makes it possible to use pattern
matching features.
Using Effcee stateful pattern matching to make it less repetitive
to check for generated functions and global variables.
This approach isn't worth
it for DebugPrintf functions because the generated code will change
depending on how many parameters are passed to every debugPrintfEXT()
call.
* spirv-opt: fix copy-propagate-arrays index opti on structs.
As per SPIR-V spec:
OpAccessChain indices must be OpConstant when indexing into a structure.
This optimization tried to remove load cascade. But in some scenario
failed:
```c
cbuffer MyStruct {
uint my_field;
};
uint main(uint index) {
const uint my_array[1] = { my_field };
return my_array[index]
}
```
This is valid as the struct is indexed with a constant index, and then
the array is indexed using a dynamic index.
The optimization would consider the local array to be useless and
generated a load directly into the struct.
* spirv-opt: prevent creation of unused instructions
Copy-propagate-arrays optimization pass would create unused constants,
even if the optimization not completed.
This was caused by the way we handled OpAccessChain squashing: we
only referenced constants, and had to create them upfront.
Fixes#4887
Signed-off-by: Nathan Gauër <brioche@google.com>
Specifically, DebugSourceContinued, DebugCompilationUnit, and
DebugEntryPoint. These instructions are top-level instructions
which do not or may not have a user except for the tool and so
should not be eliminated.
When folding a vector shuffle with an undef literal, it is possible that the
literal is adjusted so that it will then be interpreted as an index into
the input operands. This is fixed by special casing that case, and not
adjusting those operands.
Fixes#4859
An access chain instruction interpretes its index operands as signed.
The composite insert and extract instruction interpret their index
operands as unsigned, so it is not possible to represent a negative
number.
This commit adds a check to the local-access-chain-convert pass to check
for a negative number in the access chain and to not do the conversion.
Fixes#4856
* spirv-as: Avoid overflow when parsing exponents on hex floats
When an exponent is so large that it would overflow the int
type in the parser, saturate the exponent.
This allows extremely large exponents, and saturates
to infinity when the exponent is positive, and zero when the exponent
is negative.
Fixes#4721.
* Avoid unexpected narrowing conversions from arithmetic operations
Co-authored-by: Alastair F. Donaldson <alastair.donaldson@imperial.ac.uk>
Excessive whitespace can lead to stack overflow during parsing as each
character of skipped whitespace involves a recursive call. An
iterative solution avoids this.
Fixes#4729.
This reverts commit d18d0d92e5.
This is reverted because it causes a 7X slowdown when legalizing
SPIR-V with NonSemantic.Shader.DebugInfo.100 instructions.
This is due to the creation of very large UseLists for several
heavily used operands for this extension combined with the fact
that the original commit changed the performance of Uselists to O(n).
Fixes https://crbug.com/oss-fuzz48578
* Adds structural reachability to basic blocks
* calculated in same manner as reachability, but using structural
successors
* Change structured CFG validation to use structural reachability
instead of reachability
* Fix some invalid reducer cases
Arrays do not have to have a size that is known at compile time. It
could be a spec constant. In these cases, treat the array
as if it is arbitrarily long. This commit will treat it like it is an
array of size UINT32_MAX.
Fixes https://crbug.com/oss-fuzz/47397.
If the `instruction` operand in an extended instruction instruction is
too large, it causes undefined behaviour when that value is cast to the
enum for the corresponding set. This is done with the
NonSemanticDebug100 instruction set. We need to avoid the undefined
behaviour.
Fixes#4727
Which functions are processed is determined by which ones are on the
call tree from the entry points before dead code is removed. So it is
possible that a function is process because it is called from an entry
point, but the CFG is not cleaned up because the call to the function
was removed.
The fix is to process and cleanup every function in the module. Since
all of the dead functions would have already been removed in an earlier
step of DCE, it should not make a different in compile time.
Fixes#4731
* Structural dominance introduced in SPIR-V 1.6 rev2
* Changes the structured cfg validation to use structural dominance
* structural dominance is based on a cfg where merge and continue
declarations are counted as graph edges
* Basic blocks now track structural predecessors and structural
successors
* Add validation for entry into a loop
* Fixed an issue with inlining a single block loop
* The continue target needs to be moved to the latch block
* Simplify the calculation of structured exits
* no longer requires block depth
* Update many invalid tests
When the `SPV_BINARY_TO_TEXT_OPTION_PRINT` option is specified, `spvtext` will not be created (see 37d2396cab/source/disassemble.cpp (L117)), and the attempt to dereference into its members (`spvtext->str` and `spvtext->length`) results in segmentation fault.
This is fixed by first checking if `spvtext` is nulll.
Co-authored-by: Steven Perron <stevenperron@google.com>
We currently build the structured order for all nodes reachable from the
loop header when unrolling a loop. However, unrolling only needs the
nodes in the loop and possibly the merge node.
To avoid needless computation, I have implemented a search that will
stop at the merge node.
Fixes#4827
The code in `CFG::SplitLoopHeader` assumes the loop header is not the
latch. This leads to it not being able to find the latch block. This
has been fixed, and a test added.
Fixes#4527
If the predecessor blocks are the same, then there is only 1 value for the
OpPhi. The simplition pass will simplify it, and it causes problems for
if-conversion. In these cases, if-conversion can just punt.
Fixes#3554.
An access chain could have a constant index that is an out of bounds
access. This is valid spir-v, even if it can cause problems at runtime.
However, it is not valid to have an OpCompositeExtract with an out of
bounds access. This means we have to stop local-access-chain-convert
from making that change.
Fixes#4605
A std::set is used instead of std::vector, where the elements are
ordered by member index first. Decorations for fields are now looked up
by going over the range of decorations for the member only, instead of
the whole set.
In an ANGLE test that generates a struct with 4096 members, validation
goes down from ~140ms to ~90ms. On debug builds, the difference is more
pronounced, going down from ~2.5s to ~600ms.
This change adds a folding rule which transforms x * y - a and a - x * y
into FMA(x, y, -a) and FMA(-x, y, a), respectively.
While the SPIR-V instruction count remains the same, target instruction
sets typically feature FMA instruction variants that can negate an
operand. Also this transformation may unlock further optimizations which
eliminate the negation.
(Google bug: b/226145988)