If the body of the module does not have any ids change, compact ids will
not change the id bound. This can cause problems because the id bound
could be much higher than the largest id in that is used. It should be
reset any time it is not the larger id used + 1.
Fixes#4604
When folding a vector shuffle feeding a vector shuffle, we do not
propagate an 0xFFFFFFFF, which has a special meaning, correctly. We
adjust the value making it lose it meaning as an undefined value.
Fixes#4581
CCP does not want to fold an instruction unless it folds to a constant.
There is an asser to check for this. The question if a spec constant
counts as a constant. The constant folder considers a spec constant a
constand, but CCP does not. I've fixed the assert in CCP to match what
the folder does. It should not require any new changes to CCP.
Swift shader needs a way to inline all functions, even those marked as
DontInline. See https://github.com/KhronosGroup/SPIRV-Tools/pull/4471.
This implements the suggestion I made in the PR. We add a pass that
will remove the DontInline function control, so that the inlining passes
will inline them.
SwiftShader will still have to modify their code to add this pass before
the other passes are run.
The function `BuildInvalideAnalyses` will be rebuilt for every analysis that
has been requested, but it is not necessary. It also can cause problems
because if the CFG needs to be rebuilt, so do the dominator trees.
This change will make the functionality match the description of the
function.
* Optimize DefUseManager allocations
Saves around 30-35% of compilation time.
For inst->use_ids, use a pool linked list instead of allocating vectors for every instruction. For inst->uses, use a "PooledLinkedList"' -- a linked list that has shared storage for all nodes. Neither re-use nodes, instead we do a bulk compaction operation when too much memory is being wasted (tuneable).
Includes separate PooledLinkedList templated datastructure, a very special case construct, but split out to make the code a little easier to understand.
Incrementally compute the hash instead of collecting words
Avoids allocating temporary space in a std::vector and std::u32string, and making three passes over all the hashed data.
Switch to using std::vector to prevent processing duplicates instead of std::unordered_set: avoids an allocation/deletion every call to ComputeHashValue, and ends up faster due to much better cache behaviour and smaller constant-factor when searching the (generally very small) list.
In my test case, made Type::HashValue go from 7.5% of compilation time to .5%
In newer versions of protobuf the Status building code has been made
internal, so that embedders cannot build their own instances like is
being done here.
Changing this code to just use the .ok() method on the status object,
since if the status is OK or not is what is actually being tested.
This will make it easier in the future to update external/protobuf.
* Update test for parsing memory access masks
Needed to support SPV_INTEL_memory_access_aliasing extension
There is a negative test that checks unused mask bits.
Some of those bits are now sued by the new Intel extension.
* Update deps for new SPIRV-Headers
Previously, array sizes were presumed to be OpConstant, which is not
necessarily true. This change ensures OpSpecConstant array sizes as
matched exactly, instead of taken as OpConstant and matched by value.
* Reimplement LCS used by spirv-diff
Two improvements are made to the LCS algorithm:
- The LCS algorithm is reimplemented to use a std::stack instead of
being recursive. This prevents stack overflow in the LCSTest.Large
test.
- The LCS algorithm uses an NxM table. Previously, entries of this
table were {size_t, bool, bool}, which is now packed in 32 bits. The
first entry can assume a maximum value of min(N, M), which
realistically for SPIR-V diff will not be larger than 1 billion
instructions. This reduces memory usage of LCS by 75%.
This partially reverts 845f3efb8a and
enables LCS tests.
* Stabilize the output of spirv-diff
std::map is used instead of std::unordered_map to ensure the output of
spirv-diff is identical everywhere.
This partially reverts 845f3efb8a and
enables spirv-diff tests.
Scalar replacement generates a null when there value for a member will
not be used. The null is used to make sure things are
deterministic in case there is an error.
However, some type cannot be null, so we will change that to use undef.
To keep the code simpler we will always use the undef.
Fixes#3996
spirv-diff is a new tool that produces diff-style output comparing two
SPIR-V modules. The instructions between the src and dst modules are
matched as best as the tool can, and output is produced (in src
id-space) that shows which instructions are removed in src, added in dst
or modified between them. The order of instructions are not retained.
Matching instructions between two SPIR-V modules is not trivial, and
thus a number of heuristics are applied in this tool. In particular,
without debug information, it's hard to match functions as they can be
reordered. As such, this tool is primarily useful to produce the diff
of two SPIR-V modules derived from the same source.
This tool can be useful in a number of scenarios:
- Compare the SPIR-V before and after modifying a shader
- Compare the SPIR-V produced from a shader before and after compiler
codegen changes.
- Compare the SPIR-V produced from a shader before and after some
transformation or optimization.
- Compare the SPIR-V produced from a shader with different compilers.
The handling of the RayQueryKHR type is not complete in the type
manager. The tests were not picking this up. I've added a test to make
sure that the `GenerateAllTypes` function actually does generate all of
the types. Once it is added there other tests should pick up on the
other parts that were missing.
Add a pass to spread Volatile semantics to variables with SMIDNV,
WarpIDNV, SubgroupSize, SubgroupLocalInvocationId, SubgroupEqMask,
SubgroupGeMask, SubgroupGtMask, SubgroupLeMask, or SubgroupLtMask BuiltIn
decorations or OpLoad for them when the shader model is the ray
generation, closest hit, miss, intersection, or callable shaders. This
pass can be used for VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V").
Handle variables used by multiple entry points:
1. Update error check to make it working regardless of the order of
entry points.
2. For a variable, if it is used by two entry points E1 and E2 and
it needs the Volatile semantics for E1 while it does not for E2
- If VulkanMemoryModel capability is enabled, which means we have to
set memory operation of load instructions for the variable, we
update load instructions in E1, but do not update the ones in E2.
- If VulkanMemoryModel capability is disabled, which means we have
to add Volatile decoration for the variable, we report an error
because E1 needs to add Volatile decoration for the variable while
E2 does not.
For the simplicity of the implementation, we assume that all functions
other than entry point functions are inlined.
* linker: Address conversion error introduced in earlier rework
Also rework the code slightly to first compute the new ID bound and
validate it, and only then, offset all existing IDs.
This makes those two steps clearer, and avoids potentially overflowing
the IDs within a module (even if that would have been caught later on).
Fixes: c3849565 ("Linker improvements (#4679)")
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4684
* test/linker: Disable IdsLimit tests for now
They are taking too long to run and results in the CI jobs timing out.
* test/linker: Code factorisation and small tweaks
* linker: Do not fail when going over limits
The limits are minima and implementations or APIs might support higher
limits, so just warn the user about it. And only check for the limits
right before emitting the binary, as limits might change earlier when
removing duplicate instructions, function prototypes, etc.
The only check performed right before merging, is making sure the ID
bound will not overflow the 32 bits following the merge.
Also, use the defines for the limits instead of hard-coding them.
* linker: Require a memory model in each input module
The existing code could run into weird situation. For example, if the
first module had no memory model, it would not emit any memory model
(sort of reasonable) and would accept without complains all possible mix
from later modules as it would not verify them.
* linker: Replace hex version with SPV_SPIRV_VERSION_WORD
* linker: Error out when linking together different versions
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4135
* tools/linker: Do not write to disk if linking failed
Also, do not consider warnings as errors.
* tools/linker: Fix formatting in help message
* tools/linker: Further clarify the use of --target-env
Also update the text for the default version to reflect the change made
in 7d768812 ("Basic support for SPIR-V 1.6 (#4663)").
The test harness for the opt fuzzer was failing to consider that the
input might use a very large id bound, despite no id approaching this
bound actually being used.
This change modifies the test harness to use the module's id bound,
rather than looking through the module for large ids.
Fixes: oss-fuzz:42386
For a shader input/output interface variable of structure type:
If the structure has BuiltIn members, then the structure type
must be Block decorated.
Otherwise, the variable, or the struct members must have locations,
but nothing can have both a location be a BuiltIn.
Implements validation needed to reject the example in
https://github.com/KhronosGroup/SPIRV-Registry/issues/134
* Basic support for SPIR-V 1.6
* Update SPIRV-Headers deps
* Add new environment enum for SPIR-V 1.6
* Make default environment 1.6 for most tools
* Update tests
* Disallow conditional branch with duplicate labels
* Disallow Dim=Buffer with sampled images
* Do not require the non-semantic extension after SPIR-V 1.5
The pass to remove the nonsemantic information and instructions
is used for drivers or tools that may not support them. Debug
information was only partially handle, which is causing a
problem. We need to either fully remove debug information or
not remove it all. Since I can see it being useful to keep the
debug information even when the nonsemantic instructions are
removed, I propose we do not remove debug info.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4269
In https://github.com/KhronosGroup/SPIRV-Tools/pull/3110, the strip reflect
pass was changed to also remove all explicitly nonsemantic instructions. This
makes it so that the name of the pass no longer reflects what the pass actually
does. This change renames the pass so that it reflects what the pass actaully does.
Use a very large id bound when fuzzing the optimizer, and check that the
input does not ids that are too close to this bound. This should make it
impossible in practice for an id overflow to occur.
Fixes#4657.
* Fix endianness of string literals
To get correct and consistent encoding and decoding of string literals
on big-endian platforms, use spvtools::utils::MakeString and MakeVector
(or wrapper functions) consistently for handling string literals.
- add variant of MakeVector that encodes a string literal into an
existing vector of words
- add variants of MakeString
- add a wrapper spvDecodeLiteralStringOperand in source/
- fix wrapper Operand::AsString to use MakeString (source/opt)
- remove Operand::AsCString as broken and unused
- add a variant of GetOperandAs for string literals (source/val)
... and apply those wrappers throughout the code.
Fixes #149
* Extend round trip test for StringLiterals to flip word order
In the encoding/decoding roundtrip tests for string literals, include
a case that flips byte order in words after encoding and then checks for
successful decoding. That is, on a little-endian host flip to big-endian
byte order and then decode, and vice versa.
* BinaryParseTest.InstructionWithStringOperand: also flip byte order
Test binary parsing of string operands both with the host's and with the
reversed byte order.
This prevents CCP from making constant -> constant transitions when
evaluating instruction values. In this case, FClamp is evaluated twice.
On the first evaluation, if computes FClamp(0.5, 0.5, -1) which returns
-1. On the second evaluation, it computes FClamp(0.5, 0.5, VARYING)
which returns 0.5.
Both fold() computations are correct given the semantics of FClamp() but
this causes a lateral transition in the constant lattice which was not
being considered VARYING by CCP.
Along with OpDecorate, also clone the OpDecorateString instructions for
variables created in the descriptor scalar replacement pass.
Fixesmicrosoft/DirectXShaderCompiler#3705
* https://github.com/KhronosGroup/Vulkan-Docs/issues/666 clearly
specified that interfaces do not require an input if there is an
associated output
* ADCE can now remove unused input variables (though they are kept if
the preserve interfaces option is used)
Fixes#4469
* Checks that decorations only usable with structure members are not
used by OpDecorate or OpDecorateId
* Checks that decorations not allowed on structure members are not used
with OpMemberDecorate
* Checks decoration targets for most core decorations
* Performs some Vulkan specific validation on deorations
* Add wasm build
* Run wasm ci on push
* Add copyright notice to wasm files
* [wasm] Update Emscripten
* [wasm] Change global lambda to regular function
* [wasm] Show detected core count during build
* [wasm] Set JS version from CHANGES, GITHUB_RUN_ID
Also remove custom docker emscripten build with brotli, as not used
* [wasm] Change github actions to use npm-publish
* [wasm] Us docker-compose up for CI
* [wasm] pass GITHUB_RUN_ID to docker
* [wasm] Change GITHUB_RUN_ID to GITHUB_RUN_NUMBER
* [wasm] Fix GITHUB_RUN_NUMBER in docker-compose.yml
If the ids overflow when creating an integer constant in the ir_builder, there will be a nullptr dereference. This is happening from inside merge return.
We need to propagate the error up, and make sure it is handled appropriately.
Consider the new test case. The conditional branch in the continue
block is never marked as live. However, `IsDead` will say it is not
dead, so it does not get deleted. Because it was never marked as live,
`%false` was not mark as live either, but it gets deleted. This results
in invalid code.
To fix this properly, we had to reconsider how branches are handle. We
make the following changes:
1) Terminator instructions that are not branch or OpUnreachable must be
kept, so they are marked as live when initializing the worklist.
2) Branches and OpUnreachable instructions are marked as live if
a) the block does not have a merge instruction and another instruction
in the block is marked as live, or
b) the merge instruction in the same block is marked as live.
3) Any instruction that is not marked as live is removed.
4) If a terminator is to be removed, an OpUnreachable is added. This
happens when the entire block is dead, and the block will be removed.
The OpUnreachable is generated to make sure the block still has a
terminator, and is valid.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4509.
When checking the OpBranchConditional for selection headers,
we intend to register both the true and false targets.
Short circuiting was getting in the way.
Co-authored-by: Steven Perron <stevenperron@google.com>
Spirv-opt has not had to handle module with function declarations. This
lead many passes to assume that every function has a body. This is not
always true. This commit will modify a number of passes to handle
function declarations.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4443
When setting default value for spec constants, for numeric bit types smaller
than 32 bits, follow the SPIR-V rules for narrow literals:
- signed integers are sign-extended
- otherwise, upper bits are zero.
Followup to #4588
* test: add a test to show 8/16-bit
* opt/spec_constants: fix bit pattern width checks.
The input bit patterns are always at least 32-bits, so let the test
pass for 8/16-bit values as well. This shouldn't have any effect on the
64-bit patterns I assume this was introduced for.
The OSS-Fuzz i386 build has been failing due to errors about
64-to-32-bit conversions, relating to random generation code. This
changre fixes the problem by explicitly using a 64-bit random generator,
and by adding a cast to size_t to avoid an implicit conversion.
Do this if Constant or DefUse managers are invalid. Using the
ConstantManager attempts to regenerate the DefUseManager
which is not valid during inlining.
* Account for strided components in arrays
Fixes#4567
* If the element type of an array takes less than a location in size,
calculate the location usage in a strided manner
* formatting
According to spec this opcode is a constant instruction - that's it
can appear outside of function bodies.
Co-authored-by: DmitryBushev <dmitry.bushev@intel.com>
Instead calculate a hash based on the input and use that as a seed
into random data generation for the target env.
Also fixes issue where input data was not actually being fed into
one fuzzer.
Fixes#4450
* Don't eliminate dead members from StructuredBuffer as layout(offset) qualifiers cannot be applied to structure fields.
* Traverse arrays when marking structs as fully used.
Co-authored-by: Steven Perron <stevenperron@google.com>
Debug[No]Line are tracked and optimized using the same mechanism that tracks
and optimizes Op[No]Line.
Also:
- Fix missing DebugScope at top of block.
- Allow scalar replacement of access chain in DebugDeclare
Pending a more general solution for constructing a target environment
based on the bytes of a test input, this change avoids a UBSan error
caused by the existing approach.
Fixes https://crbug.com/38087
* Fix extract with out-of-bounds index
When folding a OpCompositeExtract that is fed by an
OpCompositeConstruct, we handle and out of bounds
index, but only in the case where the result of the
OpCompostiteConstruct is a struct. This change
refactors that folding rule and then improves it to
handle an out-of-bounds access when the result of the
OpCompositeConstruct is a vector.
Includes:
- Shift to use of spirv-header extinst.nonsemantic.shader grammar.json
- Remove extinst.nonsemantic.vulkan.debuginfo.100.grammar.json
- Enable all optimizations for Shader.DebugInfo
Also fixes scalar replacement to only insert DebugValue after all
OpVariables. This is not necessary for OpenCL.DebugInfo, but it is
for Shader.DebugInfo.
Likewise, fixes Private-to-Local to insert DebugDeclare after all
OpVariables.
Also fixes inlining to handle FunctionDefinition which can show up
after first block if early return processing happens.
Co-authored-by: baldurk <baldurk@baldurk.org>
Makes the fuzzer pass and transformation that wraps vector synonyms
aware of the fact that integer operations can have arguments that
differ in signedness, and that the result type of such an operation
can have different sign from the argument types.
Fixes#4413.
In SPIR-V, integers use 2s complement representation, so that signed
integer overflow and underflow is well defined. However, the constant
folder was causing overflow / underflow at the C++ level. This change
avoids such overflows by performing constant folding for IAdd, ISub and
IMul in the context of unsigned values, which works because signedness
is irrelevant according to the SPIR-V semantics for these instructions.
Fixes#4510.
* Fix infinite loop in validation
Fixes https://crbug.com/38548
* Fixes an issue in structured exit checking where an invalid merge
could result in an infinite traversal
* formatting
A test has been removed which depends on casting to spv_target_env from a value
outside the range of that enum. This is an undefined behaviour, thus the
test is invalid.
It is possible that other optimization will propagate
a value into an OpCompositeExtract or OpVectorShuffle
instruction that is larger than the vector size.
Vector DCE has to be able to handle it.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4513.
- The binary exponent must have some decimal digits
- A + or - after the binary exponent digits should not be interpreted as
part of the binary exponent.
Fixes: #4500
With OSS-Fuzz, the build system should not directly set options such as
-fsanitize=fuzzer. Instead, these are set by OSS-Fuzz, and
linker options are provided via the LIB_FUZZER_OPTIONS environment
variable. This change allows the fuzzers to be build stand-alone,
outside of OSS-Fuzz, in the way that was already supported, as well as
inside OSS-Fuzz, when the LIB_FUZZER_OPTIONS environment variable is
set.
ADCE does not handle exported functions. This was an explicit decision
because we did not believe that the linkage attribute could be used in
shaders, but it can now. This change has been made.
While fixing this error, I noticed that the OpName for labels is
sometimes removed because the label instructions are not marked
explicitly marked as live. This has able been fixed.
Currently, handles promotion of divergence due to reconvergence rules, but doesn't handle "late merges" caused by a later-than-necessary declared merge block.
Co-authored-by: Jakub Kuderski <kubak@google.com>
convert-to-sampled-image pass converts images and/or samplers with
given pairs of descriptor set and binding to sampled image.
If a pair of an image and a sampler have the same pair of descriptor
set and binding that is one of the given pairs, they will be
converted to a sampled image. In addition, if only an image has the
descriptor set and binding that is one of the given pairs, it will
be converted to a sampled image as well.
For example, when we have
%a = OpLoad %type_2d_image %texture
%b = OpLoad %type_sampler %sampler
%combined = OpSampledImage %type_sampled_image %a %b
%value = OpImageSampleExplicitLod %v4float %combined ...
1. If %texture and %sampler have the same descriptor set and binding
%combine_texture_and_sampler = OpVaraible %ptr_type_sampled_image_Uniform
...
%combined = OpLoad %type_sampled_image %combine_texture_and_sampler
%value = OpImageSampleExplicitLod %v4float %combined ...
2. If %texture and %sampler have different pairs of descriptor set and binding
%a = OpLoad %type_sampled_image %texture
%extracted_image = OpImage %type_2d_image %a
%b = OpLoad %type_sampler %sampler
%combined = OpSampledImage %type_sampled_image %extracted_image %b
%value = OpImageSampleExplicitLod %v4float %combined ...
* Disallow loading a runtime-sized array
Fixes#4472
* Disallow loading a runtime-sized array or a composite containing one
* Refactor type traversal into a separate function used by both runtime
array checks and sized int/float checks
* Update invalid tests
This PR adds a generic dataflow analysis framework to SPIRV-opt, with the intent of being used in SPIRV-lint. This may also be useful for SPIRV-opt, as existing ad-hoc analyses can be rewritten to use a common framework, but this is not the target of this PR.
This PR adds a new executable spirv-lint with a simple "Hello, world!"
program, along with its associated library and a dummy unit test.
For now, only adds to CMake and Bazel; other build systems will be added
in a future PR.
Issue: #3196
Testing on ClusterFuzz has revealed that the fuzzer sometimes goes
wrong when a shader is very simple - e.g., there have been bugs where
a fuzzer pass has assumed that at least one basic type exists in the
module. This change adds an almost empty SPIR-V example to the shaders
used for testing, to help catch such cases locally.
spirv-fuzz features transformations that should be applicable by
construction. Assertions are used to detect when such transformations
turn out to be inapplicable. Failures of such assertions indicate bugs
in the fuzzer. However, when using the fuzzer at scale (e.g. in
ClusterFuzz) reports of these assertion failures create noise, and
cause the fuzzer to exit early. This change adds an option whereby
inapplicable transformations can be ignored. This reduces noise and
allows fuzzing to continue even when a transformation that should be
applicable but is not has been erroneously created.
Control dependence analysis constructs a control dependence graph,
representing the conditions for a block's execution relative to the
results of other blocks with conditional branches, etc.
This is an analysis pass that will be useful for the linter and
potentially also useful in opt. Currently it is unused except for the
added unit tests.
Adaps the transformations that add OpConstantUndef and OpConstantNull
to a module so that pointer undefs are not allowed, and null pointers
are only allowed if suitable capabilities are present.
Fixes#4357.
Adds a new transformation that rewrites a scalar operation (like
OpFAdd, opISub) as an equivalent vector operation, adding a synonym
between the scalar result and an appropriate component of the vector
result.
Fixes#4195.
This change is responsible for avoiding the replacement of constant
operands with another one not constant, in the context of atomic
operations. The related rule from the SPIR-V spec is: "All used for
Scope and Memory Semantics in shader capability must be of an
OpConstant."
Fixes#4346.
This change allows spriv-reduce to get rid of a selection, switch or
loop construct if none of the instructions defined in the construct
are used outside the construct.
The new pass will removed interface variable on the OpEntryPoint instruction when they are not statically referenced in the call tree of the entry point.
It can be enabled on the command line using the options `remove-unused-interface-variables`.
This change allows the reducer to merge together blocks even when they
are unreachable, but keeps the restriction of reachability in place
for the optimizer.
Fixes#4302.
* Initial support for SPV_KHR_integer_dot_product
- Adds new operand types for packed-vector-format
- Moves ray tracing enums to the end
- PackedVectorFormat is a new optional operand type, so it requires
special handling in grammar table generation.
- Add SPV_KHR_integer_dot_product to optimizer whitelists.
- Pass-through validation: valid cases pass validation
Validation errors are not checked.
- Update SPIRV-Headers
Patch by David Neto <dneto@google.com>
Rebase and minor tweaks by Kevin Petit <kevin.petit@arm.com>
Signed-off-by: David Neto <dneto@google.com>
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Change-Id: Icb41741cb7f0f1063e5541ce25e5ba6c02266d2c
* format fixes
Change-Id: I35c82ec27bded3d1b62373fa6daec3ffd91105a3
Fixes#4170, by checking the signedness of bitwise operands in
TransformationAddBitInstructionSynonym, to avoid an "Expected Base
Type to be equal to Result Type" validation error.
PR #4118 (d71ac38b8e) let spirv-val report a validation error when we
use offset for an OpImage* instruction instead of ConstantOffset. Since
some compilers like DXC rely on spirv-opt for function inlining or loop
unrolling, the spirv-val change broke some working shaders when the
shader developers disable the optimization (spirv-opt).
For example, DXC recently got this issue from a few users e.g.,
https://github.com/microsoft/DirectXShaderCompiler/issues/3807
Since this error is reported only when the spirv-opt is disabled, it
looks like the exact case that we have to skip spirv-val when
`--before-legalize-hlsl` is given. Moreover, avoiding the error using
`--before-legalize-hlsl` on DXC is exactly what FXC and DXC's DXIL
do (they do not report the error if the offset becomes a constant after
function inlining or loop unrolling).
This change prevents TransformationOutlineFunction from outlining a
region of blocks if some block in the region has an unreachable
predecessor. This avoids a bug whereby the region would be outlined,
and the unreachable predecessors would be left behind, referring to
blocks that are no longer in the function.
The def-use manager was being incorrectly updated in
TransformationPermutePhiOperands, and this was causing future
transformations to go wrong during fuzzing. This change updates the
def-use manager in a correct manner, and adds a test exposing the
previous bug.
Fixes#4300.
The Transformation class tests did not cover the (trivial) ToMessage
methods of each transformation, nor the constructors that take a
protobuf message. This lac of coverage makes it hard to see which more
interesting pieces of code are not covered when looking at coverage
percentages. This change adapts the helper function for applying a
transformation and checking fresh ids so that it turns a
transformation into a protobuf message and back, thus covering
ToMessage and the protobuf constructor for every transformation. The
runtime overhead of doing this is very small.
Fixes https://crbug.com/tint/793
* When a loop has an empty loop construct, the loop construct and
continue construct share the same header so don't disallow the loop
header for the continue construct
Fix dangling phi bug from loop-unroll
When unrolling the following loop:
```
%const0 = OpConstant ...
%const1 = OpConstant ...
...
%LoopHeader = OpLabel
%phi0 = OpPhi %float %const0 %PreHeader %phi1 %Latch
%phi1 = OpPhi %float %const1 %PreHeader %x %Latch
...
%LoopBody = OpLabel
%x = OpFSub %float %phi1 %phi0
...
```
the loop-unroll pass sets the value of `%phi0` as `%phi1` for the second
copy of the loop body. For example, the second copy of
`%x = OpFSub %float %phi1 %phi0` will be
`%y = OpFSub %float %x %phi1`.
Since all phi instructions for inductions will are removed after the
loop unrolling, `%phi1` will be a dead dangling phi.
It happens only for the phi values of the first loop iteration. Replacing those
dangling phis with their initial values fixes this issue.
For example, the second copy of `%x = OpFSub %float %phi1 %phi0` should be
`%y = OpFSub %float %x %const1` because the value of `%phi1` from the
first loop iteration is `%const1`.
This pass converts an internal form of GLSLstd450 Interpolate ops
to the externally valid form. The external form takes the lvalue
of the interpolant. The internal form can do a load of the interpolant.
The pass replaces the load with its pointer. The internal form is
generated by glslang and possibly other frontends for HLSL shaders.
The new pass is called as part of HLSL legalization after all
propagation is complete.
Also adds internal interpolate form to pre-legalization validation
To help ensure that optimizations that do less cautious invalidation
of analyses are implemented correctly, this change adds checks to the
tests of various transformations to ensure that analyses such as
def-use are up to date.
This allows the GPU-AV layer to differentiate between errors with
uniform buffers versus storage buffers and map these to the relevant
VUIDs.
This is a resubmit of a previously reverted commit. The revert was
done as someone erroneously attempted to build the latest validation
layers with a TOT spirv-tools. The validation layers must be built with
their known-good glslang and its known-good spirv-tools and spirv-headers.
* Mark module as modified if convert-to-half removes decorations.
If the convert-to-half pass does not change the body of the function,
but removes decorations, it returns that nothing changed. This is
incorrect, and will be fixed.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4117
* Update comment for RemoveDecorationsFrom
The existing spirv-opt `DebugInfoManager::AddDebugValueForDecl()` sets
the scope and line info of the new added DebugValue using the scope and
line of DebugDeclare. This is wrong because only a single DebugDeclare
must exist under a scope while we have to add DebugValue for all the
places where the variable's value is updated. Therefore, we have to set
the scope and line of DebugValue based on the places of the variable
updates.
This bug makes
https://github.com/google/amber/blob/main/tests/cases/debugger_hlsl_shadowed_vars.amber
fail. This commit fixes the bug.
* Validate SPV_KHR_workgroup_memory_explicit_layout
* Check if SPIR-V is at least 1.4 to use the extension.
* Check if either only Workgroup Blocks or only Workgroup non-Blocks
are used.
* Check that if more than one Workgroup Block is used, variables are
decorated with Aliased.
* Check layout decorations for Workgroup Blocks.
* Implicitly use main capability if the ...8BitAccess or
...16BitAccess are used.
* Allow 8-bit and 16-bit types when ...8BitAccess and ...16BitAccess
are used respectively.
* Update SPIRV-Headers dependency
Bump it to include SPV_KHR_workgroup_memory_explicit_layout.
* Add option to validate Workgroup blocks with scalar layout
Validate the equivalent of scalarBlockLayout for Workgroup storage
class Block variables from SPV_KHR_workgroup_memory_explicit_layout.
Add option to the API and command line tool.
Propagating the OpLine/OpNoLine to preserve the debug information
through transformations results in integrity check failures because of
the extra line instructions. This commit lets spirv-opt skip the
integrity check when the code contains OpLine or OpNoLine.
When there is an array of strutured buffers, desc sroa will only split
the array, but not a struct type in the structured buffer. However,
the calcualtion of the number of binding a struct requires does not take
this into consideration. This commit will fix that.
Avoid generating OpPhi on void types, and allow the transformation to
take place on regions that produce pointer and sampled image result
ids if such ids are not used after the region.
Fixes#3787.
* validation: tighter validation of multisampled images
- if MS=1, then Sample image operand is required
- Sampling operations are not permitted when MS=1
Fixes#4057, #4058
* Fail early for multisampled image for sampling, dref, gather
* validate StorageImageMultisampled capability
The StorageImageMultisampled capability is required when declaring
an image type with with Multisampled==1 and it's a storage image (Sampled == 2)
Fixes#4061
* Allow SubpassData with Multisampled and Sampled==2
The eliminate dead member pass is written assuming that the index to an
OpAccessChain will be a 32-bit integer or 64-bit integer. That is
changed to work for any width 64-bits or less.
Fixes https://crbug.com/1151727
* Add validation for ray tracing builtins
- Remove existing InstanceId testing that was combined with VertexId in awkward ways.
- Rather than adding a new set of functions for each ray tracing builtin, add
an error table that maps the builtin ID to the 3 common VUIDs for each builtin
(I could see this being extended for other builtins in the future).
- add F32 matrix validation function
- augment existing PrimitiveId validation to verify Input storage class for the
RT stages this is accepted in, and correct the list of stages that it is actually
accepted in (only Intersection / Any Hit / Closest Hit)
* add testing for ray tracing builtins
- remove exising InstanceId testing as it was tangled in with VertexId in now weird ways
and combine it with the new tests
- add testing for ray tracing builtins
- builtins accepted in the same stages and of the same types are combined into test functions
- add some new matrix types to the code generator so they can be used for testing
This instruments ImageRead, ImageWrite and ImageFetch when applied to
texel buffers.
Also add new (but not yet generated) buffer OOB error codes differentiated
for VUID classification.
* BuildModule: optionally avoid adding new OpLine instructions
Fixes#4029 for my use case
* Fix formatting
* Create last_line_inst_ only if doing extra line tracking
* Update to final ray tracing extensions
Drop Provisional from ray tracing enums
sed -ie 's/RayQueryProvisionalKHR/RayQueryKHR/g' **/*
sed -ie 's/RayTracingProvisionalKHR/RayTracingKHR/g' **/*
Add terminator support for SpvOpIgnoreIntersectionKHR and SpvOpTerminateRayKHR
Update deps for SPIRV-Headers
* Update capability dependencies for MeshShadingNV
Accommodate https://github.com/KhronosGroup/SPIRV-Headers/pull/180
MeshShadingNV: enables PrimitiveId, Layer, and ViewportIndex
Co-authored-by: Daniel Koch <dkoch@nvidia.com>
Fix buffer oob instrumentation for matrix refs.
Matrix stride decoration is not on matrix type but is a member decoration
on the enclosing struct type. Also correctly apply matrix stride depending
on row or column major.
spirv-opt has a bug that `DebugInfoManager::AddDebugValueWithIndex()` does not
preserve `Indexes` operands of
[DebugValue](https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.DebugInfo.100.html#DebugValue).
It has to preserve all of those `Indexes` operands, but it preserves only the first index
operand.
This PR removes `DebugInfoManager::AddDebugValueWithIndex()` and lets the spirv-opt
use `DebugInfoManager::AddDebugValueForDecl()`.
`DebugInfoManager::AddDebugValueForDecl()` preserves the Indexes operand correctly.
in 3943 the new tests should have failed, but didn't due to the AnyVUID
check not handling case with trailing whitespace. Added trimming to handle
it as easily might happen again in future
The front-end language compiler would simply emit DebugDeclare for
a variable when it is declared, which is effective through the variable's
scope. Since DebugDeclare only maps an OpVariable to a local variable,
the information can be removed when an optimization pass uses the
loaded value of the variable. DebugValue can be used to specify the
value of a variable. For each value update or phi instruction of a variable,
we can add DebugValue to help debugger inspect the variable at any
point of the program execution.
For example,
float a = 3;
... (complicated cfg) ...
foo(a); // <-- variable inspection: debugger can find DebugValue of `float a` in the nearest dominant
For the code with complicated CFG e.g., for-loop, if-statement, we
need help of ssa-rewrite to analyze the effective value of each variable
in each basic block.
If the value update of the variable happens only once and it dominates
all its uses, local-single-store-elim pass conducts the same value update
with ssa-rewrite and we have to let it add DebugValue for the value assignment.
One main issue is that we have to add DebugValue only when the value
update of a variable is visible to DebugDeclare. For example,
```
{ // scope1
%stack = OpVariable %ptr_int %int_3
{ // scope2
DebugDeclare %foo %stack <-- local variable "foo" in high-level language source code is declared as OpVariable "%stack"
// add DebugValue "foo = 3"
...
Store %stack %int_7 <-- foo = 7, add DebugValue "foo = 7"
...
// debugger can inspect the value of "foo"
}
Store %stack %int_11 <-- out of "scope2" i.e., scope of "foo". DO NOT add DebugValue "foo = 11"
}
```
However, the initalization of a variable is an exception.
For example, an argument passing of an inlined function must be done out of
the function's scope, but we must add a DebugValue for it.
```
// in HLSL
bar(float arg) { ... }
...
float foo = 3;
bar(foo);
// in SPIR-V
%arg = OpVariable
OpStore %arg %foo <-- Argument passing. Out of "float arg" scope, but we must add DebugValue for "float arg"
... body of function bar(float arg) ...
```
This PR handles the except case in local-single-store-elim pass. It adds
DebugValue for a store that is considered as an initialization.
The same exception handling code for ssa-rewrite is done by this commit: df4198e50e.
This fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/3873.
In the presence of variable pointers, the reaching definition may be
another pointer. For example, the following fragment:
%2 = OpVariable %_ptr_Input_float Input
%11 = OpVariable %_ptr_Function__ptr_Input_float Function
OpStore %11 %2
%12 = OpLoad %_ptr_Input_float %11
%13 = OpLoad %float %12
corresponds to the pseudo-code:
layout(location = 0) in flat float *%2
float %13;
float *%12;
float **%11;
*%11 = %2;
%12 = *%11;
%13 = *%12;
which ultimately, should correspond to:
%13 = *%2;
During rewriting, the pointer %12 is found to be replaceable by %2.
However, when processing the load %13 = *%12, the type of %12's reaching
definition is another float pointer (%2), instead of a float value.
When this happens, we need to continue looking up the reaching definition
chain until we get to a float value or a non-target var (i.e. a variable
that cannot be SSA replaced, like %2 in this case since it is a function
argument).
This PR fixes a bug related to the transformation applicability.
When the OpNot case was implemented, its opcode was not
added to the list of supported bit instructions in IsApplicable.
So, the changes made are the following.
- Add OpNot to the list of supported bit instructions.
- Update the tests.
This commit add support for optimizer to not inline functions with DontInline control flag, so that the [noinline] attribute in HLSL will be useful in DXC SPIR-V generation.
This is part of work of github.com/microsoft/DirectXShaderCompiler/issues/3158
Based on the OpLine spec, an OpLine instruction must be applied to
the instructions physically following it up to the first occurrence
of the next end of block, the next OpLine instruction, or the next
OpNoLine instruction.
```
OpLine %file 0 0
OpNoLine
OpLine %file 1 1
OpStore %foo %int_1
%value = OpLoad %int %foo
OpLine %file 2 2
```
For the above code, the current spirv-opt keeps three line
instructions `OpLine %file 0 0`, `OpNoLine`, and `OpLine %file 1 1`
in `std::vector<Instruction> dbg_line_insts_` of Instruction class
for `OpStore %foo %int_1`. It does not put any line instruction to
`std::vector<Instruction> dbg_line_insts_` of
`%value = OpLoad %int %foo` even though `OpLine %file 1 1` must be
applied to `%value = OpLoad %int %foo` based on the spec.
This results in the missing line information for
`%value = OpLoad %int %foo` while each spirv-opt pass optimizes the
code. We have to put `OpLine %file 1 1` to
`std::vector<Instruction> dbg_line_insts_` of
both `%value = OpLoad %int %foo` and `OpStore %foo %int_1`.
This commit conducts the line instruction propagation and skips
emitting the eliminated line instructions at the end, which are the same
with PropagateLineInfoPass and RedundantLineInfoElimPass. This
commit removes PropagateLineInfoPass and RedundantLineInfoElimPass.
KhronosGroup/glslang#2440 is a related PR that stop using
PropagateLineInfoPass and RedundantLineInfoElimPass from glslang.
When the code in this PR applied, the glslang tests will pass.
If enabled the following targets will be created:
* `${SPIRV_TOOLS}-static` - `STATIC` library. Has full public symbol visibility.
* `${SPIRV_TOOLS}-shared` - `SHARED` library. Has default-hidden symbol visibility.
* `${SPIRV_TOOLS}` - will alias to one of above, based on BUILD_SHARED_LIBS.
If disabled the following targets will be created:
* `${SPIRV_TOOLS}` - either `STATIC` or `SHARED` based on the new `SPIRV_TOOLS_LIBRARY_TYPE` flag. Has full public symbol visibility.
* `${SPIRV_TOOLS}-shared` - `SHARED` library. Has default-hidden symbol visibility.
Defaults to `ON`, matching existing build behavior.
This flag can be used by package maintainers to ensure that all libraries are built as shared objects.
* spirv-val: Allow the ViewportIndex and Layer built-ins when their corresponding SPIR-V 1.5 capabilities are present
* Added tests for OpCapability ShaderViewportIndex and OpCapability ShaderLayer
For some cases, we have DebugDecl invisible to a value assignment, but
the value assignment information is important i.e., debugger cannot inspect
the variable without the information. For example, a parameter of an inlined
function must have its value assignment i.e., argument passing out of its
function scope. If we simply remove DebugDecl because it is invisible to the
argument passing, we cannot inspec the variable.
This PR
- Adds DebugValue for DebugDecl invisible to a value assignment. We use
the value of the variable in the basic block that contains DebugDecl, which is
found by ssa-rewrite. If the value instruction does not dominate DebugDecl,
we use the value of the variable in the immediate dominator of the basic block.
- Checks the visibility of DebugDecl for Phi value assignment based on the
all value operands of the Phi. Since Phi just references multiple values from
multiple basic blocks, scopes of value operands must be regarded as the scope
of the Phi.
The following implementations are introduced:
- Transformation and fuzzer pass for expanding vector reduction.
- Unit tests to cover the instructions with different vector sizes.
Fixes#3768.
This fixes a problem where TransformationInlineFunction could lead to
distinct instructions having identical unique ids. It adds a validity
check to detect this problem in general.
Fixes#3911.
`DebugInfoManager::AddDebugValueIfVarDeclIsVisible()` adds
OpenCL.DebugInfo.100 DebugValue from DebugDeclare only when the
DebugDeclare is visible to the give scope. It helps us correctly
handle a reference variable e.g.,
{ // scope #1.
int foo = /* init */;
{ // scope #2.
int& bar = foo;
...
in the above code, we must not propagate DebugValue of `int& bar` for
store instructions in the scope #1 because it is alive only in
the scope #2.
We have an exception: If the given DebugDeclare is used for a function
parameter, `DebugInfoManager::AddDebugValueIfVarDeclIsVisible()` has
to always add DebugValue instruction regardless
of the scope. It is because the initializer (store instruction) for
the function parameter can be out of the function parameter's scope
(the function) in particular when the function was inlined.
Without this change, the function parameter value information always
disappears whenever we run the function inlining pass and
`DebugInfoManager::AddDebugValueIfVarDeclIsVisible()`.
The validity check during fuzzing and in unit tests is strengthened to
require that every block has its enclosing function as its parent.
TransformationMergeFunctionReturns is fixed so that it ensures parents
are set appropriately.
Fixes#3907.
Currently the validator, when checking an instruction is in the correct
section, always advances the current section. This means if we have an
instruction from a previous section we'll end up reporting it as invalid
in a function definition. This error is confusing.
This CL updates the validator to check if the given opcode is from a
previous layout section before advancing the current section. If it is
from a previous layout section an error is emitted.
1. DebugValue/DebugDeclare references of load/store must not change
the behaviors of the convert-local-access-chains pass
2. We have to properly set the scope and line information of new
instructions made by the convert-local-access-chains pass
Adds a virtual method, GetFreshIds(), to Transformation. Every
transformation uses this to indicate which ids in its protobuf message
are fresh ids. This means that when replaying a sequence of
transformations the replayer can obtain a smallest id that is not in
use by the module already and that will not be used by any
transformation by necessity. Ids greater than or equal to this id
can be used as overflow ids.
Fixes#3851.
The following changes are introduced:
1. Entry block might have more than one predecessor, even if it is not
a selection/loop merge block. However Apply method asserts that
there is only one predecessor. Now, IsApplicable method ensures
that there is only one predecessor.
2. In fuzzer pass we exclude both loop headers and selection headers
as potential exit blocks.
Fixes#3827.
Before this change, the replayer would return a SPIR-V binary. This
did not allow further transforming the resulting module: it would need
to be re-parsed, and the transformation context arising from the
replayed transformations was not available. This change makes it so
that after replay an IR context and transformation context are
returned instead; the IR context can subsequently be turned into a
binary if desired.
This change paves the way for an upcoming PR to integrate spirv-reduce
with the spirv-fuzz shrinker.
TransformationContext now holds a std::unique_ptr to a FactManager,
rather than a plain pointer. This makes it easier for clients of
TransformationContext to work with heap-allocated instances of
TransformationContext, which is needed in some upcoming work.
This transformation, given a constant integer (scalar or vector) C,
constants I and S of compatible type and scalar 32-bit integer constant
N, such that C = I - S*N, adds a loop which subtracts S from I, N
times, creating a synonym for C.
The related fuzzer pass randomly chooses constants to which to add
synonyms using this transformation, and the location where they should
be added.
Fixes#3616.
In preparation for some upcoming work on the shrinker, this PR changes
the interfaces of Fuzzer, Replayer and Shrinker so that all data
relevant to each class is provided on construction, meaning that the
"Run" method can become a zero-argument method that returns a status,
transformed binary and sequence of applied transformations via a
struct.
This makes greater use of fields, so that -- especially in Fuzzer --
there is a lot less parameter passing.
This change introduces various strategies for controlling the manner
in which fuzzer passes are applied repeatedly, including infrastructure
to allow fuzzer passes to be recommended based on which passes ran
previously.
This PR modifies the FactManager methods IdIsIrrelevant and GetIrrelevantIds so
that an id is always considered irrelevant if it comes from a dead block.
Fixes#3733.
Introduces two changes:
- duplicated_exit_region refers to a correct block, regardless of the order
of the blocks in the enclosing function.
- Exclude the case where the continue target is the exit block.
This PR implements part of the add bit instruction synonym transformation.
For now, the implementation covers the OpBitwiseOr, OpBitwiseXor and
OpBitwiseAnd cases.
This PR changes the fact manager so that, when calling some of the
functions in submanagers, passes references to other submanagers if
necessary (e.g. to make consistency checks).
In particular:
- DataSynonymAndIdEquationFacts is passed to the AddFactIdIsIrrelevant
function of IrrelevantValueFacts
- IrrelevantValueFacts is passed to the AddFact functions of
DataSynonymAndIdEquationFacts
The IRContext is also passed when necessary and the calls to the
corresponding functions in FactManager were updated to be valid and
always use an updated context.
Fixes#3550.
This transformation, given the header of a selection construct with
branching instruction OpBranchConditional, flattens it.
Side-effecting operations such as OpLoad, OpStore and OpFunctionCall
are enclosed within smaller conditionals.
It is applicable if the construct does not contain inner selection
constructs or loops, or atomic or barrier instructions.
The corresponding fuzzer pass looks for selection headers and
tries to flatten them.
Needed for the issue #3544, but it does not fix it completely.
In #3636, I missed that the instruction folder may create more than a
single constant per call. Since CCP was only checking whether one
constant had been created after folding, it was wrongly thinking that
the IR had not changed.
Fixes#3738.
Adds a transformation that inserts a conditional statement with a
boolean expression of arbitrary value and duplicates a given
single-entry, single-exit region, so that it is present in each
conditional branch and will be executed regardless of which branch will
be taken.
Fixes#3614.
Motivated by integrating spirv-reduce into spirv-fuzz, so that an
added function can be made smaller during shrinking, this adds support
in spirv-reduce for asking reduction to be restricted to the
instructions of a single specified function.
This PR adds the NestingDepth function to StructuredCFGAnalysis.
This function, given a block id, returns the number of merge
constructs containing it.
This is needed by spirv-fuzz, but it makes sense to add it to
StructuredCFGAnalaysis, which contains related functionalities.
This transformation takes an OpSelect instruction and replaces it with
a conditional branch, selecting the correct value using an OpPhi
instruction.
Fixes part of the issue #3544.
The Vulkan 1.2.152 headers/spec now include Valid Usage IDs (VUID)
for every BuiltIn. This change adds labeling to help aid tracking
the coverage gap in Vulkan targeted validation. This is done with
a script in the Validation Layers that parses the source for VUID
strings, both that there is an implementation and a test for it.
There are 2 main changes:
1. A VUID string is added where applicable. It is wrapped in a function
that at runtimes checks if Vulkan is the target so that other targets
will not be effected as the output error only changes for Vulkan and
the overhead only occurs if there is an error at all.
2. For unit test, a new parameter value was added to allow make sure
that for each VUID there is a matching test. There are cases where the
same unit test checks multiple VUs via the parameter feature of gtest.
For this, I added a custom gMock Matcher to simply loop over a list
of VUID strings
Pointer (if VariablePointers is enabled) to find sets of potential
synonyms.
However, some instructions with these types cannot be used in an OpPhi:
- OpFunction cannot be used as a value
- OpUndef should not be used, because it yields an undefined value for
each use
Fixes#3761.