* Add move folding for composite instructions
Fold chains of insert into construct
If a chain of OpCompositeInsert instruction write to every element of a
composite object, then we can replace it with an OpCompositeConstruct.
Fold a construct fed by extracts to a single extract
We already fold an OpCompositeConstruct when it is simlpy reconstructing
an object that was decomposed by a series of OpCompositeExtract
instructions. However, we do not do that if that object is an element
of a larger object.
I have updated the rule, so that if the original object is a an element
of a larger object, then the OpCompositeConstruct is replaced with a
single OpCompositeExtract from the larger object.
Fixes#4371.
* Handle 64-bit integers in local access chain convert
The local access chain convert pass does on run on module that have
64-bit integers, even if they have nothing to to with access chains.
This is very limiting because other passes rely on the access chains
being removed. So this commit will add this functionality to the pass.
spirv validation require OpFunctionCall with memory object, usually this
is non issue as all the functions are inlined.
This pass deal with some case for
DontInline function. accesschain input operand would be replaced new
created variable
- Add assembler/disassembler support
- Add validator support
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Change-Id: Iffcedd5d5e636a0e128a5906ffe634dd85727de1
This reverts commit 671f6e633f.
PR #4783 was reverted because it caused OpenCL CTS failures for clvk.
The was in clspv, which was not adding the no contract decoration when
it was required. This has been fixed in
https://github.com/google/clspv/pull/845. We can now reapply #4783.
* linker: Recalculate interface variables
By resolving extern symbols Entry Points might access variables they
hadn't declared before.
* test/linker: add test to verify linked spir-vs importing functions validate
Without the fix Validate will complain about:
"ERROR: 9: Interface variable id <5> is used by entry point 'bar' id <1>, but is not listed as an interface\n %5 = OpVariable %_ptr_Input_v3uint Input\n"
Adding Fma instruction can speed up the code. This was requested by
swiftshader, so they do not have to do this analysis themselves. It can
also help reduce the code size, and the work the ICD compilers have to
do.
The code accidentally expected OpTypeFunction operand count to match.
This is fixed so that OpTypeFunction instructions with different operand
counts are considered not matching.
Currently, the diff tool matches types bottom up, so on every
instruction it expects to know if its operands are already matched or
not. With cyclical references, it cannot know that. Type matching
would need significant rework to be able to support such a use case; for
example, it may need to maintain a set of plausable matches between type
pointers that are forward-referenced, and potentially back track when
later the types turn out to be incompatible.
In this change, OpTypeForwardPointer is supported in the more common and
trivial case. Firstly, forwarded type pointers are only matched if they
have they have the same storage class and point to the same type opcode:
- In the presence of debug info, matching is done only if the names are
unique in both src and dst.
- In the absence of debug info, matching is done only if there is only
one possible matching.
Fixes: #4754
spread-volatile-semantics pass spreads Volatile semantics for builtin
variables used by certain execution models based on
VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V"). Therefore, shaders without execution model (e.g., used only
for linkage) are not the target of the pass. This commit lets the pass
just return SuccessWithoutChange in that case.
If the body of the module does not have any ids change, compact ids will
not change the id bound. This can cause problems because the id bound
could be much higher than the largest id in that is used. It should be
reset any time it is not the larger id used + 1.
Fixes#4604
When folding a vector shuffle feeding a vector shuffle, we do not
propagate an 0xFFFFFFFF, which has a special meaning, correctly. We
adjust the value making it lose it meaning as an undefined value.
Fixes#4581
CCP does not want to fold an instruction unless it folds to a constant.
There is an asser to check for this. The question if a spec constant
counts as a constant. The constant folder considers a spec constant a
constand, but CCP does not. I've fixed the assert in CCP to match what
the folder does. It should not require any new changes to CCP.
Swift shader needs a way to inline all functions, even those marked as
DontInline. See https://github.com/KhronosGroup/SPIRV-Tools/pull/4471.
This implements the suggestion I made in the PR. We add a pass that
will remove the DontInline function control, so that the inlining passes
will inline them.
SwiftShader will still have to modify their code to add this pass before
the other passes are run.
The function `BuildInvalideAnalyses` will be rebuilt for every analysis that
has been requested, but it is not necessary. It also can cause problems
because if the CFG needs to be rebuilt, so do the dominator trees.
This change will make the functionality match the description of the
function.
* Optimize DefUseManager allocations
Saves around 30-35% of compilation time.
For inst->use_ids, use a pool linked list instead of allocating vectors for every instruction. For inst->uses, use a "PooledLinkedList"' -- a linked list that has shared storage for all nodes. Neither re-use nodes, instead we do a bulk compaction operation when too much memory is being wasted (tuneable).
Includes separate PooledLinkedList templated datastructure, a very special case construct, but split out to make the code a little easier to understand.
Incrementally compute the hash instead of collecting words
Avoids allocating temporary space in a std::vector and std::u32string, and making three passes over all the hashed data.
Switch to using std::vector to prevent processing duplicates instead of std::unordered_set: avoids an allocation/deletion every call to ComputeHashValue, and ends up faster due to much better cache behaviour and smaller constant-factor when searching the (generally very small) list.
In my test case, made Type::HashValue go from 7.5% of compilation time to .5%
In newer versions of protobuf the Status building code has been made
internal, so that embedders cannot build their own instances like is
being done here.
Changing this code to just use the .ok() method on the status object,
since if the status is OK or not is what is actually being tested.
This will make it easier in the future to update external/protobuf.
* Update test for parsing memory access masks
Needed to support SPV_INTEL_memory_access_aliasing extension
There is a negative test that checks unused mask bits.
Some of those bits are now sued by the new Intel extension.
* Update deps for new SPIRV-Headers
Previously, array sizes were presumed to be OpConstant, which is not
necessarily true. This change ensures OpSpecConstant array sizes as
matched exactly, instead of taken as OpConstant and matched by value.
* Reimplement LCS used by spirv-diff
Two improvements are made to the LCS algorithm:
- The LCS algorithm is reimplemented to use a std::stack instead of
being recursive. This prevents stack overflow in the LCSTest.Large
test.
- The LCS algorithm uses an NxM table. Previously, entries of this
table were {size_t, bool, bool}, which is now packed in 32 bits. The
first entry can assume a maximum value of min(N, M), which
realistically for SPIR-V diff will not be larger than 1 billion
instructions. This reduces memory usage of LCS by 75%.
This partially reverts 845f3efb8a and
enables LCS tests.
* Stabilize the output of spirv-diff
std::map is used instead of std::unordered_map to ensure the output of
spirv-diff is identical everywhere.
This partially reverts 845f3efb8a and
enables spirv-diff tests.
Scalar replacement generates a null when there value for a member will
not be used. The null is used to make sure things are
deterministic in case there is an error.
However, some type cannot be null, so we will change that to use undef.
To keep the code simpler we will always use the undef.
Fixes#3996
spirv-diff is a new tool that produces diff-style output comparing two
SPIR-V modules. The instructions between the src and dst modules are
matched as best as the tool can, and output is produced (in src
id-space) that shows which instructions are removed in src, added in dst
or modified between them. The order of instructions are not retained.
Matching instructions between two SPIR-V modules is not trivial, and
thus a number of heuristics are applied in this tool. In particular,
without debug information, it's hard to match functions as they can be
reordered. As such, this tool is primarily useful to produce the diff
of two SPIR-V modules derived from the same source.
This tool can be useful in a number of scenarios:
- Compare the SPIR-V before and after modifying a shader
- Compare the SPIR-V produced from a shader before and after compiler
codegen changes.
- Compare the SPIR-V produced from a shader before and after some
transformation or optimization.
- Compare the SPIR-V produced from a shader with different compilers.
The handling of the RayQueryKHR type is not complete in the type
manager. The tests were not picking this up. I've added a test to make
sure that the `GenerateAllTypes` function actually does generate all of
the types. Once it is added there other tests should pick up on the
other parts that were missing.
Add a pass to spread Volatile semantics to variables with SMIDNV,
WarpIDNV, SubgroupSize, SubgroupLocalInvocationId, SubgroupEqMask,
SubgroupGeMask, SubgroupGtMask, SubgroupLeMask, or SubgroupLtMask BuiltIn
decorations or OpLoad for them when the shader model is the ray
generation, closest hit, miss, intersection, or callable shaders. This
pass can be used for VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V").
Handle variables used by multiple entry points:
1. Update error check to make it working regardless of the order of
entry points.
2. For a variable, if it is used by two entry points E1 and E2 and
it needs the Volatile semantics for E1 while it does not for E2
- If VulkanMemoryModel capability is enabled, which means we have to
set memory operation of load instructions for the variable, we
update load instructions in E1, but do not update the ones in E2.
- If VulkanMemoryModel capability is disabled, which means we have
to add Volatile decoration for the variable, we report an error
because E1 needs to add Volatile decoration for the variable while
E2 does not.
For the simplicity of the implementation, we assume that all functions
other than entry point functions are inlined.
* linker: Address conversion error introduced in earlier rework
Also rework the code slightly to first compute the new ID bound and
validate it, and only then, offset all existing IDs.
This makes those two steps clearer, and avoids potentially overflowing
the IDs within a module (even if that would have been caught later on).
Fixes: c3849565 ("Linker improvements (#4679)")
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4684
* test/linker: Disable IdsLimit tests for now
They are taking too long to run and results in the CI jobs timing out.
* test/linker: Code factorisation and small tweaks
* linker: Do not fail when going over limits
The limits are minima and implementations or APIs might support higher
limits, so just warn the user about it. And only check for the limits
right before emitting the binary, as limits might change earlier when
removing duplicate instructions, function prototypes, etc.
The only check performed right before merging, is making sure the ID
bound will not overflow the 32 bits following the merge.
Also, use the defines for the limits instead of hard-coding them.
* linker: Require a memory model in each input module
The existing code could run into weird situation. For example, if the
first module had no memory model, it would not emit any memory model
(sort of reasonable) and would accept without complains all possible mix
from later modules as it would not verify them.
* linker: Replace hex version with SPV_SPIRV_VERSION_WORD
* linker: Error out when linking together different versions
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4135
* tools/linker: Do not write to disk if linking failed
Also, do not consider warnings as errors.
* tools/linker: Fix formatting in help message
* tools/linker: Further clarify the use of --target-env
Also update the text for the default version to reflect the change made
in 7d768812 ("Basic support for SPIR-V 1.6 (#4663)").
The test harness for the opt fuzzer was failing to consider that the
input might use a very large id bound, despite no id approaching this
bound actually being used.
This change modifies the test harness to use the module's id bound,
rather than looking through the module for large ids.
Fixes: oss-fuzz:42386
For a shader input/output interface variable of structure type:
If the structure has BuiltIn members, then the structure type
must be Block decorated.
Otherwise, the variable, or the struct members must have locations,
but nothing can have both a location be a BuiltIn.
Implements validation needed to reject the example in
https://github.com/KhronosGroup/SPIRV-Registry/issues/134
* Basic support for SPIR-V 1.6
* Update SPIRV-Headers deps
* Add new environment enum for SPIR-V 1.6
* Make default environment 1.6 for most tools
* Update tests
* Disallow conditional branch with duplicate labels
* Disallow Dim=Buffer with sampled images
* Do not require the non-semantic extension after SPIR-V 1.5
The pass to remove the nonsemantic information and instructions
is used for drivers or tools that may not support them. Debug
information was only partially handle, which is causing a
problem. We need to either fully remove debug information or
not remove it all. Since I can see it being useful to keep the
debug information even when the nonsemantic instructions are
removed, I propose we do not remove debug info.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4269
In https://github.com/KhronosGroup/SPIRV-Tools/pull/3110, the strip reflect
pass was changed to also remove all explicitly nonsemantic instructions. This
makes it so that the name of the pass no longer reflects what the pass actually
does. This change renames the pass so that it reflects what the pass actaully does.
Use a very large id bound when fuzzing the optimizer, and check that the
input does not ids that are too close to this bound. This should make it
impossible in practice for an id overflow to occur.
Fixes#4657.
* Fix endianness of string literals
To get correct and consistent encoding and decoding of string literals
on big-endian platforms, use spvtools::utils::MakeString and MakeVector
(or wrapper functions) consistently for handling string literals.
- add variant of MakeVector that encodes a string literal into an
existing vector of words
- add variants of MakeString
- add a wrapper spvDecodeLiteralStringOperand in source/
- fix wrapper Operand::AsString to use MakeString (source/opt)
- remove Operand::AsCString as broken and unused
- add a variant of GetOperandAs for string literals (source/val)
... and apply those wrappers throughout the code.
Fixes #149
* Extend round trip test for StringLiterals to flip word order
In the encoding/decoding roundtrip tests for string literals, include
a case that flips byte order in words after encoding and then checks for
successful decoding. That is, on a little-endian host flip to big-endian
byte order and then decode, and vice versa.
* BinaryParseTest.InstructionWithStringOperand: also flip byte order
Test binary parsing of string operands both with the host's and with the
reversed byte order.
This prevents CCP from making constant -> constant transitions when
evaluating instruction values. In this case, FClamp is evaluated twice.
On the first evaluation, if computes FClamp(0.5, 0.5, -1) which returns
-1. On the second evaluation, it computes FClamp(0.5, 0.5, VARYING)
which returns 0.5.
Both fold() computations are correct given the semantics of FClamp() but
this causes a lateral transition in the constant lattice which was not
being considered VARYING by CCP.
Along with OpDecorate, also clone the OpDecorateString instructions for
variables created in the descriptor scalar replacement pass.
Fixesmicrosoft/DirectXShaderCompiler#3705
* https://github.com/KhronosGroup/Vulkan-Docs/issues/666 clearly
specified that interfaces do not require an input if there is an
associated output
* ADCE can now remove unused input variables (though they are kept if
the preserve interfaces option is used)
Fixes#4469
* Checks that decorations only usable with structure members are not
used by OpDecorate or OpDecorateId
* Checks that decorations not allowed on structure members are not used
with OpMemberDecorate
* Checks decoration targets for most core decorations
* Performs some Vulkan specific validation on deorations
* Add wasm build
* Run wasm ci on push
* Add copyright notice to wasm files
* [wasm] Update Emscripten
* [wasm] Change global lambda to regular function
* [wasm] Show detected core count during build
* [wasm] Set JS version from CHANGES, GITHUB_RUN_ID
Also remove custom docker emscripten build with brotli, as not used
* [wasm] Change github actions to use npm-publish
* [wasm] Us docker-compose up for CI
* [wasm] pass GITHUB_RUN_ID to docker
* [wasm] Change GITHUB_RUN_ID to GITHUB_RUN_NUMBER
* [wasm] Fix GITHUB_RUN_NUMBER in docker-compose.yml
If the ids overflow when creating an integer constant in the ir_builder, there will be a nullptr dereference. This is happening from inside merge return.
We need to propagate the error up, and make sure it is handled appropriately.
Consider the new test case. The conditional branch in the continue
block is never marked as live. However, `IsDead` will say it is not
dead, so it does not get deleted. Because it was never marked as live,
`%false` was not mark as live either, but it gets deleted. This results
in invalid code.
To fix this properly, we had to reconsider how branches are handle. We
make the following changes:
1) Terminator instructions that are not branch or OpUnreachable must be
kept, so they are marked as live when initializing the worklist.
2) Branches and OpUnreachable instructions are marked as live if
a) the block does not have a merge instruction and another instruction
in the block is marked as live, or
b) the merge instruction in the same block is marked as live.
3) Any instruction that is not marked as live is removed.
4) If a terminator is to be removed, an OpUnreachable is added. This
happens when the entire block is dead, and the block will be removed.
The OpUnreachable is generated to make sure the block still has a
terminator, and is valid.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4509.
When checking the OpBranchConditional for selection headers,
we intend to register both the true and false targets.
Short circuiting was getting in the way.
Co-authored-by: Steven Perron <stevenperron@google.com>
Spirv-opt has not had to handle module with function declarations. This
lead many passes to assume that every function has a body. This is not
always true. This commit will modify a number of passes to handle
function declarations.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/4443
When setting default value for spec constants, for numeric bit types smaller
than 32 bits, follow the SPIR-V rules for narrow literals:
- signed integers are sign-extended
- otherwise, upper bits are zero.
Followup to #4588
* test: add a test to show 8/16-bit
* opt/spec_constants: fix bit pattern width checks.
The input bit patterns are always at least 32-bits, so let the test
pass for 8/16-bit values as well. This shouldn't have any effect on the
64-bit patterns I assume this was introduced for.
The OSS-Fuzz i386 build has been failing due to errors about
64-to-32-bit conversions, relating to random generation code. This
changre fixes the problem by explicitly using a 64-bit random generator,
and by adding a cast to size_t to avoid an implicit conversion.
Do this if Constant or DefUse managers are invalid. Using the
ConstantManager attempts to regenerate the DefUseManager
which is not valid during inlining.
* Account for strided components in arrays
Fixes#4567
* If the element type of an array takes less than a location in size,
calculate the location usage in a strided manner
* formatting
According to spec this opcode is a constant instruction - that's it
can appear outside of function bodies.
Co-authored-by: DmitryBushev <dmitry.bushev@intel.com>