* Fix#2478. The fix is to just not try to simplify such loops.
* Also added `BasicBlock::MergeBlockId()` and `BasicBlock::ContinueBlockId()`.
* Some minor changes to `structured_loop_to_selection_reduction_opportunity.cpp`.
* Added test.
Fix#2475. Fix#2476.
* Improve reducer algorithm: shrink granularity, remove an early return, no lazy initialization, notify pass if binary is interesting, add comments.
* Add fail-on-validation-error option to fail a reduction if an invalid state is reached; useful for tests.
* Set fail-on-validation-error in tests.
* Improve some documentation comments.
* Add Reducer::AddDefaultReductionPasses so tests (and other library consumers) can add the default reduction passes.
* Add CLIMessageConsumer in test_reduce so we can see messages for tricky tests.
* Remove test RemoveUnreferencedInstructionReductionPassTest_ApplyReduction because it was indirectly testing the reduction algorithm, not the RemoveUnreferencedInstruction pass.
* Tweak tests where needed.
Fix#2396
* Check that initial state is valid. Add kInitialStateInvalid.
* Fix RemoveOpnameAndRemoveUnreferenced test; turns out the original shader is invalid, but we never notice because we don't check this and the reduced shader is valid; fix original shader. Assert reduction status is kComplete.
* Always check return value from `Reducer::Run`.
* Change Reducer::Run to *not* immediately copy the input binary.
Changing the stored value for a sampled image consumer to be the
instruction instead of result ID, since not all instructions have
result IDs. Using result IDs led to a potential crash when using
OpReturnValue, which doesn't have a result ID. OpReturnValue is not a
legal consumer, but the validator needs to look at the instruction to
determine this, thus storing the pointer to the instruction, instead
of trying to fetch the pointer using the instruction.
Issue #1528 covers fixing the check.
Fixes#2463
If SPV_EXT_descriptor_indexing is enabled, add check that for a
descriptor-based reference, the descriptor is initialized. Initialization
data is stored in the debug input buffer, added to the length information
already there. This feature must be seperately enabled on the pass
creation routine. NOTE: Currently just supports image references; buffer
references are still TODO.
Adds an optimization pass to remove usages of AtomicCounterMemory
bit. This bit is ignored in Vulkan environments and outright forbidden
in WebGPU ones.
Fixes#2242
If relax-logical-pointer is enabled, this commit makes Validator
accept function param even when its Storage Class is different from
the expected one.
Related to #2423, #2430
In constant propagation, decoration are transfered from the original
expression to the constant that will replace it. This can be wrong
because there are no decorations that apply to constants. We choose to
simply delete the decorations.
Fixes#2441
In WebGPU all blocks are required to be reachable, unless they are one of two
specific degenerate cases for merge-block or continue-target. This PR adds in
checking for these conditions.
Fixes#2068
Updated script to work with python3 and python2.
Added required tools.
We added a section to the readme to mention the tools that are needed to
build and test spirv-tools. For the compiler, the compilers used by the
bots are mentioned.
The bots have been changed. The windows bots will not use python 3.6 for testing. The other bots will still use python 2.7. Both Python2 and Python3 will be tested.
Fixes#2407.
Fixes#1856.
* Handle back edges better in dead branch elim.
Loop header must have exactly one back edge. Sometimes the branch
with the back edge can be folded. However, it should not be folded
if it removes the back edge.
The code to check this simply avoids folding the branch in the
continue block. That needs to be changed to not fold the back edge,
wherever it is.
At the same time, the branch can be folded if it folds to a branch to
the header, because the back edge will still exist.
Fixes#2391.
In relaxed addressing mode, we want to accept non memory objects
because this is a very natural translation of hlsl. It should be fixed
by legalization by inlining the calls.
* Fix OpDot folding of half float vectors.
The code that folds OpDot does not handle half floats correctly. After
trying to multiple the first components, we get a nullptr because we
don't fold half float values. This nullptr gets passed to the code that
does the addition, and causes an assert.
Fixes#2405.
The types of input and output variables must match for the pipeline. We
cannot see the uses in all of the shader, so dead member
elimination cannot safely change the type of input and output variables.
Add a pass that looks for members of structs whose values do not affects
the output of the shader. Those members are then removed and just
treated like padding in the struct.
* Fixes#2358. Added to the reducer the ability to remove a function that is not directly called. Factored out some code from the optimizer to help with this.
Fixes#2120
Enhanced the reducer so that it can merge blocks together, leveraging the functionality extracted from the block_merge pass in the optimizer.
* Fixes#2338. Added check for phi node before merging blocks.
* Added functionality to merge blocks A and B even when B starts with OpPhi instructions, by replacing uses of the OpPhi results with the definitions coming from A. Added some tests for this.
* Fixed assertion.
* Remove use of deprecated googletest macro
INSTANTIATE_TEST_CASE_P has been deprecated. We need to use
INSTANTIATE_TEST_SUITE_P instead.
* Remove extra commas from test suites.
This CL adds in the specific checks required for WebGPU, enables
running the builtin checks for WebGPU, and refactors the existing
testing infrastructure to support testing the new checks.
This PR is part of resolving #2276
In a recent PR, we allowed a forward reference for the element type in
an array declaration. However, we do not have other check to make sure
the forward reference is a pointer type first reference in
OpTypeForawrdPointer. We add that check.
Fixes https://crbug.com/920074.
When looking at the uses of the result of an instruction, code sinking
assumes that all uses are in a basic block. However, this is not true
if there is a decoration or name for the result of that insturction.
This commit checks for this.
Fixes https://crbug.com/923243.
It is legal, but not generated by any SPIR-V producer: an OpCompositeExtract
with no indexes. This is essentially just a copy of the object, so we
treat them that way. We simply propagate the live variables of the
result to the operand.
Fixes https://crbug.com/919181.
It is legal, but not generated by any SPIR-V producer: an OpAccessChain
with no indexes. This is essentially just a copy of the pointer.
I have decided to treat it like an OpCopyObject. In CheckUses, we
return that it is not okay.
When looking at this I realized that we had code in GetUsedComponents
that cannot be reached. If there is a use in an OpCopyObject the it
will not call GetUsedComponents. I removed that dead code.
Fixes https://crbug.com/918311.
During unrolling a new loop is created, but its ownership is not clear
as it gets passed through the code. Changed something to unique_ptr to
make that clearer.
Fixes#2299.
Fixing other memory leaks at the same time.
Fixes#2296Fixes#2297
In C++, a bit shift of the same size as the type is undefined, but it is
defined in spir-v. When folding those cases, we have to be careful. We
cannot simply do the shift in C++.
Fixes https://crbug.com/917697.
Fixes#2138
* Modf and frexp are upgraded to use the struct version of the
instruction and generate an explicit store whose flags can be upgraded
separately
* Fixed major bug where availability and visibility were reversed for
non-copy memory instructions
* Fixed bug where availability and visibility scope operands were reversed for copy memory
* Upgraded all opt tests to use SPV_ENV_UNIVERSAL_1_3
* Upgrade tests moved into unified tests and removed standalone test
Broader check for ids that require a type
Fixes https://crbug.com/911700
* Adds a broader check for when id operands require a type
* updated a few tests
* added a test to catch the original issue
* Handle CompositeInsert with no indices in VDCE
In the spec, there it nothing that forces an OpCompositeInsert to have
an index, but VDCE assumes there is at least 1 in a couple places.
This commit updates VDCE to handle these cases.
* Added additional changes for the new AccelerationStructureNV type.
* Added NVIDIA ray tracing storage classes for checking in ValidateVariable.
* For NVIDIA ray tracing storage classes added test to load bool type (allowed) in new storage class.
This Cl changes the binary parser to keep track of the instruction count
being processed. The parser will then use that instruction number as the
error number, instead of the binary word.
This should make it easier to match the error up to what the
disassembler would output for the error.
Issue #2091
When processing options in a file, it does have access to the
ValidatorOptions and OptimizerOptions object, so options that change
those do not work. We just need to pass it in.
Fixes#2219.
* Added additional changes for the new AccelerationStructureNV type.
* Added additional changes for the new AccelerationStructureNV type. Change tabs to space...
* Added additional changes for the new accelerationStructureNV type -- add proper type name.
Fix TypeManager.TypeStrings test:
[----------] 29 tests from TypeManager
[ RUN ] TypeManager.TypeStrings
[ OK ] TypeManager.TypeStrings (7 ms)
When we are predicating the continue target for a loop, it can no longer
be the continue target because it will have a branch that exits the loop
and is not the bach edge. The continue target will have to be the
target of that branch that is still in the loop.
Fixes#2211.
The function `UpdatePhiNodes` was being called inconsistently. In one
case, the cfg had already been updated to include the new edge, and in
another place the cfg was not updated. This caused the function to
miss flagging a block as needing new phi nodes. I picked that the cfg
should not be updated before making the call. I documented it, and
change the call sites to match.
Fixes#2207.
We initially assumed that if the type manager returned the correct id
for the pointee type, that we would get the correct pointer type back,
but that is not true. See the unit test added with this commit. We
need to fall back to the linear search any time we are looking for a
pointer to a type that may not be unique.
At the same time, SROA considered an OpName on a variable to be a use of
the entire variable. That has been fixed.
Fixes#2209.
We currently place the load instructions at the start of the basic block
that dominates all of the loads. If that basic block contains OpPhi
instructions, then this will generate invalid code. We just need to
search for a location that comes after all of the OpPhi instructions.
Fixes#2204.
* Add OperandToUndefReductionPass.
Fixes#2115.
Also added some tests that are similar to those in OperandToConstantReductionPassTest.
In addition, refactor FindOrCreateGlobalUndef into reduction_util.cpp. Fixes#2184.
Removed many documentation comments that were identical or very similar to the overridden function's documentation comment.
* Don't fold specialized branchs in loop unswitch
Folding branches can have a lot of special cases, and can be a little
error prone. So I only want it in one place. That will be in dead
branch elimination. I will change loop unswitching to set the branches
that were being folded to have a constant condition. Then subsequent
pass of dead branch elimination will be able to remove the code.
At the same time, I added a check that loop unswitching will not
unswitch a branch with a constant condition. It is not useful to do it
because dead branch elimination will simple fold the branch anyway.
Also it avoid an infinite loop that would other wise be introduced by my
first change.
Fixes#2203.
Loop unswitching is unswitching the conditional branch that creates the
back-edge. In the version of the loop, where the bachedge is not taken,
there is no back-edge. This is what causes the validator to complain.
The solution I will go with will be to now unswitch a condition with a
back-edge. At this time we do not now if loop unswitching is used. We do
not include it in the optimization sets provided, nor is it used in
glslang's set. When there are opportunities and no breaks from the loop,
the loop with either be a single iteration loop, or an infinite loop.
There is no performance advantage to performing loop unswitching in
either of those cases. If there is a break, maintaining structured
control flow will be tricky. Unless we see a clear advantage to handling
these case, I would go with the safer simpler solution.
Fixes#2201.
If there are multiple edges to a basic block, then the ssa rewriter will
create OpPhi instructions with duplicate entries. This is invalid, and
it is fixed in this commit.
Fixes#2202.
* Run validator during reduction.
* Added functionality to validate modules after each reduction step, and some tests to check this is working. Also fixed an issue where reduction passes were not guaranteed to be executed at their minimum granularities.
There is inconsistencies between the different specs about whether or
not this capability is required/allowed, so tooling like glslang
currently ignores it. Once this is resolved the check and test can be
re-enabled.
* Invalidate the decoration manager at the start of ADCE.
If the decoration manager is kept live the the contex will try to keep
it up to date. ADCE deals with group decorations by changing the
operands in |OpGroupDecorate| instructions directly without informing
the decoration manager. This puts it in an invalid state, which will
cause an error when the context tries to update it. To Avoid this
problem, we will invalidate the decoration manager upfront.
At the same time, the decoration manager is now considered when checking
the consistency of the decoration manager.
Add a spirv-reduce pass which removes OpName and OpMemberName instructions.
This is useful to enable other reduction passes, e.g. RemoveUnreferencedInstruction may not be able to remove an instruction creating an id whose only usage is an OpName for this id.
* Fix invalid OpPhi generated by merge-return.
When we create a new phi node for a value say %10, we have to replace
all of the uses of %10 that are no longer dominated by the def of %10
by the result id of the new phi. However, if the use is in a phi node,
it is possible that the bb contains the use is not dominated by either.
In this case, needs to be handled differently.
* Split loop headers before add a new branch to them.
In merge return, Phi node in loop header that are also merges for loop
do not get updated correctly. Those cases do not fit in with our
current analysis. Doing this will simplify the code by reducing the
number of cases that have to be handled.
Added documentation to the ir context to indicates that TakeNextId()
returns 0 when the max id is reached. TODOs were added to each call
sight so that we know where we have to start to handle this case.
Handle id overflow in |SplitLoopHeader|.
Handle id overflow in |GetOrCreatePreHeaderBlock|.
Handle failure to create preheader in LICM.
Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/1841.
* Only check for binding and descriptor set on variables that are
statically used by an entry point
* updated tests and added a couple new ones
* new method for collecting entry points that statically reference an
id
* Validate OpForwardPointer
The validator does not have a a check that OpForwardPointer is giving
a forward reference to a pointer type. We add that check.
https://crbug.com/910852
* Remove more specialized check.
There was a check that the forward pointer is actually a poiner type,
but it was only done if it was used in a struct. This was too specific.
Remove it in favour of the more general check that was added.
* Format
* Check the storage type in OpTypeForwardPointer
* Fix typo is test case epxected results.
We currently simulate all shift operations when the two operand are
constants. The problem is that if the shift amount is larger than
32, the result is undefined.
I'm changing the folder to return 0 if the shift value is too high.
That way, we will have defined behaviour.
https://crbug.com/910937.
Fixes#2147
* Checks that device scope is not used for availability and visibility
operations unless VulkanMemoryModelDeviceScopeKHR capability is present
* implemented for atomics, barriers and memory instructions currently
This CL changes the id/name output from the validator to always use a
consistent id[%name] style. This removes the need for getIdOrName. The
name lookup is changed to use the NameMapper so the output is consistent
with what the disassembler will produce.
Fixes#2137
* Validate uses of ids defined in unreachable blocks.
For some reason we do not make sure the uses of ids that are defined
in unreachable blocks are dominated by their def. This is causing
invalid code to pass the validator.
Fixes#2143
* Add test for unreachable code after a return.
We want to allow code like:
```
void foo() {
a = ...;
...
return; // for debugging
<use of a>;
...
}
```
I added a test to make sure that something like this is still accepted
by the validator.
* Add test for unreachable def used in phi.
Upgrade to VulkanKHR memory model
* Converts Logical GLSL450 memory model to Logical VulkanKHR
* Adds extension and capability
* Removes deprecated decorations and replaces them with appropriate
flags on downstream instructions
* Support for Workgroup upgrades
* Support for copy memory
* Adding support for image functions
* Adding barrier upgrades and tests
* Use QueueFamilyKHR scope instead of device
* Move ProcessFunction* function from pass to the context.
There are a few functions that are used to traverse the call tree.
They currently live in the Pass class, but they have nothing to do with
a pass, and may be needed outside of a pass. They would be better in
the ir context, or in a specific call tree class if we ever have a need
for it.
* Don't inline recursive functions.
Inlining does not check if a function is recursive or not. This has
been fine as long as the shader was a Vulkan shader, which forbid
recursive functions. However, not all shaders are vulkan, so either
we limit inlining to Vulkan shaders or we teach it to look for recursive
functions.
I prefer to keep the passes as general as is reasonable. The change
does not require much new code in inlining and gives a reason to refactor
some other code.
The changes are to add a member function to the Function class that
checks if that function is recursive or not.
Then this is used in inlining to not inlining a function call if it calls
a recursive function.
* Add id to function analysis
There are a few places that build a map from ids to Function whose
result is that id. I decided to add an analysis to the context for this
to reduce that code, and simplify some of the functions.
* Add missing file.