This pass tries to fix validation error due to a mismatch of storage classes
in instructions. There is no guarantee that all such error will be fixed,
and it is possible that in fixing these errors, it could lead to other
errors.
Fixes#2430.
Fixes#2452
Swaps priority of handling unreachable merge and continues so that the
back-edge is retained in the case a block is both a loop continue and
loop merge
* Check var pointer capability in ADCE.
* Check var ptr capability for common uniform.
* Check var ptr capability in access chain convert.
Since we want this pass to run even if there are variable pointer on
storage buffers, we had to remove asserts that assumed there were no
variable pointers. The functions with the asserts will now work, it
becomes the responsibility of the callers to deal with the output as
appropriate.
* Single block elimination and variable pointers.
It seems like the code in local single block elimination is able to
handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed a
variable pointer are not candidates.
* Single store elimination and variable pointers.
It seems like the code in local single stroe elimination is able to
handle cases with variable pointers already. This is because the
function `FindSingleStoreAndCheckUses` ensures that variables that feed
a variable pointer are not candidates.
* SSA rewriter and variable pointers.
It seems like the code in the two passes that call the SSA rewriter are
able to handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed
a variable pointer are not candidates.
Fixes#2458.
Fixes#2456
* When eliminating a structured construct that has an unreachable merge,
replace that unreachable terminator with an appropriate return
* New tests
Fixes#2453
* Enable addition of OpPhi instructions when the loop has multiple
predecessors of the merge due to a break
* This can result in some values no longer dominating their uses
* Track return blocks in structured flow to produce OpPhis that have
multiple undef and non-undef arguments
* New tests to catch the bug
* When a block is predicated, mark the new body as a return if the old
block as already a return
If SPV_EXT_descriptor_indexing is enabled, add check that for a
descriptor-based reference, the descriptor is initialized. Initialization
data is stored in the debug input buffer, added to the length information
already there. This feature must be seperately enabled on the pass
creation routine. NOTE: Currently just supports image references; buffer
references are still TODO.
Adds an optimization pass to remove usages of AtomicCounterMemory
bit. This bit is ignored in Vulkan environments and outright forbidden
in WebGPU ones.
Fixes#2242
In constant propagation, decoration are transfered from the original
expression to the constant that will replace it. This can be wrong
because there are no decorations that apply to constants. We choose to
simply delete the decorations.
Fixes#2441
* Handle back edges better in dead branch elim.
Loop header must have exactly one back edge. Sometimes the branch
with the back edge can be folded. However, it should not be folded
if it removes the back edge.
The code to check this simply avoids folding the branch in the
continue block. That needs to be changed to not fold the back edge,
wherever it is.
At the same time, the branch can be folded if it folds to a branch to
the header, because the back edge will still exist.
Fixes#2391.
* Fix OpDot folding of half float vectors.
The code that folds OpDot does not handle half floats correctly. After
trying to multiple the first components, we get a nullptr because we
don't fold half float values. This nullptr gets passed to the code that
does the addition, and causes an assert.
Fixes#2405.
The types of input and output variables must match for the pipeline. We
cannot see the uses in all of the shader, so dead member
elimination cannot safely change the type of input and output variables.
Add a pass that looks for members of structs whose values do not affects
the output of the shader. Those members are then removed and just
treated like padding in the struct.
* Fixes#2338. Added check for phi node before merging blocks.
* Added functionality to merge blocks A and B even when B starts with OpPhi instructions, by replacing uses of the OpPhi results with the definitions coming from A. Added some tests for this.
* Fixed assertion.
* Remove use of deprecated googletest macro
INSTANTIATE_TEST_CASE_P has been deprecated. We need to use
INSTANTIATE_TEST_SUITE_P instead.
* Remove extra commas from test suites.
When looking at the uses of the result of an instruction, code sinking
assumes that all uses are in a basic block. However, this is not true
if there is a decoration or name for the result of that insturction.
This commit checks for this.
Fixes https://crbug.com/923243.
It is legal, but not generated by any SPIR-V producer: an OpCompositeExtract
with no indexes. This is essentially just a copy of the object, so we
treat them that way. We simply propagate the live variables of the
result to the operand.
Fixes https://crbug.com/919181.
It is legal, but not generated by any SPIR-V producer: an OpAccessChain
with no indexes. This is essentially just a copy of the pointer.
I have decided to treat it like an OpCopyObject. In CheckUses, we
return that it is not okay.
When looking at this I realized that we had code in GetUsedComponents
that cannot be reached. If there is a use in an OpCopyObject the it
will not call GetUsedComponents. I removed that dead code.
Fixes https://crbug.com/918311.
During unrolling a new loop is created, but its ownership is not clear
as it gets passed through the code. Changed something to unique_ptr to
make that clearer.
Fixes#2299.
Fixing other memory leaks at the same time.
Fixes#2296Fixes#2297
In C++, a bit shift of the same size as the type is undefined, but it is
defined in spir-v. When folding those cases, we have to be careful. We
cannot simply do the shift in C++.
Fixes https://crbug.com/917697.
Fixes#2138
* Modf and frexp are upgraded to use the struct version of the
instruction and generate an explicit store whose flags can be upgraded
separately
* Fixed major bug where availability and visibility were reversed for
non-copy memory instructions
* Fixed bug where availability and visibility scope operands were reversed for copy memory
* Upgraded all opt tests to use SPV_ENV_UNIVERSAL_1_3
* Upgrade tests moved into unified tests and removed standalone test
* Handle CompositeInsert with no indices in VDCE
In the spec, there it nothing that forces an OpCompositeInsert to have
an index, but VDCE assumes there is at least 1 in a couple places.
This commit updates VDCE to handle these cases.
* Added additional changes for the new AccelerationStructureNV type.
* Added additional changes for the new AccelerationStructureNV type. Change tabs to space...
* Added additional changes for the new accelerationStructureNV type -- add proper type name.
Fix TypeManager.TypeStrings test:
[----------] 29 tests from TypeManager
[ RUN ] TypeManager.TypeStrings
[ OK ] TypeManager.TypeStrings (7 ms)
When we are predicating the continue target for a loop, it can no longer
be the continue target because it will have a branch that exits the loop
and is not the bach edge. The continue target will have to be the
target of that branch that is still in the loop.
Fixes#2211.
The function `UpdatePhiNodes` was being called inconsistently. In one
case, the cfg had already been updated to include the new edge, and in
another place the cfg was not updated. This caused the function to
miss flagging a block as needing new phi nodes. I picked that the cfg
should not be updated before making the call. I documented it, and
change the call sites to match.
Fixes#2207.
We initially assumed that if the type manager returned the correct id
for the pointee type, that we would get the correct pointer type back,
but that is not true. See the unit test added with this commit. We
need to fall back to the linear search any time we are looking for a
pointer to a type that may not be unique.
At the same time, SROA considered an OpName on a variable to be a use of
the entire variable. That has been fixed.
Fixes#2209.
We currently place the load instructions at the start of the basic block
that dominates all of the loads. If that basic block contains OpPhi
instructions, then this will generate invalid code. We just need to
search for a location that comes after all of the OpPhi instructions.
Fixes#2204.
* Don't fold specialized branchs in loop unswitch
Folding branches can have a lot of special cases, and can be a little
error prone. So I only want it in one place. That will be in dead
branch elimination. I will change loop unswitching to set the branches
that were being folded to have a constant condition. Then subsequent
pass of dead branch elimination will be able to remove the code.
At the same time, I added a check that loop unswitching will not
unswitch a branch with a constant condition. It is not useful to do it
because dead branch elimination will simple fold the branch anyway.
Also it avoid an infinite loop that would other wise be introduced by my
first change.
Fixes#2203.
Loop unswitching is unswitching the conditional branch that creates the
back-edge. In the version of the loop, where the bachedge is not taken,
there is no back-edge. This is what causes the validator to complain.
The solution I will go with will be to now unswitch a condition with a
back-edge. At this time we do not now if loop unswitching is used. We do
not include it in the optimization sets provided, nor is it used in
glslang's set. When there are opportunities and no breaks from the loop,
the loop with either be a single iteration loop, or an infinite loop.
There is no performance advantage to performing loop unswitching in
either of those cases. If there is a break, maintaining structured
control flow will be tricky. Unless we see a clear advantage to handling
these case, I would go with the safer simpler solution.
Fixes#2201.
If there are multiple edges to a basic block, then the ssa rewriter will
create OpPhi instructions with duplicate entries. This is invalid, and
it is fixed in this commit.
Fixes#2202.