* Add ComputeDerivativeGroup*NV capabilities to trim capabilities pass.
* Add SPV_NV_compute_shader_derivatives to allow lists
No tests needed for this. The code path is well tested. Just adding new
data.
Constexpr guaranteed no runtime init in addition to const semantics.
Moving all opt/ to constexpr.
Moving all compile-unit statics to anonymous namespaces to uniformize
the method used (anonymous namespace vs static has the same behavior
here AFAIK).
Signed-off-by: Nathan Gauër <brioche@google.com>
An access chain instruction interpretes its index operands as signed.
The composite insert and extract instruction interpret their index
operands as unsigned, so it is not possible to represent a negative
number.
This commit adds a check to the local-access-chain-convert pass to check
for a negative number in the access chain and to not do the conversion.
Fixes#4856
An access chain could have a constant index that is an out of bounds
access. This is valid spir-v, even if it can cause problems at runtime.
However, it is not valid to have an OpCompositeExtract with an out of
bounds access. This means we have to stop local-access-chain-convert
from making that change.
Fixes#4605
* Handle 64-bit integers in local access chain convert
The local access chain convert pass does on run on module that have
64-bit integers, even if they have nothing to to with access chains.
This is very limiting because other passes rely on the access chains
being removed. So this commit will add this functionality to the pass.
* Fix endianness of string literals
To get correct and consistent encoding and decoding of string literals
on big-endian platforms, use spvtools::utils::MakeString and MakeVector
(or wrapper functions) consistently for handling string literals.
- add variant of MakeVector that encodes a string literal into an
existing vector of words
- add variants of MakeString
- add a wrapper spvDecodeLiteralStringOperand in source/
- fix wrapper Operand::AsString to use MakeString (source/opt)
- remove Operand::AsCString as broken and unused
- add a variant of GetOperandAs for string literals (source/val)
... and apply those wrappers throughout the code.
Fixes #149
* Extend round trip test for StringLiterals to flip word order
In the encoding/decoding roundtrip tests for string literals, include
a case that flips byte order in words after encoding and then checks for
successful decoding. That is, on a little-endian host flip to big-endian
byte order and then decode, and vice versa.
* BinaryParseTest.InstructionWithStringOperand: also flip byte order
Test binary parsing of string operands both with the host's and with the
reversed byte order.
Includes:
- Shift to use of spirv-header extinst.nonsemantic.shader grammar.json
- Remove extinst.nonsemantic.vulkan.debuginfo.100.grammar.json
- Enable all optimizations for Shader.DebugInfo
Also fixes scalar replacement to only insert DebugValue after all
OpVariables. This is not necessary for OpenCL.DebugInfo, but it is
for Shader.DebugInfo.
Likewise, fixes Private-to-Local to insert DebugDeclare after all
OpVariables.
Also fixes inlining to handle FunctionDefinition which can show up
after first block if early return processing happens.
Co-authored-by: baldurk <baldurk@baldurk.org>
* Initial support for SPV_KHR_integer_dot_product
- Adds new operand types for packed-vector-format
- Moves ray tracing enums to the end
- PackedVectorFormat is a new optional operand type, so it requires
special handling in grammar table generation.
- Add SPV_KHR_integer_dot_product to optimizer whitelists.
- Pass-through validation: valid cases pass validation
Validation errors are not checked.
- Update SPIRV-Headers
Patch by David Neto <dneto@google.com>
Rebase and minor tweaks by Kevin Petit <kevin.petit@arm.com>
Signed-off-by: David Neto <dneto@google.com>
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Change-Id: Icb41741cb7f0f1063e5541ce25e5ba6c02266d2c
* format fixes
Change-Id: I35c82ec27bded3d1b62373fa6daec3ffd91105a3
1. DebugValue/DebugDeclare references of load/store must not change
the behaviors of the convert-local-access-chains pass
2. We have to properly set the scope and line information of new
instructions made by the convert-local-access-chains pass
* Check var pointer capability in ADCE.
* Check var ptr capability for common uniform.
* Check var ptr capability in access chain convert.
Since we want this pass to run even if there are variable pointer on
storage buffers, we had to remove asserts that assumed there were no
variable pointers. The functions with the asserts will now work, it
becomes the responsibility of the callers to deal with the output as
appropriate.
* Single block elimination and variable pointers.
It seems like the code in local single block elimination is able to
handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed a
variable pointer are not candidates.
* Single store elimination and variable pointers.
It seems like the code in local single stroe elimination is able to
handle cases with variable pointers already. This is because the
function `FindSingleStoreAndCheckUses` ensures that variables that feed
a variable pointer are not candidates.
* SSA rewriter and variable pointers.
It seems like the code in the two passes that call the SSA rewriter are
able to handle cases with variable pointers already. This is because the
function `HasOnlySupportedRefs` ensures that variables that feed
a variable pointer are not candidates.
Fixes#2458.
* Move ProcessFunction* function from pass to the context.
There are a few functions that are used to traverse the call tree.
They currently live in the Pass class, but they have nothing to do with
a pass, and may be needed outside of a pass. They would be better in
the ir context, or in a specific call tree class if we ever have a need
for it.
* Don't inline recursive functions.
Inlining does not check if a function is recursive or not. This has
been fine as long as the shader was a Vulkan shader, which forbid
recursive functions. However, not all shaders are vulkan, so either
we limit inlining to Vulkan shaders or we teach it to look for recursive
functions.
I prefer to keep the passes as general as is reasonable. The change
does not require much new code in inlining and gives a reason to refactor
some other code.
The changes are to add a member function to the Function class that
checks if that function is recursive or not.
Then this is used in inlining to not inlining a function call if it calls
a recursive function.
* Add id to function analysis
There are a few places that build a map from ids to Function whose
result is that id. I decided to add an analysis to the context for this
to reduce that code, and simplify some of the functions.
* Add missing file.
* Copy decorations when creating new ids.
When creating a new value based on an old value, we need to copy the
decorations to the new id. This change does this in 3 places:
1) The variable holding the return value of the function generated by
merge return should get decorations from the function.
2) The results of the OpPhi instructions should get decorations from the
variable they are replacing in the ssa writer.
3) In local access chain convert the intermediate struct (result of
OpCompositeInsert) generated for the store replacement should get its
decorations from the variable being stored to.
Fixes#1787.
In local-access-chain-convert, we replace loads by load the entire
variable, then doing the extract. The extract will have the same value
as the load. However, if the load has a decoration on it, the
decoration is lost because we do not copy any them to the new id.
This is fixed by rewritting the load into the extract and keeping the
same result id.
This change has the effect that we do not call DCEInst on the loads
because the load is not being deleted, but replaced. This could leave
OpAccessChain instructions around that are not used. This is not a
problem for -O and -Os. They run local_single_*_elim passes and then
dead code elimination. The dce will remove the unused access chains,
and the load elimination passes work even if there are unused access
chains. I have added test to them to ensure they will not loss
opportunities.
Fixes#1787.
Currently the IRContext is passed into the Pass::Process method. It is
then up to the individual pass to store the context into the context_
variable. This CL changes the Run method to store the context before
calling Process which no-longer receives the context as a parameter.
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
Optimizations should work in the presence of recent
SPV_GOOGLE_decorate_string and SPV_GOOGLE_hlsl_functionality1
SPV_GOOGLE_decorate_string:
- Adds operation OpDecorateStringGOOGLE to decorate an object with decorations
having string operands.
SPV_GOOGLE_hlsl_functionality1:
- Adds HlslSemanticGOOGLE, used to decorate an interface variable with
an HLSL semantic string. Optimizations already preserve those variables
as required because they are interface variables (with uses), independent
of whether they have HLSL decorations.
- Adds HlslCounterBufferGOOGLE, used to associate a buffer with a
counter variable.
Fixes#1391
The current method of removing an instruction is to call ToNop. The
problem with this is that it leaves around an instruction that later
passes will look at. We should just delete the instruction.
In MemPass there is a utility routine called DCEInst. It can delete
essentially any instruction, which can invalidate pointers now that they
are actually deleted. The interface was changed to add a call back that
can be used to update any local data structures that contain
ir::Intruction*.
Re-formatted the source tree with the command:
$ /usr/bin/clang-format -style=file -i \
$(find include source tools test utils -name '*.cpp' -or -name '*.h')
This required a fix to source/val/decoration.h. It was not including
spirv.h, which broke builds when the #include headers were re-ordered by
clang-format.
Replaced representation of uses
* Changed uses from unordered_map<uint32_t, UseList> to
set<pairInstruction*, Instruction*>>
* Replaced GetUses with ForEachUser and ForEachUse functions
* updated passes to use new functions
* partially updated tests
* lots of cleanup still todo
Adding an unique id to Instruction generated by IRContext
Each instruction is given an unique id that can be used for ordering
purposes. The ids are generated via the IRContext.
Major changes:
* Instructions now contain a uint32_t for unique id and a cached context
pointer
* Most constructors have been modified to take a context as input
* unfortunately I cannot remove the default and copy constructors, but
developers should avoid these
* Added accessors to parents of basic block and function
* Removed the copy constructors for BasicBlock and Function and replaced
them with Clone functions
* Reworked BuildModule to return an IRContext owning the built module
* Since all instructions require a context, the context now becomes the
basic unit for IR
* Added a constructor to context to create an owned module internally
* Replaced uses of Instruction's copy constructor with Clone whereever I
found them
* Reworked the linker functionality to perform clones into a different
context instead of moves
* Updated many tests to be consistent with the above changes
* Still need to add new tests to cover added functionality
* Added comparison operators to Instruction
Adding tests for Instruction, IRContext and IR loading
Fixed some header comments for BuildModule
Fixes to get tests passing again
* Reordered two linker steps to avoid use/def problems
* Fixed def/use manager uses in merge return pass
* Added early return for GetAnnotations
* Changed uses of Instruction::ToNop in passes to IRContext::KillInst
Simplifying the uses for some contexts in passes
Each instruction is given an unique id that can be used for ordering
purposes. The ids are generated via the IRContext.
Major changes:
* Instructions now contain a uint32_t for unique id and a cached context
pointer
* Most constructors have been modified to take a context as input
* unfortunately I cannot remove the default and copy constructors, but
developers should avoid these
* Added accessors to parents of basic block and function
* Removed the copy constructors for BasicBlock and Function and replaced
them with Clone functions
* Reworked BuildModule to return an IRContext owning the built module
* Since all instructions require a context, the context now becomes the
basic unit for IR
* Added a constructor to context to create an owned module internally
* Replaced uses of Instruction's copy constructor with Clone whereever I
found them
* Reworked the linker functionality to perform clones into a different
context instead of moves
* Updated many tests to be consistent with the above changes
* Still need to add new tests to cover added functionality
* Added comparison operators to Instruction
* Added an internal option to LinkerOptions to verify merged ids are
unique
* Added a test for the linker to verify merged ids are unique
* Updated MergeReturnPass to supply a context
* Updated DecorationManager to supply a context for cloned decorations
* Reworked several portions of the def use tests in anticipation of next
set of changes