The replayer takes an existing sequence of transformations and applies
them to a module. Replaying a sequence of transformations that were
obtained via fuzzing should lead to an identical module to the module
that was fuzzed. Tests have been added to check for this.
The current tool can parse basic command-line argument, but generates
a binary identical to the input binary, since no transformations are
yet implemented.
* Make pointers to logically matching types interchangeable with option.
DXC will be generating code where the function parameters will be a more
generic type that the actual parameter. They should be logically
matching and the decorations of the actual parameter must be a superset
of the decorations of the formal parameter.
We want to accept this code with an options so that spirv-opt can then
inline and fix the type mismatch. We will accept this under a new
options `--before-hlsl-legalization`.
The new option will also imply `relax-logical-pointer` so that HLSL
frontends will need to use just the one more generic option.
Moved the |LogicallyMatches| to the validation state to make it
available in more places. Also added a parameter to have it check the
decorations. I did not do a separate function for the decorations
because checking the decorations involves making sure the types
logically match anyway.
Fixes#2535
* SPIR-V 1.4 headers, add SPV_ENV_UNIVERSAL_1_4
* Support --target-env spv1.4 in help for command line tools
* Support asm/dis of UniformId decoration
* Validate UniformId decoration
* Fix version check on instructions and operands
Also register decorations used with OpDecorateId
* Extension lists can differ between enums that match
Example: SubgroupMaskEq vs SubgroupMaskEqKHR
* Validate scope value for Uniform decoration, for SPIR-V 1.4
* More unioning of exts
* Preserve grammar order within an enum value
* 1.4: Validate OpSelect over composites
* Tools default to 1.4
* Add asm/dis test for OpCopyLogical
* 1.4: asm/dis tests for PtrEqual, PtrNotEqual, PtrDiff
* Basic asm/Dis test for OpCopyMemory
* Test asm/dis OpCopyMemory with 2-memory access
Add asm/dis tests for OpCopyMemorySized
Requires grammar update to add second optional memory access operand
to OpCopyMemory and OpCopyMemorySized
* Validate one or two memory accesses on OpCopyMemory*
* Check av/vis on CopyMemory source and target memory access
This is a proposed rule. See
https://gitlab.khronos.org/spirv/SPIR-V/issues/413
* Validate operation for OpSpecConstantOp
* Validate NonWritable decoration
Also permit NonWritable on members of UBO and SSBO.
* SPIR-V 1.4: NonWrtiable can decorate Function and Private vars
* Update optimizer CLI tests for SPIR-V 1.4
* Testing tools: Give expected SPIR-V version in message
* SPIR-V 1.4 validation for entry point interfaces
* Allow only unique interfaces
* Allow all global variables
* Check that all statically used global variables are listed
* new tests
* Add validation fixture CompileFailure
* Add 1.4 validation for pointer comparisons
* New tests
* Validate with image operands SignExtend, ZeroExtend
Since we don't actually know the image texel format, we can't fully
validate. We need more context.
But we can make sure we allow the new image operands in known-good
cases.
* Validate OpCopyLogical
* Recursively checks subtypes
* new tests
* Add SPIR-V 1.4 tests for NoSignedWrap, NoUnsignedWrap
* Allow scalar conditions in 1.4 with OpSelect
* Allows scalar conditions with vector operands
* new tests
* Validate uniform id scope as an execution scope
* Validate the values of memory and execution scopes are valid scope
values
* new test
* Remove SPIR-V 1.4 Vulkan 1.0 environment
* SPIR-V 1.4 requires Vulkan 1.1
* FIX: include string for spvLog
* FIX: validate nonwritable
* FIX: test case suite for member decorate string
* FIX: test case for hlsl functionality1
* Validation test fixture: ease debugging
* Use binary version for SPIR-V 1.4 specific features
* Switch checks based on the SPIR-V version from the target environment
to instead use the version from the binary
* Moved header parsing into the ValidationState_t constructor (where
version based features are set)
* Added new versions of tests that assemble a 1.3 binary and validate a
1.4 environment
* Fix test for update to SPIR-V 1.4 headers
* Fix formatting
* Ext inst lookup: Add Vulkan 1.1 env with SPIR-V 1.4
* Update spirv-val help
* Operand version checks should use module version
Use the module version instead of the target environment version.
* Fix comment about two-access form of OpCopyMemory
Renames the existing flag '--webgpu-mode' to '--vulkan-to-webgpu' for
the Vulkan->WebGPU operation, and adds a new flag '--webgpu-to-vulkan'
for the WebGPU->Vulkan operation.
Currently '--webgpu-to-vulkan' doesn't have any passes associated with
it yet, but further patches will implement them.
Fixes#2495
Fix#2475. Fix#2476.
* Improve reducer algorithm: shrink granularity, remove an early return, no lazy initialization, notify pass if binary is interesting, add comments.
* Add fail-on-validation-error option to fail a reduction if an invalid state is reached; useful for tests.
* Set fail-on-validation-error in tests.
* Improve some documentation comments.
* Add Reducer::AddDefaultReductionPasses so tests (and other library consumers) can add the default reduction passes.
* Add CLIMessageConsumer in test_reduce so we can see messages for tricky tests.
* Remove test RemoveUnreferencedInstructionReductionPassTest_ApplyReduction because it was indirectly testing the reduction algorithm, not the RemoveUnreferencedInstruction pass.
* Tweak tests where needed.
Fix#2396
* Check that initial state is valid. Add kInitialStateInvalid.
* Fix RemoveOpnameAndRemoveUnreferenced test; turns out the original shader is invalid, but we never notice because we don't check this and the reduced shader is valid; fix original shader. Assert reduction status is kComplete.
* Always check return value from `Reducer::Run`.
* Change Reducer::Run to *not* immediately copy the input binary.
Adds an optimization pass to remove usages of AtomicCounterMemory
bit. This bit is ignored in Vulkan environments and outright forbidden
in WebGPU ones.
Fixes#2242
* Fixes#2358. Added to the reducer the ability to remove a function that is not directly called. Factored out some code from the optimizer to help with this.
Fixes#2120
Enhanced the reducer so that it can merge blocks together, leveraging the functionality extracted from the block_merge pass in the optimizer.
During unrolling a new loop is created, but its ownership is not clear
as it gets passed through the code. Changed something to unique_ptr to
make that clearer.
Fixes#2299.
Fixing other memory leaks at the same time.
Fixes#2296Fixes#2297
When processing options in a file, it does have access to the
ValidatorOptions and OptimizerOptions object, so options that change
those do not work. We just need to pass it in.
Fixes#2219.
* Add OperandToUndefReductionPass.
Fixes#2115.
Also added some tests that are similar to those in OperandToConstantReductionPassTest.
In addition, refactor FindOrCreateGlobalUndef into reduction_util.cpp. Fixes#2184.
Removed many documentation comments that were identical or very similar to the overridden function's documentation comment.
Add a spirv-reduce pass which removes OpName and OpMemberName instructions.
This is useful to enable other reduction passes, e.g. RemoveUnreferencedInstruction may not be able to remove an instruction creating an id whose only usage is an OpName for this id.
Upgrade to VulkanKHR memory model
* Converts Logical GLSL450 memory model to Logical VulkanKHR
* Adds extension and capability
* Removes deprecated decorations and replaces them with appropriate
flags on downstream instructions
* Support for Workgroup upgrades
* Support for copy memory
* Adding support for image functions
* Adding barrier upgrades and tests
* Use QueueFamilyKHR scope instead of device
* Added a reduction pass to replace ids with ids of the same type that dominate them.
* Introduce helper method for querying whether an operand type is an input id.
Adds validator option to specify scalar block layout rules.
Both VK_KHR_relax_block_layout and VK_EXT_scalar_block_layout can be
enabled at the same time. But scalar block layout is as permissive
as relax block layout.
Also, scalar block layout does not require padding at the end of a
struct.
Add test for scalar layout testing ArrayStride 12 on array of vec3s
Cleanup: The internal getSize method does not need a round-up argument,
so remove it.
* Validate the id bound.
Validates that the id bound for the module is not larger than the max id
bound. Also adds an option to set the max id bound. Allows the
optimizer option to set the max id bound to also set the id bound for
the validation run done by the optimizer.
Fixes#2030.
Instead of using the source/table.h methods, this CL switches the stats
tool to use the spvtools::Context class and assign the message consumer
through the public API.
* Create a new entry point for the optimizer
Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.
The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.
Part of the optimizer options will also be the upper bound on the id bound.
* Add a command line option to set the max value for the id bound. The default is 0x3FFFFF.
* Modify `TakeNextIdBound` to return 0 when the limit is reached.
This forks the testing harness from https://github.com/google/shaderc
to allow testing CLI tools.
New features needed for SPIRV-Tools include:
1- A new PlaceHolder subclass for spirv shaders. This place holder
calls spirv-as to convert assembly input into SPIRV bytecode. This is
required for most tools in SPIRV-Tools.
2- A minimal testing file for testing basic functionality of spirv-opt.
Add tests for all flags in spirv-opt.
1. Adds tests to check that known flags match the names that each pass
advertises.
2. Adds tests to check that -O, -Os and --legalize-hlsl schedule the
expected passes.
3. Adds more functionality to Expect classes to support regular
expression matching on stderr.
4. Add checks for integer arguments to optimization flags.
5. Fixes#1817 by modifying the parsing of integer arguments in
flags that take them.
6. Fixes -Oconfig file parsing (#1778). It reads every line of the file
into a string and then parses that string by tokenizing every group of
characters between whitespaces (using the standard cin reading
operator). This mimics shell command-line parsing, but it does not
support quoting (and I'm not planning to).
The code in source/message was only used in a single set of tests to
format the output results. This CL changes the test to verify the
message instead of all the error values and removes the source/message
code.
* Run the validator in the optimization fuzzers.
The optimizers assumes that the input to the optimizer is valid. Since
the fuzzers do not check that the input is valid before passing the
spir-v to the optimizer, we are getting a few errors.
The solution is to run the validator in the optimizer to validate the
input.
For the legalization passes, we need to add an extra option to the
validator to accept certain types of variable pointers, even if the
capability is not given. At the same time, we changed the option
"--legalize-hlsl" to relax the validator in the same way instead of
turning it off.
* Combines OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain and
OpInBoundsPtrAccessChain
* New folding rule to fold add with 0 for integers
* Converts to a bitcast if the result type does not match the operand
type
V
This re-implements the -Oconfig=<file> flag to use a new API that takes
a list of command-line flags representing optimization passes.
This moves the processing of flags that create new optimization passes
out of spirv-opt and into the library API. Useful for other tools that
want to incorporate a facility similar to -Oconfig.
The main changes are:
1- Add a new public function Optimizer::RegisterPassesFromFlags. This
takes a vector of strings. Each string is assumed to have the form
'--pass_name[=pass_args]'. It creates and registers into the pass
manager all the passes specified in the vector. Each pass is
validated internally. Failure to create a pass instance causes the
function to return false and a diagnostic is emitted to the
registered message consumer.
2- Re-implements -Oconfig in spirv-opt to use the new API.
Currently the utils/ folder uses both spvutils:: and spvtools::utils.
This CL changes the namespace to consistenly be spvtools::utils to match
the rest of the codebase.
Fixes#937
Stop std140/430 validation when runtime array is encountered.
Check for standard uniform/storage buffer layout instead of std140/430.
Added validator command line switch to skip block layout checking.
Validate structs decorated as Block/BufferBlock only when they
are used as variable with storage class of uniform or push
constant.
Expose --relax-block-layout to command line.
dneto0 modification:
- Use integer arithmetic instead of floor.
Add SPV_ENV_WEBGPU_0 for work-in-progress WebGPU.
val: Disallow OpUndef in WebGPU env
Silence unused variable warnings when !defined(SPIRV_EFFCE)
Limit visibility of validate_instruction.cpp's symbols
Only InstructionPass needs to be visible so all other functions are put
in an anonymous namespace inside the libspirv namespace.
[val] Add extra context to error messages.
This CL extends the error messages produced by the validator to output the
disassembly of the errored line.
The validation_id messages have also been updated to print the line number of
the error instead of the word number. Note, the error number is from the start
of the SPIR-V, it does not include any headers printed in the disassembled code.
Fixes#670, #1581
Removes the limit on scalar replacement for the lagalization passes.
This is done by adding an option to the pass (and command line option)
to set the limit on maximum size of the composite that scalar
replacement is willing to divide.
Fixes#1494.
We have already disabled common uniform elimination because it created
sequences of loads an entire uniform object, then we extract just a
single element. This caused problems in some drivers, and is just
generally slow because it loads more memory than needed.
However, there are other way to get into this situation, so I've added
a pass that looks specifically for this pattern and removes it when only
a portion of the load is used.
Fixes#1547.
This pass will look for adjacent loops that are compatible and legal to
be fused.
Loops are compatible if:
- they both have one induction variable
- they have the same upper and lower bounds
- same initial value
- same condition
- they have the same update step
- they are adjacent
- there are no break/continue in either of them
Fusion is legal if:
- fused loops do not have any dependencies with dependence distance
greater than 0 that did not exist in the original loops.
- there are no function calls in the loops (could have side-effects)
- there are no barriers in the loops
It will fuse all such loops as long as the number of registers used for
the fused loop stays under the threshold defined by
max_registers_per_loop.
Adds support for spliting loops whose register pressure exceeds a user
provided level. This pass will split a loop into two or more loops given
that the loop is a top level loop and that spliting the loop is legal.
Control flow is left intact for dead code elimination to remove.
This pass is enabled with the --loop-fission flag to spirv-opt.
Introduce a pass that does a DCE type analysis for vector elements
instead of the whole vector as a single element.
It will then rewrite instructions that are not used with something else.
For example, an instruction whose value are not used, even though it is
referenced, is replaced with an OpUndef.
For each loop in a function, the pass walks the loops from inner to outer most loop
and tries to peel loop for which a certain amount of iteration can be done before or after the loop.
To limit code growth, peeling will not happen if the growth in code size goes above a configurable threshold.
The sprir-v generated from HLSL code contain many copyies of very large
arrays. Not only are these time consumming, but they also cause
problems for drivers because they require too much space.
To work around this, we will implement an array copy propagation. Note
that we will not implement a complete array data flow analysis in order
to implement this. We will be looking for very simple cases:
1) The source must never be stored to.
2) The target must be stored to exactly once.
3) The store to the target must be a store to the entire array, and be a
copy of the entire source.
4) All loads of the target must be dominated by the store.
The hard part is keeping all of the types correct. We do not want to
have to do too large a search to update everything, which may not be
possible, do we give up if we see any instruction that might be hard to
update.
Also in types.h, the element decorations are not stored in an std::map.
This change was done so the hashing algorithm for a Struct is
consistent. With the std::unordered_map, the traversal order was
non-deterministic leading to the same type getting hashed to different
values. See |Struct::GetExtraHashWords|.
Contributes to #1416.
This patch adds a new option --time-report to spirv-opt. For each pass
executed by spirv-opt, the flag prints resource utilization for the pass
(CPU time, wall time, RSS and page faults)
This fixes issue #1378
This pass replaces the load/store elimination passes. It implements the
SSA re-writing algorithm proposed in
Simple and Efficient Construction of Static Single Assignment Form.
Braun M., Buchwald S., Hack S., Leißa R., Mallon C., Zwinkau A. (2013)
In: Jhala R., De Bosschere K. (eds)
Compiler Construction. CC 2013.
Lecture Notes in Computer Science, vol 7791.
Springer, Berlin, Heidelberg
https://link.springer.com/chapter/10.1007/978-3-642-37051-9_6
In contrast to common eager algorithms based on dominance and dominance
frontier information, this algorithm works backwards from load operations.
When a target variable is loaded, it queries the variable's reaching
definition. If the reaching definition is unknown at the current location,
it searches backwards in the CFG, inserting Phi instructions at join points
in the CFG along the way until it finds the desired store instruction.
The algorithm avoids repeated lookups using memoization.
For reducible CFGs, which are a superset of the structured CFGs in SPIRV,
this algorithm is proven to produce minimal SSA. That is, it inserts the
minimal number of Phi instructions required to ensure the SSA property, but
some Phi instructions may be dead
(https://en.wikipedia.org/wiki/Static_single_assignment_form).
We are seeing shaders that have multiple returns in a functions. These
functions must get inlined for legalization purposes; however, the
inliner does not know how to inline functions that have multiple
returns.
The solution we will go with it to improve the merge return pass to
handle structured control flow.
Note that the merge return pass will assume the cfg has been cleanedup
by dead branch elimination.
Fixes#857.
Strips reflection info. This is limited to decorations and
decoration instructions related to the SPV_GOOGLE_hlsl_functionality1
extension.
It will remove the OpExtension for SPV_GOOGLE_hlsl_functionality1.
It will also remove the OpExtension for SPV_GOOGLE_decorate_string
if there are no further remaining uses of OpDecorateStringGOOGLE.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1398
The default target is SPIR-V 1.3.
For example, spirv-as will generate a SPIR-V 1.3 binary by default.
Use command line option "--target-env spv1.0" if you want to make a SPIR-V
1.0 binary or validate against SPIR-V 1.0 rules.
Example:
# Generate a SPIR-V 1.0 binary instead of SPIR-V 1.3
spirv-as --target-env spv1.0 a.spvasm -o a.spv
spirv-as --target-env vulkan1.0 a.spvasm -o a.spv
# Validate as SPIR-V 1.0.
spirv-val --target-env spv1.0 a.spv
# Validate as Vulkan 1.0
spirv-val --target-env vulkan1.0 a.spv
Use indirection through latest_version_spirv.h
Also, when generating enum tables, use the unified1 JSON grammar since
it now has FragmentFullyCoveredEXT but the other JSON grammars don't.
They are starting to fall behind.
It moves all conditional branching and switch whose conditions are loop
invariant and uniform. Before performing the loop unswitch we check that
the loop does not contain any instruction that would prevent it
(barriers, group instructions etc.).
This patch adds initial support for loop unrolling in the form of a
series of utility classes which perform the unrolling. The pass can
be run with the command spirv-opt --loop-unroll. This will unroll
loops within the module which have the unroll hint set. The unroller
imposes a number of requirements on the loops it can unroll. These are
documented in the comments for the LoopUtils::CanPerformUnroll method in
loop_utils.h. Some of the restrictions will be lifted in future patches.
Implementation of the simplification pass.
- Create pass that calls the instruction folder on each instruction and
propagate instructions that fold to a copy. This will do copy
propagation as well.
- Did not use the propagator engine because I want to modify the instruction
as we go along.
- Change folding to not allocate new instructions, but make changes in
place. This change had a big impact on compile time.
- Add simplification pass to the legalization passes in place of
insert-extract elimination.
- Added test cases for new folding rules.
- Added tests for the simplification pass
- Added a method to the CFG to apply a function to the basic blocks in
reverse post order.
Contributes to #1164.
Creates a pass that will remove instructions that are invalid for the
current shader stage. For the instruction to be considered for replacement
1) The opcode must be valid for a shader modules.
2) The opcode must be invalid for the current shader stage.
3) All entry points to the module must be for the same shader stage.
4) The function containing the instruction must be reachable from an entry point.
Fixes#1247.
* Handles simple cases only
* Identifies phis in blocks with two predecessors and attempts to
convert the phi to an select
* does not perform code motion currently so the converted values must
dominate the join point (e.g. can't be defined in the branches)
* limited for now to two predecessors, but can be extended to handle
more cases
* Adding if conversion to -O and -Os
We have come across a driver bug where and OpUnreachable inside a loop
is causing the shader to go into an infinite loop. This commit will try
to avoid this bug by turning OpUnreachable instructions that are
contained in a loop into branches to the loop merge block.
This is not added to "-O" and "-Os" because it should only be used if
the driver being targeted has this problem.
Fixes#1209.
In HLSL structured buffer legalization, pointer to pointer types
are emitted to indicate a structured buffer variable should be
treated as an alias of some other variable. We need an option to
relax the check of pointer types in logical addressing mode to
catch other validation errors.
Turn `Linker::Link()` into free functions
As very little information was kept in the Linker class, we can get rid
of the whole class and have the `Link()` as free functions instead; the
environment target as well as the consumer are passed along through an
`spv_context` object.
The resulting linked_binary is passed as a pointer rather than a
reference to follow the Google C++ Style guidelines.
Addresses remaining comments from
https://github.com/KhronosGroup/SPIRV-Tools/pull/693 about the SPIR-V
linker.
Fix variable naming in the linker
Some of the variables were using mixed case, which did not follow the
Google C++ Style guidelines.
Linker: Use EXPECT_EQ when possible and update some test
* Replace occurrences of ASSERT_EQ by EXPECT_EQ when possible;
* Reformulated some of the error messages;
* Added the symbol name in the error message when there is a type or
decoration mismatch between the imported and exported declarations.
Opt: List all duplicates removed by RemoveDuplicatePass in the header
Opt: Make the const version of GetLabelInst() return a pointer
For consistency with the non-const version, as well as other similar
functions.
Opt: Rename function_end to EndInst()
As pointed out by dneto0 the previous name was quite confusing and could
be mistaken with a function returning an end iterator.
Also change the return type of the const version to a pointer rather
than a reference, for consistency.
Opt: Add performance comment to RemoveDuplicateTypes and decorations
This comment was requested during the review of
https://github.com/KhronosGroup/SPIRV-Tools/pull/693.
Opt: Add comments and fix variable naming in RemoveDuplicatePass
* Add missing comments to private functions;
* Rename variables that were using mixed case;
* Add TODO for moving AreTypesEqual out.
Linker: Remove commented out code and add TODOs
Linker: Merged together strings that were too much splitted
Implement a C++ RAII wrapper around spv_context
Adds optimizer API to write disassembly to a given output stream
before each pass, and after the last pass.
Adds spirv-opt --print-all option to write disassembly to stderr
before each pass, and after the last pass.
This implements the conditional constant propagation pass proposed in
Constant propagation with conditional branches,
Wegman and Zadeck, ACM TOPLAS 13(2):181-210.
The main logic resides in CCPPass::VisitInstruction. Instruction that
may produce a constant value are evaluated with the constant folder. If
they produce a new constant, the instruction is considered interesting.
Otherwise, it's considered varying (for unfoldable instructions) or
just not interesting (when not enough operands have a constant value).
The other main piece of logic is in CCPPass::VisitBranch. This
evaluates the selector of the branch. When it's found to be a known
value, it computes the destination basic block and sets it. This tells
the propagator which branches to follow.
The patch required extensions to the constant manager as well. Instead
of hashing the Constant pointers, this patch changes the constant pool
to hash the contents of the Constant. This allows the lookups to be
done using the actual values of the Constant, preventing duplicate
definitions.
* changed the way duplicate types are removed to stop copying
instructions
* Reworked RemoveDuplicatesPass::AreTypesSame to use type manager and
type equality
* Reworked TypeManager memory management to store a pool of unique
pointers of types
* removed unique pointers from id map
* fixed instances where free'd memory could be accessed
Changes the set of optimizations done for legalization. While doing
this, I added documentation to explain why we want each optimization.
A new option "--legalize-hlsl" is added so the legalization passes can
be easily run from the command line.
The legalize option implies skip-validation.