Sometimes when debugging or logging, SPIR-V may be dumped as a stream of
hex values. There are tools to convert such a stream to binary
(such as [1]) but they create an inconvenient extra step when for
example the disassembly of that hex stream is needed.
[1]: https://www.khronos.org/spir/visualizer/hexdump.html
In this change, the binary reader used by the tools is enhanced to
detect when the binary is actually a hex stream, and parse that instead.
The following formats are accepted, detected based on how the SPIR-V
magic number is output:
=== Words
If the first token of the hex stream is one of 0x07230203, 0x7230203,
x07230203, or x7230203, the hex stream is expected to consist of 32-bit
hex words prefixed with 0x or x. For example:
0x7230203, 0x10400, 0x180001, 0x79, 0x0
is parsed as:
0x07230203 0x00010400 0x00180001 0x00000079 0x00000000
Note that `,` is optional in the stream, but the hex values are expected
to be delimited by either `,` or whitespace.
=== Bytes With Prefix
If the first token of the hex stream is one of 0x07, 0x7, x07, x7, 0x03,
0x3, x03, or x3, the hex stream is expected to consist of 8-bit hex
bytes prefixed with 0x or x. If the first token has a value of 7, the
stream is big-endian. Otherwise it's little-endian. For example:
0x3, 0x2, 0x23, 0x7, 0x0, 0x4, 0x1, 0x0, 0x1, 0x0, 0x18, 0x0, 0x79, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0
is parsed as:
0x07230203 0x00010400 0x00180001 0x00000079 0x00000000
Similar to "Words", `,` is optional in the stream, but the hex values
are expected to be delimited by either `,` or whitespace.
=== Bytes Without Prefix
If the first two characters of the hex stream is 07, or 03, the hex
stream is expected to consist of 8-bit hex bytes of 2 characters each.
If the first token is 07, the stream is big-endian. Otherwise it's
little-endian. Unlike the other modes, delimiter is optional (which
automatically handles 32-bit word streams), but no 0-padding is done.
For example, all of the following:
03, 02, 23, 07, 00, 04, 01, 00, 01, 00, 18, 00, 79, 00, 00, 00, 00, 00, 00, 00
03 02 23 07 00 04 01 00 01 00 18 00 79 00 00 00 00 00 00 00
03022307 00040100 01001800 79000000 00000000
07,23,02,03,00,01,04,00,00,18,00,01,00,00,00,79,00,00,00,00
07230203, 00010400, 00180001, 00000079, 00000000
are parsed as:
0x07230203 0x00010400 0x00180001 0x00000079 0x00000000
This pass fixups the opcode used for OpExtInst instructions
to use OpExtInstWithForwardRefsKHR when it contains a forward
reference.
This pass is agnostic to the extension used, hence the validity
of the code depends of the validity of the usage:
If a forward reference is used on a non-semantic extended instruction,
the generated code will remain invalid, but the opcode will change.
What this pass guarantees is valid code won't become invalid.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Co-authored-by: Steven Perron <stevenperron@google.com>
Add a new legalization pass to dedupe invocation interlock instructions
DXC will be adding support for HLSL's rasterizer ordered views by using
the SPV_EXT_fragment_shader_interlock_extension. That extension
stipulates that if an entry point has an interlock ordering execution
mode, it must dynamically execute OpBeginInvocationInterlockEXT and
OpEndInvocationInterlockEXT, in that order, exactly once. This would be
difficult to determine in DXC's SPIR-V backend, so instead we will emit
these instructions potentially multiple times, and use this legalization
pass to ensure that the final SPIR-V follows the specification.
This PR uses data-flow analysis to determine where to place begin and
end instructions; in essence, determining whether a block contains or is
preceded by a begin instruction is similar to a specialized case of a
reaching definitions analysis, where we have only a single definition,
such as `bool has_begun = false`. For this simpler case, we can compute
the set of blocks using BFS to determine the reachability of the begin
instruction.
We need to do this for both begin and end instructions, so I have
generalized portions of the code to run both forward and backward over
the CFG for each respective case.
* tools: refactorize tools flags parsing.
Each SPIR-V tool was handing their own flag parsing. This PR
adds a new flag parsing utility that should be flexible enough
for our usage, but generic enough so it can be shared across tools while
remaining simple to use.
* cfg: replace cfg option parsing with the new one.
* change spirv-dis parsing + title + summary
* clang format
* flags: fix static init fiasco & remove help
Static initialization order is important, and was
working just by sheer luck.
Also removing the help generation tooling. It's less flexible than the
hand-written string, and making it as-good and as-flexible brings too
much complexity.
* review feedback
The CHANGES file was an alternative source of truth that was trying
to duplicate/replace the git history truth.
This allow us to change the way we handle releases so we don't have to make
sure our CHANGES PR are linked to the tag and tested PR, simplifying the
process.
This adds two passes to accomplish this: one pass to analyze a shader
to determine the input slots that are live. The second pass is run on
the preceding shader to eliminate any stores to output slots that are
not consumed by the following shader.
These passes support vert, tesc, tese, geom, and frag shaders.
These passes are currently only available through the API.
These passes together with dead code elimination, and elimination of
dead input and output components and variables (WIP), will allow users
to do dead code elimination across shader boundaries.
This reverts commit d18d0d92e5.
This is reverted because it causes a 7X slowdown when legalizing
SPIR-V with NonSemantic.Shader.DebugInfo.100 instructions.
This is due to the creation of very large UseLists for several
heavily used operands for this extension combined with the fact
that the original commit changed the performance of Uselists to O(n).
spirv validation require OpFunctionCall with memory object, usually this
is non issue as all the functions are inlined.
This pass deal with some case for
DontInline function. accesschain input operand would be replaced new
created variable
* Fix gen_build_version on Windows
The paths used in the gen_build_version script are causing some failures
in downstream builds. Switch to using the location of the CHANGES file
rather than the ".." parent directory workaround.
* Update other build files
Standalone projects that depend on both SPIRV-Tools and SwiftShader
failed to have Ninja build files generated because `build_with_chromium`
is false for both of them and cause the executables to be declared
twice.
This change causes `build_with_chromium` builds to declare the
executables, but only for the copy in `//third_party`. The path for that
was also corrected to include the `vulkan-deps` and `src`
subdirectories.
Builds from the Fuchsia tree also declare the executable targets.
This change was tested with a Chromium build and a "standalone" Dawn
build.
Bug: b/158002593
GN does not allow multiple build files to declare the same executables,
so for SwiftShader to be able to use its own internal copy of SPIRV-
Tools libraries it needs to exclude the executable targets.
This is achieved by checking if the current path is //third-party/
SPIRV-Tools/.
Bug: b/158002593
Swift shader needs a way to inline all functions, even those marked as
DontInline. See https://github.com/KhronosGroup/SPIRV-Tools/pull/4471.
This implements the suggestion I made in the PR. We add a pass that
will remove the DontInline function control, so that the inlining passes
will inline them.
SwiftShader will still have to modify their code to add this pass before
the other passes are run.
* Optimize DefUseManager allocations
Saves around 30-35% of compilation time.
For inst->use_ids, use a pool linked list instead of allocating vectors for every instruction. For inst->uses, use a "PooledLinkedList"' -- a linked list that has shared storage for all nodes. Neither re-use nodes, instead we do a bulk compaction operation when too much memory is being wasted (tuneable).
Includes separate PooledLinkedList templated datastructure, a very special case construct, but split out to make the code a little easier to understand.
Incrementally compute the hash instead of collecting words
Avoids allocating temporary space in a std::vector and std::u32string, and making three passes over all the hashed data.
Switch to using std::vector to prevent processing duplicates instead of std::unordered_set: avoids an allocation/deletion every call to ComputeHashValue, and ends up faster due to much better cache behaviour and smaller constant-factor when searching the (generally very small) list.
In my test case, made Type::HashValue go from 7.5% of compilation time to .5%
Add a pass to spread Volatile semantics to variables with SMIDNV,
WarpIDNV, SubgroupSize, SubgroupLocalInvocationId, SubgroupEqMask,
SubgroupGeMask, SubgroupGtMask, SubgroupLeMask, or SubgroupLtMask BuiltIn
decorations or OpLoad for them when the shader model is the ray
generation, closest hit, miss, intersection, or callable shaders. This
pass can be used for VUID-StandaloneSpirv-VulkanMemoryModel-04678 and
VUID-StandaloneSpirv-VulkanMemoryModel-04679 (See "Standalone SPIR-V
Validation" section of Vulkan spec "Appendix A: Vulkan Environment for
SPIR-V").
Handle variables used by multiple entry points:
1. Update error check to make it working regardless of the order of
entry points.
2. For a variable, if it is used by two entry points E1 and E2 and
it needs the Volatile semantics for E1 while it does not for E2
- If VulkanMemoryModel capability is enabled, which means we have to
set memory operation of load instructions for the variable, we
update load instructions in E1, but do not update the ones in E2.
- If VulkanMemoryModel capability is disabled, which means we have
to add Volatile decoration for the variable, we report an error
because E1 needs to add Volatile decoration for the variable while
E2 does not.
For the simplicity of the implementation, we assume that all functions
other than entry point functions are inlined.
In https://github.com/KhronosGroup/SPIRV-Tools/pull/3110, the strip reflect
pass was changed to also remove all explicitly nonsemantic instructions. This
makes it so that the name of the pass no longer reflects what the pass actually
does. This change renames the pass so that it reflects what the pass actaully does.
Includes:
- Shift to use of spirv-header extinst.nonsemantic.shader grammar.json
- Remove extinst.nonsemantic.vulkan.debuginfo.100.grammar.json
- Enable all optimizations for Shader.DebugInfo
Also fixes scalar replacement to only insert DebugValue after all
OpVariables. This is not necessary for OpenCL.DebugInfo, but it is
for Shader.DebugInfo.
Likewise, fixes Private-to-Local to insert DebugDeclare after all
OpVariables.
Also fixes inlining to handle FunctionDefinition which can show up
after first block if early return processing happens.
Co-authored-by: baldurk <baldurk@baldurk.org>
convert-to-sampled-image pass converts images and/or samplers with
given pairs of descriptor set and binding to sampled image.
If a pair of an image and a sampler have the same pair of descriptor
set and binding that is one of the given pairs, they will be
converted to a sampled image. In addition, if only an image has the
descriptor set and binding that is one of the given pairs, it will
be converted to a sampled image as well.
For example, when we have
%a = OpLoad %type_2d_image %texture
%b = OpLoad %type_sampler %sampler
%combined = OpSampledImage %type_sampled_image %a %b
%value = OpImageSampleExplicitLod %v4float %combined ...
1. If %texture and %sampler have the same descriptor set and binding
%combine_texture_and_sampler = OpVaraible %ptr_type_sampled_image_Uniform
...
%combined = OpLoad %type_sampled_image %combine_texture_and_sampler
%value = OpImageSampleExplicitLod %v4float %combined ...
2. If %texture and %sampler have different pairs of descriptor set and binding
%a = OpLoad %type_sampled_image %texture
%extracted_image = OpImage %type_2d_image %a
%b = OpLoad %type_sampler %sampler
%combined = OpSampledImage %type_sampled_image %extracted_image %b
%value = OpImageSampleExplicitLod %v4float %combined ...
This PR adds a generic dataflow analysis framework to SPIRV-opt, with the intent of being used in SPIRV-lint. This may also be useful for SPIRV-opt, as existing ad-hoc analyses can be rewritten to use a common framework, but this is not the target of this PR.
Control dependence analysis constructs a control dependence graph,
representing the conditions for a block's execution relative to the
results of other blocks with conditional branches, etc.
This is an analysis pass that will be useful for the linter and
potentially also useful in opt. Currently it is unused except for the
added unit tests.
Some generated headers are exposed by headers in the spvtools_opt
target, but its dependency on them is private. This can result in build
flake, since the headers don't need to be generated before compiling
any spvtools_opt dependents.
This fixes the build flake by correctly expressing these as public
dependencies.
Tint reaches out into SPIRV-Tools BUILD.gn file for some targets:
- Adds spvtools_include_gen_dirs so that Tint can add $target_gen_dir
and find autogenerated files.
- Adds spvtools_language_headers that Tint can reference so it
automatically picks up new language headers that are added.
Adds a new transformation that rewrites a scalar operation (like
OpFAdd, opISub) as an equivalent vector operation, adding a synonym
between the scalar result and an appropriate component of the vector
result.
Fixes#4195.