mirror of
https://github.com/KhronosGroup/SPIRV-Tools
synced 2025-01-13 01:40:14 +00:00
215 lines
9.1 KiB
Markdown
215 lines
9.1 KiB
Markdown
# SPIR-V Assembly language syntax
|
|
|
|
## Overview
|
|
|
|
The assembly attempts to adhere the binary form as closely as possible
|
|
using text names from section 3 of the SPIR-V spec.
|
|
|
|
Here is an example:
|
|
|
|
```
|
|
OpCapability Shader
|
|
OpMemoryModel Logical Simple
|
|
OpEntryPoint GLCompute %3 "main"
|
|
OpExecutionMode %3 LocalSize 64 64 1
|
|
OpTypeVoid %1
|
|
OpTypeFunction %2 %1
|
|
OpFunction %1 %3 None %2
|
|
OpLabel %4
|
|
OpReturn
|
|
OpFunctionEnd
|
|
```
|
|
|
|
A module is a sequence of instructions, separated by whitespace.
|
|
An instruction is an opcode name followed by operands, separated by
|
|
whitespace. Typically each instruction is presented on its own line,
|
|
but the assembler does not enforce this rule.
|
|
|
|
The opcode names and expected operands are described in section 3 of
|
|
the SPIR-V specification. An operand is one of:
|
|
* a literal integer: A decimal integer, or a hexadecimal integer
|
|
(indicated by a leading `0x`).
|
|
* a literal floating point number.
|
|
* a literal string, surrounded by double-quotes ("). TODO: describe quoting and
|
|
escaping rules.
|
|
* a named enumerated value, specific to that operand position. For example,
|
|
the `OpMemoryModel` takes a named Addressing Model operand (e.g. `Logical` or
|
|
`Physical32`), and a named Memory Model operand (e.g. `Simple` or `OpenCL`).
|
|
Named enumerated values are only meaningful in specific positions, and will
|
|
otherwise generate an error.
|
|
* a mask expression, consisting of one or more mask enum names separated
|
|
by `|`. For example, the expression `NotNaN|NotInf|NSZ` denotes the mask
|
|
which is the combination of the `NotNaN`, `NotInf`, and `NSZ` flags.
|
|
* an injected immediate integer: `!<integer>`. See [below](#immediate).
|
|
* an ID, e.g. `%foo`. See [below](#id).
|
|
|
|
## Assignment-oriented Assembly Form
|
|
<a name="assignment-form"></a>
|
|
The description and examples from above describe the Canonical Assembly
|
|
Form for SPIR-V assembly language.
|
|
|
|
We also define the Assignment-oriented Assembly Form, aimed at improving
|
|
the text's readability. In AAF, the `<result-id>` generated by an
|
|
instruction is moved to the beginning of that instruction and followed by
|
|
an `=` sign. This allows us to distinguish between variable definitions
|
|
and uses and locate value definitions more easily. So, the above example
|
|
can also be written as:
|
|
|
|
```
|
|
OpCapability Shader
|
|
OpMemoryModel Logical Simple
|
|
OpEntryPoint GLCompute %3 "main"
|
|
OpExecutionMode %3 LocalSize 64 64 1
|
|
%1 = OpTypeVoid
|
|
%2 = OpTypeFunction %1
|
|
%3 = OpFunction %1 None %2
|
|
%4 = OpLabel
|
|
OpReturn
|
|
OpFunctionEnd
|
|
```
|
|
|
|
## ID Definitions & Usage
|
|
<a name="id"></a>
|
|
|
|
An ID definition pertains to the `<result-id>` of an instruction, and ID usage is a
|
|
use of an ID as an input to an instruction.
|
|
|
|
An ID in the assembly language begins with `%` and must be followed by a name
|
|
consisting of one or more letters, numbers or underscore characters.
|
|
|
|
For every ID in the assembly program, the assembler generates a unique number
|
|
called the ID's internal number. Then each ID reference translates into its
|
|
internal number in the SPIR-V output. Internal numbers are unique within the
|
|
compilation unit: no two IDs in the same unit will share internal numbers.
|
|
|
|
The disassembler generates IDs where the name is always a decimal number
|
|
greater than 0.
|
|
|
|
So the example can be rewritten using more user-friendly names, as follows:
|
|
```
|
|
OpCapability Shader
|
|
OpMemoryModel Logical Simple
|
|
OpEntryPoint GLCompute %main "main"
|
|
OpExecutionMode %main LocalSize 64 64 1
|
|
%void = OpTypeVoid
|
|
%fnMain = OpTypeFunction %void
|
|
%main = OpFunction %void None %fnMain
|
|
%lbMain = OpLabel
|
|
OpReturn
|
|
OpFunctionEnd
|
|
```
|
|
|
|
## Arbitrary Integers
|
|
<a name="immediate"></a>
|
|
|
|
When writing tests it can be useful to emit an invalid 32 bit word into the
|
|
binary stream at arbitrary positions within the assembly. To specify an
|
|
arbitrary word into the stream the prefix `!` is used, this takes the form
|
|
`!<integer>`. Here is an example.
|
|
|
|
```
|
|
OpCapability !0x0000FF00
|
|
```
|
|
|
|
Any token in a valid assembly program may be replaced by `!<integer>` -- even
|
|
tokens that dictate how the rest of the instruction is parsed. Consider, for
|
|
example, the following assembly program:
|
|
|
|
```
|
|
%4 = OpConstant %1 123 456 789 OpExecutionMode %2 LocalSize 11 22 33
|
|
OpExecutionMode %3 InputLines
|
|
```
|
|
|
|
The tokens `OpConstant`, `LocalSize`, and `InputLines` may be replaced by random
|
|
`!<integer>` values, and the assembler will still assemble an output binary with
|
|
three instructions. It will not necessarily be valid SPIR-V, but it will
|
|
faithfully reflect the input text.
|
|
|
|
You may wonder how the assembler recognizes the instruction structure (including
|
|
instruction boundaries) in the text with certain crucial tokens replaced by
|
|
arbitrary integers. If, say, `OpConstant` becomes a `!<integer>` whose value
|
|
differs from the binary representation of `OpConstant` (remember that this
|
|
feature is intended for fine-grain control in SPIR-V testing), the assembler
|
|
generally has no idea what that value stands for. So how does it know there is
|
|
exactly one `<id>` and three number literals following in that instruction,
|
|
before the next one begins? And if `LocalSize` is replaced by an arbitrary
|
|
`!<integer>`, how does it know to take the next three tokens (instead of zero or
|
|
one, both of which are possible in the absence of certainty that `LocalSize`
|
|
provided)? The answer is a simple rule governing the parsing of instructions
|
|
with `!<integer>` in them:
|
|
|
|
When a token in the assembly program is a `!<integer>`, that integer value is
|
|
emitted into the binary output, and parsing proceeds differently than before:
|
|
each subsequent token not recognized as an OpCode is emitted into the binary
|
|
output without any checking; when a recognizable OpCode is eventually
|
|
encountered, it begins a new instruction and parsing returns to normal. (If a
|
|
subsequent OpCode is never found, then this alternate parsing mode handles all
|
|
the remaining tokens in the program. If a subsequent OpCode is in an
|
|
[assignment form](#assignment-form), the ID preceding it begins a new
|
|
instruction.)
|
|
|
|
The assembler processes the tokens encountered in alternate parsing mode as
|
|
follows:
|
|
|
|
* If the token is a number literal, it outputs that number as one or more words,
|
|
as defined in the SPIR-V specification for Literal Number. The number must
|
|
fit within the unsigned 32-bit range. All formats supported by `strtoul()`
|
|
are accepted.
|
|
* If the token is a string literal, it outputs a sequence of words representing
|
|
the string as defined in the SPIR-V specification for Literal String.
|
|
* If the token is an ID, it outputs the ID's internal number.
|
|
* If the token is another `!<integer>`, it outputs that integer.
|
|
* Any other token causes the assembler to quit with an error.
|
|
|
|
Note that this has some interesting consequences, including:
|
|
|
|
* When an OpCode is replaced by `!<integer>`, the integer value should encode
|
|
the instruction's word count, as specified in the physical-layout section of
|
|
the SPIR-V specification.
|
|
|
|
* Consecutive instructions may have their OpCode replaced by `!<integer>` and
|
|
still produce valid SPIR-V. For example, `!262187 %1 %2 "abc" !327739 %1 %3 6
|
|
%2` will successfully assemble into SPIR-V declaring a constant and a
|
|
PrivateGlobal variable.
|
|
|
|
* Enums (such as `DontInline` or `SubgroupMemory`, for instance) are not handled
|
|
by the alternate parsing mode. They must be replaced by `!<integer>` for
|
|
successful assembly.
|
|
|
|
* The `<result-id>` on the left-hand side of an assignment cannot be a
|
|
`!<integer>`. The `<result-id>` can be still be manually controlled if desired
|
|
by using the [Canonical Assembly Form](#assignment-form) or by simply
|
|
expressing the entire instruction as `!<integer>` tokens for its opcode and
|
|
operands.
|
|
|
|
* The `=` sign cannot be processed by the alternate parsing mode if the OpCode
|
|
following it is a `!<integer>`.
|
|
|
|
* When replacing a named ID with `!<integer>`, it is possible to generate
|
|
unintentionally valid SPIR-V. If the integer provided happens to equal a
|
|
number generated for an existing named ID, it will result in a reference to
|
|
that named ID being output. This may be valid SPIR-V, contrary to the
|
|
presumed intention of the writer.
|
|
|
|
## Notes
|
|
|
|
* Some enumerants cannot be used by name, because the target instruction
|
|
in which they are meaningful take an ID reference instead of a literal value.
|
|
For example:
|
|
* Named enumerated value `CmdExecTime` from section 3.30 Kernel
|
|
Profiling Info is used in constructing a mask value supplied as
|
|
an ID for `OpCaptureEventProfilingInfo`. But no other instruction
|
|
has enough context to bring the enumerant names from section 3.30
|
|
into scope.
|
|
* Similarly, the names in section 3.29 Kernel Enqueue Flags are used to
|
|
construct a value supplied as an ID to the Flags argument of
|
|
OpEnqueueKernel.
|
|
* Similarly for the names in section 3.25 Memory Semantics.
|
|
* Similarly for the names in section 3.27 Scope.
|
|
* Some enumerants cannot be used by name, because they only name values
|
|
returned by an instruction:
|
|
* Enumerants from 3.12 Image Channel Order name possible values returned
|
|
by the `OpImageQueryOrder` instruction.
|
|
* Enumerants from 3.13 Image Channel Data Type name possible values
|
|
returned by the `OpImageQueryFormat` instruction.
|