diff --git a/syntax.md b/syntax.md index be6cb189e..28b516c23 100644 --- a/syntax.md +++ b/syntax.md @@ -2,67 +2,15 @@ ## Overview -The assembly attempts to adhere the binary form as closely as possible -using text names from section 3 of the SPIR-V spec. +The assembly attempts to adhere to the binary form from Section 3 of the SPIR-V +spec as closely as possible, with one exception aiming at improving the text's +readability. The `` generated by an instruction is moved to the +beginning of that instruction and followed by an `=` sign. This allows us to +distinguish between variable definitions and uses and locate value definitions +more easily. Here is an example: -``` -OpCapability Shader -OpMemoryModel Logical Simple -OpEntryPoint GLCompute %3 "main" -OpExecutionMode %3 LocalSize 64 64 1 -OpTypeVoid %1 -OpTypeFunction %2 %1 -OpFunction %1 %3 None %2 -OpLabel %4 -OpReturn -OpFunctionEnd -``` - -A module is a sequence of instructions, separated by whitespace. -An instruction is an opcode name followed by operands, separated by -whitespace. Typically each instruction is presented on its own line, -but the assembler does not enforce this rule. - -The opcode names and expected operands are described in section 3 of -the SPIR-V specification. An operand is one of: -* a literal integer: A decimal integer, or a hexadecimal integer. - A hexadecimal integer is indicated by a leading `0x` or `0X`. A hex - integer supplied for a signed integer value will be sign-extended. - For example, `0xffff` supplied as the literal for an `OpConstant` - on a signed 16-bit integer type will be interpreted as the value `-1`. -* a literal floating point number. -* a literal string. - * A literal string is everything following a double-quote `"` until the - following un-escaped double-quote. This includes special characters such as - newlines. - * A backslash `\` may be used to escape characters in the string. The `\` - may be used to escape a double-quote or a `\` but is simply ignored when - preceding any other character. -* a named enumerated value, specific to that operand position. For example, -the `OpMemoryModel` takes a named Addressing Model operand (e.g. `Logical` or -`Physical32`), and a named Memory Model operand (e.g. `Simple` or `OpenCL`). -Named enumerated values are only meaningful in specific positions, and will -otherwise generate an error. -* a mask expression, consisting of one or more mask enum names separated - by `|`. For example, the expression `NotNaN|NotInf|NSZ` denotes the mask - which is the combination of the `NotNaN`, `NotInf`, and `NSZ` flags. -* an injected immediate integer: `!`. See [below](#immediate). -* an ID, e.g. `%foo`. See [below](#id). - -## Assignment-oriented Assembly Form - -The description and examples from above describe the Canonical Assembly -Form for SPIR-V assembly language. - -We also define the Assignment-oriented Assembly Form, aimed at improving -the text's readability. In AAF, the `` generated by an -instruction is moved to the beginning of that instruction and followed by -an `=` sign. This allows us to distinguish between variable definitions -and uses and locate value definitions more easily. So, the above example -can also be written as: - ``` OpCapability Shader OpMemoryModel Logical Simple @@ -76,11 +24,42 @@ can also be written as: OpFunctionEnd ``` +A module is a sequence of instructions, separated by whitespace. +An instruction is an opcode name followed by operands, separated by +whitespace. Typically each instruction is presented on its own line, +but the assembler does not enforce this rule. + +The opcode names and expected operands are described in Section 3 of +the SPIR-V specification. An operand is one of: +* a literal integer: A decimal integer, or a hexadecimal integer. + A hexadecimal integer is indicated by a leading `0x` or `0X`. A hex + integer supplied for a signed integer value will be sign-extended. + For example, `0xffff` supplied as the literal for an `OpConstant` + on a signed 16-bit integer type will be interpreted as the value `-1`. +* a literal floating point number. +* a literal string. + * A literal string is everything following a double-quote `"` until the + following un-escaped double-quote. This includes special characters such + as newlines. + * A backslash `\` may be used to escape characters in the string. The `\` + may be used to escape a double-quote or a `\` but is simply ignored when + preceding any other character. +* a named enumerated value, specific to that operand position. For example, +the `OpMemoryModel` takes a named Addressing Model operand (e.g. `Logical` or +`Physical32`), and a named Memory Model operand (e.g. `Simple` or `OpenCL`). +Named enumerated values are only meaningful in specific positions, and will +otherwise generate an error. +* a mask expression, consisting of one or more mask enum names separated + by `|`. For example, the expression `NotNaN|NotInf|NSZ` denotes the mask + which is the combination of the `NotNaN`, `NotInf`, and `NSZ` flags. +* an injected immediate integer: `!`. See [below](#immediate). +* an ID, e.g. `%foo`. See [below](#id). + ## ID Definitions & Usage -An ID definition pertains to the `` of an instruction, and ID usage is a -use of an ID as an input to an instruction. +An ID _definition_ pertains to the `` of an instruction, and ID +_usage_ is a use of an ID as an input to an instruction. An ID in the assembly language begins with `%` and must be followed by a name consisting of one or more letters, numbers or underscore characters. @@ -148,13 +127,11 @@ with `!` in them: When a token in the assembly program is a `!`, that integer value is emitted into the binary output, and parsing proceeds differently than before: -each subsequent token not recognized as an OpCode is emitted into the binary -output without any checking; when a recognizable OpCode is eventually -encountered, it begins a new instruction and parsing returns to normal. (If a -subsequent OpCode is never found, then this alternate parsing mode handles all -the remaining tokens in the program. If a subsequent OpCode is in an -[assignment form](#assignment-form), the ID preceding it begins a new -instruction.) +each subsequent token not recognized as an OpCode or a is emitted +into the binary output without any checking; when a recognizable OpCode or a + is eventually encountered, it begins a new instruction and parsing +returns to normal. (If a subsequent OpCode is never found, then this alternate +parsing mode handles all the remaining tokens in the program.) The assembler processes the tokens encountered in alternate parsing mode as follows: @@ -187,8 +164,7 @@ Note that this has some interesting consequences, including: * The `` on the left-hand side of an assignment cannot be a `!`. The `` can be still be manually controlled if desired - by using the [Canonical Assembly Form](#assignment-form) or by simply - expressing the entire instruction as `!` tokens for its opcode and + by expressing the entire instruction as `!` tokens for its opcode and operands. * The `=` sign cannot be processed by the alternate parsing mode if the OpCode