* Fix switch case construct validation
Fixes https://crbug.com/tint/372311599
* Stop using block depth in switch validation and instead use the more
robust structured exit logic from the switch construct
* This is valid because the function has already handled the
additional valid cases for case constructs
* formatting
* Add support for SPV_KHR_compute_shader_derivative
* Update tests for SPV_KHR_compute_shader_derivatives
---------
Co-authored-by: MagicPoncho <magicponcho@gmail.com>
Ensure that the validator rejects stores to objects of types
`OpTypeImage`, `OpTypeSampler`, `OpTypeSampledImage`,
`OpTypeAccelerationStructureKHR`, and arrays of these types, according
to `VUID-StandaloneSpirv-OpTypeImage-06924`.
Guard the check behind the before_hlsl_legalization option, as
sometimes we may have temporaries or local variables that are expected
to get optimized away.
Fixes#4796
Change-Id: Ie035c01c5f94e7bdfc16b5c6c85705f302b7bda3
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* Validate Stride operand to OpCooperativeMatrix{Load,Store}KHR
The specification requires the Stride operand for the RowMajorKHR and
ColumnMajorKHR layouts.
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Change-Id: I51084b9b8dedebf9cab7ae25334ee56b75ef0126
* Update source/val/validate_memory.cpp
Co-authored-by: alan-baker <alanbaker@google.com>
* add test to exercise memory layout from spec constant and fix validation
Change-Id: I06d7308c4a2b62d26d69e88e03bfa009a7f8fff3
* format fixes
Change-Id: I9cbabec0ed2172dcd228cc385551cb7a5b79df1a
---------
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Co-authored-by: alan-baker <alanbaker@google.com>
* Support SPV_KHR_untyped_pointers
Covers:
- assembler
- disassembler
- validator
fix copyright
Validate OpTypeUntypedPointerKHR
* Disallow an untyped pointer in a typed pointer
* Validate capability requirements for untyped pointer
* Allow duplicate untyped pointer declarations
Add round trip tests
Validate OpUntypedVariableKHR
Validate untyped access chains
* Add a test for opcodes that generate untyped pointers
* simplify some checks for operands needing types
* validate OpUnypedAccessChainKHR, OpUntypedInBoundsAccessChainKHR,
OpUntypedPtrAccessChainKHR, OpUntypedInBoundsPtrAccessChainKHR
Unify variable validation
Validate OpCopyMemorySized
* Fix some opcode tests to accound for untyped pointers
* Add validation for OpCopyMemorySized for shaders and untyped pointers
* fix up tests
Validate pointer comparisons and bitcast
* Update more helpers
* Fix entry validation to allow OpUntypedVariableKHR
* Validate OpPtrEqual, OpPtrNotEqual and OpPtrDiff
* Validate OpBitcast
Validate atomics and untyped pointers
Make interface variable validation aware of untyped pointers
* Check OpUntypedVariableKHR in interface validation
More untyped pointer validation
* Validate interfaces more thoroughly
* Validate layouts for untyped pointer uses
* Improve capability checks for vulkan with OpTypeUntypedPointerKHR
* workgroup member explicit layout validation updates
More validation
* validate function arguments and parameters
* handle untyped pointer and variable in more places
Add a friendly assembly name for untyped pointers
Update OpCopyMemory validation and tests
Fix test for token update
Fixes for validation
* Allow typed pointers to contain untyped pointers
* Fix decoration validation
* add untyped pointer as a case for size and alignments
Fix interface validation
* Grabbed the wrong storage class operand for untyped variables
* Add ability to specify assembler options in validation tests
Add passthrough validation for OpUntypedArrayLengthKHR
More validation of untyped pointers
* Validate OpUntypedArrayLengthKHR
* Validate layout for OpLoad, OpStore, and OpUntypedArrayLengthKHR
Validation support for cooperative matrix and untyped pointers
* Allow untyped pointers for cooperative matrix KHR load and store
Updates to match spec
* Remove extra capability references
* Swap untyped variable data type and storage class operands
* update validation of variables
* update deps
---------
Co-authored-by: David Neto <dneto@google.com>
* In spirv-val allow format arg to printf to be an array of i8 in Generic space
Signed-off-by: Lu, John <john.lu@intel.com>
* Allow more addr spaces for printf format string
Signed-off-by: Lu, John <john.lu@intel.com>
* Update printf format arg testcase
Signed-off-by: Lu, John <john.lu@intel.com>
* Apply clang-format
Signed-off-by: Lu, John <john.lu@intel.com>
* Reorder code for clarity
Signed-off-by: Lu, John <john.lu@intel.com>
* Only allow other addr spaces if extension is seen
Signed-off-by: Lu, John <john.lu@intel.com>
* Add test to check printf format with extension
Signed-off-by: Lu, John <john.lu@intel.com>
* Add extension correctly
Signed-off-by: Lu, John <john.lu@intel.com>
---------
Signed-off-by: Lu, John <john.lu@intel.com>
The KHR suffix was missing from the published SPIR-V extension.
This is now fixed, but requires some patches in SPIRV-Tools.
Signed-off-by: Nathan Gauër <brioche@google.com>
* val, core: add support for OpExtInstWithForwardRefs
This commit adds validation and support for
OpExtInstWithForwardRefs. This new instruction will be used
for non-semantic debug info, when forward references are
required.
For now, this commit only fixes the code to handle this new instruction,
and adds validation rules. But it does not add the pass to generate/fix
the OpExtInst instruction when forward references are in use.
Such pass would be useful for DXC or other tools, but I wanted to land
validation rules first.
This commit also bumps SPIRV-Headers to get this new opcode.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
The Scope operand of `OpReadClockKHR` was always validated using the
Vulkan environment rules, which only allow `Subgroup` or `Device`.
For the OpenCL environment, `Workgroup` is also a valid Scope, so
`Workgroup` should not be rejected in the universal environment.
Guard the existing Scope check behind `spvIsVulkanEnv` and add a new
Scope check for the OpenCL environment.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* Check for matrix decorations on arrays of matrices
* MatrixStide, RowMajor and ColMajor can be applied to matrix or
arrays of matrix members
* Check that matrix stride satisfies alignment in arrays
Reject `OpCooperativeMatrixStoreKHR` with a `MakePointerVisibleKHR`
MemoryAccess operand, as `MakePointerVisibleKHR` is not supposed to be
used with store operations.
The `CoopMatKHRStoreMemoryAccessFail` test failed to catch this
because it used the helper function `GenCoopMatLoadStoreShader` which
generates `...NV` instead of `...KHR` instructions. Add a new helper
function to generate similar shaders for the KHR extension, as the NV
and KHR extensions have various subtle differences that makes
parameterizing the original helper function non-trivial.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* Disallow duplicate decorations generally
* Only FuncParamAttr and UserSemantic can be applied to the same target
multiple times
* Unchecked: completely duplicate UserSemantic and FuncParamAttr
* Disallow duplicate execution modes generally
* Exceptions for float controls, float controls2 and some intel
execution modes
* Fix invalid fuzzer transforms
This commit adds forward iterator, and renames functions to
it matches the std::unordered_set/std::set better.
This goes against the SPIR-V coding style, but might be better in
the long run, especially when this set is used along real STL
sets.
(Right now, they are not compatible, and requires 2 syntaxes).
This container could in theory handle bidirectional
iterator, but for now, only forward seemed required for
our use-cases.
Signed-off-by: Nathan Gauër <brioche@google.com>
The current EnumSet implementation is only efficient for enums with
values < than 64. The reason is the first 63 values are stored as a
bitmask in a 64 bit unsigned integer, and the other values are stored
in a std::set.
For small enums, this is fine (most SPIR-V enums have IDs < than 64),
but performance starts to drop with larger enums (Capabilities,
opcodes).
Design considerations:
----------------------
This PR changes the internal behavior of the EnumSet to handle enums
with arbitrary values while staying performant.
The idea is to extend the 64-bits buckets sparsely:
- each bucket can store 64 value, starting from a multiplier of 64.
This could be considered as a hashset with linear probing.
- For small enums, there is a slight memory overhead due to the bucket
storage, but lookup is still constant.
- For linearly distributed values, lookup is constant.
- Worse case for storage are for enums with values which are multiples of 64.
But lookup is constant.
- Worse case for lookup are enums with a lot of small ranges scattered in
the space (requires linear probing).
For enums like capabilities/opcodes, this bucketing is useful as values
are usually scatters in distinct, but almost contiguous blocks.
(vendors usually have allocated ranges, like [5000;5500], while [1000;5000]
is mostly unused).
Benchmarking:
-------------
Benchmarking was done in 2 ways:
- a benchmark built for the occasion, which only measure the EnumSet
performance.
- SPIRV-Tools tests, to measure a more realist scenario.
Running SPIR-V tests with both implementations shows the same
performance (delta < noise). So seems like we have no regressions.
This method is noisy by nature (I/O, etc), but the most representative
of a real-life scenario.
Protocol:
- run spirv-tests with no stdout using perf, multiple times.
Result:
- measure noise is larger than the observed difference.
The custom benchmark was testing EnumSet interfaces using SPIRV enums.
Doing thousand of insertion/deletion/lookup, with 2 kind of scenarios:
- add once, lookup many times.
- add/delete/loopkup many time.
For small enums, results are similar (delta < noise). Seems relevant
with the previously observed results as most SPIRV enums are small, and
SPIRV-Tools is not doing that many intensive operations on EnumSets.
Performance on large enums (opcode/capabilities) shows an improvement:
+-----------------------------+---------+---------+---------+
| Metric | Old | New | Delta % |
+-----------------------------+---------+---------+---------+
| Execution time | 27s | 7s | -72% |
| Instruction count | 174b | 129b | -25% |
| Branch count | 28b | 33b | +17% |
| Branch miss | 490m | 26m | -94% |
| Cache-misses | 149k | 26k | -82% |
+-----------------------------+---------+---------+---------+
Future work
-----------
This was by-design an NFC change to compare apples-to-apples.
The next PR aims to add STL-like iterators to the EnumSet to allow
using it with STL algorithms, and range-based for loops.
Signed-off-by: Nathan Gauër <brioche@google.com>