Some support for subgroups is present starting in Metal 2.0 on both iOS
and macOS. macOS gains more complete support in 10.14 (Metal 2.1).
Some restrictions are present. On iOS and on macOS 10.13, the
implementation of `OpGroupNonUniformElect` is incorrect: if thread 0 has
already terminated or is not executing a conditional branch, the first
thread that *is* will falsely believe itself not to be. Unfortunately,
this operation is part of the "basic" feature set; without it, subgroups
cannot be supported at all.
The `SubgroupSize` and `SubgroupLocalInvocationId` builtins are only
available in compute shaders (and, by extension, tessellation control
shaders), despite SPIR-V making them available in all stages. This
limits the usefulness of some of the subgroup operations in fragment
shaders.
Although Metal on macOS supports some clustered, inclusive, and
exclusive operations, it does not support them all. In particular,
inclusive and exclusive min, max, and, or, and xor; as well as cluster
sizes other than 4 are not supported. If this becomes a problem, they
could be emulated, but at a significant performance cost due to the need
for non-uniform operations.
MSL does not seem to have a qualifier for this, but HLSL SM 5.1 does.
glslangValidator for HLSL does not support this, so skip any validation,
but it passes in FXC.
Buffer objects can contain arbitrary pointers to blocks.
We can also implement ConvertPtrToU and ConvertUToPtr.
The latter can cast a uint64_t to any type as it pleases,
so we will need to generate fake buffer reference blocks to be able to
cast the type.
We made the mistake of registering a dependency on the atomic variable
even if the atomic result was forced to a temporary. There is no need to
register reads from atomic variables like this as we always force atomic
results to a temporary and argument read/writes do not need to be
tracked.
Atomics are not supported on images or texture_buffers in MSL.
Properly throw an error if OpImageTexelPointer is used (since it can
only be used for atomic operations anyways).
* origin/master:
Support running {,update_}test_shader.sh with CMake builds.
Don't apply vertex attribute remapping other non-vertex or non-input interface blocks
Force complex loop in certain rare access chain scenarios.
Fix guard around [[noreturn]].
Deal with mismatched signs in S/U/F conversion opcodes.
Workaround lack of lvalue/rvalue operator overload on MSVC 2013.
Support direct conversions to std::vector from SmallVector.
Fix some minor copy constructor issues in Variant.
Make sure ids_for_types are moved correctly in move operator.
Run format_all.sh.
Refactor out error handling and containers to new headers.
Do not use SmallVector as input type in public interfaces.
Fix various bugs found in testing.
Explicitly implement move operators for ParsedIR.
Try another MSVC 2013 workaround.
Implement edge cases in insert/end and add a simple test case.
Fix GCC 4.x warnings.
Workaround lack of alignas on MSVC 2013.
Reduce pressure on global allocation.
CLI: Make --iterations more useful.