Commit Graph

733 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen
43842cefb3 MSL: Cleanup decoration forwarding for SampleMask.
Don't want to create Meta instances unless we have to.
2019-10-24 11:15:35 +02:00
Hans-Kristian Arntzen
db55d474f9 MSL: Do not declare complex composite array in main for non-inlined.
Need to consider that complex composite arrays may be used in leaf
functions, and avoid the MSL library link fix unless everything is
nicely inlined.
2019-10-24 11:12:01 +02:00
Lukas Hermanns
b0d616aa6d Removed 'argument_buffer_offset' and fixed packed matrix Metal output. 2019-10-23 16:28:32 -04:00
Lukas Hermanns
6673a675ba Simplified overriding of 'access_chain_internal' function in CompilerMSL. 2019-10-22 11:06:16 -04:00
Lukas Hermanns
84351d3aed Merge remote-tracking branch 'upstream/master' 2019-10-21 18:55:36 -04:00
Lukas Hermanns
e1b161b54b Removed bounds checks in favor of SPIRV-Tools pass '--graphics-robust-access' 2019-10-21 16:39:53 -04:00
Hans-Kristian Arntzen
4bb673a626 MSL: Add opt-in support for huge IABs.
If there are enough members in an IAB, we cannot use the constant
address space as MSL compiler complains about there being too many
members. Support emitting the device address space instead.
2019-10-14 16:20:34 +02:00
Lukas Hermanns
0853bcaee1 Disabled spvUnsafeArray<> type for packed vectors and added test cases for those arrays. 2019-10-09 17:59:47 -04:00
Lukas Hermanns
ffbd801853 Added '--msl-invariant-float-math' option and new test case for it. 2019-10-09 14:03:06 -04:00
Hans-Kristian Arntzen
2d20b1ab93 Run format_all.sh. 2019-10-07 10:29:04 +02:00
Lukas Hermanns
f3a6d28a1d Further updates for pull request #1162; also added two test cases for spvCubemapTo2DArrayFace function and added '--msl-framebuffer-fetch'/ '--msl-emulate-cube-array' compiler options. 2019-09-27 15:49:54 -04:00
Lukas Hermanns
c3d6022956 Update for pull request #1162 rev. 1 2019-09-24 18:13:04 -04:00
Lukas Hermanns
7ad0a84778 Updates for pull request #1162 2019-09-24 14:35:25 -04:00
Lukas Hermanns
37df74035b Merge branch 'ue4_dev' 2019-09-20 09:42:42 -04:00
Lukas Hermanns
9f9276f5ce Fixed false-positive optimization of builtin variables (may happen when 'spvOut' is emitted). 2019-09-19 14:44:30 -04:00
Hans-Kristian Arntzen
3c11254ece MSL: Fix 16-bit integer literals.
There is no suffix, so bitcasts failed.
2019-09-19 10:19:51 +02:00
Lukas Hermanns
50ac6862ac Rearranged all 'UE Change' comments to match to project's coding style. 2019-09-18 14:03:54 -04:00
Lukas Hermanns
137e9d6d98 Removed reference specifiers in 'spvFMul*' functions to avoid address specifiers. 2019-09-17 16:50:33 -04:00
Lukas Hermanns
51be601922 Avoid emitting 'spvUnsafeArray<>', 'spvFMul*', and 'spvFAdd' custom functions if they are not needed. 2019-09-17 15:10:39 -04:00
Lukas Hermanns
36eab88b23 Further adjustments to make Metal backend work again in UE4 on Mac. 2019-09-17 11:40:01 -04:00
Lukas Hermanns
7cf5d4f7a1 Added a new 'emulate_cube_array' option to SPIRV-Cross to cope with translating TextureCubeArray into texture2d_array for iOS where this type is not available. (Original Author: Mark Satterthwaite) 2019-09-13 17:24:27 -04:00
Lukas Hermanns
a9f3c981d9 Adjustments after rebase of ue4_dev branch. 2019-09-13 14:03:02 -04:00
Mark Satterthwaite
c4f9704af0 OpImageTexelPointer needs to use an int coordinate type for GLSL, but not for MSL. 2019-09-12 08:52:08 -04:00
Mark Satterthwaite
fdaf9b47bd Remove obsolete memory barrier scope specification from Metal output, this API has been removed. 2019-09-12 08:35:28 -04:00
Mark Satterthwaite
69b703f1da Add an option to SPIRV-Cross to enforce invariant floating point math to prevent different depth calculation between prepass & basepass when running on Metal 2.0 and earlier. 2019-09-12 08:35:15 -04:00
Mark Satterthwaite
e4c6388571 More fixes to handling packing & access elements in an array. Made in two parts. 1. Don't allow AccessChain operations to add duplicated swizzles when accessing packed arrays. 2. Only pack arrays when there is the proper amount of space between members in a struct, otherwise it will definitely be wrong. 2019-09-11 16:15:10 -04:00
Mark Satterthwaite
b491806b47 Fix texture swizzling. 2019-09-11 14:56:54 -04:00
Mark Satterthwaite
9e54a8dd7b Slight modifications to IAB support for Metal output, so that the caller can specify an offset for the IAB start index, as for HLSL shaders UAVs need to occupy slots 0-7. The runtime support for SSBO robustness is also much simpler if the buffer size block is at index 0. Change made in two parts. 1. Allow the caller to specify the Metal translation should use argument buffers. 2. Move this to the front of IABs for convenience of the runtime. 2019-09-10 13:09:49 -04:00
Mark Satterthwaite
d9f3576305 Metal doesn't automatically enforce robust access to buffers unlike other APIs, so for storage-buffers, which become raw T* buffers in Metal, we need to fetch the buffer size and clamp the access to a valid index within the buffer ourselves. This is essential for shaders converted from HLSL which expects all resource access to be robust, though this implementation is technically different to the HLSL specification of return-0 for OOB reads, ignore OOB writes. 2019-09-10 12:32:32 -04:00
Mark Satterthwaite
0428faada3 HLSL makes position calculations invariant by default to eliminate problems with depth-precision, Apple added a similar qualifier for Metal 2.1 that can and should be used in Vertex & Domain/TessEval shaders for the same effect. 2019-09-10 11:47:40 -04:00
Mark Satterthwaite
9ce3158193 When compiling from HLSL which pads and aligns float[]/float2[] within structures to float4[] we need to unpack the original type in Metal from the float4. 2019-09-10 11:21:43 -04:00
Mark Satterthwaite
40a4456a54 Fix conversion of the SampleMask intrinsic from SPIRV, where it is an array to Metal where it isn't. 2019-09-10 10:46:42 -04:00
Mark Satterthwaite
42b8a62870 Fixes to the generation of Metal tessellation shaders from SPIRV so that it works correctly in more complicated cases.
First, when generating from HLSL before invoking the code that comes from the HLSL patch-function a control-flow and full memory-barrier are required to ensure that all the temporary values in thread-local storage for the patch are available.
    Second, the inputs to control and evaluation shaders must be properly forwarded from the global variables in SPIRV to the member variables in the relevant input structure.
    Finally when arrays of interpolators are used for input or output we need to add an extra level of array indirection because Metal works at a different granularity than SPIRV.

    Five parts.
    1. Fix tessellation patch function processing.
    2. Fix loads from tessellation control inputs not being forwarded to the gl_in structure array.
    3. Fix loads from tessellation evaluation inputs not being forwarded to the stage_in structure array.
    4. Workaround SPIRV losing an array indirection in tessellation shaders - not the best solution but enough to keep things progressing.
    5. Apparently gl_TessLevelInner/Outer is special and needs to not be placed into the input array.
2019-09-10 10:37:07 -04:00
Mark Satterthwaite
de6441af88 Work-around HLSL using zero-based InstanceID and VertexID variables, but SPIRV, like Metal, includes BaseInstance & BaseVertex. Until this can be fixed in DXC, which is really the proper place to solve this, we can decrement InstanceID & VertexID when the source is HLSL. Made in two parts. 1. Handle HLSL-style 0-based vertex/instance index. 2. We zero-base the InstanceID & VertexID variables for HLSL emulation elsewhere, so don't do it twice. 2019-09-09 16:55:59 -04:00
Mark Satterthwaite
97a66ff906 On iOS sub-passes can be implemented using the frame-buffer fetch API which is much more efficient than binding the textures. Change was made in three parts. 1. Use Metal's native frame-buffer fetch API for subpass inputs. 2. Make sure that frame-buffer-fetch is only available on iOS. 3. Default to using Metal's native frame-buffer fetch for subpass inputs on iOS. 2019-09-09 15:02:11 -04:00
Wade Brainerd
f2a1b4320f MSL: Fix array copies to/from interpolators 2019-09-06 18:23:57 -07:00
Mark Satterthwaite
32557e9093 SPIRV doesn't distinguish depth textures from regular textures, but Metal does, so if we've ever seen a depth comparison operation we must ensure that the texture is specified as a depth-texture. 2019-09-06 16:58:27 -04:00
Hans-Kristian Arntzen
2082e7e801 Run format_all.sh. 2019-09-06 14:23:16 +02:00
Hans-Kristian Arntzen
333980ae91 Refactor into stronger types in public API.
Some fallout where internal functions are using stronger types.
Overkill to move everything over to strong types right now, but perhaps
move over to it slowly over time.
2019-09-06 12:29:47 +02:00
Hans-Kristian Arntzen
1935f1a8e3 Fix some issues on certain compilers. 2019-09-06 10:11:18 +02:00
Chip Davis
cb35934248 MSL: Support dynamic offsets for buffers in argument buffers.
Vulkan has two types of buffer descriptors,
`VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC` and
`VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC`, which allow the client to
offset the buffers by an amount given when the descriptor set is bound
to a pipeline. Metal provides no direct support for this when the buffer
in question is in an argument buffer, so once again we're on our own.
These offsets cannot be stored or associated in any way with the
argument buffer itself, because they are set at bind time.  Different
pipelines may have different offsets set. Therefore, we must use a
separate buffer, not in any argument buffer, to hold these offsets. Then
the shader must manually offset the buffer pointer.

This change fully supports arrays, including arrays of arrays, even
though Vulkan forbids them. It does not, however, support runtime
arrays. Perhaps later.
2019-09-05 23:29:00 -05:00
Mark Satterthwaite
5e8590a23d Emulate texture atomics in Metal by binding the underlying buffer that backs the resource to a separate binding point and using that for Metal's atomic operations. This will work with texture_buffer and texture2d created from an MTLBuffer, so is perfect for emulating HLSL atomics on RWBuffer and sufficient, but not ideal, for RWTexture2D with some restrictions (limited format support and can't be used for render-targets). 2019-09-05 15:13:28 -04:00
Mark Satterthwaite
239e04762b Support Metal 2.1's texture_buffer type which is the equivalent to HLSL's Buffer/RWBuffer, so doesn't require modifying buffer sizes to match alignments. 2019-09-05 14:46:15 -04:00
Mark Satterthwaite
8596bf5ee2 In order to use Metal shader libraries properly you have to ensure that you have no duplicated global symbol names for different entities, otherwise 'metallib' won't be able to combine multiple shaders into a single library. This is broken into two parts. 1. Constant arrays of non-primitive types (i.e. matrices) won't link properly into Metal libraries. 2. Metal helper functions must be static force-inline otherwise they will cause problems when linked together in a single Metallib. 2019-09-05 14:39:06 -04:00
Mark Satterthwaite
d50659af92 Rework the way arrays are handled in Metal to remove the array copies as they are unnecessary from Metal 1.2. There were cases where copies were not being inserted and others appeared unncessary, using the template type should allow the 'metal' compiler to do the best possible optimisation. The changes are broken into three stages. 1. Allow Metal to use the array<T> template to make arrays a value type. 2. Force the use of C style array declaration for some cases which cannot be wrapped with a template. 3. Threadgroup arrays can't have a wrapper type. 4. Tweak the code to use unsafe_array in a few more places so that we can handle passing arrays of resources into the shader and then through shaders into sub-functions. 5. Handle packed matrix types inside arrays within structs. 6. Make sure that builtin arguments still retain their array qualifiers when used in leaf functions. 7. Fix declaration of array-of-array constants for Metal so we can use the array<T> template. 2019-09-05 12:39:44 -04:00
Chip Davis
103817009c MSL: Force storage images on iOS to use discrete descriptors.
Writable textures cannot use argument buffers on iOS. They must be
passed as arguments directly to the shader function. Since we won't know
if a given storage image will have the `NonWritable` decoration at the
time we encode the argument buffer, we must therefore pass all storage
images as discrete arguments. Previously, we were throwing an error if
we encountered an argument buffer with a writable texture in it on iOS.
2019-09-05 11:01:05 -05:00
Hans-Kristian Arntzen
261b46982a Deal with complex interlock cases in GLSL. 2019-09-04 12:18:04 +02:00
Chip Davis
2eff420d9a Support the SPV_EXT_fragment_shader_interlock extension.
This was straightforward to implement in GLSL. The
`ShadingRateInterlockOrderedEXT` and `ShadingRateInterlockUnorderedEXT`
modes aren't implemented yet, because we don't support
`SPV_NV_shading_rate` or `SPV_EXT_fragment_invocation_density` yet.

HLSL and MSL were more interesting. They don't support this directly,
but they do support marking resources as "rasterizer ordered," which
does roughly the same thing. So this implementation scans all accesses
inside the critical section and marks all storage resources found
therein as rasterizer ordered. They also don't support the fine-grained
controls on pixel- vs. sample-level interlock and disabling ordering
guarantees that GLSL and SPIR-V do, but that's OK. "Unordered" here
merely means the order is undefined; that it just so happens to be the
same as rasterizer order is immaterial. As for pixel- vs. sample-level
interlock, Vulkan explicitly states:

> With sample shading enabled, [the `PixelInterlockOrderedEXT` and
> `PixelInterlockUnorderedEXT`] execution modes are treated like
> `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`
> respectively.

and:

> If [the `SampleInterlockOrderedEXT` or `SampleInterlockUnorderedEXT`]
> execution modes are used in single-sample mode they are treated like
> `PixelInterlockOrderedEXT` or `PixelInterlockUnorderedEXT`
> respectively.

So this will DTRT for MoltenVK and gfx-rs, at least.

MSL additionally supports multiple raster order groups; resources that
are not accessed together can be placed in different ROGs to allow them
to be synchronized separately. A more sophisticated analysis might be
able to place resources optimally, but that's outside the scope of this
change. For now, we assign all resources to group 0, which should do for
our purposes.

`glslang` doesn't support the `RasterizerOrdered` UAVs this
implementation produces for HLSL, so the test case needs `fxc.exe`.

It also insists on GLSL 4.50 for `GL_ARB_fragment_shader_interlock`,
even though the spec says it needs either 4.20 or
`GL_ARB_shader_image_load_store`; and it doesn't support the
`GL_NV_fragment_shader_interlock` extension at all. So I haven't been
able to test those code paths.

Fixes #1002.
2019-09-02 12:31:10 -05:00
Chip Davis
39dce88d3b MSL: Add support for sampler Y'CbCr conversion.
This change introduces functions and in one case, a class, to support
the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of
GBGR8 and BGRG8 formats, for which Metal natively supports implicit
chroma reconstruction, we're on our own here. We have to do everything
ourselves. Much of the complexity comes from the need to support
multiple planes, which must now be passed to functions that use the
corresponding combined image-samplers. The rest is from the actual
Y'CbCr conversion itself, which requires additional post-processing of
the sample retrieved from the image.

Passing sampled images to a function was a particular problem. To
support this, I've added a new class which is emitted to MSL shaders
that pass sampled images with Y'CbCr conversions attached around. It
can handle sampled images with or without Y'CbCr conversion. This is an
awful abomination that should not exist, but I'm worried that there's
some shader out there which does this. This support requires Metal 2.0
to work properly, because it uses default-constructed texture objects,
which were only added in MSL 2. I'm not even going to get into arrays of
combined image-samplers--that's a whole other can of worms.  They are
deliberately unsupported in this change.

I've taken the liberty of refactoring the support for texture swizzling
while I'm at it. It's now treated as a post-processing step similar to
Y'CbCr conversion. I'd like to think this is cleaner than having
everything in `to_function_name()`/`to_function_args()`. It still looks
really hairy, though. I did, however, get rid of the explicit type
arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`.

Update the C API. In addition to supporting this new functionality, add
some compiler options that I added in previous changes, but for which I
neglected to update the C API.
2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen
9b845a4788
Merge pull request #1141 from troughton/inline-everything
MSL: Inline all non-entry-point functions
2019-08-30 11:05:04 +02:00
Thomas Roughton
6b5403206e Clang-format changes 2019-08-30 20:25:40 +12:00
Hans-Kristian Arntzen
07c76f66b5 MSL: Add {Base,}{Vertex,Instance}Index to bitcast_from_builtin_load.
Totally missed these, so float(index) would not work correctly for
negative numbers.
2019-08-29 13:56:37 +02:00
Thomas Roughton
e5f9e2c203 Inline all non-entry-point functions 2019-08-29 17:07:57 +12:00
Thomas Roughton
6338f0aa0f MSL: inline all emitted functions
# Conflicts:
#	spirv_msl.cpp
2019-08-29 17:07:27 +12:00
Hans-Kristian Arntzen
3ccfbce264 Run format_all.sh. 2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen
b3305799a8 Deal correctly with sign on bitfield operations.
Need a lot of special purpose implementation functions for these.
2019-08-26 11:36:36 +02:00
Hans-Kristian Arntzen
ffca8735ff
Merge pull request #1105 from cdavis5e/msl-unify-as
MSL: Unify the get_*_address_space() methods.
2019-07-29 10:19:12 +02:00
Chip Davis
df18d98bea MSL: Unify the get_*_address_space() methods.
These methods have largely the same logic, with minor differences. That
I felt compelled to duplicate the logic into another method was one of
the things that bothered me about the variable pointers change. This
cleans that part of the code up; now we don't have two places to change.
2019-07-26 09:43:28 -05:00
Hans-Kristian Arntzen
d378413040
Merge pull request #1103 from KhronosGroup/fix-1100
MSL: Cleanup temporary use with emit_uninitialized_temporary.
2019-07-26 14:35:18 +02:00
Hans-Kristian Arntzen
c3e8e728d8 MSL: Cleanup temporary use with emit_uninitialized_temporary. 2019-07-26 11:16:43 +02:00
Hans-Kristian Arntzen
abb345d0b3 MSL: Deal with Modf/Frexp where output is access chain to scalar.
This is not allowed as we cannot take mutable reference to a
vec.{x,y,z,w}. We only care about scalar since entire vectors are fine.
2019-07-26 11:02:38 +02:00
Hans-Kristian Arntzen
3c03b55c46 Workaround MSVC 2013 compiler issues. 2019-07-25 10:28:11 +02:00
Chip Davis
fb5ee4cb5c MSL: Adjust BuiltInWorkgroupId for vkCmdDispatchBase().
This command allows the caller to set the base value of
`BuiltInWorkgroupId`, and thus of `BuiltInGlobalInvocationId`. Metal
provides no direct support for this... but it does provide a builtin,
`[[grid_origin]]`, normally used to pass the base values for the stage
input region, which we will now abuse to pass the dispatch base and
avoid burning a buffer binding.

`[[grid_origin]]`, as part of Metal's support for compute stage input,
requires MSL 1.2. For 1.0 and 1.1, we're forced to provide a buffer.

(Curiously, this builtin was undocumented until the MSL 2.2 release. Go
figure.)
2019-07-24 08:56:15 -05:00
Hans-Kristian Arntzen
c62503bca7 Do not attempt to pack types which are already scalar. 2019-07-24 11:52:28 +02:00
Hans-Kristian Arntzen
646e04294a Fix some warnings when building in MoltenVK. 2019-07-23 16:39:13 +02:00
Hans-Kristian Arntzen
5c1cb7accf Recursively pack struct types when we find scalar packed structs. 2019-07-23 15:24:53 +02:00
Hans-Kristian Arntzen
3fa2b14634 Run format_all.sh. 2019-07-23 12:23:41 +02:00
Hans-Kristian Arntzen
7277c7ac46 Use to_unpacked_row_major_expression to unify row-major in MSL/GLSL. 2019-07-23 11:36:54 +02:00
Hans-Kristian Arntzen
47a18b9f1b Simplify row-major matrix/vector multiplies. 2019-07-23 10:56:57 +02:00
Hans-Kristian Arntzen
6224199c76 Add struct size padding tests. 2019-07-23 10:30:37 +02:00
Hans-Kristian Arntzen
2172b19be2 Remove obsolete matrix workaround code. 2019-07-22 16:27:47 +02:00
Hans-Kristian Arntzen
609d087f8f Only transpose unpacked expressions. 2019-07-22 16:06:09 +02:00
Hans-Kristian Arntzen
6057ffcbb1 Deal correctly with complete stores to row_major matrices. 2019-07-22 15:49:17 +02:00
Hans-Kristian Arntzen
19f5cd3e90 Declare correct matrix type when unpacking. 2019-07-22 13:25:45 +02:00
Hans-Kristian Arntzen
f2d6a77c95 Don't forget to register a write to LHS expression in certain case. 2019-07-22 13:06:30 +02:00
Hans-Kristian Arntzen
745a2f7b0e Deal with swizzled stores to std140 matrices. 2019-07-22 13:05:23 +02:00
Hans-Kristian Arntzen
180a6b38c5 Fix some row-major column store cases. 2019-07-22 12:56:14 +02:00
Hans-Kristian Arntzen
4ab2829cf6 Fix more stray parens. 2019-07-22 12:13:07 +02:00
Hans-Kristian Arntzen
d6004bfc97 Fixup stray parent in output. 2019-07-22 12:08:56 +02:00
Hans-Kristian Arntzen
14afb968dd Correctly unpack row-major matrices when storing to LHS. 2019-07-22 12:03:12 +02:00
Hans-Kristian Arntzen
249f8e5180 MSL: Support storing to row-major column.
Defer transposes to actual Load or Store.
2019-07-22 11:13:44 +02:00
Hans-Kristian Arntzen
be2fccd837 Tests run clean. 2019-07-22 10:23:39 +02:00
Hans-Kristian Arntzen
b66a53a979 Traverse correct types when checking scalar layout. 2019-07-19 14:43:42 +02:00
Hans-Kristian Arntzen
e90d816cdd Deal with scalar layout of entire structs.
Mark all candidate struct types.
2019-07-19 14:18:14 +02:00
Hans-Kristian Arntzen
12c5020854 Pass down row-major state to unpacking functions. 2019-07-19 13:03:08 +02:00
Hans-Kristian Arntzen
27b75c2c5a Deal with all forms of matrix writes ... 2019-07-19 12:53:10 +02:00
Hans-Kristian Arntzen
f6251e4699 Can deal with std140 matrices now.
Refactor is coming together.
2019-07-19 11:21:02 +02:00
Hans-Kristian Arntzen
dd7ebaf9f7 Start considering how to emit physical type ID. 2019-07-19 10:06:19 +02:00
Hans-Kristian Arntzen
b09b8d3fa9 Deal more cleanly with matrices and row-major. 2019-07-19 10:06:19 +02:00
Hans-Kristian Arntzen
c160d5227f Reintroduce struct_member_* MSL queries.
Need to remap to physical type + packed qualifier, and this is handy to
do in a helper function.
2019-07-19 10:06:19 +02:00
Hans-Kristian Arntzen
a86308bce1 MSL: Begin rewrite of buffer packing logic. 2019-07-19 10:06:19 +02:00
Chip Davis
12a8654784 Don't forward uses of an OpIsHelperInvocationEXT op.
If this is computed *before* a `demote`, but used *after*, forwarding it
will produce the wrong value. This does make for uglier shaders, but
it's necessary right now to ensure correctness.

I needed to use an assembly shader to produce the test for this.
`spirv-opt` is not smart enough (or too smart?) to eliminate the
variable that would be used in GLSL to express this.
2019-07-18 17:32:35 -05:00
Chip Davis
50dce10c5d Support the SPV_EXT_demote_to_helper_invocation extension.
This extension provides a new operation which causes a fragment to be
discarded without terminating the fragment shader invocation. The
invocation for the discarded fragment becomes a helper invocation, so
that derivatives will remain defined. The old `HelperInvocation` builtin
becomes undefined when this occurs, so a second new instruction queries
the current helper invocation status.

This is only fully supported for GLSL. HLSL doesn't support the
`IsHelperInvocation` operation and MSL doesn't support the
`DemoteToHelperInvocation` op.

Fixes #1052.
2019-07-17 09:12:22 -05:00
Hans-Kristian Arntzen
c7eda1bce9 Test glsl.std450 more exhaustively.
Make sure to test everything with scalar as well to catch any weird edge
cases.

Not all opcodes are covered here, just the arithmetic ones. FP64 packing
is also ignored.
2019-07-17 11:53:05 +02:00
Chip Davis
bc646574a6 MSL: Support the SPV_INTEL_shader_integer_functions2 extension.
This provides a few functions normally available in OpenCL to the SPIR-V
shader environment. These functions happen to be available in Metal as
well.

No GLSL, unfortunately. Intel has yet to publish a
`GL_INTEL_shader_integer_functions2` spec.
2019-07-15 09:42:36 -05:00
Hans-Kristian Arntzen
33d2bbcf69 Merge branch 'msl-amd-trinary-functions' of git://github.com/cdavis5e/SPIRV-Cross 2019-07-15 09:46:31 +02:00
Chip Davis
6a58554568 Support the SPV_KHR_device_group extension.
The only piece added by this extension is the `DeviceIndex` builtin,
which tells the shader which device in a grouped logical device it is
running on.

Metal's pipeline state objects are owned by the `MTLDevice` that created
them. Since Metal doesn't support logical grouping of devices the way
Vulkan does, we'll thus have to create a pipeline state for each device
in a grouped logical device. The upcoming peer group support in Metal 3
will not change this. For this reason, for Metal, the device index is
supplied as a constant at pipeline compile time.

There's an interaction between `VK_KHR_device_group` and
`VK_KHR_multiview` in the
`VK_PIPELINE_CREATE_VIEW_INDEX_FROM_DEVICE_INDEX_BIT`, which defines the
view index to be the same as the device index. The new
`view_index_from_device_index` MSL option supports this functionality.
2019-07-13 16:45:54 -05:00
Chip Davis
ca91fcfe5f MSL: Support the SPV_AMD_shader_trinary_minmax extension.
This requires MSL 2.1.
2019-07-13 16:43:57 -05:00
Hans-Kristian Arntzen
92e5255570 Run format_all.sh. 2019-07-12 10:59:53 +02:00
Hans-Kristian Arntzen
932ee0e328 Deal correctly with return sign of bitscan operations. 2019-07-12 10:57:56 +02:00
Hans-Kristian Arntzen
19ebbd48c7
Merge pull request #1077 from cdavis5e/msl-spirv-qualifiers
MSL: Handle coherent, volatile, and restrict.
2019-07-12 10:03:06 +02:00
Hans-Kristian Arntzen
ad5eae46ed
Merge pull request #1078 from cdavis5e/post-depth-coverage
Support the SPV_KHR_post_depth_coverage extension.
2019-07-12 09:56:26 +02:00
Chip Davis
6628ea6e48 MSL: Use the select() function for OpSelect.
This significantly improves codegen for vector `OpSelect` in MSL.
2019-07-11 10:30:37 -05:00
Chip Davis
1df47db6ba Support the SPV_KHR_post_depth_coverage extension.
Using the `PostDepthCoverage` mode specifies that the `gl_SampleMaskIn`
variable is to contain the computed coverage mask following the early
fragment tests, which this mode requires and implicitly enables.

Note that unlike Vulkan and OpenGL, Metal places this on the sample mask
input itself, and furthermore does *not* implicitly enable early
fragment testing. If it isn't enabled explicitly with an
`[[early_fragment_tests]]` attribute, the compiler will error out. So we
have to enable that mode explicitly if `PostDepthCoverage` is enabled
but `EarlyFragmentTests` isn't.

For Metal, only iOS supports this; for some reason, Apple has yet to
implement it on macOS, even though many desktop cards support it.
2019-07-11 10:28:43 -05:00
Chip Davis
058f1a0933 MSL: Handle coherent, volatile, and restrict.
This maps them to their MSL equivalents. I've mapped `Coherent` to
`volatile` since MSL doesn't have anything weaker than `volatile` but
stronger than nothing.

As part of this, I had to remove the implicit `volatile` added for
atomic operation casts. If the buffer is already `coherent` or
`volatile`, then we would add a second `volatile`, which would be
redundant. I think this is OK even when the buffer *doesn't* have
`coherent`: `T *` is implicitly convertible to `volatile T *`, but not
vice-versa. It seems to compile OK at any rate. (Note that the
non-`volatile` overloads of the atomic functions documented in the spec
aren't present in the MSL 2.2 stdlib headers.)

`restrict` is tricky, because in MSL, as in C++, it needs to go *after*
the asterisk or ampersand for the pointer type it's modifying.

Another issue is that, in the `Simple`, `GLSL450`, and `Vulkan` memory
models, `Restrict` is the default (i.e. does not need to be specified);
but MSL likely follows the `OpenCL` model where `Aliased` is the
default. We probably need to implicitly set either `Restrict` or
`Aliased` depending on the module's declared memory model.
2019-07-11 10:22:30 -05:00
Hans-Kristian Arntzen
1a592b7c0f
Merge pull request #1067 from cdavis5e/msl-scalar-block-layout
MSL: Support scalar block layout.
2019-07-11 13:03:03 +02:00
Chip Davis
28454facbb MSL: Handle packed matrices.
The old method of using a different unpacked matrix type doesn't work
for scalar alignment. It certainly wouldn't have any effect for a square
matrix, since the number of columns and rows are the same. So now we'll
store them as arrays of packed vectors.
2019-07-10 18:37:31 -05:00
Chip Davis
ea5c0ed82f MSL: Fix alignment of packed types.
Packed types have scalar alignment.
2019-07-10 11:57:04 -05:00
Hans-Kristian Arntzen
6b010e0cbc
Merge pull request #1069 from KhronosGroup/fix-1053
MSL: Re-roll array expressions in initializers.
2019-07-10 12:15:12 +02:00
Hans-Kristian Arntzen
f6f849397e MSL: Re-roll array expressions in initializers.
We cannot rely on copy path when using an array as part of a struct
initializer, so reroll such expressions to an initializer list again.
2019-07-10 11:19:33 +02:00
Chip Davis
e5fa7edfd6 MSL: Support scalar block layout.
Relaxed block layout relaxed the restrictions on vector alignment,
allowing them to be aligned on scalar boundaries. Scalar block layout
relaxes this further, allowing *any* member to be aligned on a scalar
boundary. The requirement that a vector not improperly straddle a
16-byte boundary is also relaxed.

I've also added a test showing that `std430` layout works with UBOs.

I'm troubled by the dual meaning of the `Packed` extended decoration. In
some instances (struct, `float[]`, and `vec2[]` members), it actually
means the exact opposite, that the member needs extra padding. This is
especially problematic for `vec2[]`, because now we need to distinguish
the two cases by checking the array stride. I wonder if this should
actually be split into two decorations.
2019-07-09 20:59:32 -05:00
Hans-Kristian Arntzen
909040e2eb MSVC 2013: Work around another compiler bug with array init. 2019-07-09 15:31:01 +02:00
Hans-Kristian Arntzen
4056d0b74e Don't use scalar dot(). 2019-07-03 14:32:06 +02:00
Hans-Kristian Arntzen
041f103d44 MSL/HLSL: Support scalar reflect and refract. 2019-07-03 12:31:52 +02:00
Chip Davis
31b6c93516 MSL: Support SubgroupLocalInvocationId and SubgroupSize in all stages.
MSL prior to 2.2 doesn't support these natively in any stage but
compute. But, we can (assuming no threads were terminated prematurely)
get their values with some creative uses of the
`simd_prefix_exclusive_sum()` and `simd_sum()` functions.

Also, fix a missing `to_expression()` with `BuiltInSubgroupEqMask`.

For KhronosGroup/MoltenVK#629.
2019-07-02 11:48:59 -05:00
Hans-Kristian Arntzen
f8b084de61 MSL/HLSL: Support OpOuterProduct. 2019-07-01 10:57:27 +02:00
Chip Davis
7eecf5a46b MSL: Support SPV_KHR_multiview.
This is needed to support `VK_KHR_multiview`, which is in turn needed
for Vulkan 1.1 support. Unfortunately, Metal provides no native support
for this, and Apple is once again less than forthcoming, so we have to
implement it all ourselves.

Tessellation and geometry shaders are deliberately unsupported for now.
The problem is that the current implementation encodes the `ViewIndex`
as part of the `InstanceIndex`, which in the SPIR-V environment at least
only exists in the vertex shader. So we need to work out a way to pass
the view index along to the later stages.

This implementation runs vertex shaders for all views up to the highest
bit set in the view mask, even those whose bits are clear. The fragments
for the inactive views are then discarded. Avoiding this is difficult:
calculating the view indices becomes far more complicated if we can only
run for those views which are set in the mask.
2019-06-29 09:43:55 -05:00
Hans-Kristian Arntzen
ff87419607 Deal with scalar input values for distance/length/normalize.
HLSL and MSL don't support it, so fall back to simpler intrinsics.
2019-06-28 11:20:14 +02:00
Hans-Kristian Arntzen
1543bdaf7b Run format_all.sh. 2019-06-27 15:10:59 +02:00
Hans-Kristian Arntzen
c76b99b711 Handle more cases with FP16 and texture sampling. 2019-06-27 15:04:22 +02:00
Hans-Kristian Arntzen
45805857e5 MSL: De-virtualize get_declared_struct_member_size.
It does not make sense to use a virtual call in the Compiler base class
here. Make it clearer by renaming the MSL-specific version to _msl.
2019-06-26 19:11:38 +02:00
Hans-Kristian Arntzen
02b2a1015d MSL: Fix minor XCode /analyze warning.
Written variable, but never read.
2019-06-26 16:10:58 +02:00
Hans-Kristian Arntzen
8f6939cb0d
Merge pull request #1041 from KhronosGroup/fix-1011
MSL: Add support for SubgroupSize / SubgroupInvocationID in fragment.
2019-06-26 15:01:13 +02:00
Hans-Kristian Arntzen
ab3798fd91 MSL: Add support for SubgroupSize / SubgroupInvocationID in fragment. 2019-06-24 12:31:54 +02:00
Hans-Kristian Arntzen
048f2380f3 MSL: Support custom bindings for argument buffer itself. 2019-06-24 11:10:20 +02:00
Hans-Kristian Arntzen
b4e0163749 Run format_all.sh. 2019-06-21 16:02:22 +02:00
Hans-Kristian Arntzen
3a4a9acac9 MSL: Add C API for querying automatic resource bindings. 2019-06-21 13:19:59 +02:00
Hans-Kristian Arntzen
e2c95bdcbc MSL: Rewrite how resource indices are fallback-assigned.
We used to use the Binding decoration for this, but this method is
hopelessly broken. If no explicit MSL resource remapping exists, we
remap automatically in a manner which should always "just work".
2019-06-21 12:54:08 +02:00
Hans-Kristian Arntzen
a1f7c8dc8e
Merge pull request #1031 from KhronosGroup/fix-1009
MSL: Support 64-bit integers.
2019-06-19 15:29:27 +02:00
Hans-Kristian Arntzen
7fdb418f18
Merge pull request #1028 from KhronosGroup/fix-1010
MSL: Support barycentrics and PrimitiveID in fragment shaders
2019-06-19 15:29:14 +02:00
Hans-Kristian Arntzen
4c20c941f0
Merge pull request #1025 from KhronosGroup/fix-1013
MSL: Support OpImageQueryLod.
2019-06-19 14:07:39 +02:00
Hans-Kristian Arntzen
a6798d06a2 MSL: Error out on int64_t/uint64_t buffer members.
Not supported for whatever reason.
2019-06-19 10:14:46 +02:00
Hans-Kristian Arntzen
a6b71ae999 MSL: Support 64-bit integers. 2019-06-19 09:55:00 +02:00
Hans-Kristian Arntzen
2e1cee5e1e MSL: Support PrimitiveID in fragment and barycentrics. 2019-06-19 09:52:35 +02:00
Hans-Kristian Arntzen
0671b3c35b MSL: Support OpImageQueryLod.
Correctness is a bit unclear at the moment. The spec document for 2.2 is
not updated for query-lod, but this is the best we can do anyways.
2019-06-19 09:51:56 +02:00
Hans-Kristian Arntzen
f171d82590 MSL: Support MinLod operand. 2019-06-19 09:43:03 +02:00
Hans-Kristian Arntzen
95053ea4bc
Merge pull request #1024 from KhronosGroup/fix-1016
GLSL/MSL: Support stencil export
2019-06-12 12:48:10 +02:00
Hans-Kristian Arntzen
14d0a1eb0c MSL: Support stencil export. 2019-06-12 10:21:20 +02:00
Hans-Kristian Arntzen
a7b2ba28a0 MSL: Support Invariant qualifier on position. 2019-06-12 09:39:12 +02:00
Hans-Kristian Arntzen
30bb197a5d MSL: Support remapping constexpr samplers by set/binding.
Older API was oriented around IDs which are not available unless you're
doing full reflection, which is awkward for certain use cases which know
their set/bindings up front.

Optimize resource bindings to be hashmap rather than doing linear seeks
all the time.
2019-06-10 15:41:36 +02:00
Hans-Kristian Arntzen
314efdcc42 MSL: Fix declaration of unused input variables.
In multiple-entry-point modules, we declared builtin inputs which were
not supposed to be used for that entry point.

Fix this, by being more strict when checking which builtins to emit.
2019-05-31 13:23:34 +02:00
Hans-Kristian Arntzen
b3094cd02a Run format_all.sh. 2019-05-27 16:54:13 +02:00
Hans-Kristian Arntzen
fd0feb1ec1 MSL: Use correct address space when passing array-of-buffers.
Need to check if the descriptor set is actually an argument buffer.
2019-05-27 16:53:30 +02:00
Hans-Kristian Arntzen
7b9e0fb428 MSL: Implement OpArrayLength.
This gets rather complicated because MSL does not support OpArrayLength
natively. We need to pass down a buffer which contains buffer sizes, and
we compute the array length on-demand.

Support both discrete descriptors as well as argument buffers.
2019-05-27 16:13:09 +02:00
Hans-Kristian Arntzen
96492648d4 MSL: Fix struct declaration order with complex type aliases.
MSL generally emits the aliases, which means we cannot always place the
master type first, unlike GLSL and HLSL. The logic fix is just to
reorder after we have tagged types with packing information, rather than
doing it in the parser fixup.
2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen
eaf7afed97 MSL: Support argument buffers and image swizzling.
Change aux buffer to swizzle buffer.
There is no good reason to expand the aux buffer, so name it
appropriately.

Make the code cleaner by emitting a straight pointer to uint rather than
a dummy struct which only contains a single unsized array member anyways.

This will also end up being very similar to how we implement swizzle
buffers for argument buffers.

Do not use implied binding if it overflows int32_t.
2019-05-18 10:30:06 +02:00
Chip Davis
8983920edf Remove fallback for OpGroupNonUniformElect.
It's not safe to enable subgroup support without this actually working
correctly.
2019-05-16 13:42:09 -05:00
Chip Davis
9d9415754b MSL: Add support for subgroup operations.
Some support for subgroups is present starting in Metal 2.0 on both iOS
and macOS. macOS gains more complete support in 10.14 (Metal 2.1).

Some restrictions are present. On iOS and on macOS 10.13, the
implementation of `OpGroupNonUniformElect` is incorrect: if thread 0 has
already terminated or is not executing a conditional branch, the first
thread that *is* will falsely believe itself not to be. Unfortunately,
this operation is part of the "basic" feature set; without it, subgroups
cannot be supported at all.

The `SubgroupSize` and `SubgroupLocalInvocationId` builtins are only
available in compute shaders (and, by extension, tessellation control
shaders), despite SPIR-V making them available in all stages. This
limits the usefulness of some of the subgroup operations in fragment
shaders.

Although Metal on macOS supports some clustered, inclusive, and
exclusive operations, it does not support them all. In particular,
inclusive and exclusive min, max, and, or, and xor; as well as cluster
sizes other than 4 are not supported. If this becomes a problem, they
could be emulated, but at a significant performance cost due to the need
for non-uniform operations.
2019-05-15 17:40:04 -05:00
Hans-Kristian Arntzen
647ddaee42 HLSL/MSL: Deal correctly with nonuniformEXT qualifier.
MSL does not seem to have a qualifier for this, but HLSL SM 5.1 does.
glslangValidator for HLSL does not support this, so skip any validation,
but it passes in FXC.
2019-05-13 14:58:27 +02:00