Commit Graph

108 Commits

Author SHA1 Message Date
Hans-Kristian Arntzen
711300baed MSL: Do not emit swizzled writes in packing fixups.
Similar to scalar access chain fix, this causes a read-modify-write on
memory we're not supposed to write to.
2020-07-06 10:03:46 +02:00
Hans-Kristian Arntzen
fa5b206d97 MSL: Workaround broken vector -> scalar access chain in MSL.
On MSL, the compiler refuses to allow access chains into a normal vector type.
What happens in practice instead is a read-modify-write where a vector type is
loaded, modified and written back.

The workaround is to convert a vector into a pointer-to-scalar before
the access chain continues to add the scalar index.
2020-07-06 10:03:44 +02:00
Hans-Kristian Arntzen
02db4c1f16 MSL: Add tests for array copies in and out of buffers. 2020-06-18 11:59:02 +02:00
Hans-Kristian Arntzen
107ab7c2b7 MSL: Avoid packed arrays in more cases.
Extend the array stride relaxation to non-packed arrays as well, as
long as the array in question contains a single array element.
2020-05-06 10:27:12 +02:00
Hans-Kristian Arntzen
17ad62eea4 MSL: Support edge case with DX layout in scalar block layout.
DX may emit ArrayStride and MatrixStride of 16, but the size of the
object does not align with that and expect to pack other members inside
its last member.

The workaround is to emit array size/col/row one less than we expect and
rely on padding to carve out a "dead zone" for the last member.
2020-04-20 15:29:24 +02:00
Hans-Kristian Arntzen
d91e134500 MSL: Add native array test for composite array initialization. 2020-02-24 13:34:51 +01:00
Hans-Kristian Arntzen
20b28f72fa MSL: Reinstate workaround for returning arrays. 2020-02-24 13:04:10 +01:00
Hans-Kristian Arntzen
c9d4f9cd74 MSL: Add a workaround path to force native arrays for everything. 2020-02-24 12:47:14 +01:00
Chip Davis
ae6c05f6f4 MSL: Move inline uniform blocks to the end of the argument buffer.
Limit inline blocks to one per descriptor set.

This should avoid the need for complicated code to calculate the
argument buffer ID stride of an inline uniform block. If there's demand
for more inline blocks, we can revisit this.
2020-01-25 13:40:51 -06:00
Chip Davis
fedbc35315 MSL: Support inline uniform blocks in argument buffers.
Here, the inline uniform block is explicit: we instantiate the buffer
block itself in the argument buffer, instead of a pointer to the buffer.
I just hope this will work with the `MTLArgumentDescriptor` API...

Note that Metal recursively assigns individual members of embedded
structs IDs. This means for automatic assignment that we have to
calculate the binding stride for a given buffer block. For MoltenVK,
we'll simply increment the ID by the size of the inline uniform block.
Then the later IDs will never conflict with the inline uniform block. We
can get away with this because Metal doesn't require that IDs be
contiguous, only monotonically increasing.
2020-01-24 18:51:24 -06:00
Hans-Kristian Arntzen
b85ab5f5ff MSL: Fix automatic binding allocation for image atomic buffers.
The Primary decoration was used by the atomic buffer, causing the
texture binding to be potentially overlapping with other resources.
2019-11-28 11:07:44 +01:00
Dan Sinclair
d409210ee5 Move all .invalid shaders into no-opt folders. 2019-11-05 13:19:19 -05:00
Hans-Kristian Arntzen
fa011f8547 MSL: Declare arrays with proper type wrapper.
Need to construct with value type spvUnsafeArray<T, N>({ elem0, elem1 })
to make array initialization work in complex scenarios.
2019-10-26 17:57:34 +02:00
Hans-Kristian Arntzen
2767257adc MSL: Do not declare array of UBO/SSBO as spvUnsafeArray<T>.
There is no need for these to be copied, and cuts down on template
stamping bloat.
2019-10-26 16:10:08 +02:00
Hans-Kristian Arntzen
d1479f871a MSL: Do not generate UnsafeArray<> for any array inside buffer objects.
This avoids a lot of huge code changes.
Arrays generally cannot be copied in and out of buffers, at least no
compiler frontend seems to do it.

Also avoids a lot of issues surrounding packed vectors and matrices.
2019-10-24 12:22:30 +02:00
Hans-Kristian Arntzen
db55d474f9 MSL: Do not declare complex composite array in main for non-inlined.
Need to consider that complex composite arrays may be used in leaf
functions, and avoid the MSL library link fix unless everything is
nicely inlined.
2019-10-24 11:12:01 +02:00
Lukas Hermanns
c3d6022956 Update for pull request #1162 rev. 1 2019-09-24 18:13:04 -04:00
Lukas Hermanns
7ad0a84778 Updates for pull request #1162 2019-09-24 14:35:25 -04:00
Lukas Hermanns
37df74035b Merge branch 'ue4_dev' 2019-09-20 09:42:42 -04:00
Ryan Harrison
cf1bf1c6ae Update external/ to SPIR-V 1.5
Rolled the hashes used for glslang, SPIRV-Tools, and SPIRV-Headers to
HEAD, which includes the update to 1.5.

Added passing '--amb' to glslang, so I didn't have to explicitly set
bindings in a large number of test shaders that currently don't, and
now glslang considers them invalid.

Marked all shaders that no longer pass spirv-val as .invalid.
2019-09-18 16:04:27 -04:00
Lukas Hermanns
744cc3e595 Updated test shaders. 2019-09-18 14:18:22 -04:00
Lukas Hermanns
cb3ecb9e1b Updated reference Metal shaders. 2019-09-17 15:11:19 -04:00
Lukas Hermanns
9573faa56d Removed all '.DS_Store' files. 2019-09-13 14:04:32 -04:00
Mark Satterthwaite
564cb3c08d Update the Metal shaders to account for changes in the shader compilation. 2019-09-11 15:06:05 -04:00
Chip Davis
cb35934248 MSL: Support dynamic offsets for buffers in argument buffers.
Vulkan has two types of buffer descriptors,
`VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC` and
`VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC`, which allow the client to
offset the buffers by an amount given when the descriptor set is bound
to a pipeline. Metal provides no direct support for this when the buffer
in question is in an argument buffer, so once again we're on our own.
These offsets cannot be stored or associated in any way with the
argument buffer itself, because they are set at bind time.  Different
pipelines may have different offsets set. Therefore, we must use a
separate buffer, not in any argument buffer, to hold these offsets. Then
the shader must manually offset the buffer pointer.

This change fully supports arrays, including arrays of arrays, even
though Vulkan forbids them. It does not, however, support runtime
arrays. Perhaps later.
2019-09-05 23:29:00 -05:00
Chip Davis
103817009c MSL: Force storage images on iOS to use discrete descriptors.
Writable textures cannot use argument buffers on iOS. They must be
passed as arguments directly to the shader function. Since we won't know
if a given storage image will have the `NonWritable` decoration at the
time we encode the argument buffer, we must therefore pass all storage
images as discrete arguments. Previously, we were throwing an error if
we encountered an argument buffer with a writable texture in it on iOS.
2019-09-05 11:01:05 -05:00
Chip Davis
39dce88d3b MSL: Add support for sampler Y'CbCr conversion.
This change introduces functions and in one case, a class, to support
the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of
GBGR8 and BGRG8 formats, for which Metal natively supports implicit
chroma reconstruction, we're on our own here. We have to do everything
ourselves. Much of the complexity comes from the need to support
multiple planes, which must now be passed to functions that use the
corresponding combined image-samplers. The rest is from the actual
Y'CbCr conversion itself, which requires additional post-processing of
the sample retrieved from the image.

Passing sampled images to a function was a particular problem. To
support this, I've added a new class which is emitted to MSL shaders
that pass sampled images with Y'CbCr conversions attached around. It
can handle sampled images with or without Y'CbCr conversion. This is an
awful abomination that should not exist, but I'm worried that there's
some shader out there which does this. This support requires Metal 2.0
to work properly, because it uses default-constructed texture objects,
which were only added in MSL 2. I'm not even going to get into arrays of
combined image-samplers--that's a whole other can of worms.  They are
deliberately unsupported in this change.

I've taken the liberty of refactoring the support for texture swizzling
while I'm at it. It's now treated as a post-processing step similar to
Y'CbCr conversion. I'd like to think this is cleaner than having
everything in `to_function_name()`/`to_function_args()`. It still looks
really hairy, though. I did, however, get rid of the explicit type
arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`.

Update the C API. In addition to supporting this new functionality, add
some compiler options that I added in previous changes, but for which I
neglected to update the C API.
2019-09-01 18:35:53 -05:00
Thomas Roughton
91b2f34a3d Update tests to account for all non-entry-point functions being inlined 2019-08-30 09:39:06 +12:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Chip Davis
fb5ee4cb5c MSL: Adjust BuiltInWorkgroupId for vkCmdDispatchBase().
This command allows the caller to set the base value of
`BuiltInWorkgroupId`, and thus of `BuiltInGlobalInvocationId`. Metal
provides no direct support for this... but it does provide a builtin,
`[[grid_origin]]`, normally used to pass the base values for the stage
input region, which we will now abuse to pass the dispatch base and
avoid burning a buffer binding.

`[[grid_origin]]`, as part of Metal's support for compute stage input,
requires MSL 1.2. For 1.0 and 1.1, we're forced to provide a buffer.

(Curiously, this builtin was undocumented until the MSL 2.2 release. Go
figure.)
2019-07-24 08:56:15 -05:00
Hans-Kristian Arntzen
47a18b9f1b Simplify row-major matrix/vector multiplies. 2019-07-23 10:56:57 +02:00
Hans-Kristian Arntzen
be2fccd837 Tests run clean. 2019-07-22 10:23:39 +02:00
Chip Davis
058f1a0933 MSL: Handle coherent, volatile, and restrict.
This maps them to their MSL equivalents. I've mapped `Coherent` to
`volatile` since MSL doesn't have anything weaker than `volatile` but
stronger than nothing.

As part of this, I had to remove the implicit `volatile` added for
atomic operation casts. If the buffer is already `coherent` or
`volatile`, then we would add a second `volatile`, which would be
redundant. I think this is OK even when the buffer *doesn't* have
`coherent`: `T *` is implicitly convertible to `volatile T *`, but not
vice-versa. It seems to compile OK at any rate. (Note that the
non-`volatile` overloads of the atomic functions documented in the spec
aren't present in the MSL 2.2 stdlib headers.)

`restrict` is tricky, because in MSL, as in C++, it needs to go *after*
the asterisk or ampersand for the pointer type it's modifying.

Another issue is that, in the `Simple`, `GLSL450`, and `Vulkan` memory
models, `Restrict` is the default (i.e. does not need to be specified);
but MSL likely follows the `OpenCL` model where `Aliased` is the
default. We probably need to implicitly set either `Restrict` or
`Aliased` depending on the module's declared memory model.
2019-07-11 10:22:30 -05:00
Hans-Kristian Arntzen
1a592b7c0f
Merge pull request #1067 from cdavis5e/msl-scalar-block-layout
MSL: Support scalar block layout.
2019-07-11 13:03:03 +02:00
Chip Davis
28454facbb MSL: Handle packed matrices.
The old method of using a different unpacked matrix type doesn't work
for scalar alignment. It certainly wouldn't have any effect for a square
matrix, since the number of columns and rows are the same. So now we'll
store them as arrays of packed vectors.
2019-07-10 18:37:31 -05:00
Hans-Kristian Arntzen
f6f849397e MSL: Re-roll array expressions in initializers.
We cannot rely on copy path when using an array as part of a struct
initializer, so reroll such expressions to an initializer list again.
2019-07-10 11:19:33 +02:00
Hans-Kristian Arntzen
f8b084de61 MSL/HLSL: Support OpOuterProduct. 2019-07-01 10:57:27 +02:00
Hans-Kristian Arntzen
ff87419607 Deal with scalar input values for distance/length/normalize.
HLSL and MSL don't support it, so fall back to simpler intrinsics.
2019-06-28 11:20:14 +02:00
Hans-Kristian Arntzen
e2c95bdcbc MSL: Rewrite how resource indices are fallback-assigned.
We used to use the Binding decoration for this, but this method is
hopelessly broken. If no explicit MSL resource remapping exists, we
remap automatically in a manner which should always "just work".
2019-06-21 12:54:08 +02:00
Hans-Kristian Arntzen
a6798d06a2 MSL: Error out on int64_t/uint64_t buffer members.
Not supported for whatever reason.
2019-06-19 10:14:46 +02:00
Hans-Kristian Arntzen
a6b71ae999 MSL: Support 64-bit integers. 2019-06-19 09:55:00 +02:00
Hans-Kristian Arntzen
fd0feb1ec1 MSL: Use correct address space when passing array-of-buffers.
Need to check if the descriptor set is actually an argument buffer.
2019-05-27 16:53:30 +02:00
Hans-Kristian Arntzen
42e64597a7 OpArrayLength must trigger active variables. 2019-05-27 16:44:02 +02:00
Hans-Kristian Arntzen
7b9e0fb428 MSL: Implement OpArrayLength.
This gets rather complicated because MSL does not support OpArrayLength
natively. We need to pass down a buffer which contains buffer sizes, and
we compute the array length on-demand.

Support both discrete descriptors as well as argument buffers.
2019-05-27 16:13:09 +02:00
Hans-Kristian Arntzen
55ff233526 MSL: Add test case for complex type alias. 2019-05-23 15:05:30 +02:00
Hans-Kristian Arntzen
96492648d4 MSL: Fix struct declaration order with complex type aliases.
MSL generally emits the aliases, which means we cannot always place the
master type first, unlike GLSL and HLSL. The logic fix is just to
reorder after we have tagged types with packing information, rather than
doing it in the parser fixup.
2019-05-23 14:54:04 +02:00
Hans-Kristian Arntzen
eaf7afed97 MSL: Support argument buffers and image swizzling.
Change aux buffer to swizzle buffer.
There is no good reason to expand the aux buffer, so name it
appropriately.

Make the code cleaner by emitting a straight pointer to uint rather than
a dummy struct which only contains a single unsized array member anyways.

This will also end up being very similar to how we implement swizzle
buffers for argument buffers.

Do not use implied binding if it overflows int32_t.
2019-05-18 10:30:06 +02:00
Chip Davis
9d9415754b MSL: Add support for subgroup operations.
Some support for subgroups is present starting in Metal 2.0 on both iOS
and macOS. macOS gains more complete support in 10.14 (Metal 2.1).

Some restrictions are present. On iOS and on macOS 10.13, the
implementation of `OpGroupNonUniformElect` is incorrect: if thread 0 has
already terminated or is not executing a conditional branch, the first
thread that *is* will falsely believe itself not to be. Unfortunately,
this operation is part of the "basic" feature set; without it, subgroups
cannot be supported at all.

The `SubgroupSize` and `SubgroupLocalInvocationId` builtins are only
available in compute shaders (and, by extension, tessellation control
shaders), despite SPIR-V making them available in all stages. This
limits the usefulness of some of the subgroup operations in fragment
shaders.

Although Metal on macOS supports some clustered, inclusive, and
exclusive operations, it does not support them all. In particular,
inclusive and exclusive min, max, and, or, and xor; as well as cluster
sizes other than 4 are not supported. If this becomes a problem, they
could be emulated, but at a significant performance cost due to the need
for non-uniform operations.
2019-05-15 17:40:04 -05:00
Bill Hollings
efbe7ca16f MSL: Fix infinite CAS loop on atomic_compare_exchange_weak_explicit(). 2019-04-05 21:28:57 -04:00
Hans-Kristian Arntzen
0909975655 MSL: Declare gl_WorkGroupSize constant with [[maybe_unused]].
Avoids ugly warnings on nearly every compute shader.
We could do analysis to detect whether we need to emit this constant,
but it's a bit tedious to figure out if an OpConstantComponent is
actually used by opcodes, so just make it simple.
2019-03-28 10:54:18 +01:00