Commit Graph

2170 Commits

Author SHA1 Message Date
Chip Davis
39dce88d3b MSL: Add support for sampler Y'CbCr conversion.
This change introduces functions and in one case, a class, to support
the `VK_KHR_sampler_ycbcr_conversion` extension. Except in the case of
GBGR8 and BGRG8 formats, for which Metal natively supports implicit
chroma reconstruction, we're on our own here. We have to do everything
ourselves. Much of the complexity comes from the need to support
multiple planes, which must now be passed to functions that use the
corresponding combined image-samplers. The rest is from the actual
Y'CbCr conversion itself, which requires additional post-processing of
the sample retrieved from the image.

Passing sampled images to a function was a particular problem. To
support this, I've added a new class which is emitted to MSL shaders
that pass sampled images with Y'CbCr conversions attached around. It
can handle sampled images with or without Y'CbCr conversion. This is an
awful abomination that should not exist, but I'm worried that there's
some shader out there which does this. This support requires Metal 2.0
to work properly, because it uses default-constructed texture objects,
which were only added in MSL 2. I'm not even going to get into arrays of
combined image-samplers--that's a whole other can of worms.  They are
deliberately unsupported in this change.

I've taken the liberty of refactoring the support for texture swizzling
while I'm at it. It's now treated as a post-processing step similar to
Y'CbCr conversion. I'd like to think this is cleaner than having
everything in `to_function_name()`/`to_function_args()`. It still looks
really hairy, though. I did, however, get rid of the explicit type
arguments to `spvGatherSwizzle()`/`spvGatherCompareSwizzle()`.

Update the C API. In addition to supporting this new functionality, add
some compiler options that I added in previous changes, but for which I
neglected to update the C API.
2019-09-01 18:35:53 -05:00
Hans-Kristian Arntzen
9b845a4788
Merge pull request #1141 from troughton/inline-everything
MSL: Inline all non-entry-point functions
2019-08-30 11:05:04 +02:00
Thomas Roughton
6b5403206e Clang-format changes 2019-08-30 20:25:40 +12:00
Thomas Roughton
91b2f34a3d Update tests to account for all non-entry-point functions being inlined 2019-08-30 09:39:06 +12:00
Hans-Kristian Arntzen
ee7357f2a6
Merge pull request #1140 from KhronosGroup/fix-1139
MSL: Add {Base,}{Vertex,Instance}Index to bitcast_from_builtin_load.
2019-08-29 16:06:32 +02:00
Hans-Kristian Arntzen
07c76f66b5 MSL: Add {Base,}{Vertex,Instance}Index to bitcast_from_builtin_load.
Totally missed these, so float(index) would not work correctly for
negative numbers.
2019-08-29 13:56:37 +02:00
Hans-Kristian Arntzen
761d3da677
Merge pull request #1137 from cdavis5e/post-depth-coverage-essl
GLSL: Fix post-depth coverage for ESSL.
2019-08-29 13:11:59 +02:00
Thomas Roughton
e5f9e2c203 Inline all non-entry-point functions 2019-08-29 17:07:57 +12:00
Thomas Roughton
6338f0aa0f MSL: inline all emitted functions
# Conflicts:
#	spirv_msl.cpp
2019-08-29 17:07:27 +12:00
Chip Davis
5fe1ecc324 GLSL: Fix post-depth coverage for ESSL.
ESSL does not support `GL_ARB_post_depth_coverage`. There, we must use
`GL_EXT_post_depth_coverage`. I've added this as a fallback for desktop
as well.

Note that `GL_EXT_post_depth_coverage` also requires the fragment shader
to set `early_fragment_tests` explicitly, while
`GL_ARB_post_depth_coverage` does not. It doesn't really matter either
way, since `SPV_KHR_post_depth_coverage` *also* requires both execution
modes to be explicitly set.
2019-08-28 13:40:13 -05:00
Hans-Kristian Arntzen
3ccfbce264 Run format_all.sh. 2019-08-28 14:25:26 +02:00
Hans-Kristian Arntzen
de26e08195
Merge pull request #1136 from KhronosGroup/fix-1132
GLSL: Assume image and sampler can be RelaxedPrecision.
2019-08-28 14:24:28 +02:00
Hans-Kristian Arntzen
d5a65b4190 GLSL: Assume image and sampler can be RelaxedPrecision.
When merging combined image samplers, we only looked at sampler, but DXC
emits RelaxedPrecision only for texture. Does not hurt to check for more
things.
2019-08-27 17:15:19 +02:00
Hans-Kristian Arntzen
563e994486
Merge pull request #1135 from KhronosGroup/fix-1119
MSL: Deal with array copies from and to threadgroup.
2019-08-27 15:48:08 +02:00
Hans-Kristian Arntzen
aec826222d
Merge pull request #1134 from KhronosGroup/fix-1117
Do not allow base expressions for non-native row-major matrices.
2019-08-27 15:47:33 +02:00
Hans-Kristian Arntzen
9436cd3036 MSL: Deal with array copies from and to threadgroup. 2019-08-27 13:18:01 +02:00
Hans-Kristian Arntzen
1017a02aad
Merge pull request #1133 from KhronosGroup/fix-1115
Deal with ldexp taking uint input.
2019-08-27 13:17:43 +02:00
Hans-Kristian Arntzen
b198b15b27
Merge pull request #1131 from KhronosGroup/fix-1114
Remove unnecessary continue block statements
2019-08-27 13:17:31 +02:00
Hans-Kristian Arntzen
7ff2db4570 Do not allow base expressions for non-native row-major matrices. 2019-08-27 11:41:54 +02:00
Hans-Kristian Arntzen
2f7848dcda Deal with ldexp taking uint input.
Need to value cast to int first.
2019-08-27 11:19:54 +02:00
Hans-Kristian Arntzen
5d97dae1eb Move branchless analysis to CFG.
Traverse backwards instead, far more robust. Should elide basically all
redundant continue; statements now.
2019-08-27 10:19:19 +02:00
Hans-Kristian Arntzen
55c2ca90ae Elide branches to continue block when continue block is also a merge. 2019-08-27 10:19:01 +02:00
Hans-Kristian Arntzen
903ef0e40a
Merge pull request #1130 from KhronosGroup/fix-1112
Deal correctly with sign on bitfield operations.
2019-08-26 16:23:00 +02:00
Hans-Kristian Arntzen
cf95dc2ef7
Merge pull request #1129 from KhronosGroup/fix-1110
Fix variable scope when switch block exits multiple times.
2019-08-26 11:39:11 +02:00
Hans-Kristian Arntzen
b3305799a8 Deal correctly with sign on bitfield operations.
Need a lot of special purpose implementation functions for these.
2019-08-26 11:36:36 +02:00
Hans-Kristian Arntzen
e3d4dddfec Fix variable scope when switch block exits multiple times.
Inner scope can still dominate here, so we need to be conservative when
we observe switch blocks specifically. Normal selection merges cannot
merge from multiple paths.
2019-08-26 10:05:43 +02:00
Hans-Kristian Arntzen
4ce04480ec
Merge pull request #1111 from KhronosGroup/fix-1108
Fix severe performance issue with invariant expression invalidation.
2019-08-01 10:01:15 +02:00
Hans-Kristian Arntzen
b97e9b0499 Fix severe performance issue with invariant expression invalidation.
We were going down a tree of expressions multiple times and this caused
an exponential explosion in time, which was not caught until recently.

Fix this by blocking any traversal going through an ID more than one
time.

This fix overall improves performance by almost an order of magnitude on a
particular test shader rather than slowing it down by ~75x.
2019-08-01 09:55:21 +02:00
Hans-Kristian Arntzen
ffca8735ff
Merge pull request #1105 from cdavis5e/msl-unify-as
MSL: Unify the get_*_address_space() methods.
2019-07-29 10:19:12 +02:00
Chip Davis
df18d98bea MSL: Unify the get_*_address_space() methods.
These methods have largely the same logic, with minor differences. That
I felt compelled to duplicate the logic into another method was one of
the things that bothered me about the variable pointers change. This
cleans that part of the code up; now we don't have two places to change.
2019-07-26 09:43:28 -05:00
Hans-Kristian Arntzen
d378413040
Merge pull request #1103 from KhronosGroup/fix-1100
MSL: Cleanup temporary use with emit_uninitialized_temporary.
2019-07-26 14:35:18 +02:00
Hans-Kristian Arntzen
87513f9ac0
Merge pull request #1102 from KhronosGroup/fix-1096
MSL: Deal with Modf/Frexp where output is access chain to scalar.
2019-07-26 14:28:16 +02:00
Hans-Kristian Arntzen
0630a8533c
Merge pull request #1101 from KhronosGroup/fix-1095
Do not force temporary unless continue-only for loop dominates.
2019-07-26 14:27:13 +02:00
Hans-Kristian Arntzen
c3e8e728d8 MSL: Cleanup temporary use with emit_uninitialized_temporary. 2019-07-26 11:16:43 +02:00
Hans-Kristian Arntzen
abb345d0b3 MSL: Deal with Modf/Frexp where output is access chain to scalar.
This is not allowed as we cannot take mutable reference to a
vec.{x,y,z,w}. We only care about scalar since entire vectors are fine.
2019-07-26 11:02:38 +02:00
Hans-Kristian Arntzen
d620f1dd26 Do not force temporary unless continue-only for loop dominates.
We would force temporaries in unexpected places, causing assertions to
throw if access chains were consumed in such loops.
2019-07-26 10:39:05 +02:00
Hans-Kristian Arntzen
301eab1b7a
Merge pull request #1099 from KhronosGroup/fix-1091
Missed case where DoWhile continue block deals with Phi.
2019-07-25 17:44:17 +02:00
Hans-Kristian Arntzen
798282d303
Merge pull request #1098 from KhronosGroup/fix-1090
Vulkan GLSL: Support disabling samplerless texture function EXT.
2019-07-25 16:10:26 +02:00
Hans-Kristian Arntzen
e06efb7259 Missed case where DoWhile continue block deals with Phi. 2019-07-25 12:30:50 +02:00
Hans-Kristian Arntzen
12ca9d1982 Vulkan GLSL: Support disabling samplerless texture function EXT.
Some platforms support Vulkan GLSL, but not this extension apparently
...
2019-07-25 11:07:14 +02:00
Hans-Kristian Arntzen
78fccc4d5c Merge branch 'msl-dispatch-base' 2019-07-25 10:32:14 +02:00
Hans-Kristian Arntzen
3c03b55c46 Workaround MSVC 2013 compiler issues. 2019-07-25 10:28:11 +02:00
Hans-Kristian Arntzen
35fc810a0c Merge branch 'msl-dispatch-base' of git://github.com/cdavis5e/SPIRV-Cross into msl-dispatch-base 2019-07-25 10:26:44 +02:00
Chip Davis
fb5ee4cb5c MSL: Adjust BuiltInWorkgroupId for vkCmdDispatchBase().
This command allows the caller to set the base value of
`BuiltInWorkgroupId`, and thus of `BuiltInGlobalInvocationId`. Metal
provides no direct support for this... but it does provide a builtin,
`[[grid_origin]]`, normally used to pass the base values for the stage
input region, which we will now abuse to pass the dispatch base and
avoid burning a buffer binding.

`[[grid_origin]]`, as part of Metal's support for compute stage input,
requires MSL 1.2. For 1.0 and 1.1, we're forced to provide a buffer.

(Curiously, this builtin was undocumented until the MSL 2.2 release. Go
figure.)
2019-07-24 08:56:15 -05:00
Hans-Kristian Arntzen
07bb1a53e0
Merge pull request #1089 from KhronosGroup/msl-packing-refactor
MSL: Refactor buffer packing logic from ground up.
2019-07-24 15:35:00 +02:00
Hans-Kristian Arntzen
d90eeddcf1 Fix some typos in comments. 2019-07-24 12:14:19 +02:00
Hans-Kristian Arntzen
c62503bca7 Do not attempt to pack types which are already scalar. 2019-07-24 11:52:28 +02:00
Hans-Kristian Arntzen
4bc8729c0e HLSL query lod cleanups. 2019-07-24 11:34:28 +02:00
Hans-Kristian Arntzen
461f1506e7 Do not eagerly invalidate all active variables on a branch.
This is not necessary, as we must emit an invalidating store before we
potentially consume an invalid expression. In fact, we're a bit
conservative here in this case for example:

int tmp = variable;
if (...)
{
    variable = 10;
}
else
{
    // Consuming tmp here is fine, but it was
    // invalidated while emitting other branch.
    // Technically, we need to study if there is an invalidating store
    // in the CFG between the loading block and this block, and the other
    // branch will not be a part of that analysis.
    int tmp2 = tmp * tmp;
}

Fixing this case means complex CFG traversal *everywhere*, and it feels like overkill.

Fixing this exposed a bug with access chains, so fix a bug where expression dependencies were not
inherited properly in access chains. Access chains are now considered forwarded if there
is at least one dependency which is also forwarded.
2019-07-24 11:17:30 +02:00
Hans-Kristian Arntzen
18bcc9b790 Do not disable temporary forwarding when we suppress usage tracking.
This subtle bug removed any expression validation for trivially swizzled
variables. Make usage suppression a more explicit concept rather than
just hacking off forwarded_temporaries.

There is some fallout here with loop generation since our expression
invalidation is currently a bit too naive to handle loops properly.
The forwarding bug masked this problem until now.

If part of the loop condition is also used in the body, we end up
reading an invalid expression, which in turn forces a temporary to be
generated in the condition block, not good. We'll need to be smarter
here ...
2019-07-23 19:18:44 +02:00