If a shader includes a mixture of several stages, such as HS and GS,
the non-stage output geometry should be ignored, lest it conflict
with the stage output.
Two unrelated, minor tweaks:
(1) Use std::array for shiftBindingForSet. Now matches shiftBinding.
(2) Add parens in shouldFlatten() to make compiler warning happy.
Fixes#1092. Allows arrays of opaques to keep arrayness, unless
needed by uniform array flattening.
Can handle assignments of mixed amounts of flattening.
There was some code replication around getting string and integer
values out of an attribute map. This adds new methods to the
TAttributeMap class to encapsulate some accessor details.
A single texture can statically appear in code mixed with a shadow sampler
and a non-shadow sampler. This would be create invalid SPIR-V, unless
one of them is provably dead.
The previous detection of this happened before DCE, so some shaders would
trigger the error even though they wouldn't after DCE. To handle that
case, this PR splits the texture into two: one with each mode. It sets
"needsLegalization" (if that happens for any texture) to warn that this shader
will need post-compilation legalization.
If the texture is only used with one of the two modes, behavior is as it
was before.
Add support for Subpass Input proposal of issue #1069.
Subpass input types are given as:
layout(input_attachment_index = 1) SubpassInput<float4> subpass_f;
layout(input_attachment_index = 2) SubpassInput<int4> subpass_i;
layout(input_attachment_index = 3) SubpassInput<uint4> subpass_u;
layout(input_attachment_index = 1) SubpassInputMS<float4> subpass_ms_f;
layout(input_attachment_index = 2) SubpassInputMS<int4> subpass_ms_i;
layout(input_attachment_index = 3) SubpassInputMS<uint4> subpass_ms_u;
The input attachment may also be specified using attribute syntax:
[[vk::input_attachment_index(7)]] SubpassInput subpass_2;
The template type may be a shorter-than-vec4 vector, but currently user
structs are not supported. (An unimplemented error will be issued).
The load operations are methods on objects of the above type:
float4 result = subpass_f.SubpassLoad();
int4 result = subpass_i.SubpassLoad();
uint4 result = subpass_u.SubpassLoad();
float4 result = subpass_ms_f.SubpassLoad(samp);
int4 result = subpass_ms_i.SubpassLoad(samp);
uint4 result = subpass_ms_u.SubpassLoad(samp);
Additionally, the AST printer could not print EOpSubpass* nodes. Now it can.
Fixes#1069
- support C++11 style brackets [[...]]
- support namespaces [[vk::...]]
- support these on parameter declarations in functions
- support location, binding/set, input attachments
Texture shadow mode must match the state of the sampler they are
combined with. This change does that, both for the AST and the
symbol table. Note that the texture cannot easily be *created*
the right way, because this may not be known at that time. Instead,
the texture is subsequently patched.
This cannot work if a single texture is used with both a shadow and
non-shadow sampler, so that case is detected and generates an error.
This is permitted by the HLSL language, however. See #1073 discussion.
Fixed one test source that was using a texture with both shadow and
non-shadow samplers.
InputPatch parameters to patch constant functions were not using the
internal (temporary) variable type. That could cause validation errors
if the input patch had a mixture of builtins and user qualified members.
This uses the entry point's internal form.
There is currently a limitation: if an InputPatch is used in a PCF,
it must also have appeared in the main entry point's parameter list.
That is not a limitation of HLSL. Currently that situation is detected
and an "implemented" error results. The limitation can be addressed,
but isn't yet in the current form of the PR.
Hull shaders have an implicitly arrayed output. This is handled by creating an arrayed form of the
provided output type, and writing to the element of it indexed by InvocationID.
The implicit indirection into that array was causing some troubles when copying to a split
structure. handleAssign was able to handle simple symbol lvalues, but not an lvalue composed
of an indirection into an array.
Changes:
(1) Allow clip/cull builtins as both input and output in the same shader stage. Previously,
not enough data was tracked to handle this.
(2) Handle the extra array dimension in GS inputs. The synthesized external variable can
now be created with the extra array dimension if needed, and the form conversion code is
able to handle it as well.
For example, both of these GS inputs would result in the same synthesized external type:
triangle in float4 clip[3] : SV_ClipDistance
triangle in float2 clip[3][2] : SV_ClipDistance
In the second case, the inner array dimension packs with the 2-vector of floats into an array[4],
which there is an array[3] of due to the triangle geometry.
While adding geometry stage support for clip/cull, it transpired that the
existing clip/cull support was not setting the type of the result of indexing
into the clup/cull variable. That's a defect independent of the geometry
support, so to simplify the geometry PR, this is addressed separately.
It doesn't appear to change the generated SPIR-V, but that's probably down to
something else tolerating a bad input.
HLSL allows a range of types for clip and cull distances. There are
three dimensions, including arrayness, vectorness, and semantic ID.
SPIR-V requires clip and cull distance be a single array of floats in
all cases.
This code provides input side conversion between the SPIR-V form and
the HLSL form. (Output conversion was added in PR #947 and #997).
This PR extends HlslParseContext::assignClipCullDistance to cope with
the input side conversion. Not as much changed as appears: there was
also a lot of renaming to reflect the fact that the code now handles
either direction.
Currently, non-{frag,vert} stages are not handled, and are explicitly
rejected.
Fixes#1026.
Some languages allow a restricted set of user structure types returned from texture sampling
operations. Restrictions include the total vector size of all components may not exceed 4,
and the basic types of all members must be identical.
This adds underpinnings for that ability. Because storing a whole TType or even a simple
TTypeList in the TSampler would be expensive, the structure definition is held in a
table outside the TType. The TSampler contains a small bitfield index, currently 4 bits
to support up to 15 separate texture template structure types, but that can be adjusted
up or down. Vector returns are handled as before.
There are abstraction methods accepting and returning a TType (such as may have been parsed
from a grammar). The new methods will accept a texture template type and set the
sampler to the structure if possible, checking a range of error conditions such as whether
the total structure vector components exceed 4, or whether their basic types differe, or
whether the struct contains non-vector-or-scalar members. Another query returns the
appropriate TType for the sampler.
High level summary of design:
In the TSampler, this holds an index into the texture structure return type table:
unsigned int structReturnIndex : structReturnIndexBits;
These are the methods to set or get the return type from the TSampler. They work for vector or structure returns, and potentially could be expanded to handle other things (small arrays?) if ever needed.
bool setTextureReturnType(TSampler& sampler, const TType& retType, const TSourceLoc& loc);
void getTextureReturnType(const TSampler& sampler, const TType& retType, const TSourceLoc& loc) const;
The ``convertReturn`` lambda in ``HlslParseContext::decomposeSampleMethods`` is greatly expanded to know how to copy a vec4 sample return to whatever the structure type should be. This is a little awkward since it involves introducing a comma expression to return the proper aggregate value after a set of memberwise copies.
This adds support for #pragma pack_matrix() to the HLSL front end.
The pragma sets the default matrix layout for subsequent unqualified matrices
in structs or buffers. Explicit qualification overrides the pragma value. Matrix
layout is not permitted at the structure level in HLSL, so only leaves which are
matrix types can be so qualified.
Note that due to the semantic (not layout) difference in first matrix indirections
between HLSL and SPIR-V, the sense of row and column major are flipped. That's
independent of this PR: just a factor to note. A column_major qualifier appears
as a RowMajor member decoration in SPIR-V modules, and vice versa.
The HLSL FE tracks four versions of a declared type to avoid losing information, since it
is not (at type-decl time) known how the type will be used downstream. If such a type
was used in a cbuffer declaration, the cbuffer type's members should have been using
the uniform form of the original user structure type, but were not.
This would manifest as matrix qualifiers (and other things, such as pack offsets) on user struct
members going missing in the SPIR-V module if the struct type was a member of a cbuffer, like so:
struct MyBuffer
{
row_major float4x4 mat1;
column_major float4x4 mat2;
};
cbuffer Example
{
MyBuffer g_MyBuffer;
};
Fixes: #789
This allows removal of isPerVertexBuiltIn(). It also leads to
removal of addInterstageIoToLinkage(), which is no longer needed.
Includes related name improvements.
The goal is to flatten all I/O, but there are multiple categories and
steps to complete, likely including a final unification of splitting
and flattening.
Most of this was obsoleted by entry-point wrapping.
Some other is just unnecessary.
Also, includes some spelling/name improvements.
This is to help lay ground work for flattening user I/O.
Two non-functional changes:
1. Remove flattenLevel, which is unneeded since at or around d1be7545c6.
2. Fix build warining about unused variable in executeInitializer.
HLSL allows several variables to be declared. There are packing rules involved:
e.g, a float3 and a float1 can be packed into a single array[4], while for a
float3 and another float3, the second one will skip the third array entry to
avoid straddling
This is implements that ability. Because there can be multiple variables involved,
and the final output array will often be a different type altogether (to fuse
the values into a single destination), a new variable is synthesized, unlike the prior
clip/cull support which used the declared variable. The new variable name is
taken from one of the declared ones, so the old tests are unchanged.
Several new tests are added to test various packing scenarios.
Only two semantic IDs are supported: 0, and 1, per HLSL rules. This is
encapsulated in
static const int maxClipCullRegs = 2;
and the algorithm (probably :) ) generalizes to larger values, although there
are a few issues around how HLSL would pack (e.g, would 4 scalars be packed into
a single HLSL float4 out reg? Probably, and this algorithm assumes so).
--resource-set-binding has a mode which allows per-register assignments of
bindings and descriptor sets on the command line, and another accepting a
single descriptor set value to assign to all variables.
The former worked, but the latter would crash when assigning the values.
This fixes it, and makes the former case a bit more robust against premature
termination of the pre-register values, which must come in (regname,set,binding)
triples.
This also allows the form "--resource-set-binding stage setnum", which was
mentioned in the usage message, but did not parse.
The operation of the per-register form of this option is unchanged.
Semantic test left over from other source languages is removed, since this is permitted by HLSL.
Also, to support the functionality, a targeted test is performed for this case and it is
turned into a EvqGlobal qualifier to create an AST initialization segment when needed.
Constness is now propagated up aggregate chains during initializer construction. This
handles hierarchical cases such as the distinction between:
static const float2 a[2] = { { 1, 2 }, { 3, 4} };
vs
static const float2 a[2] = { { 1, 2 }, { cbuffer_member, 4} };
The first of which can use a first class constant initalization, and the second cannot.
HLSL allows float/etc types for any/all intrinsics, while the
SPIR-V opcode requires bool. This adds a simple decomposition
to type convert the argument. It could get a little more clever
in some of the type cases if it ever had to.
Lays the groundwork for fixing issue #954.
Partial flattenings were previously tracked through a stack of active subsets
in the parse context, but full functionality needs AST nodes to represent
this across time, removing the need for parsecontext tracking.
In HLSL, there are three (TODO: ??) dimensions of clip and cull
distance values:
* The semantic's value N, ala SV_ClipDistanceN.
* The array demension, if the value is an array.
* The vector element, if the value is a vector or array of vectors.
In SPIR-V, clip and cull distance are arrays of scalar floats, always.
This PR currently ignores the semantic N axis, and handles the other
two axes by sequentially copying each vector element of each array member
into sequential floats in the output array.
Fixes: #946
This fixes:
1. A compilation error when assigning scalars to matricies
2. A semantic error in matrix construction from scalars. This was
initializing the diagonal, where HLSL semantics require the scalar be
replicated to every matrix element.
3. Functions accepting mats can be called with scalars, which will
be shape-converted to the matrix type. This was previously failing
to match the function signature.
NOTE: this does not yet handle complex scalars (a function call,
say) used to construct matricies. That'll be added when the
node replicator service is available. For now, there's an assert.
There's one new test (hlsl.scalar2matrix.frag). An existing test
lsl.type.half.frag changes, because of (2) above, and a negative
test error message changes due to (3) above.
Fixes#923.
For "s.m = t", a sampler member assigned a sampler, make t an alias
for s.m, and when s.m is flattened, it will flatten to the alias t.
Normally, assignments to samplers are disallowed.
This modifies function parameter passing to pass the counter
buffer associated with a struct buffer to a function as a
hidden parameter. Similarly function declarations will have
hidden parameters added to accept the associated counter buffers.
There is a limitation: if a SB type may or may not have an associated
counter, passing it as a function parameter will assume that it does, and
the counter will appear in the linkage whether or not there is a counter
method used on the object.
Marking as WIP since it might deserve discussion or at least explicit consideration.
During type sanitization, the TQualifier's TBuiltInVariable type is lost. However,
sometimes it's needed downstream. There were already two methods in use to track
it through to places it was needed: one in the TParameter, and one in a map in the
HlslParseContext used for structured buffers.
The latter was going to be insufficient when SB types with counters are passed to
user functions, and it's proving awkward to track the data to where it's needed.
This PR holds a proposal: track the original declared builtin type in the TType,
so it's trivially available where needed.
This lets the other two mechanisms be removed (and they are in this PR). There's a
side benefit of not losing certain types of information before the reflection interface.
This PR is only that proposal, so it changes no test results. If it's acceptable,
I'll use it for the last piece of SB counter functionality.
This implements mytex.mips[mip][coord] for texture types. There is
some error testing, but not comprehensive. The constructs can be
nested, e.g in this case the inner .mips is parsed before the completion
of the outer [][] operator.
tx.mips[tx.mips[a][b].x][c]
Using GS methods such as Append() in non-GS stages should be ignored, but was
creating errors due to the lack of a stream output symbol for the non-GS stage.
This adds infrastructure suitable for any front end to create SPIR-V loop
control flags. The only current front end doing so is HLSL.
[unroll] turns into spv::LoopControlUnrollMask
[loop] turns into spv::LoopControlDontUnrollMask
no specification means spv::LoopControlMaskNone
This reverts commit cfc69d95af.
* Change CMAKE_INSTALL_PREFIX default on Windows in order
to prevent permission denied errors when trying to install
to "Program Files".
* Use `GNUInstallDirs` in order to respect GNU conventions.
This is especially important for multi-arch/multi-lib setups.
* Specify position independent mode building properly, without
using the historic hack of adding `-fPIC` as a definition.
This makes the build system more portable.
* Only detect C++ (and not C) to slightly speed up configuring.
* Specify C++11 mode using modern CMake idioms.
* Fix some whitespace issues.
Multisample textures support a GetSamplePosition() method intended to query
positions given a sample index. This cannot be truly implemented in SPIR-V,
but #753 requested returning standard positions for the 1..16 cases, which
this PR adds. Anything besides that returns (0,0). If the standard positions
are not used, this will be wrong.
This should be revisited when there is a real query available.
glslang/MachineIndependent/iomapper.cpp:207:9: error: field 'resolver' will be initialized after field 'stage' [-Werror,-Wreorder]
: resolver(r)
^
glslang/MachineIndependent/iomapper.cpp:263:9: error: field 'resolver' will be initialized after field 'stage' [-Werror,-Wreorder]
: resolver(r)
^
hlsl/hlslParseHelper.cpp:70:5: error: field 'gsStreamOutput' will be initialized after field 'inputPatch' [-Werror,-Wreorder]
gsStreamOutput(nullptr),
^
Byte address buffers were failing to detect that they were byte address
buffers when used as fn parameters.
Note: this detection is a little awkward, and could be simplified if
it was easy to obtain the declared builtin type for an object.
Some texture and SB operations can take non-integer indexes, which should be
cast to integers before use if they are not already. This adds makeIntegerIndex()
for the purpose. Int types are left alone.
(This was done before for operator[], but needs to apply to some other things
too, hence its extraction into common function now)
This adds TProgram::getUniformBlockCounterIndex(int index), which returns the
index the block of the counter buffer associated with the block of the passed in
index, if any, or -1 if none.
This is WIP, heavy on the IP part. There's not yet enough to use in real workloads.
Currently present:
* Creation of separate counter buffers for structured buffer types needing them.
* IncrementCounter / DecrementCounter methods
* Postprocess to remove unused counter buffers from linkage
* Associated counter buffers are given @count suffix (invalid as a user identifier)
Not yet present:
* reflection queries to obtain bindings for counter buffers
* Append/Consume buffers
* Ability to use SB references passed as fn parameters
The prior decomposition of isfinite was not setting the return type on the
sequence node. (Sequence was used because there's an internal temporary
to avoid the complex rvalue problem).
HLSL requires vec2 tessellation coordinate declarations in some cases
(e.g, isoline topology), where SPIR-V requires the TessCoord qualified
builtin to be a vec3 in all cases. This alters the IO form of the
variable to be a vec3, which will be copied to the shader's declared
type if needed. This is not a validation; the shader type must be correct.
Improves foundation for adding scalar casts.
Makes handle/make names more sane, better commented, uses more
precise subclass typing, and removes mutual recursion between
converting initializer lists and making constructors.
Previously, patch constant functions only accepted OutputPatch. This
adds InputPatch support, via a pseudo-builtin variable type, so that
the patch can be tracked clear through from the qualifier.
The prior implementation of GS did not work with the new EP wrapping architecture.
This fixes it: the Append() method now looks up the actual output rather
than the internal sanitized temporary type, and writes to that.
In the hull shader, the PCF output does not participate in an argument list,
so has no defined ordering. It is always put at the end of the linkage. That
means the DS input reading PCF data must be be at the end of the DS linkage
as well, no matter where it may appear in the argument list. This change
makes sure that happens.
The detection is by looking for arguments that contain tessellation factor
builtins, even as a struct member. The whole struct is taken as the PCF output
if any members are so qualified.
The SPIR-V generator had assumed tessellation modes such as
primitive type and vertex order would only appear in tess eval
(domain) shaders. SPIR-V allows either, and HLSL allows and
possibly requires them to be in the hull shader.
This change:
1. Passes them through for either tessellation stage, and,
2. Does not set up defaults in the domain stage for HLSl compilation,
to avoid conflicting definitions.
Unknown how extensive the semantics need to be yet. Need real
feedback from workloads. This is just done as part of unifying it
with the class/struct namespaces and grammar productions.
HLSL HS outputs a per ctrl point value, and the DS reads an array
of that type. (It also has a per patch frequency). The per-ctrl-pt
frequency is arrayed on just one side, as opposed to SPIR-V which
is arrayed on both. To match semantics, the compiler creates an
array behind the scenes and indexes it by invocation ID, assigning
the HS return value to it.