including SPV generation using SPV_EXT_fragment_invocation_density.
This is an alias of the functionality in SPV_NV_shading_rate, and thus in some
cases we can only have one set of the tokens present (switch statements), so
we have picked the EXT version. This required updating the expected test
results for SPV_NV_shading_rate.
Also updated the known-good for spirv-headers so that the validator in
spirv-tools knows about the new extension.
This change adds unary conversion folding when the source is a constant.
This fixes an ISV issue whereby:
```
const float16_t f = float16_t(42.0);
```
Wouldn't compile because the conversion operator would always produce an
EvqTemporary when it could have produced an EvqConst.
I've also added a test case that proves out that all basic-type to
basic-type conversions work.
If a block has assigned a XfbOffset it is assumed that it would
inherit the current global XfbBuffer. This commit fixes two use cases:
1) Getting the members of a Block with a XfbOffset to be assigned an
offset, as explained on GLSL 4.60 spec, section "4.4.2 Output
Layout Qualifiers", subsection "Transform Feedback Layout
Qualifiers".
2) Compute properly an error on overlapping ranges if a block is
assigned a XfbOffset and one of it members is assigned a explicit
one. This gets working because when the members of a block get
assigned a Offset/Buffer at fixBlockXfbOffsets, then the block is
deassigned the Offsets, so ranges are computed only with the block
members.
BTW, this is already done when redeclaring block builtins.
Fixes#1535
If the out variable is a struct type, with a xfb_offset explicitly
assigned, the members need to get their xfb_offset assigned. This is
specially relevant, as we cannot use layout qualifiers on struct
members.
UniformAndStorageBuffer8BitAccess capability.
When using the 8-bit storage extension it basically always used the
`UniformAndStorageBuffer8BitAccess` capability, even in cases where it
wasn't required. For instance if we are targeting Vulkan 1.1 (SPIR-V 1.3
or higher), and we are only using 8-bit types in an SSBO, we only need
the `StorageBuffer8BitAccess` capability.
I fixed this by enabling storage buffer use in Vulkan 1.1 / SPIR-V 1.3
or higher, and then changing the logic to match.
I also added some tests that will output different capabilities when run
on Vulkan 1.0 and 1.1, thus they are added twice to the test list (one
for each version).
Fixes#1539
- Emit relevant capability/extension for use of perprimitiveNV in fragment shader
- Remove redundant checks for mesh shader qualifiers in glslang.y
- Add profile version check for use of extension GL_NV_mesh_shader
- Add a new gtest for use of perprimitiveNV in fragment shader
Apart from allowing redeclaration of gl_MeshPerVertexNV and gl_MeshPerPrimitiveNV blocks, this change also -
- Resize clip/cull perview distances based on static index use
- Error out use of both single-view and per-view builtins
- Add new gtests with redeclared blocks and edit existing test output
- Fix couple of typos
This is one step in providing full linker functionality for creating
correct SPIR-V from multiple compilation units for the same stage.
(This was the only remaining "hard" part. The rest should be simple.)
These introduce limited support for 8/16-bit types such that they can only be accessed in buffer memory and converted to/from 32-bit types.
Contributed from Khronos-internal work.
- Adds a pragma to see binary output of double values (not portable)
- Print decimals that show more values, but in a portable way
(lots of portability issues)
- Expand the tests to test more double values
Note: it is quite difficult to have 100% portable tests for floating point.
The current situation works by not printing full precision, and working around
several portability issues.
Previously, casting an object of a struct type to an identical type
would produce an error. This PR allows this case.
As a side-effect of the change, several self-type casts in existing
tests go away. For example:
0:10 Construct float ( temp float)
0:10 'f' ( in float)
becomes this (without the unneeded constructor op):
0:10 'f' ( in float)
For vector or array types this can result in somewhat less overall code.
Fixes: #1218
SPIR-V requires the coverage mask to be an array of integers, but HLSL
allows scalar integers. This adds the requisite type conversion and
wrapped entry point handling.
Fixes: #1202
This PR forces the external definition of SV_GroupID variables to 3-vectors.
The conversion process between the shader-declared type and the external type
happens in wrapped main IO variable conversion.
The same applies to SV_DispatchThreadID and SV_GroupThreadID.
Fixes: #1371
Append() method is special: unlike most outputs, it does not copy
some temporary data to a symbol in the entry point epilogue, but
rather uses an emit builtin after each write to the output stream.
This had been handled by remembering the special output symbol for
the stream as it was declared in the shader entry point before
symbol sanitization. However the prior code was too simple and
only handled cases where the Append() method happened after the
entry point, so that the output symbol had been seen.
This PR adds a patching step so that the Append()s may appear in
any order WRT the entry point. They are patched in an epilogue,
whereupon it is guaranteed in a well formed shader that we have
seen the appropriate declaration.
Fixes#1217.
There a couple functional problems, which when reduced down also led to
some good simplifications and rationalization. So, this commit:
- corrects "mixed" functionality: int[A] f[B] -> f[B][A]
- correct multi-identifier decls: int[A] f[B], g[C] -> f and g are independently sized.
- increases symmetry between different places in the code that do this
- makes fewer ways to do the same thing; several methods are just gone now
- makes more clear when something is copied or shared
HLSL allows image and texture types to be templatized on sub-vec4 types,
or even structures. This was mostly handled already during creation of
sampling operations. However, for operator[] which can generate image
loads, this wasn't happening.
It also isn't very easy to do at that point in time, because operator[]
does not know where the results it produces will end up. They may be
an lvalue or an rvalue, and there's a post-process to convert loads to
stores. They may end up in atomic ops.
To bypass that difficulty, GlslangToSpv now looks for this case and
adds the appropriate conversion. LIMITATION: this only works for
cases for which a simple conversion opcode suffices. That is to say,
it will not work if the type is templatized on a struct.
- make it sharable with GLSL
- correct the case insensitivity
- remove the map; queries are not needed, all entries need processing
- make it easier to build bottom up (will help GLSL parsing)
- support semantic checking and reporting
- allow front-end dependent semantics and attribute name mapping
- correct inheritence (or not) of the right XFB buffer
- compute implicit stride (fixes#1212)
- semantic check block-member redeclarations
- inherit stride from a member
Also, only emit this XFB information where the SPIR-V spec says
it should be emitted: essentially, on objects.
This and the previous commit together fix#1185.
Issue #791 was partially fixed by PR #1161 (the mat mul implicit
truncations were its main point), but it still wouldn't compile due to
the use of ConstantBuffer as an identifier. Apparently those fall into
the same class as "float float", where float is both a type and an
identifier.
This allows struct definitions with such keyword-identifiers,
and adds ConstantBuffer to the set. 'cbuffer int' is legal in HLSL,
and 'struct int' appears to only be rejected due to the redefinition
of the 'int' type.
Fixes#791
HLSL truncates the vector, or one of the two matrix dimensions if there is a
dimensional mismatch in m*v, v*m, or m*m.
This PR adds that ability. Conversion constructors are added as required.
If a shader includes a mixture of several stages, such as HS and GS,
the non-stage output geometry should be ignored, lest it conflict
with the stage output.
Both debug and release clang builds have segfaulted on recent
changes, non deterministically, while doing the single/multi-thread
test all test files. Removing recent test files, to see if it gives
a clue.
Fixes#1092. Allows arrays of opaques to keep arrayness, unless
needed by uniform array flattening.
Can handle assignments of mixed amounts of flattening.
A single texture can statically appear in code mixed with a shadow sampler
and a non-shadow sampler. This would be create invalid SPIR-V, unless
one of them is provably dead.
The previous detection of this happened before DCE, so some shaders would
trigger the error even though they wouldn't after DCE. To handle that
case, this PR splits the texture into two: one with each mode. It sets
"needsLegalization" (if that happens for any texture) to warn that this shader
will need post-compilation legalization.
If the texture is only used with one of the two modes, behavior is as it
was before.
Add support for Subpass Input proposal of issue #1069.
Subpass input types are given as:
layout(input_attachment_index = 1) SubpassInput<float4> subpass_f;
layout(input_attachment_index = 2) SubpassInput<int4> subpass_i;
layout(input_attachment_index = 3) SubpassInput<uint4> subpass_u;
layout(input_attachment_index = 1) SubpassInputMS<float4> subpass_ms_f;
layout(input_attachment_index = 2) SubpassInputMS<int4> subpass_ms_i;
layout(input_attachment_index = 3) SubpassInputMS<uint4> subpass_ms_u;
The input attachment may also be specified using attribute syntax:
[[vk::input_attachment_index(7)]] SubpassInput subpass_2;
The template type may be a shorter-than-vec4 vector, but currently user
structs are not supported. (An unimplemented error will be issued).
The load operations are methods on objects of the above type:
float4 result = subpass_f.SubpassLoad();
int4 result = subpass_i.SubpassLoad();
uint4 result = subpass_u.SubpassLoad();
float4 result = subpass_ms_f.SubpassLoad(samp);
int4 result = subpass_ms_i.SubpassLoad(samp);
uint4 result = subpass_ms_u.SubpassLoad(samp);
Additionally, the AST printer could not print EOpSubpass* nodes. Now it can.
Fixes#1069
- support C++11 style brackets [[...]]
- support namespaces [[vk::...]]
- support these on parameter declarations in functions
- support location, binding/set, input attachments
Also added known-good mechanism to fetch latest validated spirv-tools.
Also added -Od and -Os to disable optimizer and optimize for size.
Fetching spirv-tools is optional for both glsl and hlsl. Legalization
of hlsl is done by default if spirv-opt is present at cmake time.
Optimization for glsl is currently done through the option -Os.
Legalization testing is currently only done on four existing shaders.
A separate baseLegalResults directory holds those results. All previous
testing is done with the optimizer disabled.
InputPatch parameters to patch constant functions were not using the
internal (temporary) variable type. That could cause validation errors
if the input patch had a mixture of builtins and user qualified members.
This uses the entry point's internal form.
There is currently a limitation: if an InputPatch is used in a PCF,
it must also have appeared in the main entry point's parameter list.
That is not a limitation of HLSL. Currently that situation is detected
and an "implemented" error results. The limitation can be addressed,
but isn't yet in the current form of the PR.
Hull shaders have an implicitly arrayed output. This is handled by creating an arrayed form of the
provided output type, and writing to the element of it indexed by InvocationID.
The implicit indirection into that array was causing some troubles when copying to a split
structure. handleAssign was able to handle simple symbol lvalues, but not an lvalue composed
of an indirection into an array.
There were several locations in TGlslangToSpvTraverser::handleUserFunctionCall testing for
whether a fn argument should be in the lvalue or rvalue array. They must get the same
result for indexing sanity, but had slightly different logic.
They're now forced into the same test.
Changes:
(1) Allow clip/cull builtins as both input and output in the same shader stage. Previously,
not enough data was tracked to handle this.
(2) Handle the extra array dimension in GS inputs. The synthesized external variable can
now be created with the extra array dimension if needed, and the form conversion code is
able to handle it as well.
For example, both of these GS inputs would result in the same synthesized external type:
triangle in float4 clip[3] : SV_ClipDistance
triangle in float2 clip[3][2] : SV_ClipDistance
In the second case, the inner array dimension packs with the 2-vector of floats into an array[4],
which there is an array[3] of due to the triangle geometry.
HLSL allows a range of types for clip and cull distances. There are
three dimensions, including arrayness, vectorness, and semantic ID.
SPIR-V requires clip and cull distance be a single array of floats in
all cases.
This code provides input side conversion between the SPIR-V form and
the HLSL form. (Output conversion was added in PR #947 and #997).
This PR extends HlslParseContext::assignClipCullDistance to cope with
the input side conversion. Not as much changed as appears: there was
also a lot of renaming to reflect the fact that the code now handles
either direction.
Currently, non-{frag,vert} stages are not handled, and are explicitly
rejected.
Fixes#1026.
Some languages allow a restricted set of user structure types returned from texture sampling
operations. Restrictions include the total vector size of all components may not exceed 4,
and the basic types of all members must be identical.
This adds underpinnings for that ability. Because storing a whole TType or even a simple
TTypeList in the TSampler would be expensive, the structure definition is held in a
table outside the TType. The TSampler contains a small bitfield index, currently 4 bits
to support up to 15 separate texture template structure types, but that can be adjusted
up or down. Vector returns are handled as before.
There are abstraction methods accepting and returning a TType (such as may have been parsed
from a grammar). The new methods will accept a texture template type and set the
sampler to the structure if possible, checking a range of error conditions such as whether
the total structure vector components exceed 4, or whether their basic types differe, or
whether the struct contains non-vector-or-scalar members. Another query returns the
appropriate TType for the sampler.
High level summary of design:
In the TSampler, this holds an index into the texture structure return type table:
unsigned int structReturnIndex : structReturnIndexBits;
These are the methods to set or get the return type from the TSampler. They work for vector or structure returns, and potentially could be expanded to handle other things (small arrays?) if ever needed.
bool setTextureReturnType(TSampler& sampler, const TType& retType, const TSourceLoc& loc);
void getTextureReturnType(const TSampler& sampler, const TType& retType, const TSourceLoc& loc) const;
The ``convertReturn`` lambda in ``HlslParseContext::decomposeSampleMethods`` is greatly expanded to know how to copy a vec4 sample return to whatever the structure type should be. This is a little awkward since it involves introducing a comma expression to return the proper aggregate value after a set of memberwise copies.
This adds support for #pragma pack_matrix() to the HLSL front end.
The pragma sets the default matrix layout for subsequent unqualified matrices
in structs or buffers. Explicit qualification overrides the pragma value. Matrix
layout is not permitted at the structure level in HLSL, so only leaves which are
matrix types can be so qualified.
Note that due to the semantic (not layout) difference in first matrix indirections
between HLSL and SPIR-V, the sense of row and column major are flipped. That's
independent of this PR: just a factor to note. A column_major qualifier appears
as a RowMajor member decoration in SPIR-V modules, and vice versa.
The HLSL FE tracks four versions of a declared type to avoid losing information, since it
is not (at type-decl time) known how the type will be used downstream. If such a type
was used in a cbuffer declaration, the cbuffer type's members should have been using
the uniform form of the original user structure type, but were not.
This would manifest as matrix qualifiers (and other things, such as pack offsets) on user struct
members going missing in the SPIR-V module if the struct type was a member of a cbuffer, like so:
struct MyBuffer
{
row_major float4x4 mat1;
column_major float4x4 mat2;
};
cbuffer Example
{
MyBuffer g_MyBuffer;
};
Fixes: #789
HLSL allows several variables to be declared. There are packing rules involved:
e.g, a float3 and a float1 can be packed into a single array[4], while for a
float3 and another float3, the second one will skip the third array entry to
avoid straddling
This is implements that ability. Because there can be multiple variables involved,
and the final output array will often be a different type altogether (to fuse
the values into a single destination), a new variable is synthesized, unlike the prior
clip/cull support which used the declared variable. The new variable name is
taken from one of the declared ones, so the old tests are unchanged.
Several new tests are added to test various packing scenarios.
Only two semantic IDs are supported: 0, and 1, per HLSL rules. This is
encapsulated in
static const int maxClipCullRegs = 2;
and the algorithm (probably :) ) generalizes to larger values, although there
are a few issues around how HLSL would pack (e.g, would 4 scalars be packed into
a single HLSL float4 out reg? Probably, and this algorithm assumes so).
Semantic test left over from other source languages is removed, since this is permitted by HLSL.
Also, to support the functionality, a targeted test is performed for this case and it is
turned into a EvqGlobal qualifier to create an AST initialization segment when needed.
Constness is now propagated up aggregate chains during initializer construction. This
handles hierarchical cases such as the distinction between:
static const float2 a[2] = { { 1, 2 }, { 3, 4} };
vs
static const float2 a[2] = { { 1, 2 }, { cbuffer_member, 4} };
The first of which can use a first class constant initalization, and the second cannot.
In HLSL, there are three (TODO: ??) dimensions of clip and cull
distance values:
* The semantic's value N, ala SV_ClipDistanceN.
* The array demension, if the value is an array.
* The vector element, if the value is a vector or array of vectors.
In SPIR-V, clip and cull distance are arrays of scalar floats, always.
This PR currently ignores the semantic N axis, and handles the other
two axes by sequentially copying each vector element of each array member
into sequential floats in the output array.
Fixes: #946
This fixes:
1. A compilation error when assigning scalars to matricies
2. A semantic error in matrix construction from scalars. This was
initializing the diagonal, where HLSL semantics require the scalar be
replicated to every matrix element.
3. Functions accepting mats can be called with scalars, which will
be shape-converted to the matrix type. This was previously failing
to match the function signature.
NOTE: this does not yet handle complex scalars (a function call,
say) used to construct matricies. That'll be added when the
node replicator service is available. For now, there's an assert.
There's one new test (hlsl.scalar2matrix.frag). An existing test
lsl.type.half.frag changes, because of (2) above, and a negative
test error message changes due to (3) above.
Fixes#923.
For "s.m = t", a sampler member assigned a sampler, make t an alias
for s.m, and when s.m is flattened, it will flatten to the alias t.
Normally, assignments to samplers are disallowed.
This changes no functional code. There was a bit of a testing hole
in that textures templatized on sub-vec4 types were not being exercised
with any intrinsics. This adds some basic sanity coverage of that case.
Adds a transformation step to the post processing step.
Two modes are available:
1) keep
- Keeps samplers, textures and sampled textures as is
2) transform pure texture into sampled texture and remove pure samplers
- removes all pure samplers
- transforms all pure textures into its sampled counter part
Change-Id: If54972e8052961db66c23f4b7e719d363cf6edbd
Name mangling did not account for the vector size in the template type of a texture.
This adds that. The mangle is as it ever was for the vec4 case, which leaves
all GLSL behavior and most HLSL behavior uneffected. For vec1-3 the size is added
to the mangle.
Current limitation: textures cannot presently be templatized on structured types,
so this works only for vectors of basic types.
Fixes#895.
OpSpecConstantOp contains an embedded opcode which is given as a literal
argument to the OpSpecConstantOp. The subsequent arguments are as the
embedded op would expect, which may be a mixture of IDs and literals. This
adds support for that to the remapper binary parser. Upon seeing such an
embedded op, the parser flips over to parsing the argument list as
appropriate for that opcode.
Fixes#882.
Also, provides an option to auto-assign locations.
Existing tests use this option, to avoid the error message,
however, it is not fully implemented yet.
This modifies function parameter passing to pass the counter
buffer associated with a struct buffer to a function as a
hidden parameter. Similarly function declarations will have
hidden parameters added to accept the associated counter buffers.
There is a limitation: if a SB type may or may not have an associated
counter, passing it as a function parameter will assume that it does, and
the counter will appear in the linkage whether or not there is a counter
method used on the object.
This implements mytex.mips[mip][coord] for texture types. There is
some error testing, but not comprehensive. The constructs can be
nested, e.g in this case the inner .mips is parsed before the completion
of the outer [][] operator.
tx.mips[tx.mips[a][b].x][c]
Using GS methods such as Append() in non-GS stages should be ignored, but was
creating errors due to the lack of a stream output symbol for the non-GS stage.
This adds infrastructure suitable for any front end to create SPIR-V loop
control flags. The only current front end doing so is HLSL.
[unroll] turns into spv::LoopControlUnrollMask
[loop] turns into spv::LoopControlDontUnrollMask
no specification means spv::LoopControlMaskNone
This reverts commit cfc69d95af.
* Change CMAKE_INSTALL_PREFIX default on Windows in order
to prevent permission denied errors when trying to install
to "Program Files".
* Use `GNUInstallDirs` in order to respect GNU conventions.
This is especially important for multi-arch/multi-lib setups.
* Specify position independent mode building properly, without
using the historic hack of adding `-fPIC` as a definition.
This makes the build system more portable.
* Only detect C++ (and not C) to slightly speed up configuring.
* Specify C++11 mode using modern CMake idioms.
* Fix some whitespace issues.
Byte address buffers were failing to detect that they were byte address
buffers when used as fn parameters.
Note: this detection is a little awkward, and could be simplified if
it was easy to obtain the declared builtin type for an object.
Some texture and SB operations can take non-integer indexes, which should be
cast to integers before use if they are not already. This adds makeIntegerIndex()
for the purpose. Int types are left alone.
(This was done before for operator[], but needs to apply to some other things
too, hence its extraction into common function now)
This is WIP, heavy on the IP part. There's not yet enough to use in real workloads.
Currently present:
* Creation of separate counter buffers for structured buffer types needing them.
* IncrementCounter / DecrementCounter methods
* Postprocess to remove unused counter buffers from linkage
* Associated counter buffers are given @count suffix (invalid as a user identifier)
Not yet present:
* reflection queries to obtain bindings for counter buffers
* Append/Consume buffers
* Ability to use SB references passed as fn parameters
HLSL requires vec2 tessellation coordinate declarations in some cases
(e.g, isoline topology), where SPIR-V requires the TessCoord qualified
builtin to be a vec3 in all cases. This alters the IO form of the
variable to be a vec3, which will be copied to the shader's declared
type if needed. This is not a validation; the shader type must be correct.
Previously, patch constant functions only accepted OutputPatch. This
adds InputPatch support, via a pseudo-builtin variable type, so that
the patch can be tracked clear through from the qualifier.
In the hull shader, the PCF output does not participate in an argument list,
so has no defined ordering. It is always put at the end of the linkage. That
means the DS input reading PCF data must be be at the end of the DS linkage
as well, no matter where it may appear in the argument list. This change
makes sure that happens.
The detection is by looking for arguments that contain tessellation factor
builtins, even as a struct member. The whole struct is taken as the PCF output
if any members are so qualified.
The SPIR-V generator had assumed tessellation modes such as
primitive type and vertex order would only appear in tess eval
(domain) shaders. SPIR-V allows either, and HLSL allows and
possibly requires them to be in the hull shader.
This change:
1. Passes them through for either tessellation stage, and,
2. Does not set up defaults in the domain stage for HLSl compilation,
to avoid conflicting definitions.
Unknown how extensive the semantics need to be yet. Need real
feedback from workloads. This is just done as part of unifying it
with the class/struct namespaces and grammar productions.
This PR emulates per control point inputs to patch constant functions.
Without either an extension to look across SIMD lanes or a dedicated
stage, the emulation must use separate invocations of the wrapped
entry point to obtain the per control point values. This is provided
since shaders are wanting this functionality now, but such an extension
is not yet available.
Entry point arguments qualified as an invocation ID are replaced by the
current control point number when calling the wrapped entry point. There
is no particular optimization for the case of the entry point not having
such an input but the PCF still accepting ctrl pt frequency data. It'll
work, but anyway makes no so much sense.
The wrapped entry point must return the per control point data by value.
At this time it is not supported as an output parameter.
This PR adds the ability to pass structuredbuffer types by reference
as function parameters.
It also changes the representation of structuredbuffers from anonymous
blocks with named members, to named blocks with pseudonymous members.
That should not be an externally visible change.
New command line option --shift-ssbo-binding mirrors --shift-ubo-binding, etc.
New reflection query getLocalSize(int dim) queries local size, e.g, CS threads.
This is a partial implemention of structurebuffers supporting:
* structured buffer types of:
* StructuredBuffer
* RWStructuredBuffer
* ByteAddressBuffer
* RWByteAddressBuffer
* Atomic operations on RWByteAddressBuffer
* Load/Load[234], Store/Store[234], GetDimensions methods (where allowed by type)
* globallycoherent flag
But NOT yet supporting:
* AppendStructuredBuffer / ConsumeStructuredBuffer types
* IncrementCounter/DecrementCounter methods
Please note: the stride returned by GetDimensions is as calculated by glslang for std430,
and may not match other environments in all cases.
This obsoletes WIP PR #704, which was built on the pre entry point wrapping master. New version
here uses entry point wrapping.
This is a limited implementation of tessellation shaders. In particular, the following are not functional,
and will be added as separate stages to reduce the size of each PR.
* patchconstantfunctions accepting per-control-point input values, such as
const OutputPatch <hs_out_t, 3> cpv are not implemented.
* patchconstantfunctions whose signature requires an aggregate input type such as
a structure containing builtin variables. Code to synthesize such calls is not
yet present.
These restrictions will be relaxed as soon as possible. Simple cases can compile now: see for example
Test/hulsl.hull.1.tesc - e.g, writing to inner and outer tessellation factors.
PCF invocation is synthesized as an entry point epilogue protected behind a barrier and a test on
invocation ID == 0. If there is an existing invocation ID variable it will be used, otherwise one is
added to the linkage. The PCF and the shader EP interfaces are unioned and builtins appearing in
the PCF but not the EP are also added to the linkage and synthesized as shader inputs.
Parameter matching to (eventually arbitrary) PCF signatures is by builtin variable type. Any user
variables in the PCF signature will result in an error. Overloaded PCF functions will also result in
an error.
[domain()], [partitioning()], [outputtopology()], [outputcontrolpoints()], and [patchconstantfunction()]
attributes to the shader entry point are in place, with the exception of the Pow2 partitioning mode.
Structs are split to remove builtin members to create valid SPIR-V. In this
process, an outer structure array dimension may be propegated onto the
now-removed builtin variables. For example, a mystruct[3].position ->
position[3]. The copy between the split and unsplit forms would handle
this in some cases, but not if the array dimension was at different levels
of aggregate.
It now does this, but may not handle arbitrary composite types. Unclear if
that has any semantic meaning for builtins though.
Previously, a type graph would turn into a type tree. That is,
a deep node that is shared would have multiple copies made.
This is important when creating IO and non-IO versions of deep types.
This needs some render testing, but is destined to be part of master.
This also leads to a variety of other simplifications.
- IO are global symbols, so only need one list of linkage nodes (deferred)
- no longer need parse-context-wide 'inEntryPoint' state, entry-point is localized
- several parts of splitting/flattening are now localized
When copying split types with mixtures of user variables and buitins,
where the builtins are extracted, there is a parallel structures traversal.
The traversal was not obtaining the derefenced types in the array case.
- Add support for invocation functions with "InclusiveScan" and
"ExclusiveScan" modes.
- Add support for invocation functions taking int64/uint64/doube/float16
as inout data types.
This partially addressess issue #670, for when the matrix swizzle
degenerates to a component or column: m[c], m[c][r] (where HLSL
swaps rows and columns for user's view).
An error message is given for the arbitrary cases not covered.
These cases will work for arbitrary use of l-values.
Future work will handle more arbitrary swizzles, which might
not work as arbitrary l-values.
Any previous use would only be for "", which would probably mean changing
include(...) -> includeLocal(...)
See comments about includeLocal() being an additional search over
includeSystem(), not a superset search.
This also removed ForbidIncluder, as
- the message in ForbidIncluder was redundant: error results were
already returned to the caller, which then gives the error it
wants to
- there is a trivial default implementation that a subclass can
override any subset of (I still like abstract base classes though)
- trying to get less implementation out of the interface file anyway
Reads and write syntax to UAV objects is turned into EOpImageLoad/Store
operations. This translation did not support destination swizzles,
for example, "mybuffer[tc].zyx = 3;", so such statements would fail to
compile. Now they work.
Parial updates are explicitly prohibited.
New test: hlsl.rw.swizzle.frag
This PR adds support for default function parameters in the following cases:
1. Simple constants, such as void fn(int x, float myparam = 3)
2. Expressions that can be const folded, such a ... myparam = sin(some_const)
3. Initializer lists that can be const folded, such as ... float2 myparam = {1,2}
New tests are added: hlsl.params.default.frag and hlsl.params.default.err.frag
(for testing error situations, such as ambiguity or non-const-foldable).
In order to avoid sampler method ambiguity, the hlsl better() lambda now
considers sampler matches. Previously, all sampler types looked identical
since only the basic type of EbtSampler was considered.
HLSL allows type keywords to also be identifiers, so a sequence such as "float half = 3" is
valid, or more bizzarely, something like "float.float = int.uint + bool;"
There are places this is not supported. E.g, it's permitted for struct members, but not struct
names or functions. Also, vector or matrix types such as "float3" are not permitted as
identifiers.
This PR adds that support, as well as support for the "half" type. In production shaders,
this was seen with variables named "half". The PR attempts to support this without breaking
useful grammar errors such as "; expected" at the end of unterminated statements, so it errs
on that side at the possible expense of failing to accept valid constructs containing a type
keyword identifier. If others are discovered, they can be added.
Also, half is now accepted as a valid type, alongside the min*float types.
This commit adds support for copying nested hierarchical types of split
types. E.g, a struct of a struct containing both user and builtin interstage
IO variables.
When copying split types, if any subtree does NOT contain builtin interstage
IO, we can copy the whole subtree with one assignment, which saves a bunch
of AST verbosity for memberwise copies of that subtree.
This adds structure splitting, which among other things will enable GS support where input structs
are passed, and thus become input arrays of structs in the GS inputs. That is a common GS case.
The salient points of this PR are:
* Structure splitting has been changed from "always between stages" to "only into the VS and out of
the PS". It had previously happened between stages because it's not legal to pass a struct
containing a builtin IO variable.
* Structs passed between stages are now split into a struct containing ONLY user types, and a
collection of loose builtin IO variables, if any. The user-part is passed as a normal struct
between stages, which is valid SPIR-V now that the builtin IO is removed.
* Internal to the shader, a sanitized struct (with IO qualifiers removed) is used, so that e.g,
functions can work unmodified.
* If a builtin IO such as Position occurs in an arrayed struct, for example as an input to a GS,
the array reference is moved to the split-off loose variable, which is given the array dimension
itself.
When passing things around inside the shader, such as over a function call, the the original type
is used in a sanitized form that removes the builtIn qualifications and makes them temporaries.
This means internal function calls do not have to change. However, the type when returned from
the shader will be member-wise copied from the internal sanitized one to the external type.
The sanitized type is used in variable declarations.
When copying split types and unsplit, if a sub-struct contains only user variables, it is copied
as a single entity to avoid more AST verbosity.
Above strategy arrived at with talks with @johnkslang.
This is a big complex change. I'm inclined to leave it as a WIP until it can get some exposure to
real world cases.
Implement token pasting as per the C++ specification, within the current
style of the PP code.
Non-identifiers (turning 12 ## 10 into the numeral 1210) is not yet covered;
they should be a simple incremental change built on this one.
Addresses issue #255.