Previously, patch constant functions only accepted OutputPatch. This
adds InputPatch support, via a pseudo-builtin variable type, so that
the patch can be tracked clear through from the qualifier.
The prior implementation of GS did not work with the new EP wrapping architecture.
This fixes it: the Append() method now looks up the actual output rather
than the internal sanitized temporary type, and writes to that.
In the hull shader, the PCF output does not participate in an argument list,
so has no defined ordering. It is always put at the end of the linkage. That
means the DS input reading PCF data must be be at the end of the DS linkage
as well, no matter where it may appear in the argument list. This change
makes sure that happens.
The detection is by looking for arguments that contain tessellation factor
builtins, even as a struct member. The whole struct is taken as the PCF output
if any members are so qualified.
The SPIR-V generator had assumed tessellation modes such as
primitive type and vertex order would only appear in tess eval
(domain) shaders. SPIR-V allows either, and HLSL allows and
possibly requires them to be in the hull shader.
This change:
1. Passes them through for either tessellation stage, and,
2. Does not set up defaults in the domain stage for HLSl compilation,
to avoid conflicting definitions.
HLSL HS outputs a per ctrl point value, and the DS reads an array
of that type. (It also has a per patch frequency). The per-ctrl-pt
frequency is arrayed on just one side, as opposed to SPIR-V which
is arrayed on both. To match semantics, the compiler creates an
array behind the scenes and indexes it by invocation ID, assigning
the HS return value to it.
SPIR-V requires that tessellation factor arrays be size 4 (outer) or 2 (inner).
HLSL allows other sizes such as 3, or even scalars. This commit converts
between them by forcing the IO types to be the SPIR-V size, and allowing
copies between the internal and IO types to handle these cases.
This PR emulates per control point inputs to patch constant functions.
Without either an extension to look across SIMD lanes or a dedicated
stage, the emulation must use separate invocations of the wrapped
entry point to obtain the per control point values. This is provided
since shaders are wanting this functionality now, but such an extension
is not yet available.
Entry point arguments qualified as an invocation ID are replaced by the
current control point number when calling the wrapped entry point. There
is no particular optimization for the case of the entry point not having
such an input but the PCF still accepting ctrl pt frequency data. It'll
work, but anyway makes no so much sense.
The wrapped entry point must return the per control point data by value.
At this time it is not supported as an output parameter.
It would have been possible for globally scoped user functions to collide
with builtin method names. This adds a prefix to avoid polluting the
namespace.
Ideally this would be an invalid character to use in user identifiers, but
as that requires changing the scanner, for the moment it's an unlikely yet
valid prefix.
Also use this to move deferred member-function-body parsing to a better
place.
This should also be well poised for implementing the 'namespace' keyword.
This is slightly cleaner today for entry-point wrapping, which sometimes made
two subtrees for a function definition instead of just one subtree. It will be
critical though for recognizing a struct with multiple member functions.
The non-LOD form of image size query is prohibited in certain cases:
see the OpImageQuerySize and OpImageQuerySizeLod sections of the SPIR-V
spec for details. Sometimes we were generating the non-LOD form when
we should have been using the LOD form. Sometimes the LOD form is required
even if the underlying HLSL query did not supply a MIP level itself,
in which case level 0 is now queried.
This change propagates the storage qualifier from the buffer object to its contained
array type so that isStructBufferType() realizes it is one. That propagation was
happening before only for global variable declarations, so compilation defects would
result if the use of a function parameter happened before a global declaration.
This fixes that case, whether or not there ever is a global declaration, and
regardless of the relative order.
This changes the hlsl.structbuffer.fn.frag test to exercise the alternate order.
There are no differences to generated SPIR-V for the cases which successfully compiled before.
Use an explicit cast from size_t to int to avoid errors like the following:
glslang\glslang\MachineIndependent\preprocessor\Pp.cpp(1053) : error C2220: warning treated as error - no 'object' file generated
glslang\glslang\MachineIndependent\preprocessor\Pp.cpp(1053) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data
affects Pp.cpp, hlslParseHelper.cpp.
Initialize local variable to get rid of warningsa about potentially
uninitialized variables:
glslang\hlsl\hlslparsehelper.cpp(3667) : error C2220: warning treated as error - no 'object' file generated
glslang\hlsl\hlslparsehelper.cpp(3667) : warning C4701: potentially uninitialized local variable 'builtIn' used
affects hlslParseHelper.cpp
The f16tof32 opcode was indexing a vector with a float 0, rather
than an int 0. It may have made no functional difference due to the
identical bit pattern, but code looking at the type could be
confused.
This PR adds the ability to pass structuredbuffer types by reference
as function parameters.
It also changes the representation of structuredbuffers from anonymous
blocks with named members, to named blocks with pseudonymous members.
That should not be an externally visible change.
This is a partial implemention of structurebuffers supporting:
* structured buffer types of:
* StructuredBuffer
* RWStructuredBuffer
* ByteAddressBuffer
* RWByteAddressBuffer
* Atomic operations on RWByteAddressBuffer
* Load/Load[234], Store/Store[234], GetDimensions methods (where allowed by type)
* globallycoherent flag
But NOT yet supporting:
* AppendStructuredBuffer / ConsumeStructuredBuffer types
* IncrementCounter/DecrementCounter methods
Please note: the stride returned by GetDimensions is as calculated by glslang for std430,
and may not match other environments in all cases.
This obsoletes WIP PR #704, which was built on the pre entry point wrapping master. New version
here uses entry point wrapping.
This is a limited implementation of tessellation shaders. In particular, the following are not functional,
and will be added as separate stages to reduce the size of each PR.
* patchconstantfunctions accepting per-control-point input values, such as
const OutputPatch <hs_out_t, 3> cpv are not implemented.
* patchconstantfunctions whose signature requires an aggregate input type such as
a structure containing builtin variables. Code to synthesize such calls is not
yet present.
These restrictions will be relaxed as soon as possible. Simple cases can compile now: see for example
Test/hulsl.hull.1.tesc - e.g, writing to inner and outer tessellation factors.
PCF invocation is synthesized as an entry point epilogue protected behind a barrier and a test on
invocation ID == 0. If there is an existing invocation ID variable it will be used, otherwise one is
added to the linkage. The PCF and the shader EP interfaces are unioned and builtins appearing in
the PCF but not the EP are also added to the linkage and synthesized as shader inputs.
Parameter matching to (eventually arbitrary) PCF signatures is by builtin variable type. Any user
variables in the PCF signature will result in an error. Overloaded PCF functions will also result in
an error.
[domain()], [partitioning()], [outputtopology()], [outputcontrolpoints()], and [patchconstantfunction()]
attributes to the shader entry point are in place, with the exception of the Pow2 partitioning mode.