Commit Graph

697 Commits

Author SHA1 Message Date
Arseny Kapoulkine
0265a9d4de
Implement constant folding for many transcendentals (#3166)
* Implement constant folding for many transcendentals

This change adds support for folding of sin/cos/tan/asin/acos/atan,
exp/log/exp2/log2, sqrt, atan2 and pow.

The mechanism allows to use any C function to implement folding in the
future; for now I limited the actual additions to the most commonly used
intrinsics in the shaders.

Unary folder had to be tweaked to work with extended instructions - for
extended instructions, constants.size() == 2 and constants[0] ==
nullptr. This adjustment is similar to the one binary folder already
performs.

Fixes #1390.

* Fix Android build

On old versions of Android NDK, we don't get std::exp2/std::log2
because of partial C++11 support.

We do get ::exp2, but not ::log2 so we need to emulate that.
2020-02-03 09:20:47 -05:00
Steven Perron
97f1d485b7 Dead branch elim fix (#3160)
We must treat a branch to the merge node of a switch that is in the
header of a construct as a nested construced.  The original merge
instruction is still needed in that case.
2020-01-28 10:17:43 -05:00
greg-lunarg
e7afeb060e Use dummy switch instead of dummy loop in MergeReturn pass. (#3151)
Fixes #3127
2020-01-24 12:20:14 -05:00
Jaebaek Seo
dd37d73c5e Handle conflict between debug info and existing validation rule (#3104)
* Allow OpExtInst for DebugInfo between secion 9 and 10

Fixes #3086

* Handle spirv-opt errors on DebugInfo Ext

* Add IR Loader test

* Fix ir loader bug

* Handle DebugFunction/DebugTypeMember forward reference

* Add test cases (forward reference to function)

* Support old DebugInfo extension

* Validate local debug info out of function
2020-01-23 17:04:30 -05:00
Jaebaek Seo
f8d7df760c
Fix OpLine bug of merge-blocks pass (#3130)
As explained in #3118, spirv-opt merge-blocks pass causes a
spirv-val error when an OpBranch has an OpLine in front of it.

OpLoopMerge
OpBranch ; Will be killed by merge-blocks pass
OpLabel  ; Will be killed by merge-blocks pass
OpLine   ; will be placed between OpLoopMerge and OpBranch - error!
OpBranch

To fix this issue, this commit moves line info of OpBranch to
OpLoopMerge.

Fixes #3118
2020-01-14 14:35:21 -05:00
David Neto
8aa423930d
Avoid pessimizing std::move (#3124)
Should fix a warning
2019-12-27 12:05:58 -05:00
greg-lunarg
9215c1b7df Fix convert-relax-to-half invalid code (#3099) (#3106) 2019-12-20 21:08:12 -05:00
David Neto
e70b009b0f
Add support for SPV_KHR_non_semantic_info (#3110)
Add support for SPV_KHR_non_semantic_info

This entails a couple of changes:

- Allowing unknown OpExtInstImport that begin with the prefix `NonSemantic.`
- Allowing OpExtInst that reference any of those sets to contain unknown
  ext inst instruction numbers, and assume the format is always a series of IDs
  as guaranteed by the extension.
- Allowing those OpExtInst to appear in the types/variables/constants section.
- Not stripping OpString in the --strip-debug pass, since it may be referenced
  by these non-semantic OpExtInsts.
- Stripping them instead in the --strip-reflect pass.

* Add adjacency validation of non-semantic OpExtInst

- We validate and test that OpExtInst cannot appear before or between
  OpPhi instructions, or before/between OpFunctionParameter
  instructions.

* Change non-semantic extinst type to single value

* Add helper function spvExtInstIsNonSemantic() which will check if the extinst
  set is non-semantic or not, either the unknown generic value or any future
  recognised non-semantic set.

* Add test of a complex non-semantic extinst

* Use DefUseManager in StripDebugInfoPass to strip some OpStrings

* Any OpString used by a non-semantic instruction cannot be stripped, all others
  can so we search for uses to see if each string can be removed.
* We only do this if the non-semantic debug info extension is enabled, otherwise
  all strings can be trivially removed.

* Silence -Winconsistent-missing-override in protobufs
2019-12-18 18:10:29 -05:00
greg-lunarg
fccbc00aca Make Instrumentation format version 2 the default (Step 1) (#3096)
* Make Instrumentation format version 2 the default (Step 1)

Add new interfaces without version number argument. Remove version 1
logic and tests. Version interfaces will be removed in step 2 after
layers have transitioned to new interface.

* Add error messages to InstrumentPass().
2019-12-16 14:18:47 -05:00
Steven Perron
00ca4e5bdf
Don't crash when folding construct of empty struct (#3092)
* Don't crash when folding construct of empty struct

An OpCompositeConstruct of an empty struct will be folded to a constant
under normal circumstances.  However, if the id limit has been reached
and the constant cannot be generated, then other folding rules will be
tried.

These rules do not handle the case of an empty struct.  We add allow it
to be handled.

Fixes http://crbug/1030194

* Changes based on the review.
2019-12-10 14:58:30 -05:00
Sarah
0a5d99d02c Permit the debug instructions in WebGPU SPIR-V - remove from the optimizer (#3083)
continuing #3063
fixing #3052
2019-12-03 11:21:26 -05:00
David Neto
af7410597e graphics robust access: use signed clamp (#3073)
Access chain indices are always interpreted as signed integers.
So use signed clamp instead of unsigned clamp.  We must also
clamp to the max signed int for the index type.

Fixes #3072
2019-12-03 11:18:56 -05:00
Steven Perron
3ed4586044
Folding: perform add and sub on mismatched integer types (#3084)
Fixes #3040
2019-12-02 17:51:20 -05:00
alan-baker
b334829a91 Validate nested constructs (#3068)
* Validate that if a construct contains a header and it's merge is
reachable, the construct also contains the merge
* updated block merging to not merge into the continue
* update inlining to mark the original block of a single block loop as
the continue
* updated some tests
* remove dead code
* rename kBlockTypeHeader to kBlockTypeSelection for clarity
2019-11-27 16:45:57 -05:00
Steven Perron
54385458ca
Handle unreachable block when computing register pressure (#3070)
Fixes #3053
2019-11-27 09:45:17 -05:00
David Neto
a62012cede
Add test with explicit example of stripping reflection info (#3064)
* Add test with explicit example of stripping reflection info

* Improve the comment on StripReflectEnd2EndExample
2019-11-26 16:20:45 -05:00
Steven Perron
0391d0823e
Handle OpPhi with no in operands in value numbering (#3056)
Fixes #3043
2019-11-19 09:45:39 -05:00
alan-baker
ab3cdcaef5 Fix operand access of composite in upgrade memory model (#3021)
Fixes #2992

* Accessing aggregate subtype used the wrong operand
* Added a test
2019-11-12 13:41:38 -05:00
alan-baker
1a18d491f2 Validate array stride does not cause overlap (#3028)
Fixes #3027

* Disallow array stride 0
* Check array stride against element size
* Fix up tests
* Add new tests
2019-11-12 13:36:53 -05:00
Ehsan
12e54dae16 Update Offset to ConstOffset bitmask if operand is constant. (#3024)
Update Offset to ConstOffset bitmask if operand is constant.

Fixes #3005
2019-11-11 22:35:14 -05:00
David Neto
618ee50942
Fix some clang-tidy issues in graphics_robust_access_pass (#2998)
One remains: the fact that the image-texel-pointer modification
is mostly dead code. But that's intentional for now.
2019-10-30 14:00:34 -04:00
greg-lunarg
5ea7099374 Add two new simplifications. (#2984)
Implements the following simplifications:

(a - b) + b => a
(a * b) + (a * c) => a * (b + c)

Also adds logic to simplification to handle rules that create new operations
that might need simplification, such as the second rule above.

Only perform the second simplification if the multiplies have the add as their
only use. Otherwise this is a deoptimization of size and performance.
2019-10-28 08:19:38 -07:00
greg-lunarg
02910ffdff Instrument: Add missing def-use analysis. (#2985) 2019-10-22 07:24:54 -07:00
Steven Perron
6a9be627c7
Keep NOPs when comparing with original binary (#2931)
We have a check that ensures that the optimizer did not change the
binary when it says that it did not.  However, when the binary is
converted back to a binary, we made a decision to remove OpNop
instructions.  This means that any spv file that contains a NOP
originally will fail this check.

To get around this, we convert the module to a second binary that keeps
the OpNop instructions.  That binary is compared against the original.

Fixes https://crbug.com/1010191
2019-10-18 09:53:29 -04:00
Jakub Kuderski
e3da3143b2
Disallow use of OpCompositeExtract/OpCompositeInsert with no indices (#2980) 2019-10-17 13:53:34 -04:00
Jakub Kuderski
e99b918221
Support constant-folding UConvert and SConvert (#2960) 2019-10-16 16:29:55 -04:00
alan-baker
2276e59788 Validate that selections are structured (#2962)
* Validate that selections are structured

WIP

* new checks that switch and conditional branch are proceeded by a
selection merge where necessary

* Don't consider unreachable blocks

* Add some tests

* Changed how labels are marked as seen

* Moved check to more appropriate place
* Labels are now marked as seen when there are encountered in a
terminator instead of when the block is checked
* more tests

* more tests

* Method comment

* new test for a bad case
2019-10-11 17:01:30 -04:00
Steven Perron
32f76efa6c
Link cfg and dominator analysis in the context (#2946)
Fixes #2889
2019-10-08 10:16:18 -04:00
Jeremy Hayes
3c7ff8d4f0 Enable OpTypeCooperativeMatrix specialization (#2927) 2019-10-07 09:52:48 -04:00
Steven Perron
c18c9ff6bc
Handle OpKill better (#2933)
We want to handle OpKill better.  The wrap opkill causes lots of extra
code to be generated, even when they are not needed to avoid the main
problem: OpKill cannot be found directly in a continue construct.

This change will be more selective on which functions the OpKill will be
wrapped and inlining will avoid inlining.

Fixes #2912
2019-10-04 13:05:32 -04:00
greg-lunarg
ad3d23f478 Generate null pointer by converting uint64 zero to pointer. (#2935)
Fixes #2929.
2019-10-04 12:26:38 -04:00
alan-baker
9d7428b052
Validate physical storage buffer restrictions (#2930)
* Physical storage buffer cannot be used with OpConstantNull,
OpPtrEqual, OpPtrNotEqual or OpPtrDiff
  * new tests
  * see also #2929
2019-10-02 21:12:57 -04:00
Steven Perron
9eb1c9a4c4
Add continue construct analysis to struct cfg analysis (#2922)
* Add continue construct analysis to struct cfg analysis

Add the ability to identify which blocks are in the continue construct for a
loop, and to get functions that are called from those blocks, directly or
indirectly.

Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/2912.
2019-10-01 10:27:09 -04:00
Steven Perron
85c67b5e08
Record trailing line dbg instructions (#2926)
There is nothing in the spir-v spec that says the last
instructions in a module cannot be OpLine or OpNoLine.
However, the code that parses the module will simply drop
these instructions.

We add code that will preserve these instructions.

Strip-debug-info is updated to remove these instructions.

Fixes https://crbug.com/1000689.
2019-09-27 16:03:45 -04:00
Ryan Harrison
4075b921f9
Add removing references to debug instructions when removing them (#2923)
Fixes #2921
2019-09-27 13:23:06 -05:00
Steven Perron
2a11f365bc
Handle id overflow in wrap-opkill (#2916)
New code in wrap-opkill does not handle id overflow correctly.  We fix that up.

Fixes https://crbug.com/1007144
2019-09-25 17:42:58 -04:00
Steven Perron
55ea57a785
Handle extract with no indexes (#2910)
* Handle extract with no indexes

It is possible that OpCompositeExtract instructions will not have any
indexes.  This is not handled well by scalar replacement and instruction
folding.

Fixes https://crbug.com/1006435

* Fix typo.
2019-09-24 16:19:31 -04:00
Steven Perron
6f26d9ad81
Handle id overflow in convert local access chains (#2908)
Fixes https://crbug.com/1004453
2019-09-24 14:04:54 -04:00
Steven Perron
6b07212659
Use OpReturn* in wrap-opkill (#2886)
* Use OpReturn* in wrap-opkill

The warp-opkill pass is generating incorrect code.  It is placing an
OpUnreachable at the end of a basic block, when the block can be
reached.  We can't reach the end of the block, but we can reach the end.
Instead we will add a return instruction.

Fixes #2875.
2019-09-20 10:32:27 -04:00
Steven Perron
61edde52a0 Revert "Use OpReturn* in wrap-opkill"
This reverts commit 87f0fa432f.
2019-09-19 22:39:56 -04:00
Steven Perron
87f0fa432f Use OpReturn* in wrap-opkill
The warp-opkill pass is generating incorrect code.  It is placing an
OpUnreachable at the end of a basic block, when the block can be
reached.  We can't reach the end of the block, but we can reach the end.
Instead we will add a return instruction.

Fixes #2875.
2019-09-19 22:34:57 -04:00
Steven Perron
248c80b049
Handle OpConstantNull in copy-prop-arrays. (#2870)
Many of the places in copy propagate arrays assumes that integer constant will be defined by an OpConstant instruction.  That is not always true.  We fix these spots by allowing for an OpConstantNull.
2019-09-19 10:24:00 -04:00
alan-baker
5a48c0da15 SPIRV-Tools support for SPIR-V 1.5 (#2865)
* Ensure same enum values have consistent extension lists

* val: fix checking of capabilities

The operand for an OpCapability should only be
checked for the extension or core version.
The InstructionPass registers a capability, and all its implied
sub-capabilities before actually checking the operand to an
OpCapability.

* Add basic support for SPIR-V 1.5

- Adds SPV_ENV_UNIVERSAL_1_5
- Command line tools default to spv1.5 environment
- SPIR-V 1.5 incorporates several extensions.  Now the disassembler
  prefers outputing the non-EXT or non-KHR names.  This requires
  updates to many tests, to make strings match again.
- Command line tests: Expect SPIR-V 1.5 by default

* Test validation of SPIR-V 1.5 incorporated extensions

Starting with 1.5, incorporated features no longer require
the associated OpExtension instruction.
2019-09-13 14:59:02 -04:00
Steven Perron
c7a39bc40f
Don't inline function containing OpKill (#2842)
If an OpKill instruction is inlined into a continue construct, then the
spir-v is no longer valid.  To avoid this issue, we do inline into an
OpKill at all.  This method was chosen because it is difficult to keep
track of whether or not you are in a continue construct while changing
the function that is being inlined into.  This will work well with wrap
OpKill because every will still be inlined except for the OpKill
instruction itself.

Fixes #2554
Fixes #2433

This reverts commit aa9e8f5380.
2019-09-11 13:26:55 -04:00
Steven Perron
4f9256db35
Handle id overflow in wrap op kill. (#2851)
Fixes https://crbug.com/997729
2019-09-11 13:26:42 -04:00
David Neto
9f188e3374 Assembler: Can't set an ID in instruction without result ID (#2852)
Fix tests that violated this rule.

Fixes #2257
2019-09-11 13:15:25 -04:00
Steven Perron
35c9518c4e
Handle id overflow in the ssa rewriter. (#2845)
* Handle id overflow in the ssa rewriter.

Remove LocalSSAElim pass at the same time.  It does the same thing as the SSARewrite pass. Then even share almost all of the same code.

Fixes crbug.com/997246
2019-09-10 09:38:23 -04:00
Steven Perron
7f7236f1eb
Handle id overflow in the constant manager. (#2844)
Fixes crbug.com/997246
2019-09-09 15:12:26 -04:00
Steven Perron
76261e2a7d
Replace CubeFaceCoord and CubeFaceIndexAMD (#2840)
Part of #2814.
2019-09-06 17:11:37 -04:00
Steven Perron
b218ad1994
Fold Min, Max, and Clamp instructions. (#2836)
Fixes #2830.
2019-09-05 13:30:03 -04:00
Steven Perron
a41520eaa4
Replace uses of SPV_AMD_shader_trinary_minmax extension (#2835)
Part of #2814
2019-09-05 09:29:04 -04:00
Ryan Harrison
19b256616d
For WebGPU<->Vulkan optimization, set correct execution environment (#2834)
Fixes #2833
2019-09-04 13:08:58 -04:00
greg-lunarg
d11725b1d4 Add --relax-float-ops and --convert-relaxed-to-half (#2808)
The first pass applies the RelaxedPrecision decoration to all executable
instructions with float32 based type results. The second pass converts
all executable instructions with RelaxedPrecision result to the equivalent
float16 type, inserting converts where necessary.
2019-09-03 13:22:13 -04:00
Steven Perron
b54d950298
Fold Fmix should accept vector operands. (#2826)
Fixes #2819
2019-09-03 09:17:18 -04:00
Steven Perron
d67130caca
Replace SwizzleInvocationsAMD extended instruction. (#2823)
Part of #2814
2019-08-30 14:07:24 -04:00
Steven Perron
ad71c057c7
Replace SwizzleInvocationsMaskedAMD extended instruction. (#2822)
Part of #2814
2019-08-30 10:48:42 -04:00
Steven Perron
35d98be3bc
Amd ext to khr (#2811)
Add the first steps to removing the AMD extension VK_AMD_shader_ballot.
Splitting up to make the PRs smaller.

Adding utilities to add capabilities and change the version of the
module.

Replaces the instructions:

OpGroupIAddNonUniformAMD = 5000
OpGroupFAddNonUniformAMD = 5001
OpGroupFMinNonUniformAMD = 5002
OpGroupUMinNonUniformAMD = 5003
OpGroupSMinNonUniformAMD = 5004
OpGroupFMaxNonUniformAMD = 5005
OpGroupUMaxNonUniformAMD = 5006
OpGroupSMaxNonUniformAMD = 5007

and extentend instructions

WriteInvocationAMD = 3
MbcntAMD = 4

Part of #2814
2019-08-29 12:48:17 -04:00
Steven Perron
73422a0a5e
Check feature mgr in context consistency check (#2818)
We add a check that the feature manager is correcter after each pass.

This resulted in a couple failing tests cases.  Those are fixed.

Part of #2814
2019-08-28 11:49:16 -04:00
Steven Perron
15fc19d091
Refactor instruction folders (#2815)
* Refactor instruction folders

We want to refactor the instruction folder to allow different sets of
rules to be added to the instruction folder.  We might want different
sets of rules in different circumstances.

We also need a way to add rules for extended instructions.  Changes are
made to the FoldingRules class and ConstFoldingRules class to enable
that.

We added tests to check that we can fold extended instructions using the
new framework.

At the same time, I noticed that there were two tests that did not tests
what they were suppose to.  They could not be easily salvaged. #2813 was
opened to track adding the new tests.
2019-08-26 18:54:11 -04:00
Steven Perron
b00ef0d26e
Handle Id overflow in private-to-local (#2807)
We need to handle id overflow in the private to local pass.

Fixes https://crbug.com/962295
2019-08-22 09:14:48 -04:00
Steven Perron
aef8f92b2b
Even more id overflow in sroa (#2806)
Now we need to handle id overflow when we overflow while replacing uses of the variable.  While looking at this code, I noticed an error in the way we handle access chains that cannot be replaced because of overflow.  Name it will make some change, and then give up by returning SuccessWithoutChange.  But it was changed.

This is fixed up by returning Failure if we notice the error at the time of rewriting the users.  This is for both id overflow or out-of-bounds accesses.

Code is added to "CheckUses" to remove variables that have out-of-bounds accesses from the candidate list, so we don't even try to rewrite its uses.

Fixes https://crbug.com/995032
2019-08-21 13:12:42 -04:00
Steven Perron
c5d1dab99e
Add name for variables in desc sroa (#2805)
Fixes #2802.
2019-08-21 10:55:02 -04:00
Steven Perron
bc62722b80
Handle overflow in wrap-opkill (#2801)
Fixes https://crbug/994203
2019-08-18 19:00:18 -04:00
Steven Perron
9cd07272a6
More handle overflow in sroa (#2800)
If we run out of ids when creating a new variable, sroa does not recognize
the error, and continues doing work.  This leads to segmentation faults.

Fixes https://crbug/969655
2019-08-16 13:15:17 -04:00
greg-lunarg
06407250a1 Instrument: Add support for Buffer Device Address extension (#2792) 2019-08-16 09:18:34 -04:00
Steven Perron
60043edfa1
Replace OpKill With function call. (#2790)
We are no able to inline OpKill instructions into a continue construct.
See #2433.  However, we have to be able to inline to correctly do
legalization.  This commit creates a pass that will wrap OpKill
instructions into a function of its own.  That way we are able to inline
the rest of the code.

The follow up to this will be to not inline any function that contains
an OpKill.

Fixes #2726
2019-08-14 09:27:12 -04:00
Steven Perron
f701237f2d
Remove useless semi-colons (#2789)
Later versions of clang seem to pick up more useless semi-colons.  I've removed them.
2019-08-12 08:52:39 -04:00
greg-lunarg
95386f9e45 Instrument: Fix version 2 output record write for tess eval shaders. (#2782)
Fix output record write for tess eval shaders.

Also change command line for bindless instrumentation to use use
output record version 2.
2019-08-09 08:22:41 -04:00
Steven Perron
4b64beb1ae
Add descriptor array scalar replacement (#2742)
Creates a pass that will replace a descriptor array with individual variables.  See #2740 for details.

Fixes #2740.
2019-08-08 10:53:19 -04:00
greg-lunarg
29af42df12 Add SPV_EXT_physical_storage_buffer to opt whitelists (#2779)
This also fixes ADCE to not remove possibly needed OpTypeForwardPointer.
The bug, its fix and the corresponding test have a circular dependency
with the extension, so they are packaged together.
2019-08-08 09:45:59 -04:00
Steven Perron
b029d3697e
Handle RelaxedPrecision in SROA (#2788)
If a member of a struct has a relaxed precision, sroa will not split the
struct.  This means we do not get all cases.  This commit handles these
cases.  The other part is that the decoration needs to be passed on to
the new variables.

Fixes #2786
2019-08-07 12:17:26 -04:00
alan-baker
3726b500b1
Treat access chain indexes as signed in SROA (#2776)
Fixes #2768

* In scalar replacement, interpret access chain indexes as signed counts
* Use Constant::GetSignExtendedValue and Constant::GetZeroExtendedValue
where appropriate
* new tests
2019-07-31 15:39:33 -04:00
David Neto
31590104ec
Add pass to inject code for robust-buffer-access semantics (#2771)
spirv-opt: Add --graphics-robust-access

Clamps access chain indices so they are always
in bounds.

Assumes:
- Logical addressing mode
- No runtime-array-descriptor-indexing
- No variable pointers

Adds stub code for clamping coordinate and samples
for OpImageTexelPointer.

Adds SinglePassRunAndFail optimizer test fixture.

Android.mk: add source/opt/graphics_robust_access_pass.cpp

Adds Constant::GetSignExtendedValue, Constant::GetZeroExtendedValue
2019-07-30 19:52:46 -04:00
David Neto
7621034aae
Add opt test fixture method SinglePassRunAndFail (#2770)
Checks for failure status code and matches against
the expected error message.
2019-07-30 10:38:46 -04:00
Diego Novillo
49797609b7
Protect against out-of-bounds references when folding OpCompositeExtract (#2774)
This fixes #2608.

The original test case had an out-of-bounds reference that ended up
folding into OpCompositeExtract that was indexing right outside the
constant composite.

The returned constant would then cause a segfault during constant
propagation.
2019-07-29 13:27:40 -07:00
alan-baker
7fd2365b06
Don't move debug or decorations when folding (#2772)
Fixes #2764

* Don't replace all uses when simplifying instructions, instead only
update non-debug, non-decoration uses
  * added a test
* Add a new version of RAUW that takes a predicate to decide whether to
replace the use or not
  * used in simplification pass
2019-07-29 16:20:43 -04:00
Diego Novillo
9559cdbdf0
Fix #2609 - Handle out-of-bounds scalar replacements. (#2767)
* Fix #2609 - Handle out-of-bounds scalar replacements.

When SROA tries to do a replacement for an OpAccessChain that is exactly
one element out of bounds, the code was trying to access its internal
array of replacements and segfaulting.

This protects the code from doing this, and it additionally fixes the
way SROA works by not returning failure when it refuses to do a
replacement.  Instead of failing the optimization pass, SROA will now
simply refuse to do the replacement and keep going.

Additionally, this patch fixes the SROA logic to now return a proper status so we can
correctly state that the pass made no changes to the IR if it only found
invalid references.
2019-07-26 12:33:40 -04:00
Steven Perron
bb0e2f65bb
Fix check for unreachable blocks in merge-return (#2762)
Merge return expects unreachable merge block to look a certain way, and
unreachable continue blocks to look a certain way.  What if an
unreachable block is both a merge and a continue?  The continue is
suppose to take precedent, but merge-return implements it with the merge
taking precedent.  This change flips that around.

Fixes #2746
2019-07-25 09:34:18 -04:00
Steven Perron
c7fcb8c3b9
Process OpDecorateId in ADCE (#2761)
* Process OpDecorateId in ADCE

When there is an OpDecorateId instruction that is live,
the ids that is references must be kept live.  This change
adds them to the worklist.

I've also updated a validator check to allow OpDecorateId
to be able to apply to decoration groups.

Fixes #1759.

* Remove dead code.
2019-07-24 14:43:49 -04:00
Steven Perron
fb83b6fbb5
Record correct dominators in merge return (#2760)
In merge return, we need to know the original dominator for a block in order to
traverse code from the original dominator to the new dominator and add
appropriate Phi nodes.  The current code gets this wrong because the dominator
tree is build as needed.  The first time we get the immediate dominator for a
function we just built the dominator tree and it takes into account that a
block has been split.  The second time it does not.

This inconsistency needs to be fixed.  We do that by recording the original
dominator for all blocks at the start of the pass.

If we were to record just the basic block, that could change if the block is
split.  We want to traverse the code in the body of the original dominator,
whatever block it ends up in.  To make this easy to track, we not save the
terminator instruction to represent the original dominator.

Fixes #2745
2019-07-24 13:56:54 -04:00
Steven Perron
c9190a54da
SSA rewriter: Don't use trivial phis (#2757)
When a phi candidate is marked as trivial, we are suppose to update all
of its uses to the reference the value that it is being folded to.
However, the code updates the uses misses `defs_at_block_`.  So at a
later time, the id for the trivial phi can reemerge.

Fixes #2744
2019-07-23 17:59:30 -04:00
greg-lunarg
3855447d93 Bindless Instrument: Make init check depend solely on input_init_enabled (#2753)
* Bindless Instrument: Make init check depend solely on input_init_enabled

Previously was dependent on presense of descriptor_indexing extension
in SPIR-V, but this missed some cases. Tests updated to refect this new
policy.

* Fix format.
2019-07-22 13:51:39 -04:00
Steven Perron
aa9e8f5380
Revert "Do not inline OpKill Instructions (#2713)" (#2749)
This reverts commit fe7cc9c612.
2019-07-17 14:59:05 -04:00
Steven Perron
230c9e4371
Fix bug in merge return (#2734)
* Fix bug in merge return

The merge return pass seems to assume that the only new edges in the cfg
are from return block to merge blocks.  However, it is possible that a
merge block branches to a merge block when it did not before.

This change add a new variable to track all of the new edges.  It also
renames some other variables and cleans us the code to make it a bit
easier to read.

Fixes #2702.
2019-07-16 09:11:22 -04:00
Jason Macnak
1fedf72e50 Allow ray tracing shaders in inst bindle check pass. (#2733)
Adds the ray tracing stages (ray gen, intersection, any hit, closest hit,
miss, and callable) to the allowed stages in pass instrumentation and add
debug records for these stages to output the global launch id.

More information for ray tracing shaders:
- https://github.com/KhronosGroup/GLSL/blob/master/extensions/nv/GLSL_NV_ray_tracing.txt
2019-07-15 16:24:42 -04:00
greg-lunarg
92c41ff1e7 Remove Common Uniform Elimination Pass (#2731)
Remove Common Uniform Elimination Pass

Fixes #2520.
2019-07-12 11:02:10 -04:00
Steven Perron
5ce8cf781f
Change the order branches are simplified in dead branch elim (#2728)
Dead branch elimination needs to know about the constructs that a block is contained it when determining what to do with its merge instruction.  We currently fold branches in block as we see them, which is parent constructs before their children.  This causes the struct cfg analysis to crash because it tries to get the parent construct for a block after the parent has been folded.

This can be fixed by folding the branch of the children before the parents.

Fixes #2667.
2019-07-10 14:59:44 -04:00
Thomas Roughton
cd153db8ed Add —preserve-bindings and —preserve-spec-constants (#2693)
Add optimizer options to for preservation of spec constants and variable with
binding decorations.  They are to be preserved even if they are unused.
2019-07-10 14:12:19 -04:00
Steven Perron
86e45efe15
Handle decorations better in some optimizations (#2716)
There are a couple spots where we are not looking at decorations when we should.

1. Value numbering is suppose to assign a different value number to ids if they have different decorations.  However that is not being done for OpCopyObject and OpPhi.

1. Instruction simplification is propagating OpCopyObject instruction without checking for decorations.  It should only do that if no decorations are being lost.

Add a new function to the decoration manager to check if the decorations of one id are a subset of the decorations of another.

Fixes #2715.
2019-07-10 11:37:16 -04:00
Steven Perron
37e8f79946
Perform merge return with single return in loop. (#2714)
Inlining does not inline functions that have a single return that is in a loop.  This is because the return cannot be replaced by a branch outside of the loop easily.  Merge return knows how to rewrite the function so the return is replaced by a branch.

Fixes #2038.
2019-07-04 14:14:49 -04:00
Steven Perron
fe7cc9c612
Do not inline OpKill Instructions (#2713)
It is illegal to inline an OpKill instruction into a continue construct because the continue header will no longer dominate the backedge.

This commit adds a check for this, and does not inline.

If we still want to be able to inline a function that contains an OpKill, we can add a new pass that will wrap OpKill instructions into its own function with just the single instruction.

I do not believe that this is a common case right now, so I will not do that yet.

Fixes #2433.
2019-07-04 12:08:23 -04:00
Jason Macnak
e6e3e2ccc6 Update type for loaded builtin GlobalInvocationID in pass instrumentation (#2705)
When working on descriptor indexing validation for compute shaders, the
gl_GlobalInvocationID builtin was being loaded as uint which would cause
compute shaders instrumented by the bindless check pass to have:

%83 = OpLoad %uint %gl_GlobalInvocationID
%84 = OpCompositeExtract %uint %83 0
%85 = OpCompositeExtract %uint %83 1
%86 = OpCompositeExtract %uint %83 2

which results in validation failures:

error: line 127: Reached non-composite type while indexes still remain
to be traversed.
%84 = OpCompositeExtract %uint %83 0

for trying to extract a uint from a uint.
2019-06-28 09:46:16 -04:00
alan-baker
2090d7a2d2
Handle volatile memory semantics in upgrade (#2674)
* If an atomic is decorated with volatile add the volatile bit to its
memory semantics
2019-06-17 16:01:37 -04:00
alan-baker
59983a6010 Validate variable initializer type (#2668)
Fixes #249

* The pointed to type of Result Type must match the initializer type
* Had to update some opt tests to be valid
2019-06-15 00:34:18 -04:00
greg-lunarg
43fb2403a6 Instrument: Fix code for version 2 output format. (#2655)
Correct record size. Also bring version 2 tests up to version 1
equivalence.
2019-06-06 11:35:34 -04:00
David Neto
d01a3c3b4b
Optimizer: Handle array type with OpSpecConstantOp length (#2652)
When it's an OpConstant or OpSpecConstant, then the literal
values are compared.  If the OpSpecConstant also has a SpecId
decoration, then that's also compared.

Otherwise, it's an OpSpecConstantOp and we only compare the
ID of the OpSpecConstantOp instruction itself.

Fixes #2649
2019-06-05 16:35:50 -04:00
Pierre Moreau
e7866de4b1 Linker: Better type comparison for OpTypeArray and OpTypeForwardPointer (#2580)
* Types: Avoid comparing IDs for in Type::IsSameImpl

When linking, we end up with duplicate types for imported and exported
types, that needs to be removed. The current code would reject valid
import/export pairs of symbols due to IDs mismatch, even if the types or
constants behind those ID were the same.

Enabled remaining type_match_test

Fixes #2442
2019-05-29 16:12:02 -04:00
Ryan Harrison
0125b28ed4
Add compact ids to WebGPU <-> Vulkan transformations (#2639)
Fixes #2634
2019-05-29 12:58:37 -07:00
greg-lunarg
3d62cb8148 Instrument: Add version 2 of record formats (#2630)
New version has additional word in stage-specific section. Also
some changes in content for tesselation and compute shaders. Either
version can be invoked at pass creation. This is done to ease integration
and updating of validation layers. Version 1 is deprecated and eventually
will go away.

Also sneaking in fix to version 1 compute shaders.
2019-05-29 15:08:21 -04:00
Steven Perron
6c7db9c630
Handle nested breaks from switches. (#2624)
* Handle nested breaks from switches.

There was a recent decision made to allow branches to the merge node of
a switch even if the switch is not the first enclosing construct.  They
can be generated by glslang from break statements in switches.

Dead branch elimination seems to be the only optimization that will
break because of this change, so I will update that optimizations.

The change made are:

- Track switches in structured cfg analysis.
- In Dead branch elimination:
  - Look for nested breaks that will require a switch instruction.
  - Rewrite, but don't delete, switchs that are required even if it
    could be replaced by an unconditional branch.
  - When looking for the first break, consider the merge of a switch
    as well.

See #2612.

* Fix variable names and comments.

* Add tests for the struct cfg analysis and switches.

* Fix typos in comments.
2019-05-27 16:28:14 -04:00
Steven Perron
d9c00e1d2d Add folding rules for OpQuantizeToF16 (#2614)
Adding the folding rules for OpQuantizeToF16, and fixed some matching
tests to check identify new lines.
2019-05-21 23:15:01 -07:00
Steven Perron
0982f0212e
Using the instruction folder to fold OpSpecConstantOp (#2598)
In order to try to reduce code duplication and to be able
to fold more cases, we want to use the instruction folder
when folding an OpSpecConstantOp with constant operands.

A couple other changes are need to make this work.  First
GetDefiningInstruction| in the constant manager is able
to handle |type_id| being logically equivalent to another
type, so we updated the interface, and removed the assert.

Some tests were also updated because we not generate
better code because constants are not duplicated as much
as before.

No need for new tests.  The functionality of the instruction folder is
already tested.  There are tests check that the instruction folder is
being used correctly for OpCompositeExtract and OpVectorShuffle in the
existing test cases.

Fixes #2585.
2019-05-21 12:45:00 -04:00
greg-lunarg
9dfd4b8358 Bindless Validation: Instrument descriptor-based loads and stores (#2583)
Essentially, support UBOs and SSBOs, scalar and array (sized and unsized).
2019-05-15 19:43:23 -04:00
alan-baker
fc7b5d8c6a Mem model spv 1.4 (#2565)
* Update memory model support for SPIR-V 1.4

Fixes #2552

* Upgrade memory model now supports two memory access operands for
OpCopyMemory*
  * in all cases the pass will first generate two operands by either
  adding them or copying
  * updates accounts for multiple operands
  * tests
2019-05-15 19:06:37 -04:00
Steven Perron
84503583c6
Handle id overflow in sroa better. (#2582)
There is a case where sroa is not handling id overflow gracefully.  It
is handled and an error message is output when the ids overflow.

Fixes https://crbug.com/961030.
2019-05-15 09:29:28 -04:00
alan-baker
2947e88f79 Update instrumentation passes to handle 1.4 interfaces (#2573)
Fixes #2556

Added variables get added to entry point interfaces
Add to input buffer too
2019-05-10 11:08:28 -04:00
alan-baker
87c4ef8a9c
Do not fold floating point if float controls used (#2569)
Fixes #2558

* Mark floating point instructions as non-foldable if any
SPV_KHR_float_controls capabilities are present
  * tests
2019-05-10 11:03:22 -04:00
Ryan Harrison
f6d9a17843
Add pass to fix some invalid unreachable blocks for WebGPU (#2563)
Attempts to split up unreachable blocks that are used both as a
merge-block and a continue-target.

Fixes #2429
2019-05-09 12:56:10 -04:00
alan-baker
cc3e93c4e6
Add tests for folding 1.4 selects (#2568)
Fixes #2554

* Folding rules already handle 1.4 selects so I simply added some tests
2019-05-08 14:06:04 -04:00
alan-baker
ea5e1b62e1
Update priv-to-local for SPIR-V 1.4 (#2567)
Fixes #2555

* Fix a bug in validation where interfaces were considered non-unique
between different entry points targeting the same function
  * added a test
* Update private to local pass to remove localized private variables
from entry point interfaces
  * added tests
2019-05-08 12:38:49 -04:00
alan-baker
b74d92a8c3
ADCE support for SPIR-V 1.4 entry points (#2561)
Fixes #2551

* Add support for 1.4 entry point interface lists
  * only input and output variables are automatically live
  * can clean up interfaces after DCE
  * added tests
* allow opt tests to specify a target environment
2019-05-07 14:52:22 -04:00
David Neto
63f57d95d6
Support SPIR-V 1.4 (#2550)
* SPIR-V 1.4 headers, add SPV_ENV_UNIVERSAL_1_4

* Support --target-env spv1.4 in help for command line tools

* Support asm/dis of UniformId decoration

* Validate UniformId decoration

* Fix version check on instructions and operands

Also register decorations used with OpDecorateId

* Extension lists can differ between enums that match

Example: SubgroupMaskEq vs SubgroupMaskEqKHR

* Validate scope value for Uniform decoration, for SPIR-V 1.4

* More unioning of exts

* Preserve grammar order within an enum value

* 1.4: Validate OpSelect over composites

* Tools default to 1.4

* Add asm/dis test for OpCopyLogical

* 1.4: asm/dis tests for PtrEqual, PtrNotEqual, PtrDiff

* Basic asm/Dis test for OpCopyMemory

* Test asm/dis OpCopyMemory with 2-memory access

Add asm/dis tests for OpCopyMemorySized

Requires grammar update to add second optional memory access operand
to OpCopyMemory and OpCopyMemorySized

* Validate one or two memory accesses on OpCopyMemory*

* Check av/vis on CopyMemory source and target memory access

This is a proposed rule. See
https://gitlab.khronos.org/spirv/SPIR-V/issues/413

* Validate operation for OpSpecConstantOp

* Validate NonWritable decoration

Also permit NonWritable on members of UBO and SSBO.

* SPIR-V 1.4: NonWrtiable can decorate Function and Private vars

* Update optimizer CLI tests for SPIR-V 1.4

* Testing tools: Give expected SPIR-V version in message

* SPIR-V 1.4 validation for entry point interfaces

* Allow only unique interfaces
* Allow all global variables
* Check that all statically used global variables are listed
* new tests

* Add validation fixture CompileFailure

* Add 1.4 validation for pointer comparisons

* New tests

* Validate with image operands SignExtend, ZeroExtend

Since we don't actually know the image texel format, we can't fully
validate.  We need more context.

But we can make sure we allow the new image operands in known-good
cases.

* Validate OpCopyLogical

* Recursively checks subtypes
* new tests

* Add SPIR-V 1.4 tests for NoSignedWrap, NoUnsignedWrap

* Allow scalar conditions in 1.4 with OpSelect

* Allows scalar conditions with vector operands
* new tests

* Validate uniform id scope as an execution scope

* Validate the values of memory and execution scopes are valid scope
values
* new test

* Remove SPIR-V 1.4 Vulkan 1.0 environment

* SPIR-V 1.4 requires Vulkan 1.1

* FIX: include string for spvLog

* FIX: validate nonwritable

* FIX: test case suite for member decorate string

* FIX: test case for hlsl functionality1

* Validation test fixture: ease debugging

* Use binary version for SPIR-V 1.4 specific features

* Switch checks based on the SPIR-V version from the target environment
to instead use the version from the binary
* Moved header parsing into the ValidationState_t constructor (where
version based features are set)
* Added new versions of tests that assemble a 1.3 binary and validate a
1.4 environment

* Fix test for update to SPIR-V 1.4 headers

* Fix formatting

* Ext inst lookup: Add Vulkan 1.1 env with SPIR-V 1.4

* Update spirv-val help

* Operand version checks should use module version

Use the module version instead of the target environment version.

* Fix comment about two-access form of OpCopyMemory
2019-05-07 12:27:18 -04:00
Steven Perron
6d04da22c6
Fix up type mismatches. (#2545)
Add functionality to fix-storage-class so that it can fix up mismatched
data types for pointers as well.

Fixes bugs in when fixing up storage class.

Move GenerateCopy to the Pass class to be reused.

The spirv-opt change for #2535.
2019-05-02 09:31:46 -04:00
Steven Perron
32af42616a
Change implementation of post order CFG traversal (#2543)
* Change implementation of post order CFG traversal

It seems like the recursion is going very deep, and causing some problem
is particular situations.  I've reimplemented the CFG post order
traversal to not use recursion.

Fixes #2539.
2019-04-29 17:09:20 -04:00
Ryan Harrison
b68af7ca8e
Add support for Private & Output to initializer decompose flag (#2537)
Fixes #2388
2019-04-25 16:24:32 -04:00
Ryan Harrison
048dcd38ce
Implement WebGPU->Vulkan initializer conversion for 'Function' variables (#2513)
WebGPU requires certain variables to be initialized, whereas there are
known issues with using initializers in Vulkan. This PR is the first
of three implementing a pass to decompose initialized variables into
a variable declaration followed by a store. This has been broken up
into multiple PRs, because there 3 distinct cases that need to be
handled, which require separate implementations.

This first PR implements the basic infrastructure that is needed, and
handling of Function storage class variables. Private and Output will
be handled in future PRs.

This is part of resolving #2388
2019-04-16 14:31:36 -04:00
Ryan Harrison
102e430a88
Add pass to legalize OpVectorShuffle for WebGPU (#2509)
In WebGPU, the component operand 0xFFFFFFFF is forbidden, but in
Vulkan it is used to indicate a value is undefined. When converting to
WebGPU, 0xFFFFFFFF needs to converted to a legal value, though the
specific one does not matter, since it was used to indicate an
undefined entry in the original code. Choosing to use 0, since the
operands are required to be on [0, N-1], so 0 is guaranteed to always
be valid.

Fixes #2349
2019-04-12 12:14:23 -04:00
Steven Perron
9047de51cb
Accept OpBitCast in fix storage class. (#2505)
Fixes http://crbug.com/950889.
2019-04-09 14:10:35 -04:00
Ryan Harrison
0cb2d4079e
Add WebGPU->Vulkan and Vulkan->WebGPU flags in spirv-opt (#2496)
Renames the existing flag '--webgpu-mode' to '--vulkan-to-webgpu' for
the Vulkan->WebGPU operation, and adds a new flag '--webgpu-to-vulkan'
for the WebGPU->Vulkan operation.

Currently '--webgpu-to-vulkan' doesn't have any passes associated with
it yet, but further patches will implement them.

Fixes #2495
2019-04-05 15:12:26 -04:00
Steven Perron
3a0bc9e724
Add fix storage class code. (#2434)
This pass tries to fix validation error due to a mismatch of storage classes
in instructions.  There is no guarantee that all such error will be fixed,
and it is possible that in fixing these errors, it could lead to other
errors.

Fixes #2430.
2019-04-05 13:12:08 -04:00
alan-baker
236bdc0065 Change prioritization of unreachable merge and continue (#2460)
Fixes #2452

Swaps priority of handling unreachable merge and continues so that the
back-edge is retained in the case a block is both a loop continue and
loop merge
2019-04-03 12:50:08 -04:00
Steven Perron
12e4a7b649
Handle variable pointer in some optimizations (#2490)
* Check var pointer capability in ADCE.

* Check var ptr capability for common uniform.

* Check var ptr capability in access chain convert.

Since we want this pass to run even if there are variable pointer on
storage buffers, we had to remove asserts that assumed there were no
variable pointers.  The functions with the asserts will now work, it
becomes the responsibility of the callers to deal with the output as
appropriate.

* Single block elimination and variable pointers.

It seems like the code in local single block elimination is able to
handle cases with variable pointers already.  This is because the
function `HasOnlySupportedRefs` ensures that variables that feed a
variable pointer are not candidates.

* Single store elimination and variable pointers.

It seems like the code in local single stroe elimination is able to
handle cases with variable pointers already.  This is because the
function `FindSingleStoreAndCheckUses` ensures that variables that feed
a variable pointer are not candidates.

* SSA rewriter and variable pointers.

It seems like the code in the two passes that call the SSA rewriter are
able to  handle cases with variable pointers already.  This is because the
function `HasOnlySupportedRefs` ensures that variables that feed
a variable pointer are not candidates.

Fixes #2458.
2019-04-03 12:47:51 -04:00
Ryan Harrison
01964e325f
Add pass to generate needed initializers for WebGPU (#2481)
Fixes #2387
2019-04-03 11:44:09 -04:00
alan-baker
4bd106b089
Handle dead infinite loops in DCE (#2471)
Fixes #2456

* When eliminating a structured construct that has an unreachable merge,
replace that unreachable terminator with an appropriate return
* New tests
2019-04-03 10:30:12 -04:00
alan-baker
c9874e5090
Fix merge return in the face of breaks (#2466)
Fixes #2453

* Enable addition of OpPhi instructions when the loop has multiple
predecessors of the merge due to a break
 * This can result in some values no longer dominating their uses
* Track return blocks in structured flow to produce OpPhis that have
multiple undef and non-undef arguments
* New tests to catch the bug
* When a block is predicated, mark the new body as a return if the old
block as already a return
2019-04-02 10:05:28 -04:00
alan-baker
0300a464a4 Maintain inst to block mapping in merge return (#2469)
Fixes #2455

Properly maintains instruction to block mapping for newly created phi instructions in merge return
2019-04-01 13:14:10 -04:00
alan-baker
320a7de5c9
Validate that OpUnreacahble is not statically reachable (#2473)
* Adds a validator check that ensures no block reachable from the entry
block is terminated by OpUnreachable
* Updated tests
* Added new tests
2019-03-29 10:49:37 -04:00
alan-baker
2ff54e34ed
Handle function decls in Structured CFG analysis (#2474)
Fixes #2451

* Structured cfg analysis now handles functions with no basic blocks
* Added a test
2019-03-26 14:39:16 -04:00
greg-lunarg
e1a76269b6 Bindless Validation: Descriptor Initialization Check (#2419)
If SPV_EXT_descriptor_indexing is enabled, add check that for a
descriptor-based reference, the descriptor is initialized. Initialization
data is stored in the debug input buffer, added to the length information
already there. This feature must be seperately enabled on the pass
creation routine. NOTE: Currently just supports image references; buffer
references are still TODO.
2019-03-19 09:53:43 -04:00
Ryan Harrison
e545522146
Add --strip-atomic-counter-memory (#2413)
Adds an optimization pass to remove usages of AtomicCounterMemory
bit. This bit is ignored in Vulkan environments and outright forbidden
in WebGPU ones.

Fixes #2242
2019-03-14 13:34:33 -04:00
Steven Perron
5186ffedb3
Remove duplicates from list of interface IDs in OpEntryPoint instruction (#2449)
* Remove duplicates from list of interface IDs in OpEntryPoint instruction

Fixes #2002.
2019-03-13 15:46:31 -04:00
Steven Perron
9d29c37ac5
Removing decorations when doing constant propagation. (#2444)
In constant propagation, decoration are transfered from the original
expression to the constant that will replace it.  This can be wrong
because there are no decorations that apply to constants.  We choose to
simply delete the decorations.

Fixes #2441
2019-03-13 10:40:49 -04:00
Steven Perron
d800bbbac9
Handle back edges better in dead branch elim. (#2417)
* Handle back edges better in dead branch elim.

Loop header must have exactly one back edge.  Sometimes the branch
with the back edge can be folded.  However, it should not be folded
if it removes the back edge.

The code to check this simply avoids folding the branch in the
continue block.  That needs to be changed to not fold the back edge,
wherever it is.

At the same time, the branch can be folded if it folds to a branch to
the header, because the back edge will still exist.

Fixes #2391.
2019-02-26 09:06:51 -05:00
Sarah
4c43afcade
It is invalid to apply both Restrict and Aliased to the same <id> (#2408)
to fix #2408 - It is invalid to apply both Restrict and Aliased to the same
2019-02-21 12:03:52 -05:00
Steven Perron
fde69dcd80
Fix OpDot folding of half float vectors. (#2411)
* Fix OpDot folding of half float vectors.

The code that folds OpDot does not handle half floats correctly.  After
trying to multiple the first components, we get a nullptr because we
don't fold half float values.  This nullptr gets passed to the code that
does the addition, and causes an assert.

Fixes #2405.
2019-02-20 20:05:08 -05:00
Steven Perron
8eddde2e70
Don't change type of input and output var in dead member elim (#2412)
The types of input and output variables must match for the pipeline.  We
cannot see the uses in all of the shader, so dead member
elimination cannot safely change the type of input and output variables.
2019-02-20 18:59:41 -05:00
greg-lunarg
2f84b5de9a Bindless: Fix computation of set and binding for runtime bounds check (#2384)
Also fix test to use non-zero set and binding which will make error
more obvious.
2019-02-19 11:43:30 -05:00
Ryan Harrison
6d20f62570
Refactor webgpu-mode pass ran tests to be parameterized (#2395)
Fixes #2394
2019-02-15 11:08:05 -05:00
Steven Perron
78ac954c41
Mark type id of unknown instructions at fully used. (#2399) 2019-02-15 10:49:49 -05:00
greg-lunarg
9540f2d981 Instrumentation: Fix instruction index when multiple functions (#2389) 2019-02-15 09:49:18 -05:00
Steven Perron
1b0047f210
Add pass to remove dead members. (#2379)
Add a pass that looks for members of structs whose values do not affects
the output of the shader. Those members are then removed and just
treated like padding in the struct.
2019-02-14 13:42:35 -05:00
alan-baker
354205b3dc
Don't merge unreachable blocks (#2375)
Fixes #2374

* Block merging no longer merges unreachable blocks into their
successors
 * added a test
2019-02-12 09:24:01 -05:00
Ryan Harrison
12b3d7e9d6 Add strip-debug to webgpu-mode passes (#2368)
Fixes #2366
2019-02-08 14:26:17 -05:00
greg-lunarg
cf21146137 Expand bindless bounds checking to runtime-sized descriptor arrays (#2316) 2019-02-07 14:00:36 -05:00
Ryan Harrison
0f4bf0720a
Add flatten-decorations flag to webgpu-mode flags (#2348)
Fixes #2272
2019-02-05 14:07:53 -05:00
Alastair Donaldson
3b6fee3dae Fixes #2338. Added functionality to remove OpPhi instructions (replacing their uses) when merging blocks (#2339)
* Fixes #2338.  Added check for phi node before merging blocks.

* Added functionality to merge blocks A and B even when B starts with OpPhi instructions, by replacing uses of the OpPhi results with the definitions coming from A.  Added some tests for this.

* Fixed assertion.
2019-01-31 09:36:05 -05:00
Steven Perron
464111eaef
Remove use of deprecated googletest macro (#2286)
* Remove use of deprecated googletest macro

INSTANTIATE_TEST_CASE_P has been deprecated.  We need to use
INSTANTIATE_TEST_SUITE_P instead.

* Remove extra commas from test suites.
2019-01-29 18:56:52 -05:00
Steven Perron
8df947d2d6
Handle instructions not in blocks in code sinking. (#2308)
When looking at the uses of the result of an instruction, code sinking
assumes that all uses are in a basic block.  However, this is not true
if there is a decoration or name for the result of that insturction.
This commit checks for this.

Fixes https://crbug.com/923243.
2019-01-21 12:09:56 -05:00
Steven Perron
d6c067630d Handle extract with no index in VDCE. (#2305)
It is legal, but not generated by any SPIR-V producer: an OpCompositeExtract
with no indexes.  This is essentially just a copy of the object, so we
treat them that way.  We simply propagate the live variables of the
result to the operand.

Fixes https://crbug.com/919181.
2019-01-18 15:43:36 -05:00
Steven Perron
81fb2649bf
Handle access chain with no index in SROA. (#2304)
It is legal, but not generated by any SPIR-V producer: an OpAccessChain
with no indexes.  This is essentially just a copy of the pointer.

I have decided to treat it like an OpCopyObject.  In CheckUses, we
return that it is not okay.

When looking at this I realized that we had code in GetUsedComponents
that cannot be reached.  If there is a use in an OpCopyObject the it
will not call GetUsedComponents.  I removed that dead code.

Fixes https://crbug.com/918311.
2019-01-18 14:19:43 -05:00
Steven Perron
213e15e100
Fix overflow when negating INT_MIN. (#2293)
When doing (-INT_MIN) is considered overflow, so we cannot fold it by
actually performing the negation.

Fixes https://crbug.com/917991
2019-01-17 17:01:55 -05:00
Steven Perron
99c2c21cf4
Fix memory leak in unrolling. (#2301)
During unrolling a new loop is created, but its ownership is not clear
as it gets passed through the code. Changed something to unique_ptr to
make that clearer.

Fixes #2299.

Fixing other memory leaks at the same time.

Fixes #2296
Fixes #2297
2019-01-17 16:02:43 -05:00
Steven Perron
dd4157dcee
Sink (#2284)
Add code sinking pass. It will move OpLoad and OpAccessChain instructions as close as possible to their uses.

Part of #1611.
2019-01-17 15:56:36 -05:00
greg-lunarg
8d2d66f30c Fix vertex instrumentation to use VertexIndex and InstanceIndex (#2294)
...instead of VertexId and InstanceId
2019-01-16 18:02:07 -05:00
Steven Perron
49b5b0abc6
Fix up bit shifts by 32. (#2292)
In C++, a bit shift of the same size as the type is undefined, but it is
defined in spir-v.  When folding those cases, we have to be careful.  We
cannot simply do the shift in C++.

Fixes https://crbug.com/917697.
2019-01-16 15:52:23 -05:00
greg-lunarg
83bfdc976a Instrumentation: Add ArrayStride decoration to debug output buffer array (#2290) 2019-01-16 10:01:40 -05:00
alan-baker
06c9dc07bd
Upgrade modf and frexp (#2266)
Fixes #2138

* Modf and frexp are upgraded to use the struct version of the
instruction and generate an explicit store whose flags can be upgraded
separately
* Fixed major bug where availability and visibility were reversed for
non-copy memory instructions
* Fixed bug where availability and visibility scope operands were reversed for copy memory
* Upgraded all opt tests to use SPV_ENV_UNIVERSAL_1_3
* Upgrade tests moved into unified tests and removed standalone test
2019-01-07 12:36:38 -05:00
Steven Perron
241644a5a3
Have replace load size handle extact with no index. (#2261)
Fixes https://crbug.com/917774
2019-01-03 13:02:10 -05:00
Steven Perron
9f36c8bb72
Handle CompositeInsert with no indices in VDCE (#2258)
* Handle CompositeInsert with no indices in VDCE

In the spec, there it nothing that forces an OpCompositeInsert to have
an index, but VDCE assumes there is at least 1 in a couple places.

This commit updates VDCE to handle these cases.
2019-01-02 14:00:04 -05:00
Steven Perron
bdc2ab9356
In LICM don't place code between merge instruction and branch. (#2252)
Fixes #2210.
2018-12-20 18:33:52 -05:00
kholtnv
e49bd96f2c Added additional changes for the new AccelerationStructureNV type. (#2218)
* Added additional changes for the new AccelerationStructureNV type.

* Added additional changes for the new AccelerationStructureNV type.  Change tabs to space...

* Added additional changes for the new accelerationStructureNV type -- add proper type name.

Fix TypeManager.TypeStrings test:
[----------] 29 tests from TypeManager
[ RUN      ] TypeManager.TypeStrings
[       OK ] TypeManager.TypeStrings (7 ms)
2018-12-19 21:42:39 +00:00
Steven Perron
68b69e16aa
Update the continue target in merge return. (#2249)
When we are predicating the continue target for a loop, it can no longer
be the continue target because it will have a branch that exits the loop
and is not the bach edge.  The continue target will have to be the
target of that branch that is still in the loop.

Fixes #2211.
2018-12-19 21:24:49 +00:00
Steven Perron
ac7feace90
Fix missing OpPhi after merge return. (#2248)
The function `UpdatePhiNodes` was being called inconsistently.  In one
case, the cfg had already been updated to include the new edge, and in
another place the cfg was not updated.  This caused the function to
miss flagging a block as needing new phi nodes.  I picked that the cfg
should not be updated before making the call.  I documented it, and
change the call sites to match.

Fixes #2207.
2018-12-19 18:17:42 +00:00
Steven Perron
9d04f82bef
Ensure SROA gets the correct pointer type. (#2247)
We initially assumed that if the type manager returned the correct id
for the pointee type, that we would get the correct pointer type back,
but that is not true.  See the unit test added with this commit.  We
need to fall back to the linear search any time we are looking for a
pointer to a type that may not be unique.

At the same time, SROA considered an OpName on a variable to be a use of
the entire variable.  That has been fixed.

Fixes #2209.
2018-12-19 17:07:29 +00:00
Steven Perron
9e81c337f9
Place load after OpPhi instructions in block. (#2246)
We currently place the load instructions at the start of the basic block
that dominates all of the loads.  If that basic block contains OpPhi
instructions, then this will generate invalid code.  We just need to
search for a location that comes after all of the OpPhi instructions.

Fixes #2204.
2018-12-19 15:18:22 +00:00
Steven Perron
5ec2d1a8cd
Don't fold specialized branches in loop unswitch (#2245)
* Don't fold specialized branchs in loop unswitch

Folding branches can have a lot of special cases, and can be a little
error prone.  So I only want it in one place.  That will be in dead
branch elimination.  I will change loop unswitching to set the branches
that were being folded to have a constant condition.  Then subsequent
pass of dead branch elimination will be able to remove the code.

At the same time, I added a check that loop unswitching will not
unswitch a branch with a constant condition.  It is not useful to do it
because dead branch elimination will simple fold the branch anyway.
Also it avoid an infinite loop that would other wise be introduced by my
first change.

Fixes #2203.
2018-12-19 04:40:30 +00:00
Steven Perron
1254335d13
Don't unswitch the latch block. (#2205)
Loop unswitching is unswitching the conditional branch that creates the
back-edge. In the version of the loop, where the bachedge is not taken,
there is no back-edge. This is what causes the validator to complain.

The solution I will go with will be to now unswitch a condition with a
back-edge. At this time we do not now if loop unswitching is used. We do
not include it in the optimization sets provided, nor is it used in
glslang's set. When there are opportunities and no breaks from the loop,
the loop with either be a single iteration loop, or an infinite loop.
There is no performance advantage to performing loop unswitching in
either of those cases. If there is a break, maintaining structured
control flow will be tricky. Unless we see a clear advantage to handling
these case, I would go with the safer simpler solution.

Fixes #2201.
2018-12-18 18:15:00 +00:00
Steven Perron
ff07c6df83
SSA-rewriter: make sure phi entries are unique. (#2206)
If there are multiple edges to a basic block, then the ssa rewriter will
create OpPhi instructions with duplicate entries.  This is invalid, and
it is fixed in this commit.

Fixes #2202.
2018-12-18 18:14:27 +00:00
Jeff Bolz
24328a0554 Recognize OpTypeAccelerationStructureNV as a type instruction (#2190) 2018-12-11 19:03:55 -05:00
Steven Perron
e07dabc25f
Invalidate the decoration manager at the start of ADCE. (#2189)
* Invalidate the decoration manager at the start of ADCE.

If the decoration manager is kept live the the contex will try to keep
it up to date.  ADCE deals with group decorations by changing the
operands in |OpGroupDecorate| instructions directly without informing
the decoration manager.  This puts it in an invalid state, which will
cause an error when the context tries to update it.  To Avoid this
problem, we will invalidate the decoration manager upfront.

At the same time, the decoration manager is now considered when checking
the consistency of the decoration manager.
2018-12-10 13:24:33 -05:00
Jeff Bolz
8fc8dfe4a5 Add PCH_FILE for upgrade_memory_model target. MSVC doesn't like building pass_utils.cpp twice in the same folder with different PCH settings. (#2186) 2018-12-10 10:54:19 -05:00
Steven Perron
0bc66a8ba9
Fix invalid OpPhi generated by merge-return. (#2172)
* Fix invalid OpPhi generated by merge-return.

When we create a new phi node for a value say %10, we have to replace
all of the uses of %10 that are no longer dominated by the def of %10
by the result id of the new phi.  However, if the use is in a phi node,
it is possible that the bb contains the use is not dominated by either.
In this case, needs to be handled differently.

* Split loop headers before add a new branch to them.

In merge return, Phi node in loop header that are also merges for loop
do not get updated correctly.  Those cases do not fit in with our
current analysis.  Doing this will simplify the code by reducing the
number of cases that have to be handled.
2018-12-07 14:10:30 -05:00
David Neto
6df6194db8
Validate Uniform decoration (#2181) 2018-12-07 09:32:57 -05:00
Steven Perron
2e4563d94f
Document in the context what happens with id overflow. (#2159)
Added documentation to the ir context to indicates that TakeNextId()
returns 0 when the max id is reached.  TODOs were added to each call
sight so that we know where we have to start to handle this case.

Handle id overflow in |SplitLoopHeader|.

Handle id overflow in |GetOrCreatePreHeaderBlock|.

Handle failure to create preheader in LICM.

Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/1841.
2018-12-06 09:07:00 -05:00
Steven Perron
17cba4695c
Remove undefined behaviour when folding shifts. (#2157)
We currently simulate all shift operations when the two operand are
constants.  The problem is that if the shift amount is larger than
32, the result is undefined.

I'm changing the folder to return 0 if the shift value is too high.
That way, we will have defined behaviour.

https://crbug.com/910937.
2018-12-04 10:04:02 -05:00
alan-baker
e510b1bac5
Update memory model (#1904)
Upgrade to VulkanKHR memory model

* Converts Logical GLSL450 memory model to Logical VulkanKHR
* Adds extension and capability
* Removes deprecated decorations and replaces them with appropriate
flags on downstream instructions
* Support for Workgroup upgrades
* Support for copy memory
* Adding support for image functions
* Adding barrier upgrades and tests
* Use QueueFamilyKHR scope instead of device
2018-11-30 14:15:51 -05:00
Steven Perron
2d2a512691
Don't inline recursive functions. (#2130)
* Move ProcessFunction* function from pass to the context.

There are a few functions that are used to traverse the call tree.
They currently live in the Pass class, but they have nothing to do with
a pass, and may be needed outside of a pass.  They would be better in
the ir context, or in a specific call tree class if we ever have a need
for it.

* Don't inline recursive functions.

Inlining does not check if a function is recursive or not.  This has
been fine as long as the shader was a Vulkan shader, which forbid
recursive functions.  However, not all shaders are vulkan, so either
we limit inlining to Vulkan shaders or we teach it to look for recursive
functions.

I prefer to keep the passes as general as is reasonable.  The change
does not require much new code in inlining and gives a reason to refactor
some other code.

The changes are to add a member function to the Function class that
checks if that function is recursive or not.

Then this is used in inlining to not inlining a function call if it calls
a recursive function.

* Add id to function analysis

There are a few places that build a map from ids to Function whose
result is that id.  I decided to add an analysis to the context for this
to reduce that code, and simplify some of the functions.

* Add missing file.
2018-11-29 14:24:58 -05:00
alan-baker
3d56cddb75
Validate pointer variables (#2111)
Fixes #2104

* Checks the rules for logical addressing and variable pointers
 * Has an out for relaxed logical pointers
* Updated PassFixture to expose validator options
 * enabled relaxed logical pointers for some tests
* New validator tests
2018-11-27 16:47:10 -05:00
Ryan Harrison
d7cd1203a4 Ensure for OpVariable that result type and storage class operand agree (#2052)
From SPIR-V spec, section 3.32.8 on OpVariable:
  Its Storage Class operand must be the same as the Storage Class
  operand of the result type.

Fixes #941
2018-11-16 11:22:11 -05:00
greg-lunarg
c37388f1ad Add passes to propagate and eliminate redundant line instructions (#2027). (#2039)
These are bookend passes designed to help preserve line information
across passes which delete, move and clone instructions. The propagation
pass attaches a debug line instruction to every instruction based on
SPIR-V line propagation rules. It should be performed before optimization.
The redundant line elimination pass eliminates all line instructions
which match the previous line instruction. This pass should be performed
at the end of optimization to reduce physical SPIR-V file size.

Fixes #2027.
2018-11-15 14:06:17 -05:00
Steven Perron
dc9d155d62
Fix folding of volatile store. (#2048)
When looking for the Volatile mask on a store, the instruction folder
accesses an out-of-bounds element.  We fix that up.

Fixes crbug.com/903530.
2018-11-14 13:52:18 -05:00
Steven Perron
ec5574a9c6
Instruction::GetBaseAddress to handle OpPtrAccessChain (#2050)
That function currently only handled OpPtrAccessChain if it was in the
middle of the chain, but not at the start.  Fixing that up.

Fixes crbug.com/905271.
2018-11-14 12:42:25 -05:00
greg-lunarg
1e9fc1aac1 Add base and core bindless validation instrumentation classes (#2014)
* Add base and core bindless validation instrumentation classes

* Fix formatting.

* Few more formatting fixes

* Fix build failure

* More build fixes

* Need to call non-const functions in order.

Specifically, these are functions which call TakeNextId(). These need to
be called in a specific order to guarantee that tests which do exact
compares will work across all platforms. c++ pretty much does not
guarantee order of evaluation of operands, so any such functions need to
be called separately in individual statements to guarantee order.

* More ordering.

* And more ordering.

* And more formatting.

* Attempt to fix NDK build

* Another attempt to address NDK build problem.

* One more attempt at NDK build failure

* Add instrument.hpp to BUILD.gn

* Some name improvement in instrument.hpp

* Change all types in instrument.hpp to int.

* Improve documentation in instrument.hpp

* Format fixes

* Comment clean up in instrument.hpp

* imageInst -> image_inst

* Fix GetLabel() issue.
2018-11-08 13:54:54 -05:00
greg-lunarg
6721478ef1 Don't assume one return means function can be inlined. (#2018) (#2025)
If there is only 1 return and it is in a loop, then the function cannot be inlined.

Fix condition when inlined code needs one-trip loop wrapper.  The dummy loop is needed when there is a return inside a selection construct.  Even if there is only 1 return.
2018-11-08 09:11:20 -05:00
Jeff Bolz
c06a35b902 Rename PCH macro to spvtools_pch to avoid conflicts with other projects. Also add pch to test/opt. (#2034) 2018-11-07 09:15:04 -05:00
Jeff Bolz
60fac96c6b Enable precompiled headers for spirv-tools(-shared) and some unit tests (#2026) 2018-11-06 09:26:23 -05:00
Steven Perron
f2cc71e5cb
Handle OpMemberDecorateStringGOOGLE in ACDE (#2029)
Add missing case to the switch statement for the annotation
instructions.

See https://github.com/KhronosGroup/glslang/issues/1561.
2018-11-02 13:42:45 -04:00
dan sinclair
9e6f5134d1
Reduce number of test targets (#2024)
This CL takes the various opt unit tests and makes a single executable
instead of one per test. This reduces the number of build targets by
~125 when building with ninja.
2018-11-01 10:19:37 -04:00
Steven Perron
6647884a13
Remove MemberDecorateStringGOOGLE during stript-refect. (#2021)
The strip-reflect pass is not removing the reflection decorations that
are decorating members.  With this commit, they will now be removed.

Fixes #2019.
2018-10-30 16:17:35 -04:00
Steven Perron
18fe6d59e5
Fix dead branch elim infinite loop. (#2009)
When looking for a break from a selection construct, we do not realize
that a jump to the continue target of a loop containing the selection
is a break.  This causes and infinit loop, or possibly other failures.

Fixes #2004.
2018-10-24 09:10:30 -04:00
Steven Perron
0ba35798c3
Fix dead branch elim infinite loop. (#1997)
When looking for a break from a selection construct, we do not need to
look inside nested constructs.  However, if a loop header has an
unconditional branch, then we enter the loop.  Entering the loop causes
an infinite loop because we keep going through the loop.

The solution is to look for a merge block, if one exsits, even for block
terminated by an OpBranch.

Fixes #1979.
2018-10-22 13:59:20 -04:00
alan-baker
6e85d1a6fc
Fix restrictions in if conversion (#1998)
Fixes #1991

* Improved identification of potential conditional branches
* Pass changed to only work for shaders
* added a test to catch the bug
2018-10-19 15:16:46 -04:00
Steven Perron
715afb0cea
Add a nullptr check to array copy propagation. (#1987)
We are missing a check for a nullptr that is causing things to fail.

Added an extra test case, and fixed up others.

This is the fix for https://github.com/Microsoft/DirectXShaderCompiler/issues/1598.
2018-10-19 12:53:40 -04:00
greg-lunarg
c4687889b7 Fix ADCE to treat OpUnreachable correctly during liveness analysis (#1984)
ADCE liveness algorithm should treat OpUnreachable at least like other
branch instructions. It was being treated as always live which was
preventing useless structured constructs from being eliminated.
OpUnreachable is generated by dead branch elimination which is now
being required by merge return, so this fix should accompany that
change.
2018-10-19 10:16:35 -04:00
Steven Perron
0e68bb3632
Only run merge-returnon reachable functions. (#1983)
We currently run merge-return on all functions, but
dead-branch-elimination only runs on function reachable from an entry
point or exported function.  Since dead-branch-elimination is needed for
merge-return, they have to match.

Fixes #1976.
2018-10-18 08:48:27 -04:00
greg-lunarg
ab45d69154 Fix ADCE liveness to include all enclosing control structures. (#1975)
Was removing control structures which didn't have data dependency
with enclosed live loop and otherwise did not contain live code.
An example is a counting loop around a live loop.

Fixes #1967.
2018-10-16 08:00:07 -04:00
Nuno Subtil
5bc30788fd Fix gtest.h include in test/opt/pass_utils.h
Fixes builds where googletest is outside the SPIRV-Tools tree.
2018-10-12 10:22:25 -04:00
Steven Perron
82663f34c9
Check for unreachable blocks in merge-return. (#1966)
Merge return assumes that the only unreachable blocks are those needed
to keep the structured cfg valid.  Even those must be essentially empty
blocks.

If this is not the case, we get unpredictable behaviour.  This commit
add a check in merge return, and emits an error if it is not the case.

Added a pass of dead branch elimination before merge return in both the
performance and size passes.  It is a precondition of merge return.

Fixes #1962.
2018-10-10 15:18:15 -04:00
Steven Perron
4e266f775a
Fold divisions by 0. (#1963)
The current implementation in the folder when seeing a division by zero
is to assert.  In the release build, the compiler will attempt to
compute the value, which causes its own problems.

The solution I will go with is to fold the division, and just give it
the value of 0.  The same goes for remainder and mod operations.

Fixes #1961.
2018-10-10 11:17:26 -04:00
alan-baker
fae1e61ab8
Fix bug in construct block calculation (#1964)
Fixes #1960

* Only allows blocks that are dominated by the header
* Fixed a bad loop fusion test
* Added a test derived from the reported bug
2018-10-10 11:14:01 -04:00