Commit Graph

584 Commits

Author SHA1 Message Date
Steven Perron
99c2c21cf4
Fix memory leak in unrolling. (#2301)
During unrolling a new loop is created, but its ownership is not clear
as it gets passed through the code. Changed something to unique_ptr to
make that clearer.

Fixes #2299.

Fixing other memory leaks at the same time.

Fixes #2296
Fixes #2297
2019-01-17 16:02:43 -05:00
Steven Perron
dd4157dcee
Sink (#2284)
Add code sinking pass. It will move OpLoad and OpAccessChain instructions as close as possible to their uses.

Part of #1611.
2019-01-17 15:56:36 -05:00
greg-lunarg
8d2d66f30c Fix vertex instrumentation to use VertexIndex and InstanceIndex (#2294)
...instead of VertexId and InstanceId
2019-01-16 18:02:07 -05:00
Steven Perron
49b5b0abc6
Fix up bit shifts by 32. (#2292)
In C++, a bit shift of the same size as the type is undefined, but it is
defined in spir-v.  When folding those cases, we have to be careful.  We
cannot simply do the shift in C++.

Fixes https://crbug.com/917697.
2019-01-16 15:52:23 -05:00
greg-lunarg
83bfdc976a Instrumentation: Add ArrayStride decoration to debug output buffer array (#2290) 2019-01-16 10:01:40 -05:00
alan-baker
06c9dc07bd
Upgrade modf and frexp (#2266)
Fixes #2138

* Modf and frexp are upgraded to use the struct version of the
instruction and generate an explicit store whose flags can be upgraded
separately
* Fixed major bug where availability and visibility were reversed for
non-copy memory instructions
* Fixed bug where availability and visibility scope operands were reversed for copy memory
* Upgraded all opt tests to use SPV_ENV_UNIVERSAL_1_3
* Upgrade tests moved into unified tests and removed standalone test
2019-01-07 12:36:38 -05:00
Steven Perron
241644a5a3
Have replace load size handle extact with no index. (#2261)
Fixes https://crbug.com/917774
2019-01-03 13:02:10 -05:00
Steven Perron
9f36c8bb72
Handle CompositeInsert with no indices in VDCE (#2258)
* Handle CompositeInsert with no indices in VDCE

In the spec, there it nothing that forces an OpCompositeInsert to have
an index, but VDCE assumes there is at least 1 in a couple places.

This commit updates VDCE to handle these cases.
2019-01-02 14:00:04 -05:00
Steven Perron
bdc2ab9356
In LICM don't place code between merge instruction and branch. (#2252)
Fixes #2210.
2018-12-20 18:33:52 -05:00
Steven Perron
c2013e248b
Make the constant and type manager analyses. (#2250)
Currently it is impossible to invalidate the constnat and type manager.
However, the compact ids pass changes the ids for the types and
constants, which makes them invalid.  This change will make them
analyses that have to been explicitly marked as preserved by passes.
This will allow compact ids to invalidate them.

Fixes #2220.
2018-12-20 18:00:05 +00:00
kholtnv
e49bd96f2c Added additional changes for the new AccelerationStructureNV type. (#2218)
* Added additional changes for the new AccelerationStructureNV type.

* Added additional changes for the new AccelerationStructureNV type.  Change tabs to space...

* Added additional changes for the new accelerationStructureNV type -- add proper type name.

Fix TypeManager.TypeStrings test:
[----------] 29 tests from TypeManager
[ RUN      ] TypeManager.TypeStrings
[       OK ] TypeManager.TypeStrings (7 ms)
2018-12-19 21:42:39 +00:00
Steven Perron
68b69e16aa
Update the continue target in merge return. (#2249)
When we are predicating the continue target for a loop, it can no longer
be the continue target because it will have a branch that exits the loop
and is not the bach edge.  The continue target will have to be the
target of that branch that is still in the loop.

Fixes #2211.
2018-12-19 21:24:49 +00:00
Steven Perron
ac7feace90
Fix missing OpPhi after merge return. (#2248)
The function `UpdatePhiNodes` was being called inconsistently.  In one
case, the cfg had already been updated to include the new edge, and in
another place the cfg was not updated.  This caused the function to
miss flagging a block as needing new phi nodes.  I picked that the cfg
should not be updated before making the call.  I documented it, and
change the call sites to match.

Fixes #2207.
2018-12-19 18:17:42 +00:00
Steven Perron
9d04f82bef
Ensure SROA gets the correct pointer type. (#2247)
We initially assumed that if the type manager returned the correct id
for the pointee type, that we would get the correct pointer type back,
but that is not true.  See the unit test added with this commit.  We
need to fall back to the linear search any time we are looking for a
pointer to a type that may not be unique.

At the same time, SROA considered an OpName on a variable to be a use of
the entire variable.  That has been fixed.

Fixes #2209.
2018-12-19 17:07:29 +00:00
Steven Perron
9e81c337f9
Place load after OpPhi instructions in block. (#2246)
We currently place the load instructions at the start of the basic block
that dominates all of the loads.  If that basic block contains OpPhi
instructions, then this will generate invalid code.  We just need to
search for a location that comes after all of the OpPhi instructions.

Fixes #2204.
2018-12-19 15:18:22 +00:00
Steven Perron
5ec2d1a8cd
Don't fold specialized branches in loop unswitch (#2245)
* Don't fold specialized branchs in loop unswitch

Folding branches can have a lot of special cases, and can be a little
error prone.  So I only want it in one place.  That will be in dead
branch elimination.  I will change loop unswitching to set the branches
that were being folded to have a constant condition.  Then subsequent
pass of dead branch elimination will be able to remove the code.

At the same time, I added a check that loop unswitching will not
unswitch a branch with a constant condition.  It is not useful to do it
because dead branch elimination will simple fold the branch anyway.
Also it avoid an infinite loop that would other wise be introduced by my
first change.

Fixes #2203.
2018-12-19 04:40:30 +00:00
Ryan Harrison
47c08a79c4
Implement initial --webgpu-mode flag (#2217)
Fixes #2166
2018-12-18 15:10:34 -05:00
Steven Perron
acd2781952
Handle id overflow in inlining. (#2196)
Have inlining return Failure if the ids overflow.

Part of #1841.
2018-12-18 19:34:03 +00:00
Steven Perron
1254335d13
Don't unswitch the latch block. (#2205)
Loop unswitching is unswitching the conditional branch that creates the
back-edge. In the version of the loop, where the bachedge is not taken,
there is no back-edge. This is what causes the validator to complain.

The solution I will go with will be to now unswitch a condition with a
back-edge. At this time we do not now if loop unswitching is used. We do
not include it in the optimization sets provided, nor is it used in
glslang's set. When there are opportunities and no breaks from the loop,
the loop with either be a single iteration loop, or an infinite loop.
There is no performance advantage to performing loop unswitching in
either of those cases. If there is a break, maintaining structured
control flow will be tricky. Unless we see a clear advantage to handling
these case, I would go with the safer simpler solution.

Fixes #2201.
2018-12-18 18:15:00 +00:00
Steven Perron
ff07c6df83
SSA-rewriter: make sure phi entries are unique. (#2206)
If there are multiple edges to a basic block, then the ssa rewriter will
create OpPhi instructions with duplicate entries.  This is invalid, and
it is fixed in this commit.

Fixes #2202.
2018-12-18 18:14:27 +00:00
Ryan Harrison
e0292c269d
Add --target-env flag to spirv-opt (#2216)
Fixes #2199
2018-12-17 16:54:23 -05:00
Jeff Bolz
24328a0554 Recognize OpTypeAccelerationStructureNV as a type instruction (#2190) 2018-12-11 19:03:55 -05:00
Steven Perron
e07dabc25f
Invalidate the decoration manager at the start of ADCE. (#2189)
* Invalidate the decoration manager at the start of ADCE.

If the decoration manager is kept live the the contex will try to keep
it up to date.  ADCE deals with group decorations by changing the
operands in |OpGroupDecorate| instructions directly without informing
the decoration manager.  This puts it in an invalid state, which will
cause an error when the context tries to update it.  To Avoid this
problem, we will invalidate the decoration manager upfront.

At the same time, the decoration manager is now considered when checking
the consistency of the decoration manager.
2018-12-10 13:24:33 -05:00
Steven Perron
0bc66a8ba9
Fix invalid OpPhi generated by merge-return. (#2172)
* Fix invalid OpPhi generated by merge-return.

When we create a new phi node for a value say %10, we have to replace
all of the uses of %10 that are no longer dominated by the def of %10
by the result id of the new phi.  However, if the use is in a phi node,
it is possible that the bb contains the use is not dominated by either.
In this case, needs to be handled differently.

* Split loop headers before add a new branch to them.

In merge return, Phi node in loop header that are also merges for loop
do not get updated correctly.  Those cases do not fit in with our
current analysis.  Doing this will simplify the code by reducing the
number of cases that have to be handled.
2018-12-07 14:10:30 -05:00
Steven Perron
2e4563d94f
Document in the context what happens with id overflow. (#2159)
Added documentation to the ir context to indicates that TakeNextId()
returns 0 when the max id is reached.  TODOs were added to each call
sight so that we know where we have to start to handle this case.

Handle id overflow in |SplitLoopHeader|.

Handle id overflow in |GetOrCreatePreHeaderBlock|.

Handle failure to create preheader in LICM.

Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/1841.
2018-12-06 09:07:00 -05:00
Steven Perron
17cba4695c
Remove undefined behaviour when folding shifts. (#2157)
We currently simulate all shift operations when the two operand are
constants.  The problem is that if the shift amount is larger than
32, the result is undefined.

I'm changing the folder to return 0 if the shift value is too high.
That way, we will have defined behaviour.

https://crbug.com/910937.
2018-12-04 10:04:02 -05:00
alan-baker
e510b1bac5
Update memory model (#1904)
Upgrade to VulkanKHR memory model

* Converts Logical GLSL450 memory model to Logical VulkanKHR
* Adds extension and capability
* Removes deprecated decorations and replaces them with appropriate
flags on downstream instructions
* Support for Workgroup upgrades
* Support for copy memory
* Adding support for image functions
* Adding barrier upgrades and tests
* Use QueueFamilyKHR scope instead of device
2018-11-30 14:15:51 -05:00
Steven Perron
2d2a512691
Don't inline recursive functions. (#2130)
* Move ProcessFunction* function from pass to the context.

There are a few functions that are used to traverse the call tree.
They currently live in the Pass class, but they have nothing to do with
a pass, and may be needed outside of a pass.  They would be better in
the ir context, or in a specific call tree class if we ever have a need
for it.

* Don't inline recursive functions.

Inlining does not check if a function is recursive or not.  This has
been fine as long as the shader was a Vulkan shader, which forbid
recursive functions.  However, not all shaders are vulkan, so either
we limit inlining to Vulkan shaders or we teach it to look for recursive
functions.

I prefer to keep the passes as general as is reasonable.  The change
does not require much new code in inlining and gives a reason to refactor
some other code.

The changes are to add a member function to the Function class that
checks if that function is recursive or not.

Then this is used in inlining to not inlining a function call if it calls
a recursive function.

* Add id to function analysis

There are a few places that build a map from ids to Function whose
result is that id.  I decided to add an analysis to the context for this
to reduce that code, and simplify some of the functions.

* Add missing file.
2018-11-29 14:24:58 -05:00
Alastair Donaldson
3b13040cf9 New spirv-reduce reduction pass: operand to dominating id. (#2099)
* Added a reduction pass to replace ids with ids of the same type that dominate them.
* Introduce helper method for querying whether an operand type is an input id.
2018-11-26 17:06:21 -05:00
Daniel Koch
3b210d6a63 Add basic support for EXT_fragment_invocation_density (#2100)
Whitelisting the extension in optimizations
* copying what was done for NV_shading_rate
2018-11-23 10:21:19 -05:00
dan sinclair
15fdcf94d7 Add missing override to ProcessLinesPass 2018-11-19 19:24:48 -05:00
greg-lunarg
c37388f1ad Add passes to propagate and eliminate redundant line instructions (#2027). (#2039)
These are bookend passes designed to help preserve line information
across passes which delete, move and clone instructions. The propagation
pass attaches a debug line instruction to every instruction based on
SPIR-V line propagation rules. It should be performed before optimization.
The redundant line elimination pass eliminates all line instructions
which match the previous line instruction. This pass should be performed
at the end of optimization to reduce physical SPIR-V file size.

Fixes #2027.
2018-11-15 14:06:17 -05:00
Greg Fischer
d4a10590b7 Fix Instruction::IsFloatingPointFoldingAllowed()
Was looking for decorations based on opcode. Should use result_id.
2018-11-14 15:25:51 -07:00
Steven Perron
dc9d155d62
Fix folding of volatile store. (#2048)
When looking for the Volatile mask on a store, the instruction folder
accesses an out-of-bounds element.  We fix that up.

Fixes crbug.com/903530.
2018-11-14 13:52:18 -05:00
Steven Perron
a6150a3fe7
Don't assert on void function parameters. (#2047)
The type manager in spirv-opt currently asserts if a function parameter
has type void.  It is not exactly clear from the spec that this is
disallowed, even if it probably will be disallowed.  In either case,
asserts should be used to verify assumptions that will actually make a
difference to the code.  As far as the optimizer is concerned, a void
parameter does not matter.  I don't see the point of the assert.  I'll
just remove it and let the validator decide whether to accept it or not.

No test was added because it is not clear that it is legal, and should
not force us to accept it in the future unless the spec make it clear
that it is legal.

Fixes crbug.com/903088.
2018-11-14 12:43:43 -05:00
Steven Perron
ec5574a9c6
Instruction::GetBaseAddress to handle OpPtrAccessChain (#2050)
That function currently only handled OpPtrAccessChain if it was in the
middle of the chain, but not at the start.  Fixing that up.

Fixes crbug.com/905271.
2018-11-14 12:42:25 -05:00
dan sinclair
f343a15764
Add missing overrides (#2041) 2018-11-12 15:11:32 -05:00
greg-lunarg
1e9fc1aac1 Add base and core bindless validation instrumentation classes (#2014)
* Add base and core bindless validation instrumentation classes

* Fix formatting.

* Few more formatting fixes

* Fix build failure

* More build fixes

* Need to call non-const functions in order.

Specifically, these are functions which call TakeNextId(). These need to
be called in a specific order to guarantee that tests which do exact
compares will work across all platforms. c++ pretty much does not
guarantee order of evaluation of operands, so any such functions need to
be called separately in individual statements to guarantee order.

* More ordering.

* And more ordering.

* And more formatting.

* Attempt to fix NDK build

* Another attempt to address NDK build problem.

* One more attempt at NDK build failure

* Add instrument.hpp to BUILD.gn

* Some name improvement in instrument.hpp

* Change all types in instrument.hpp to int.

* Improve documentation in instrument.hpp

* Format fixes

* Comment clean up in instrument.hpp

* imageInst -> image_inst

* Fix GetLabel() issue.
2018-11-08 13:54:54 -05:00
greg-lunarg
6721478ef1 Don't assume one return means function can be inlined. (#2018) (#2025)
If there is only 1 return and it is in a loop, then the function cannot be inlined.

Fix condition when inlined code needs one-trip loop wrapper.  The dummy loop is needed when there is a return inside a selection construct.  Even if there is only 1 return.
2018-11-08 09:11:20 -05:00
Jeff Bolz
c06a35b902 Rename PCH macro to spvtools_pch to avoid conflicts with other projects. Also add pch to test/opt. (#2034) 2018-11-07 09:15:04 -05:00
Jeff Bolz
60fac96c6b Enable precompiled headers for spirv-tools(-shared) and some unit tests (#2026) 2018-11-06 09:26:23 -05:00
Steven Perron
f2cc71e5cb
Handle OpMemberDecorateStringGOOGLE in ACDE (#2029)
Add missing case to the switch statement for the annotation
instructions.

See https://github.com/KhronosGroup/glslang/issues/1561.
2018-11-02 13:42:45 -04:00
Jeff Bolz
fb996dce75 Add /Zm flag as a workaround for VS2013 build (#2023) 2018-10-31 07:59:43 -04:00
Steven Perron
6647884a13
Remove MemberDecorateStringGOOGLE during stript-refect. (#2021)
The strip-reflect pass is not removing the reflection decorations that
are decorating members.  With this commit, they will now be removed.

Fixes #2019.
2018-10-30 16:17:35 -04:00
alelenv
1c1e749f0b Add support for nv-raytracing-final (#2010)
Add support for nv-raytracing (non-experimental)
2018-10-25 14:07:46 -04:00
Steven Perron
18fe6d59e5
Fix dead branch elim infinite loop. (#2009)
When looking for a break from a selection construct, we do not realize
that a jump to the continue target of a loop containing the selection
is a break.  This causes and infinit loop, or possibly other failures.

Fixes #2004.
2018-10-24 09:10:30 -04:00
Steven Perron
0ba35798c3
Fix dead branch elim infinite loop. (#1997)
When looking for a break from a selection construct, we do not need to
look inside nested constructs.  However, if a loop header has an
unconditional branch, then we enter the loop.  Entering the loop causes
an infinite loop because we keep going through the loop.

The solution is to look for a merge block, if one exsits, even for block
terminated by an OpBranch.

Fixes #1979.
2018-10-22 13:59:20 -04:00
alan-baker
6e85d1a6fc
Fix restrictions in if conversion (#1998)
Fixes #1991

* Improved identification of potential conditional branches
* Pass changed to only work for shaders
* added a test to catch the bug
2018-10-19 15:16:46 -04:00
Jeff Bolz
dd1e837e1c Use per-configuration location for pch file (#1989) 2018-10-19 14:58:26 -04:00
Steven Perron
715afb0cea
Add a nullptr check to array copy propagation. (#1987)
We are missing a check for a nullptr that is causing things to fail.

Added an extra test case, and fixed up others.

This is the fix for https://github.com/Microsoft/DirectXShaderCompiler/issues/1598.
2018-10-19 12:53:40 -04:00
greg-lunarg
c4687889b7 Fix ADCE to treat OpUnreachable correctly during liveness analysis (#1984)
ADCE liveness algorithm should treat OpUnreachable at least like other
branch instructions. It was being treated as always live which was
preventing useless structured constructs from being eliminated.
OpUnreachable is generated by dead branch elimination which is now
being required by merge return, so this fix should accompany that
change.
2018-10-19 10:16:35 -04:00
Steven Perron
0e68bb3632
Only run merge-returnon reachable functions. (#1983)
We currently run merge-return on all functions, but
dead-branch-elimination only runs on function reachable from an entry
point or exported function.  Since dead-branch-elimination is needed for
merge-return, they have to match.

Fixes #1976.
2018-10-18 08:48:27 -04:00
greg-lunarg
ab45d69154 Fix ADCE liveness to include all enclosing control structures. (#1975)
Was removing control structures which didn't have data dependency
with enclosed live loop and otherwise did not contain live code.
An example is a counting loop around a live loop.

Fixes #1967.
2018-10-16 08:00:07 -04:00
Jeff Bolz
339d23275d Enable precompiled headers for MSVC (#1969) 2018-10-15 11:12:02 -04:00
greg-lunarg
e545564887 Consider atomics that load when analyzing live stores in ADCE (#1956) (#1958)
Consider atomics that load when analyzing live stores in ADCE.

Previously it asserted that the base of an OpImageTexelPointer should
be an image. It is actually a pointer to an image, so IsValidBasePointer
should suffice.
2018-10-12 08:46:35 -04:00
Steven Perron
82663f34c9
Check for unreachable blocks in merge-return. (#1966)
Merge return assumes that the only unreachable blocks are those needed
to keep the structured cfg valid.  Even those must be essentially empty
blocks.

If this is not the case, we get unpredictable behaviour.  This commit
add a check in merge return, and emits an error if it is not the case.

Added a pass of dead branch elimination before merge return in both the
performance and size passes.  It is a precondition of merge return.

Fixes #1962.
2018-10-10 15:18:15 -04:00
Steven Perron
4e266f775a
Fold divisions by 0. (#1963)
The current implementation in the folder when seeing a division by zero
is to assert.  In the release build, the compiler will attempt to
compute the value, which causes its own problems.

The solution I will go with is to fold the division, and just give it
the value of 0.  The same goes for remainder and mod operations.

Fixes #1961.
2018-10-10 11:17:26 -04:00
Steven Perron
497958d899 Removing HLSLCounterBuffer decorations when not needed. (#1954)
The HlslCounterBufferGOOGLE that was introduced changed the OpDecorateId
so that is can now reference an id other than the target.  If that other
id is used only in the decoration, then the definition of the id will be
removed because decoration do not count as real uses.

However, if the target of the decoration is still live the decoration
will not be removed.  This leaves a reference to an id that is not
defined.

There are two solutions to consider.  The first is that is the decoration
is kept, then the definition of the id should be kept live.  Implementing
this change would be involved because the way ADCE handles decorations
will have to be reimplemented.

The other solution is to remove the decoration the id is otherwise dead.
This works for this specific case.  Also this is the more desirable
behaviour in this case.  The id will always be the id of a variable that
belongs to a descriptor set.  If that variable is not bound and we do
not remove it, the driver will complain.

I chose to implement the second solution.  The first will be left to when
a case for it comes up.

Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1885.
2018-10-05 08:23:09 -04:00
Alan Baker
3b5960174f Don't scalarize spec constant sized arrays
Fixes #1952

* Prevent scalarization of arrays that are sized by a specialization
constant
2018-10-04 11:58:23 -04:00
Steven Perron
146eb3bdcf
Fix erroneous uses of the type manager in copy-prop-arrays. (#1942)
There are a few spots where copy propagate arrays is trying
to go from a Type to an id, but the type is not unique.  When generating
code this pass needs specific ids, otherwise we get type mismatches.
However, the ambigous types means we can sometimes get the wrong type
and generate invalid code.

That code has been rewritten to not rely on the type manager, and just
look at the instructions instead.

I have opened https://github.com/KhronosGroup/SPIRV-Tools/issues/1939 to
try to get a way to make this more robust.
2018-10-01 14:45:44 -04:00
Jeff Bolz
fe90a1d2dc Enable /MP4 (parallel build across 4 cores for MSVC) for SPIRV-Tools/source[/opt] (#1930) 2018-10-01 10:47:39 -04:00
Steven Perron
ddc705933d
Analyze uses for all instructions. (#1937)
* Analyze uses for all instructions.

The def-use manager needs to fill in the `inst_to_used_ids_` field for
every instruction.  This means we have to analyze the uses for every
instruction, even if they do not have any uses.

This mistake was not found earlier because there was a typo in the
equality check for def-use managers.  No new tests are needed.

While looking into this I found redundant work in block merge.  Cleaning
that up at the same time.

* Fix other transformations

Aggressive dead code elimination did not update the OpGroupDecorate
and the OpGroupMemberDecorate instructions properly when they are
updated.  That is fixed.

Dead branch elimination did not analyze the OpUnreachable instructions
that is would add.  That is taken care of.
2018-09-28 14:39:06 -04:00
Steven Perron
32381e30ef
Handle decoration groups with no decorations. (#1921)
In DecorationManager::RemoveDecorationsFrom, we do not remove the id
from a decoration group if the group has no decorations.  This causes
problems because KillNamesAndDecorates is suppose to remove all
references to the id, but in this case, there is still a reference.

This is fixed by adding a special case.

Also, there is the possibility of a double free because
RemoveDecorationsFrom will delete the instructions defining |id| when
|id| is a decoration group.  Later, KillInst would later write to memory
that has been deleted when trying to turn it into a Nop.  To fix this,
we will only remove the decorations that use |id| and not its definition
in RemoveDecorationsFrom.
2018-09-28 14:16:04 -04:00
Steven Perron
80564a56ec
Keep analyses live in unrolling (#1929)
Add code to keep the def-use manger and the inst-to-block mapping up-to-date. This means we do not have to rebuild them later.

To make this work, we will have to have to find places to update the
def-use manager. Updating the def-use manager is not straight forward
because we are unrolling loops, and we have circular references.

This forces one pass to register all of the definitions. A second one
to analyze the uses. Also because there will be references to the new
instructions in the old code, we want to register the definitions of the
new instructions early, so we can update the uses of the older code as
we go along.

The inst-to-block mapping is not too difficult. It can be done as instructions are created.

Fixes #1928.
2018-09-26 17:36:27 -04:00
Steven Perron
0e5fc7d75e
Allow 0 as argument to scalar replacement. (#1917)
A limit of 0 for the scalar replacement options it used to indicate that
there is no limit.  The current implementation does not allow 0.  This
should be fixed.
2018-09-26 09:58:28 -04:00
Steven Perron
b85fb4a300
Get KillNameAndDecorates to handle group decorations. (#1919)
It seems like the current implementation of KillNameAndDecorates does
not handle group decorations correctly.  The id being removed is not
removed from the OpGroupDecorate instructions.  Even worst, any
decorations that apply to that group are removed.

The solution is to use the function in the decoration manager that will
remove the decorations and update the instructions instead of doing the
work itself.
2018-09-25 12:57:44 -04:00
Chao Chen
6e2dab2ffd Add support for Nvidia Turing extensions 2018-09-19 20:46:14 -04:00
Steven Perron
9fbcce4ca1
Add unrolling to the legalization passes (#1903)
Adds unrolling to the legalization passes.

After enabling unrolling I found a bug when there is a self-referencing
phi node.  That has been fixed.

The test that checks for that the order of optimizations is correct also
needed to be updated.
2018-09-19 16:40:09 -04:00
Steven Perron
7075c49923
Add dummy loop in merge-return. (#1896)
The current implementation of merge return can create bad, but correct,
code.  When it is not in a loop construct, it will insert a lot of
extra branch around code.  The potentially large number of branches are
bad.  At the same time, it can separate code store to variables from
its uses hiding the fact that the store dominates the load.

This hurts the later analysis because the compiler thinks that multiple
values can reach a load, when there is really only 1.  This poorer
analysis leads to missed optimizations.

The solution is to create a dummy loop around the entire body of the
function, then we can break from that loop with a single branch.  Also
only new merge nodes would be those at the end of loops meaning that
most analysies will not be hurt.

Remove dead code for cases that are no longer possible.

It seems like some drivers expect there the be an OpSelectionMerge
before conditional branches, even if they are not strictly needed.
So we add them.
2018-09-18 08:52:47 -04:00
Steven Perron
5f599e700e
Fix infinite loop in dead-branch-elimination (#1891)
* Create structed cfg analysis.

There are lots of optimization that have to traverse the CFG in a
structured order just because it wants to know which constructs a
basic block in contained in.  This adds extra complexity to these
optimizations, for causes too much refactoring of older optimizations.

To help with this problem, I have written an analysis that can give this
information.

* Identify branches breaking from loops.

Dead branch elimination does a search for a conditional branch to the
end of the current selection construct.  This search assumes that the
only way to leave the construct is through the merge node.  But that is
not true.  The code can jump to the merge node of a loop that contains
the construct.

The search needs to take this into consideration.
2018-09-17 13:00:24 -04:00
Diego Novillo
4a4632264e Add IR dumping functions to use during debugging.
When using lldb and/or gdb I frequently get odd std::string failures
when using the IR printing instructions we have now.  This adds the
methods  Instruction::Dump(), BasicBlock::Dump() and Function::Dump() to
emit the output of the pretty print to stderr.

With this I can now reliably print IR from gdb and lldb sessions.
2018-09-14 14:28:34 -04:00
Steven Perron
6d5f1bc2e8
Allow merge blocks to merge two header blocks in some cases. (#1890)
In merge blocks, we do not allow the merging of two blocks with merge
instructions.  This is because if the two block are merged only 1 of
those instructions can exists.  However, if the successor block is the
merge block of the predecessor, then we can delete the merge instruction
in the predecessor.  In this case, we are able to merge the blocks.
2018-09-14 13:37:18 -04:00
Steven Perron
75c1bf2843
Add option for the max id bound. (#1870)
* Create a new entry point for the optimizer

Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.

The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.

Part of the optimizer options will also be the upper bound on the id bound.

* Add a command line option to set the max value for the id bound.  The default is 0x3FFFFF.

* Modify `TakeNextIdBound` to return 0 when the limit is reached.
2018-09-10 11:49:41 -04:00
Steven Perron
416b1ab4f3
Have the constant manager take ownership of constants. (#1866)
* Have the constant manager take ownership of constants.

Right now the owner of an object of type contant that is in the
|const_pool_| of the constant manager is unclear.  The constant
manager does not delete them, there is no other reasonable owner.  This
causes memory leaks.

This change fixes the memory leaks by having the constant manager
take ownership of the constant that is stores in |const_pool_|.  Other
changes include interface changes to make it explicit that the constant
manager takes ownership of the object when a constant is registered
with the constant manager.

Fixes #1865.
2018-08-27 09:53:47 -04:00
Steven Perron
47ee776a2c Revert "Have the constant manager take ownership of constants."
This reverts commit b938b74bac.
2018-08-24 15:12:49 -04:00
Steven Perron
b938b74bac Have the constant manager take ownership of constants.
Right now the owner of an object of type contant that is in the
|const_pool_| of the constant manager is unclear.  The constant
manager does not delete them, there is no other reasonable owner.  This
causes memory leaks.

This change fixes the memory leaks by having the constant manager
take ownership of the constant that is stores in |const_pool_|.  Other
changes include interface changes to make it explicit that the constant
manager takes ownership of the object when a constant is registered
with the constant manager.
2018-08-24 15:08:12 -04:00
Steven Perron
d746681fe9
Copy decorations when creating new ids. (#1843)
* Copy decorations when creating new ids.

When creating a new value based on an old value, we need to copy the
decorations to the new id.  This change does this in 3 places:

1) The variable holding the return value of the function generated by
merge return should get decorations from the function.

2) The results of the OpPhi instructions should get decorations from the
variable they are replacing in the ssa writer.

3) In local access chain convert the intermediate struct (result of
OpCompositeInsert) generated for the store replacement should get its
decorations from the variable being stored to.

Fixes #1787.
2018-08-24 11:55:39 -04:00
Steven Perron
b4d3618f77
Don't "break" from selection constructs. (#1862)
If seems like at least 1 driver does not like a condition jump to the end
of a selection construct.  We are generating these in the merge return
pass.  This change stops merge return from generating this sequence.

Part of #1861.
2018-08-23 14:38:25 -04:00
Steven Perron
6c73b1fb70
Update the order when predicating blocks. (#1859)
When doing predicate blocks, we need to traverse every block in
structured order in order to keep track of which construct a block is
contained in.  The standard way of traversing code in structured order
is to create a list with all of the nodes in order.  However, when
predicating blocks, new blocks are created, and those blocks are missed.
This causes branches that go too far.

The solution is to update the order as new blocks are created.  Since
we are using an std::list, we do not have to worry about invalidation of
iterators when changing the list.
2018-08-23 12:59:31 -04:00
Steven Perron
d91d34e150
Fix VS2013 build break. (#1853) 2018-08-21 13:50:47 -04:00
Steven Perron
19264ef42c
Have PredicateBlocks jump the existing merge blocks. (#1849)
* Refactor PredicateBlocks

Refactor PredicateBlocks so that we know which constructs a return
is contained in.  Will be used later.

* Have PredicateBlocks jump the existing merge blocks.

In PredicateBlocks, we currently skip instructions with side effects,
but it still follows the same control flow (sort-of).  This causes a
problem, when we are trying to predicate code in a loop.  We skip all
of the code with side effects (IV increment), but still follow the
same control flow (jump back the start of the loop).  This creates an
infinite loop because the code will keep jumping back to the start of
the loop without changing the values that effect the exit condition.

This is a large change to merge-return.  When predicating a block that
is in a loop or merge construct, it will jump to the merge block of the
construct.  Once out of all constructs we will generate code as we did
before.
2018-08-21 12:04:08 -04:00
Steven Perron
d693a83e36
Handle breaks from structured-ifs in DCE. (#1848)
* Handle breaks from structured-ifs in DCE.

dead code elimination assumes that are conditional branches except for
breaks and continues in loops will have an OpSelectionMerge before them.
That is not true when breaking out of a selection construct.

The fix is to look for breaks in selection constructs in the same place
we look for breaks and continues for loops.
2018-08-21 11:54:44 -04:00
Steven Perron
45c235d41f
Have dead-branch-elim handle conditional exits from selections. (#1850)
When dead-branch-elim folds a conditional branch, it also deletes the
OpSelectionMerge instruction.  If that construct contains a
conditional branch to the merge node, it will not have its own
OpSelectionMerge.  When the headers merge instruction is deleted, the
the inner conditional branch will no longer be legal.  It will be a
selection to a node that is not a merge node.

We fix this up by moving the OpSelectionMerge to a new location if it is
still needed.
2018-08-21 11:49:56 -04:00
Diego Novillo
03000a3a38 Add testing framework for tools.
This forks the testing harness from https://github.com/google/shaderc
to allow testing CLI tools.

New features needed for SPIRV-Tools include:

1- A new PlaceHolder subclass for spirv shaders.  This place holder
   calls spirv-as to convert assembly input into SPIRV bytecode. This is
   required for most tools in SPIRV-Tools.

2- A minimal testing file for testing basic functionality of spirv-opt.

Add tests for all flags in spirv-opt.

1. Adds tests to check that known flags match the names that each pass
   advertises.
2. Adds tests to check that -O, -Os and --legalize-hlsl schedule the
   expected passes.
3. Adds more functionality to Expect classes to support regular
   expression matching on stderr.
4. Add checks for integer arguments to optimization flags.
5. Fixes #1817 by modifying the parsing of integer arguments in
   flags that take them.
6. Fixes -Oconfig file parsing (#1778). It reads every line of the file
   into a string and then parses that string by tokenizing every group of
   characters between whitespaces (using the standard cin reading
   operator).  This mimics shell command-line parsing, but it does not
   support quoting (and I'm not planning to).
2018-08-17 15:03:14 -04:00
Steven Perron
e065cc208f
Keep decorations when replacing loads in access-chain-convert. (#1829)
In local-access-chain-convert, we replace loads by load the entire
variable, then doing the extract.  The extract will have the same value
as the load.  However, if the load has a decoration on it, the
decoration is lost because we do not copy any them to the new id.

This is fixed by rewritting the load into the extract and keeping the
same result id.

This change has the effect that we do not call DCEInst on the loads
because the load is not being deleted, but replaced.  This could leave
OpAccessChain instructions around that are not used.  This is not a
problem for -O and -Os.  They run local_single_*_elim passes and then
dead code elimination.  The dce will remove the unused access chains,
and the load elimination passes work even if there are unused access
chains.  I have added test to them to ensure they will not loss
opportunities.

Fixes #1787.
2018-08-15 09:14:21 -04:00
dan sinclair
1963a2dbda
Use MakeUnique. (#1837)
This CL replaces instances of reset(new ..) with MakeUnique.
2018-08-14 15:01:50 -04:00
dan sinclair
1553025f4c
Move make_unique to source/util. (#1836)
This MakeUnique code is used in places other then source/opt so move it
to source/utils.
2018-08-14 12:44:54 -04:00
Steven Perron
bf24d9b4ac
Don't copy decorations twice when rebuilding a type. (#1835)
In `TypeManager::RebuildType`, the base cases call `Clone`, which will
copy the decorations for the type.  After that it breaks out of the
switch statement and copies the decorations again.

This has not causes any real problems yet because none of those types
are allowed to have decorations.  However to make the code more robust
it is best to not copy twice because it should be empty.

This way if a new base type or decoration is added that changes this
rule the code will be correct.
2018-08-14 11:26:14 -04:00
Steven Perron
bcb0b6935c
Reenable --skip-validation. (#1820)
In previous changes, the option `--skip-validation` was disabled.  This
change is to reenable it.
2018-08-13 13:18:46 -04:00
Steven Perron
5c8b4f5a1c
Validate the input to Optimizer::Run (#1799)
* Run the validator in the optimization fuzzers.

The optimizers assumes that the input to the optimizer is valid.  Since
the fuzzers do not check that the input is valid before passing the
spir-v to the optimizer, we are getting a few errors.

The solution is to run the validator in the optimizer to validate the
input.

For the legalization passes, we need to add an extra option to the
validator to accept certain types of variable pointers, even if the
capability is not given.  At the same time, we changed the option
"--legalize-hlsl" to relax the validator in the same way instead of
turning it off.
2018-08-08 11:16:19 -04:00
dan sinclair
9991d661f8
Fix readbility/braces warnings (#1804) 2018-08-07 09:09:47 -04:00
dan sinclair
eda2cfbe12
Cleanup includes. (#1795)
This Cl cleans up the include paths to be relative to the top level
directory. Various include-what-you-use fixes have been added.
2018-08-03 15:06:09 -04:00
dan sinclair
58a6876cee
Rewrite include guards (#1793)
This CL rewrites the include guards to make PRESUBMIT.py include guard
check happy.
2018-08-03 08:05:33 -04:00
Steven Perron
ce644d4a24
Update OpPhi instructions after splitting block. (#1783)
In the merge return pass, we will split a block, but not update the phi
instructions that reference the block.  Since the branch in the original
block is now part of the block with the new id, the phi nodes must be
updated.

This commit will change this.

I have also considered other places where an id of a basic block could
be referenced, and I don't think any of them need to change.

1) Branch and merge instructions: These jump to the start of the
original block, and so we want them to jump to the block that uses the
original id.  Nothing needs to change.

2) Names and decorations: I don't think it matters with block keeps the
name, and there are no decorations that apply to basic blocks.

Fixes #1736.
2018-08-02 11:02:50 -04:00
dan sinclair
a5a5ea0e2d
Remove using std::<foo> statements. (#1756)
Many of the files have using std::<foo> statements in them, but then the
use of <foo> will be inconsistently std::<foo> or <foo> scattered
through the file. This CL removes all of the using statements and
updates the code to have the required std:: prefix.
2018-08-01 14:58:12 -04:00
Steven Perron
c8c724cba7
Don't change decorations and names in merge return. (#1777)
When creating a new phi for a value in the function, merge return will
rewrite all uses of an id that are no longer dominated by its
definition.  Uses that are not in a basic block, like OpName or
decorations, are not dominated, but they should not be replaced.

Fixes #1736.
2018-08-01 13:47:09 -04:00
Alan Baker
755e5c9420 Transform to combine consecutive access chains
* Combines OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain and
OpInBoundsPtrAccessChain
* New folding rule to fold add with 0 for integers
 * Converts to a bitcast if the result type does not match the operand
 type
V
2018-07-31 13:42:47 -04:00
Diego Novillo
99fe61e724 Add API to create passes out of a list of command-line flags.
This re-implements the -Oconfig=<file> flag to use a new API that takes
a list of command-line flags representing optimization passes.

This moves the processing of flags that create new optimization passes
out of spirv-opt and into the library API.  Useful for other tools that
want to incorporate a facility similar to -Oconfig.

The main changes are:

1- Add a new public function Optimizer::RegisterPassesFromFlags. This
   takes a vector of strings.  Each string is assumed to have the form
   '--pass_name[=pass_args]'.  It creates and registers into the pass
   manager all the passes specified in the vector.  Each pass is
   validated internally.  Failure to create a pass instance causes the
   function to return false and a diagnostic is emitted to the
   registered message consumer.

2- Re-implements -Oconfig in spirv-opt to use the new API.
2018-07-27 15:10:08 -04:00
Alan Baker
b49f76fd62 Handle undef literal value in vector shuffle
Fixes #1731

* Updated folding rules related to vector shuffle to account for the
undef literal value:
 * FoldVectorShuffleFeedingShuffle
 * FoldVectorShuffleFeedingExtract
 * FoldVectorShuffleWithConstants
* These rules would commit memory violations due to treating the undef
literal value as an accessible composite component
2018-07-20 11:32:43 -04:00
dan sinclair
effafedcee
Replace opt::Instruction type and result cache with flags. (#1718)
Currentlty opt::Instruction class holds a cache of the result_id and
type_id for the instruction. That cache needs to be updated if the
underlying operand values are changes.

This CL changes the cache to being a flag if there is a type or result
id for the instruction. We then retrieve the value if needed from the
operands.
2018-07-20 11:09:30 -04:00