Commit Graph

23 Commits

Author SHA1 Message Date
dan sinclair
eda2cfbe12
Cleanup includes. (#1795)
This Cl cleans up the include paths to be relative to the top level
directory. Various include-what-you-use fixes have been added.
2018-08-03 15:06:09 -04:00
dan sinclair
58a6876cee
Rewrite include guards (#1793)
This CL rewrites the include guards to make PRESUBMIT.py include guard
check happy.
2018-08-03 08:05:33 -04:00
dan sinclair
c7da51a085
Cleanup extraneous namespace qualifies in source/opt. (#1716)
This CL follows up on the opt namespacing CLs by removing the
unnecessary opt:: and opt::analysis:: namespace prefixes.
2018-07-12 15:14:43 -04:00
dan sinclair
f96b7f1cb9
use Pass::Run to set the context on each pass. (#1708)
Currently the IRContext is passed into the Pass::Process method. It is
then up to the individual pass to store the context into the context_
variable. This CL changes the Run method to store the context before
calling Process which no-longer receives the context as a parameter.
2018-07-12 09:08:45 -04:00
dan sinclair
e6b953361d
Move the ir namespace to opt. (#1680)
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
2018-07-09 11:32:29 -04:00
Steven Perron
2cb589cc14 Remove uses DCEInst and call ADCE
The algorithm used in DCEInst to remove dead code is very slow.  It is
fine if you only want to remove a small number of instructions, but, if
you need to remove a large number of instructions, then the algorithm in
ADCE is much faster.

This PR removes the calls to DCEInst in the load-store removal passes
and adds a pass of ADCE afterwards.

A number of different iterations of the order of optimization, and I
believe this is the best I could find.

The results I have on 3 sets of shaders are:

Legalization:

Set 1: 5.39 -> 5.01
Set 2: 13.98 -> 8.38
Set 3: 98.00 -> 96.26

Performance passes:

Set 1: 6.90 -> 5.23
Set 2: 10.11 -> 6.62
Set 3: 253.69 -> 253.74

Size reduction passes:

Set 1: 7.16 -> 7.25
Set 2: 17.17 -> 16.81
Set 3: 112.06 -> 107.71

Note that the third set's compile time is large because of the large
number of basic blocks, not so much because of the number of
instructions.  That is why we don't see much gain there.
2018-02-27 21:06:08 -05:00
Alan Baker
eb0c73dad6 Maintain instruction to block mapping in phi insertion
* Changed MemPass::InsertPhiInstructions to set basic blocks for new
phis
* Local SSA elim now maintains instr to block mapping
 * Added a test and confirmed it fails without the updated phis
* IRContext::set_instr_block no longer builds the map if the analysis is
invalid
* Added instruction to block mapping verification to
IRContext::IsConsistent()
2018-01-12 10:16:53 -05:00
Steven Perron
efe12ff5a1 Have all MemPasses preserve the def-use manager.
Originally the passes that extended from MemPass were those that are
of the def-use manager.  I am assuming they would be able to preserve
it because of that.

Added a check to verify consistency of the IRContext. The IRContext
relies on the pass to tell it if something is invalidated.
It is possible that the pass lied.  To help identify those situations,
we will check if the valid analyses are correct after each pass.

This will be enabled by default for the debug build, and disabled in the
production build.  It can be disabled in the debug build by adding
"-DSPIRV_CHECK_CONTEXT=OFF" to the cmake command.
2017-11-10 11:17:12 -05:00
Diego Novillo
d2938e4842 Re-format files in source, source/opt, source/util, source/val and tools.
NFC. This just makes sure every file is formatted following the
formatting definition in .clang-format.

Re-formatted with:

$ clang-format -i $(find source tools include -name '*.cpp')
$ clang-format -i $(find source tools include -name '*.h')
2017-11-08 14:03:08 -05:00
Steven Perron
476cae6f7d Add the IRContext (part 1)
This is the first part of adding the IRContext.  This class is meant to
hold the extra data that is build on top of the module that it
owns.

The first part will simply create the IRContext class and get it passed
to the passes in place of the module.  For now it does not have any
functionality of its own, but it acts more as a wrapper for the module.

The functions that I added to the IRContext are those that either
traverse the headers or add to them.  I did this because we may decide
to have other ways of dealing with these sections (for example adding a
type pool, or use the decoration manager).

I also added the function that add to the header because the IRContext
needs to know when an instruction is added to update other data
structures appropriately.

Note that there is still lots of work that needs to be done.  There are
still many places that change the module, and do not inform the context.
That will be the next step.
2017-10-31 13:46:05 -04:00
Diego Novillo
1040a95b3f Re-factor Phi insertion code out of LocalMultiStoreElimPass
Including a re-factor of common behaviour into class Pass:

The following functions are now in class Pass:

- IsLoopHeader.
- ComputeStructuredOrder
- ComputeStructuredSuccessors (annoyingly, I could not re-factor all
  instances of this function, the copy in common_uniform_elim_pass.cpp
  is slightly different and fails with the common implementation).
- GetPointeeTypeId
- TakeNextId
- FinalizeNextId
- MergeBlockIdIfAny

This is a NFC (non-functional change)
2017-10-27 15:28:08 -04:00
GregF
1e7994c085 Opt: Move *NextId functionality into MemPass 2017-10-13 15:22:19 -04:00
David Neto
33b879c105 elim-multi-store: only patch loop header phis that we created
There can already be OpPhi instructions in a loop header that
are unrelated to the optimization.  We should not be patching those.

Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/826
2017-09-21 10:01:30 -04:00
Greg Fischer
8be28f7524 ElimLocalMultiStore: Reset structured successors for each function 2017-09-19 13:47:28 -06:00
GregF
c8c86a0d36 Opt: Have "size" passes process full entry point call tree.
Includes code to deal correctly with OpFunctionParameter. This
is needed by opaque propagation which may not exhaustively inline
entry point functions.

Adds ProcessEntryPointCallTree: a method to do work on the
functions in the entry point call trees in a deterministic order.
2017-08-18 10:16:01 -04:00
GregF
f0fe601dc8 AccessChainConvert: Add HasOnlySupportedRefs()
This avoids conversion on variables which will not ultimately be optimized.
Also removed an obsolete restriction from FindTargetVars(). Also added
decorates to supported refs (eg. RelaxedPrecision). Also fixed name to
IsNonTypeDecorate().
2017-08-04 18:11:44 -04:00
GregF
c1b46eedbd Add MemPass, move all shared functions to it. 2017-08-02 14:24:02 -04:00
GregF
7954740d54 Opt: Delete names and decorations of dead instructions 2017-07-26 18:36:41 -04:00
Lei Zhang
4a539d77ef Revert "Revert "Opt: LocalBlockElim: Add HasOnlySupportedRefs""
This reverts commit df96e243c6.
2017-07-25 23:22:09 -04:00
GregF
1182415581 Add extension whitelists to size-reduction passes.
Currently only SPV_KHR_variable_pointers is disallowed in passes which
do pointer analysis. Positive and negative tests of the general extensions
mechanism were added to aggressive_dce but cover all passes.
2017-07-25 19:14:02 -04:00
Lei Zhang
df96e243c6 Revert "Opt: LocalBlockElim: Add HasOnlySupportedRefs"
This reverts commit 2d0f7fbc11.
2017-07-22 10:48:56 -04:00
greg-lunarg
2d0f7fbc11 Opt: LocalBlockElim: Add HasOnlySupportedRefs
Verifies that targeted variables have only access chain and direct
loads and stores as references.
2017-07-22 10:32:19 -04:00
GregF
cc8bad3a5b Add LocalMultiStoreElim pass
A SSA local variable load/store elimination pass.
For every entry point function, eliminate all loads and stores of function
scope variables only referenced with non-access-chain loads and stores.
Eliminate the variables as well.

The presence of access chain references and function calls can inhibit
the above optimization.

Only shader modules with logical addressing are currently processed.
Currently modules with any extensions enabled are not processed. This
is left for future work.

This pass is most effective if preceeded by Inlining and
LocalAccessChainConvert. LocalSingleStoreElim and LocalSingleBlockElim
will reduce the work that this pass has to do.
2017-07-07 17:54:21 -04:00