Commit Graph

30 Commits

Author SHA1 Message Date
Chao Chen
6e2dab2ffd Add support for Nvidia Turing extensions 2018-09-19 20:46:14 -04:00
dan sinclair
eda2cfbe12
Cleanup includes. (#1795)
This Cl cleans up the include paths to be relative to the top level
directory. Various include-what-you-use fixes have been added.
2018-08-03 15:06:09 -04:00
dan sinclair
c7da51a085
Cleanup extraneous namespace qualifies in source/opt. (#1716)
This CL follows up on the opt namespacing CLs by removing the
unnecessary opt:: and opt::analysis:: namespace prefixes.
2018-07-12 15:14:43 -04:00
dan sinclair
f96b7f1cb9
use Pass::Run to set the context on each pass. (#1708)
Currently the IRContext is passed into the Pass::Process method. It is
then up to the individual pass to store the context into the context_
variable. This CL changes the Run method to store the context before
calling Process which no-longer receives the context as a parameter.
2018-07-12 09:08:45 -04:00
dan sinclair
e6b953361d
Move the ir namespace to opt. (#1680)
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
2018-07-09 11:32:29 -04:00
David Neto
a91cbfbf75 Optimizer: update extension whitelists
Add two new extensions:
- SPV_NV_shader_subgroup_partitioned
- SPV_EXT_descriptor_indexing
2018-04-06 15:56:20 -04:00
Eleni Maria Stea
045cc8f75b Fixes compile errors generated with -Wpedantic
This patch fixes the compile errors generated when the options
SPIRV_WARN_EVERYTHING and SPIRV_WERROR (that force -Wpedantic) are
set to cmake.
2018-03-22 09:40:11 -04:00
Diego Novillo
735d8a579e SSA rewrite pass.
This pass replaces the load/store elimination passes.  It implements the
SSA re-writing algorithm proposed in

     Simple and Efficient Construction of Static Single Assignment Form.
     Braun M., Buchwald S., Hack S., Leißa R., Mallon C., Zwinkau A. (2013)
     In: Jhala R., De Bosschere K. (eds)
     Compiler Construction. CC 2013.
     Lecture Notes in Computer Science, vol 7791.
     Springer, Berlin, Heidelberg

     https://link.springer.com/chapter/10.1007/978-3-642-37051-9_6

In contrast to common eager algorithms based on dominance and dominance
frontier information, this algorithm works backwards from load operations.

When a target variable is loaded, it queries the variable's reaching
definition.  If the reaching definition is unknown at the current location,
it searches backwards in the CFG, inserting Phi instructions at join points
in the CFG along the way until it finds the desired store instruction.

The algorithm avoids repeated lookups using memoization.

For reducible CFGs, which are a superset of the structured CFGs in SPIRV,
this algorithm is proven to produce minimal SSA.  That is, it inserts the
minimal number of Phi instructions required to ensure the SSA property, but
some Phi instructions may be dead
(https://en.wikipedia.org/wiki/Static_single_assignment_form).
2018-03-20 20:56:55 -04:00
David Neto
2e3aec23ca Add recent Google extensions to optimizer whitelists
Optimizations should work in the presence of recent
SPV_GOOGLE_decorate_string and SPV_GOOGLE_hlsl_functionality1

SPV_GOOGLE_decorate_string:
- Adds operation OpDecorateStringGOOGLE to decorate an object with decorations
  having string operands.

SPV_GOOGLE_hlsl_functionality1:
- Adds HlslSemanticGOOGLE, used to decorate an interface variable with
  an HLSL semantic string.  Optimizations already preserve those variables
  as required because they are interface variables (with uses), independent
  of whether they have HLSL decorations.

- Adds HlslCounterBufferGOOGLE, used to associate a buffer with a
  counter variable.

Fixes #1391
2018-03-15 11:16:20 -04:00
Rex Xu
314cfa29b2 Add missing SPV extension strings 2018-03-08 21:54:00 +08:00
Steven Perron
2cb589cc14 Remove uses DCEInst and call ADCE
The algorithm used in DCEInst to remove dead code is very slow.  It is
fine if you only want to remove a small number of instructions, but, if
you need to remove a large number of instructions, then the algorithm in
ADCE is much faster.

This PR removes the calls to DCEInst in the load-store removal passes
and adds a pass of ADCE afterwards.

A number of different iterations of the order of optimization, and I
believe this is the best I could find.

The results I have on 3 sets of shaders are:

Legalization:

Set 1: 5.39 -> 5.01
Set 2: 13.98 -> 8.38
Set 3: 98.00 -> 96.26

Performance passes:

Set 1: 6.90 -> 5.23
Set 2: 10.11 -> 6.62
Set 3: 253.69 -> 253.74

Size reduction passes:

Set 1: 7.16 -> 7.25
Set 2: 17.17 -> 16.81
Set 3: 112.06 -> 107.71

Note that the third set's compile time is large because of the large
number of basic blocks, not so much because of the number of
instructions.  That is why we don't see much gain there.
2018-02-27 21:06:08 -05:00
Steven Perron
756b277fb8 Store all enabled capabilities in the feature manger.
In order to keep track of all of the implicit capabilities as well as
the explicit ones, we will add them all to the feature manager.  That is
the object that needs to be queried when checking if a capability is
enabled.

The name of the "HasCapability" function in the module was changed to
make it more obvious that it does not check for implied capabilities.

Keep an spv_context and AssemblyGrammar in IRContext
2017-12-21 11:14:53 -05:00
Steven Perron
79a00649b4 Allow pointers to pointers in logical addressing mode.
A few optimizations are updates to handle code that is suppose to be
using the logical addressing mode, but still has variables that contain
pointers as long as the pointer are to opaque objects.  This is called
"relaxed logical addressing".

|Instruction::GetBaseAddress| will check that pointers that are use meet
the relaxed logical addressing rules.  Optimization that now handle
relaxed logical addressing instead of logical addressing are:

 - aggressive dead-code elimination
 - local access chain convert
 - local store elimination passes.
2017-12-19 14:29:14 -05:00
Steven Perron
65046eca7c Change IRContext::KillInst to delete instructions.
The current method of removing an instruction is to call ToNop.  The
problem with this is that it leaves around an instruction that later
passes will look at.  We should just delete the instruction.

In MemPass there is a utility routine called DCEInst.  It can delete
essentially any instruction, which can invalidate pointers now that they
are actually deleted.  The interface was changed to add a call back that
can be used to update any local data structures that contain
ir::Intruction*.
2017-12-04 11:07:45 -05:00
Diego Novillo
83228137e1 Re-format source tree - NFC.
Re-formatted the source tree with the command:

$ /usr/bin/clang-format -style=file -i \
    $(find include source tools test utils -name '*.cpp' -or -name '*.h')

This required a fix to source/val/decoration.h.  It was not including
spirv.h, which broke builds when the #include headers were re-ordered by
clang-format.
2017-11-27 14:31:49 -05:00
Diego Novillo
d2938e4842 Re-format files in source, source/opt, source/util, source/val and tools.
NFC. This just makes sure every file is formatted following the
formatting definition in .clang-format.

Re-formatted with:

$ clang-format -i $(find source tools include -name '*.cpp')
$ clang-format -i $(find source tools include -name '*.h')
2017-11-08 14:03:08 -05:00
Steven Perron
476cae6f7d Add the IRContext (part 1)
This is the first part of adding the IRContext.  This class is meant to
hold the extra data that is build on top of the module that it
owns.

The first part will simply create the IRContext class and get it passed
to the passes in place of the module.  For now it does not have any
functionality of its own, but it acts more as a wrapper for the module.

The functions that I added to the IRContext are those that either
traverse the headers or add to them.  I did this because we may decide
to have other ways of dealing with these sections (for example adding a
type pool, or use the decoration manager).

I also added the function that add to the header because the IRContext
needs to know when an instruction is added to update other data
structures appropriately.

Note that there is still lots of work that needs to be done.  There are
still many places that change the module, and do not inform the context.
That will be the next step.
2017-10-31 13:46:05 -04:00
Diego Novillo
632e2068f3 More re-factoring to simplify pass initialization.
This implements two cleanups suggested by @s-perron
(https://github.com/KhronosGroup/SPIRV-Tools/pull/921):

- Move FindNamedOrDecoratedIds() into MemPass::InitializeProcessing().
- Remove FinalizeNextId(). Always call SetIdBound() from
  Pass::TakeNextId().
2017-10-30 09:06:17 -04:00
Diego Novillo
1040a95b3f Re-factor Phi insertion code out of LocalMultiStoreElimPass
Including a re-factor of common behaviour into class Pass:

The following functions are now in class Pass:

- IsLoopHeader.
- ComputeStructuredOrder
- ComputeStructuredSuccessors (annoyingly, I could not re-factor all
  instances of this function, the copy in common_uniform_elim_pass.cpp
  is slightly different and fails with the common implementation).
- GetPointeeTypeId
- TakeNextId
- FinalizeNextId
- MergeBlockIdIfAny

This is a NFC (non-functional change)
2017-10-27 15:28:08 -04:00
GregF
1e7994c085 Opt: Move *NextId functionality into MemPass 2017-10-13 15:22:19 -04:00
David Neto
33b879c105 elim-multi-store: only patch loop header phis that we created
There can already be OpPhi instructions in a loop header that
are unrelated to the optimization.  We should not be patching those.

Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/826
2017-09-21 10:01:30 -04:00
Greg Fischer
8be28f7524 ElimLocalMultiStore: Reset structured successors for each function 2017-09-19 13:47:28 -06:00
GregF
c8c86a0d36 Opt: Have "size" passes process full entry point call tree.
Includes code to deal correctly with OpFunctionParameter. This
is needed by opaque propagation which may not exhaustively inline
entry point functions.

Adds ProcessEntryPointCallTree: a method to do work on the
functions in the entry point call trees in a deterministic order.
2017-08-18 10:16:01 -04:00
GregF
f0fe601dc8 AccessChainConvert: Add HasOnlySupportedRefs()
This avoids conversion on variables which will not ultimately be optimized.
Also removed an obsolete restriction from FindTargetVars(). Also added
decorates to supported refs (eg. RelaxedPrecision). Also fixed name to
IsNonTypeDecorate().
2017-08-04 18:11:44 -04:00
GregF
c1b46eedbd Add MemPass, move all shared functions to it. 2017-08-02 14:24:02 -04:00
GregF
7954740d54 Opt: Delete names and decorations of dead instructions 2017-07-26 18:36:41 -04:00
GregF
1182415581 Add extension whitelists to size-reduction passes.
Currently only SPV_KHR_variable_pointers is disallowed in passes which
do pointer analysis. Positive and negative tests of the general extensions
mechanism were added to aggressive_dce but cover all passes.
2017-07-25 19:14:02 -04:00
GregF
adb237f3bd Fix handling of CopyObject in GetPtr and its call sites 2017-07-21 18:08:01 -04:00
Greg Fischer
fe24e0316f LocalMultiStore: Always put varId for backedge on loop phi function.
And always patch the backedge operand when patching phi functions. This
approach is more correct and cleaner. The previous code was generating
incorrect phis when the backedge block had no predecessors.
2017-07-12 16:42:07 -04:00
GregF
cc8bad3a5b Add LocalMultiStoreElim pass
A SSA local variable load/store elimination pass.
For every entry point function, eliminate all loads and stores of function
scope variables only referenced with non-access-chain loads and stores.
Eliminate the variables as well.

The presence of access chain references and function calls can inhibit
the above optimization.

Only shader modules with logical addressing are currently processed.
Currently modules with any extensions enabled are not processed. This
is left for future work.

This pass is most effective if preceeded by Inlining and
LocalAccessChainConvert. LocalSingleStoreElim and LocalSingleBlockElim
will reduce the work that this pass has to do.
2017-07-07 17:54:21 -04:00