Commit Graph

33 Commits

Author SHA1 Message Date
Jaebaek Seo
e8bd26e1f8
Set correct scope and line info for DebugValue (#4125)
The existing spirv-opt `DebugInfoManager::AddDebugValueForDecl()` sets
the scope and line info of the new added DebugValue using the scope and
line of DebugDeclare. This is wrong because only a single DebugDeclare
must exist under a scope while we have to add DebugValue for all the
places where the variable's value is updated. Therefore, we have to set
the scope and line of DebugValue based on the places of the variable
updates.

This bug makes
https://github.com/google/amber/blob/main/tests/cases/debugger_hlsl_shadowed_vars.amber
fail. This commit fixes the bug.
2021-01-28 12:57:35 -05:00
Jaebaek Seo
f686518cee
spirv-opt: properly preserve DebugValue indexes operand (#4022)
spirv-opt has a bug that `DebugInfoManager::AddDebugValueWithIndex()` does not
preserve `Indexes` operands of
[DebugValue](https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.DebugInfo.100.html#DebugValue).
It has to preserve all of those `Indexes` operands, but it preserves only the first index
operand.

This PR removes `DebugInfoManager::AddDebugValueWithIndex()` and lets the spirv-opt
use `DebugInfoManager::AddDebugValueForDecl()`.
`DebugInfoManager::AddDebugValueForDecl()` preserves the Indexes operand correctly.
2020-11-13 12:06:38 -05:00
Diego Novillo
c74d5523bb
Fix SSA re-writing in the presence of variable pointers. (#4010)
This fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/3873.

In the presence of variable pointers, the reaching definition may be
another pointer.  For example, the following fragment:

 %2 = OpVariable %_ptr_Input_float Input
%11 = OpVariable %_ptr_Function__ptr_Input_float Function
      OpStore %11 %2
%12 = OpLoad %_ptr_Input_float %11
%13 = OpLoad %float %12

corresponds to the pseudo-code:

layout(location = 0) in flat float *%2
float %13;
float *%12;
float **%11;
*%11 = %2;
%12 = *%11;
%13 = *%12;

which ultimately, should correspond to:

%13 = *%2;

During rewriting, the pointer %12 is found to be replaceable by %2.
However, when processing the load %13 = *%12, the type of %12's reaching
definition is another float pointer (%2), instead of a float value.

When this happens, we need to continue looking up the reaching definition
chain until we get to a float value or a non-target var (i.e. a variable
that cannot be SSA replaced, like %2 in this case since it is a function
argument).
2020-11-04 10:23:53 -05:00
Jaebaek Seo
df4198e50e
Add DebugValue for DebugDecl invisible to value assignment (#3973)
For some cases, we have DebugDecl invisible to a value assignment, but
the value assignment information is important i.e., debugger cannot inspect
the variable without the information. For example, a parameter of an inlined
function must have its value assignment i.e., argument passing out of its
function scope. If we simply remove DebugDecl because it is invisible to the
argument passing, we cannot inspec the variable.

This PR
- Adds DebugValue for DebugDecl invisible to a value assignment. We use
the value of the variable in the basic block that contains DebugDecl, which is
found by ssa-rewrite. If the value instruction does not dominate DebugDecl,
we use the value of the variable in the immediate dominator of the basic block.
- Checks the visibility of DebugDecl for Phi value assignment based on the
all value operands of the Phi. Since Phi just references multiple values from
multiple basic blocks, scopes of value operands must be regarded as the scope
of the Phi.
2020-10-27 15:10:08 -04:00
greg-lunarg
12df3cafee
Fix SSA-rewrite to remove DebugDeclare for variables without loads (#3719) 2020-08-24 15:33:01 -04:00
Jaebaek Seo
b78f4b1518
Remove DebugDeclare only for target variables in ssa-rewrite (#3511)
For each local variable, ssa-rewrite should remove its DebugDeclare
if and only if it is replaced by any number of DebugValues for store
and phi instructions.

For example, when we have two variables `a` whose DebugDeclare
will be replaced to DebugValues by ssa-rewrite pass and `b` whose
DebugDeclare will not be replaced, we have to remove only DebugDeclare
for `a`, not `b`.
2020-07-31 10:00:30 -04:00
Jaebaek Seo
efaae24d00
Clear debug information for kill and replacement (#3459)
For many spirv-opt passes such as simplify-instructions pass, we have to
correctly clear the OpenCL.DebugInfo.100 debug information for
KillInst() and ReplaceAllUses(). If we keep some debug information that
disappeared because of KillInst() and ReplaceAllUses(), adding new
DebugValue instructions based on the existing DebugDeclare information
will generate incorrect information. This CL update DebugInfoManager
and IRContext to correctly clear debug information.
2020-06-25 15:48:26 -04:00
Jaebaek Seo
d4b9f576eb
[spirv-opt] debug info preservation in ssa-rewrite (#3356)
Add OpenCL.DebugInfo.100 `DebugValue` instructions for store
and phi instructions of local variables to provide the debugger with
the updated values of local variables correctly.
2020-06-19 14:57:43 -04:00
Jaebaek Seo
2b987c49a4
Handle OpConstantNull in ssa-rewrite (#3362)
ssa-rewrite fails in `MemPass::GetPtr()` when the SPIR-V code contains
`OpLoad` for the result id of `OpConstantNull` because of the out of
index access for an operand to get the base address. This commit fixes
it.

Fixes #3344
2020-05-20 12:00:51 -04:00
alan-baker
b334829a91 Validate nested constructs (#3068)
* Validate that if a construct contains a header and it's merge is
reachable, the construct also contains the merge
* updated block merging to not merge into the continue
* update inlining to mark the original block of a single block loop as
the continue
* updated some tests
* remove dead code
* rename kBlockTypeHeader to kBlockTypeSelection for clarity
2019-11-27 16:45:57 -05:00
Steven Perron
35c9518c4e
Handle id overflow in the ssa rewriter. (#2845)
* Handle id overflow in the ssa rewriter.

Remove LocalSSAElim pass at the same time.  It does the same thing as the SSARewrite pass. Then even share almost all of the same code.

Fixes crbug.com/997246
2019-09-10 09:38:23 -04:00
Steven Perron
c9190a54da
SSA rewriter: Don't use trivial phis (#2757)
When a phi candidate is marked as trivial, we are suppose to update all
of its uses to the reference the value that it is being folded to.
However, the code updates the uses misses `defs_at_block_`.  So at a
later time, the id for the trivial phi can reemerge.

Fixes #2744
2019-07-23 17:59:30 -04:00
Steven Perron
12e4a7b649
Handle variable pointer in some optimizations (#2490)
* Check var pointer capability in ADCE.

* Check var ptr capability for common uniform.

* Check var ptr capability in access chain convert.

Since we want this pass to run even if there are variable pointer on
storage buffers, we had to remove asserts that assumed there were no
variable pointers.  The functions with the asserts will now work, it
becomes the responsibility of the callers to deal with the output as
appropriate.

* Single block elimination and variable pointers.

It seems like the code in local single block elimination is able to
handle cases with variable pointers already.  This is because the
function `HasOnlySupportedRefs` ensures that variables that feed a
variable pointer are not candidates.

* Single store elimination and variable pointers.

It seems like the code in local single stroe elimination is able to
handle cases with variable pointers already.  This is because the
function `FindSingleStoreAndCheckUses` ensures that variables that feed
a variable pointer are not candidates.

* SSA rewriter and variable pointers.

It seems like the code in the two passes that call the SSA rewriter are
able to  handle cases with variable pointers already.  This is because the
function `HasOnlySupportedRefs` ensures that variables that feed
a variable pointer are not candidates.

Fixes #2458.
2019-04-03 12:47:51 -04:00
Steven Perron
ff07c6df83
SSA-rewriter: make sure phi entries are unique. (#2206)
If there are multiple edges to a basic block, then the ssa rewriter will
create OpPhi instructions with duplicate entries.  This is invalid, and
it is fixed in this commit.

Fixes #2202.
2018-12-18 18:14:27 +00:00
alan-baker
3d56cddb75
Validate pointer variables (#2111)
Fixes #2104

* Checks the rules for logical addressing and variable pointers
 * Has an out for relaxed logical pointers
* Updated PassFixture to expose validator options
 * enabled relaxed logical pointers for some tests
* New validator tests
2018-11-27 16:47:10 -05:00
Jaebaek Seo
ebcc58b5f8 Validator: function scope variable at start of entry block #1923
All OpVariable instructions in a function must be the first
instructions in the first block.
2018-10-04 15:05:47 -04:00
Steven Perron
c4c68712c4
Make EFFCEE required (#1943)
Fixes #1912.

Remove the non-effcee build as EFFCEE is now required.
2018-10-04 10:00:11 -04:00
Steven Perron
d746681fe9
Copy decorations when creating new ids. (#1843)
* Copy decorations when creating new ids.

When creating a new value based on an old value, we need to copy the
decorations to the new id.  This change does this in 3 places:

1) The variable holding the return value of the function generated by
merge return should get decorations from the function.

2) The results of the OpPhi instructions should get decorations from the
variable they are replacing in the ssa writer.

3) In local access chain convert the intermediate struct (result of
OpCompositeInsert) generated for the store replacement should get its
decorations from the variable being stored to.

Fixes #1787.
2018-08-24 11:55:39 -04:00
dan sinclair
eda2cfbe12
Cleanup includes. (#1795)
This Cl cleans up the include paths to be relative to the top level
directory. Various include-what-you-use fixes have been added.
2018-08-03 15:06:09 -04:00
dan sinclair
2cce2c5b97
Move tests into namespaces (#1689)
This CL moves the test into namespaces based on their directories.
2018-07-11 09:24:49 -04:00
dan sinclair
e6b953361d
Move the ir namespace to opt. (#1680)
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
2018-07-09 11:32:29 -04:00
Alan Baker
4f866abfd8 Validate static uses of interfaces
Fixes #1120

Checks that all static uses of the Input and Output variables are listed
as interfaces in each corresponding entry point declaration.
 * Changed validation state to track interface lists
 * updated many tests
* Modified validation state to store entry point names
 * Combined with interface list and called EntryPointDescription
 * Updated uses
* Changed interface validation error messages to output entry point name
in addtion to ID
2018-06-13 10:56:14 -04:00
GregF
6fbfe1c016 Fix SSA rewrite for nested loops.
From the test case, the slice of the CFG that is interesting for the bug
is

25
|
v
30
|
v
31<-+
|   |
v   |
34--+

1. In block 25, we have a Phi candidate for %f with arguments
   %47 = Phi[%float_0, %0]. This merges %float_0 and a yet unknown
   argument from the external loop backedge.
2. We are now processing block 34:
   i. The load %35 = OpLoad %f triggers a Phi candidate to be placed in
      block 31.
  ii. The Phi candidate %50 = Phi needs two arguments. The one coming
      from block 30 is %47. But the one coming from block 34 (which we
      are now processing and have marked sealed), finds %50 itself as
      the reaching def for %f.
3. This wrongfully marks %50 as a copy-of Phi, which ultimately makes
   both %47 and %50 copy-of Phis that get eliminated.
2018-04-06 15:17:52 -04:00
Neil Roberts
57a2441791 hex_float: Use max_digits10 for the float precision
CPPreference.com has this description of digits10:

“The value of std::numeric_limits<T>::digits10 is the number of
 base-10 digits that can be represented by the type T without change,
 that is, any number with this many significant decimal digits can be
 converted to a value of type T and back to decimal form, without
 change due to rounding or overflow.”

This means that any number with this many digits can be represented
accurately in the corresponding type. A change in any digit in a
number after that may or may not cause it a different bitwise
representation. Therefore this isn’t necessarily enough precision to
accurately represent the value in text. Instead we need max_digits10
which has the following description:

“The value of std::numeric_limits<T>::max_digits10 is the number of
 base-10 digits that are necessary to uniquely represent all distinct
 values of the type T, such as necessary for
 serialization/deserialization to text.”

The patch includes a test case in hex_float_test which tries to do a
round-robin conversion of a number that requires more than 6 decimal
places to be accurately represented. This would fail without the
patch.

Sadly this also breaks a bunch of other tests. Some of the tests in
hex_float_test use ldexp and then compare it with a value which is not
the same as the one returned by ldexp but instead is the value rounded
to 6 decimals. Others use values that are not evenly representable as
a binary floating fraction but then happened to generate the same
value when rounded to 6 decimals. Where the actual value didn’t seem
to matter these have been changed with different values that can be
represented as a binary fraction.
2018-04-03 12:53:10 -04:00
Diego Novillo
735d8a579e SSA rewrite pass.
This pass replaces the load/store elimination passes.  It implements the
SSA re-writing algorithm proposed in

     Simple and Efficient Construction of Static Single Assignment Form.
     Braun M., Buchwald S., Hack S., Leißa R., Mallon C., Zwinkau A. (2013)
     In: Jhala R., De Bosschere K. (eds)
     Compiler Construction. CC 2013.
     Lecture Notes in Computer Science, vol 7791.
     Springer, Berlin, Heidelberg

     https://link.springer.com/chapter/10.1007/978-3-642-37051-9_6

In contrast to common eager algorithms based on dominance and dominance
frontier information, this algorithm works backwards from load operations.

When a target variable is loaded, it queries the variable's reaching
definition.  If the reaching definition is unknown at the current location,
it searches backwards in the CFG, inserting Phi instructions at join points
in the CFG along the way until it finds the desired store instruction.

The algorithm avoids repeated lookups using memoization.

For reducible CFGs, which are a superset of the structured CFGs in SPIRV,
this algorithm is proven to produce minimal SSA.  That is, it inserts the
minimal number of Phi instructions required to ensure the SSA property, but
some Phi instructions may be dead
(https://en.wikipedia.org/wiki/Static_single_assignment_form).
2018-03-20 20:56:55 -04:00
Steven Perron
2cb589cc14 Remove uses DCEInst and call ADCE
The algorithm used in DCEInst to remove dead code is very slow.  It is
fine if you only want to remove a small number of instructions, but, if
you need to remove a large number of instructions, then the algorithm in
ADCE is much faster.

This PR removes the calls to DCEInst in the load-store removal passes
and adds a pass of ADCE afterwards.

A number of different iterations of the order of optimization, and I
believe this is the best I could find.

The results I have on 3 sets of shaders are:

Legalization:

Set 1: 5.39 -> 5.01
Set 2: 13.98 -> 8.38
Set 3: 98.00 -> 96.26

Performance passes:

Set 1: 6.90 -> 5.23
Set 2: 10.11 -> 6.62
Set 3: 253.69 -> 253.74

Size reduction passes:

Set 1: 7.16 -> 7.25
Set 2: 17.17 -> 16.81
Set 3: 112.06 -> 107.71

Note that the third set's compile time is large because of the large
number of basic blocks, not so much because of the number of
instructions.  That is why we don't see much gain there.
2018-02-27 21:06:08 -05:00
Alan Baker
eb0c73dad6 Maintain instruction to block mapping in phi insertion
* Changed MemPass::InsertPhiInstructions to set basic blocks for new
phis
* Local SSA elim now maintains instr to block mapping
 * Added a test and confirmed it fails without the updated phis
* IRContext::set_instr_block no longer builds the map if the analysis is
invalid
* Added instruction to block mapping verification to
IRContext::IsConsistent()
2018-01-12 10:16:53 -05:00
Steven Perron
79a00649b4 Allow pointers to pointers in logical addressing mode.
A few optimizations are updates to handle code that is suppose to be
using the logical addressing mode, but still has variables that contain
pointers as long as the pointer are to opaque objects.  This is called
"relaxed logical addressing".

|Instruction::GetBaseAddress| will check that pointers that are use meet
the relaxed logical addressing rules.  Optimization that now handle
relaxed logical addressing instead of logical addressing are:

 - aggressive dead-code elimination
 - local access chain convert
 - local store elimination passes.
2017-12-19 14:29:14 -05:00
GregF
78c025abe9 MultiStore: Support OpVariable Initialization
Treat an OpVariable with initialization as if it was an OpStore.
With PR #1073, this completes work for issue #1017.
2017-12-11 10:37:14 -05:00
Diego Novillo
83228137e1 Re-format source tree - NFC.
Re-formatted the source tree with the command:

$ /usr/bin/clang-format -style=file -i \
    $(find include source tools test utils -name '*.cpp' -or -name '*.h')

This required a fix to source/val/decoration.h.  It was not including
spirv.h, which broke builds when the #include headers were re-ordered by
clang-format.
2017-11-27 14:31:49 -05:00
GregF
ac04b2faea Opt: Fix HasLoads to not report decoration as load. 2017-11-07 17:39:58 -05:00
David Neto
33b879c105 elim-multi-store: only patch loop header phis that we created
There can already be OpPhi instructions in a loop header that
are unrelated to the optimization.  We should not be patching those.

Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/826
2017-09-21 10:01:30 -04:00
GregF
cc8bad3a5b Add LocalMultiStoreElim pass
A SSA local variable load/store elimination pass.
For every entry point function, eliminate all loads and stores of function
scope variables only referenced with non-access-chain loads and stores.
Eliminate the variables as well.

The presence of access chain references and function calls can inhibit
the above optimization.

Only shader modules with logical addressing are currently processed.
Currently modules with any extensions enabled are not processed. This
is left for future work.

This pass is most effective if preceeded by Inlining and
LocalAccessChainConvert. LocalSingleStoreElim and LocalSingleBlockElim
will reduce the work that this pass has to do.
2017-07-07 17:54:21 -04:00