* Refactor instruction folders
We want to refactor the instruction folder to allow different sets of
rules to be added to the instruction folder. We might want different
sets of rules in different circumstances.
We also need a way to add rules for extended instructions. Changes are
made to the FoldingRules class and ConstFoldingRules class to enable
that.
We added tests to check that we can fold extended instructions using the
new framework.
At the same time, I noticed that there were two tests that did not tests
what they were suppose to. They could not be easily salvaged. #2813 was
opened to track adding the new tests.
This fixes#2608.
The original test case had an out-of-bounds reference that ended up
folding into OpCompositeExtract that was indexing right outside the
constant composite.
The returned constant would then cause a segfault during constant
propagation.
* Fix OpDot folding of half float vectors.
The code that folds OpDot does not handle half floats correctly. After
trying to multiple the first components, we get a nullptr because we
don't fold half float values. This nullptr gets passed to the code that
does the addition, and causes an assert.
Fixes#2405.
Fixes#1731
* Updated folding rules related to vector shuffle to account for the
undef literal value:
* FoldVectorShuffleFeedingShuffle
* FoldVectorShuffleFeedingExtract
* FoldVectorShuffleWithConstants
* These rules would commit memory violations due to treating the undef
literal value as an accessible composite component
With current implementation, the constant manager does not keep around
two constant with the same value but different types when the types
hash to the same value. So when you start looking for that constant you
will get a constant with the wrong type back.
I've made a few changes to the constant manager to fix this. First off,
I have changed the map from constant to ids to be an std::multimap.
This way a single constant can be mapped to mutiple ids each
representing a different type.
Then when asking for an id of a constant, we can search all of the ids
associated with that constant in order to find the one with the correct
type.
This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
Currently the utils/ folder uses both spvutils:: and spvtools::utils.
This CL changes the namespace to consistenly be spvtools::utils to match
the rest of the codebase.
The folding routines are currently global functions. They also rely on
data in an std::map that holds the folding rules for each opcode. This
causes that map to not have a clear owner, and therefore never gets
deleted.
There has been a request to delete this map. To implement this, we will
create a InstructionFolder class that owns the maps. The IRContext will
own the InstructionFolder instance. Then the global functions will
become public memeber functions of the InstructionFolder.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1659.
An FClamp instruction forces a values to be within a certain interval.
When the upper or lower bound of the FClamp is a constant and the value
being compared with is a constant, then in some case we can fold the
compared because the entire range is say less than the value.
Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1549.
If one of the operands to an OpVectorTimesScalar instruction is zero,
then the result will be the 0 vector. Currently we do not fold the
insturction unless both operands are constants. This change fixes that.
We also allow folding of OpPhi instructions where the incoming values
are either an OpUndef or the OpPhi instruction itself. As with other
cases, this can be simplified to the OpUndef.
Adding three rules to fold OpDot (implemented as two).
- When an OpDot has two constants, then fold to the resulting const.
- When one of the inputs is the 0 vector, then fold to zero.
- When one of the inputs is a single 1 with 0s, then rewrite to an
OpCompositeExtract of the appropriate element. This will help find
even more folding opportunities.
Contributes to #709.
Adding basis of arithmetic merging
* Refactored constant collection in ConstantManager
* New rules:
* consecutive negates
* negate of arithmetic op with a constant
* consecutive muls
* reciprocal of div
* Removed IRContext::CanFoldFloatingPoint
* replaced by Instruction::IsFloatingPointFoldingAllowed
* Fixed some bad tests
* added some header comments
Added PerformIntegerOperation
* minor fixes to constants and tests
* fixed IntMultiplyBy1 to work with 64 bit ints
* added tests for integer mul merging
Adding test for vector integer multiply merging
Adding support for merging integer add and sub through negate
* Added tests
Adding rules to merge mult with preceding divide
* Has a couple tests, but needs more
* Added more comments
Fixed bug in integer division folding
* Will no longer merge through integer division if there would be a
remainder in the division
* Added a bunch more tests
Adding rules to merge divide and multiply through divide
* Improved comments
* Added tests
Adding rules to handle mul or div of a negation
* Added tests
Changes for review
* Early exit if no constants are involved in more functions
* fixed some comments
* removed unused declaration
* clarified some logic
Adding new rules for add and subtract
* Fold adds of adds, subtracts or negates
* Fold subtracts of adds, subtracts or negates
* Added tests
This change implements instruction folding for arithmetic operations
that are redundant, specifically:
x + 0 = 0 + x = x
x - 0 = x
0 - x = -x
x * 0 = 0 * x = 0
x * 1 = 1 * x = x
0 / x = 0
x / 1 = x
mix(a, b, 0) = a
mix(a, b, 1) = b
Cache ExtInst import id in feature manager
This allows us to avoid string lookups during optimization; for now we
just cache GLSL std450 import id but I can imagine caching more sets as
they become utilized by the optimizer.
Add tests for add/sub/mul/div/mix folding
The tests cover scalar float/double cases, and some vector cases.
Since most of the code for floating point folding is shared, the tests
for vector folding are not as exhaustive as scalar.
To test sub->negate folding I had to implement a custom fixture.
This change handles all 6 regular comparison types in two variations,
ordered (true if values are ordered *and* comparison is true) and
unordered (true if values are unordered *or* comparison is true).
Ordered comparison matches the default floating-point behavior on host
but we use std::isnan to check ordering explicitly anyway.
This change also slightly reworks the floating-point folding support
code to make it possible to define a folding operation that returns
boolean instead of floating point.
These tests exhaustively test ordered/unordered comparisons for
float/double.
Since for NaN inputs the comparison result doesn't depend on the
comparison function, we just test == and !=; NaN inputs result in true
unordered comparisons and false ordered comparisons.