Commit Graph

66 Commits

Author SHA1 Message Date
Markus Tavenrath
b64a423b44
Workaround issue in MSVC arm64 compiler returning random upper 32-bits in function spvtools::util::CountSetBits. (#5763)
Fix issue #5762
2024-08-07 14:11:11 -04:00
Steven Perron
581279dedd
[OPT] Zero-extend unsigned 16-bit integers when bitcasting (#5714)
The folding rule `BitCastScalarOrVector` was incorrectly handling
bitcasting to unsigned integers smaller than 32-bits. It was simply
copying the entire 32-bit word containing the integer. This conflicts with the
requirement in section 2.2.1 of the SPIR-V spec which states that
unsigned numeric types with a bit width less than 32-bits must have the
high-order bits set to 0.

This change include a refactor of the bit extension code to be able to
test it better, and to use it in multiple files.

Fixes https://github.com/microsoft/DirectXShaderCompiler/issues/6319.
2024-06-19 19:17:05 +02:00
Steven Perron
048faf2e53
Remove use of deprecated std::aligned_storage (#5183)
* Remove use of deprecated std::aligned_storage

Fixes #4806.
2023-04-06 17:22:28 +02:00
David Neto
2937538210
Fix undef behaviour in hex float parsing (#5025)
When the parser saw more significant hex digits than fit in
the target type, it would compute a nonsensical shift amount, resulting
in undefined behaviour.

Now, drop the excess bits, effectively truncating the significand.

Also guard against overflow of the exponent in the extraordinary (and untested)
case where we see more than, for example, 2**(32-4+1) significant hex digits
for a 32-bit float, or 2**(16-4+1) significant hex digits for a 16-bit
float.

Also guard against overflow of the indexing counting the number of
significant bits.  When that would occur silently drop any further
significant bits.  (Untested)

Avoid hex floats in C++ code. It's a C++17 feature.

Fixes: #4724
2022-12-19 15:46:57 -05:00
David Neto
8dc0030ecb
spirv-as: Avoid overflow when parsing exponents on hex floats (#4874)
* spirv-as: Avoid overflow when parsing exponents on hex floats

When an exponent is so large that it would overflow the int
type in the parser, saturate the exponent.
This allows extremely large exponents, and saturates
to infinity when the exponent is positive, and zero when the exponent
is negative.

Fixes #4721.

* Avoid unexpected narrowing conversions from arithmetic operations

Co-authored-by: Alastair F. Donaldson <alastair.donaldson@imperial.ac.uk>
2022-07-28 09:40:07 -04:00
Greg Fischer
faa8d6a653
Revert "Optimize DefUseManager allocations (#4709)" (#4846)
This reverts commit d18d0d92e5.

This is reverted because it causes a 7X slowdown when legalizing
SPIR-V with NonSemantic.Shader.DebugInfo.100 instructions.
This is due to the creation of very large UseLists for several
heavily used operands for this extension combined with the fact
that the original commit changed the performance of Uselists to O(n).
2022-07-12 13:14:47 -06:00
Nico Weber
b0ce31fd2d
Suppress -Wunused-but-set-variable on variable (#4777)
sentinel_count is not used in non-assert builds.

Recent clang improvements to -Wunused-but-set-variable trigger a warning on this.

Fixes #4772.
2022-04-01 17:07:05 +01:00
pd-valve
d18d0d92e5
Optimize DefUseManager allocations (#4709)
* Optimize DefUseManager allocations

Saves around 30-35% of compilation time.

For inst->use_ids, use a pool linked list instead of allocating vectors for every instruction.  For inst->uses, use a "PooledLinkedList"' -- a linked list that has shared storage for all nodes.  Neither re-use nodes, instead we do a bulk compaction operation when too much memory is being wasted (tuneable).

Includes separate PooledLinkedList templated datastructure, a very special case construct, but split out to make the code a little easier to understand.
2022-02-15 19:17:30 -05:00
pd-valve
a123632ed9
Optimize Type::HashValue (#4707)
Incrementally compute the hash instead of collecting words

Avoids allocating temporary space in a std::vector and std::u32string, and making three passes over all the hashed data.
Switch to using std::vector to prevent processing duplicates instead of std::unordered_set: avoids an allocation/deletion every call to ComputeHashValue, and ends up faster due to much better cache behaviour and smaller constant-factor when searching the (generally very small) list.

In my test case, made Type::HashValue go from 7.5% of compilation time to .5%
2022-02-15 18:57:39 +00:00
pd-valve
44923beb52
Optimize Instruction::Instruction (#4705)
Avoid constructing temporary vector + copying operands multiple times.
Add SmallVector(InputIt first, InputIt last), matching std::vector.
2022-02-10 18:31:07 +00:00
luzpaz
65ecfd1093
Fix various source comment (doxygen) typos (#4680)
Found via `codespell -q 3 -L fo,lod,parm
2022-01-26 15:13:08 -05:00
smikims
e8439c1c9d
Avoid infinite recursion in comparison operators on SmallVector (#4681)
C++20 automatically adds reversed versions of operator overloads for
consideration; in this particular instance this results in infinite
recursion, which has now been pointed out elsewhere as a known issue
when migrating to C++20. Here we just disable one of the overloads in
C++20 mode and let the auto-reversing take care of it for us.
2022-01-25 09:07:40 -05:00
Marius Hillenbrand
1ed847f438
Fix endianness of string literals (#4622)
* Fix endianness of string literals

To get correct and consistent encoding and decoding of string literals
on big-endian platforms, use spvtools::utils::MakeString and MakeVector
(or wrapper functions) consistently for handling string literals.

- add variant of MakeVector that encodes a string literal into an
  existing vector of words
- add variants of MakeString
- add a wrapper spvDecodeLiteralStringOperand in source/
- fix wrapper Operand::AsString to use MakeString (source/opt)
- remove Operand::AsCString as broken and unused
- add a variant of GetOperandAs for string literals (source/val)
... and apply those wrappers throughout the code.

Fixes  #149

* Extend round trip test for StringLiterals to flip word order

In the encoding/decoding roundtrip tests for string literals, include
a case that flips byte order in words after encoding and then checks for
successful decoding. That is, on a little-endian host flip to big-endian
byte order and then decode, and vice versa.

* BinaryParseTest.InstructionWithStringOperand: also flip byte order

Test binary parsing of string operands both with the host's and with the
reversed byte order.
2021-12-08 12:01:26 -05:00
Alastair Donaldson
1f3f0f185b
Avoid uninitialised read when parsing hex float (#4646)
Ensures that when an attempt to read a floating-point value from an
input stream fails, the recieving variable for the read does not end up
uninitialised.

Fixes OSS-Fuzz:40432
Fixes #4644
2021-12-08 11:09:25 -05:00
David Neto
7e860e3831
fix parsing of bad binary exponents in hex floats (#4501)
- The binary exponent must have some decimal digits
- A + or - after the binary exponent digits should not be interpreted as
  part of the binary exponent.

Fixes: #4500
2021-09-03 12:27:12 -04:00
Jakub Kuderski
e99b918221
Support constant-folding UConvert and SConvert (#2960) 2019-10-16 16:29:55 -04:00
Steven Perron
4b64beb1ae
Add descriptor array scalar replacement (#2742)
Creates a pass that will replace a descriptor array with individual variables.  See #2740 for details.

Fixes #2740.
2019-08-08 10:53:19 -04:00
Steven Perron
72d4e5414b
Change HexFloat to work with gcc8. (#2109)
When we want to set a the value of a HexFloat to inf or nan, we
construct the specific bit pattern in an appropriately sized integer.
That integer is copied to a FloatProxy object through a memcpy.  GCC8
complains about the memcpy because it is overwriting a private member of
the class.

The original solution worked well because the template to the HexFloat
could be anything.  However, we only used some instantiation of FloatProxy,
which has a construction from that takes its uint_type, so I decided to use
that constructor instead of the memcpy.  This puts an extra requirement
on the templace for HexFloat, but it will be fine for us.

Part of #1541.
2018-11-26 15:47:48 -05:00
Steven Perron
75c1bf2843
Add option for the max id bound. (#1870)
* Create a new entry point for the optimizer

Creates a new struct to hold the options for the optimizer, and creates
an entry point that take the optimizer options as a parameter.

The old entry point that takes validator options are now deprecated.
The validator options will be one of the optimizer options.

Part of the optimizer options will also be the upper bound on the id bound.

* Add a command line option to set the max value for the id bound.  The default is 0x3FFFFF.

* Modify `TakeNextIdBound` to return 0 when the limit is reached.
2018-09-10 11:49:41 -04:00
dan sinclair
1963a2dbda
Use MakeUnique. (#1837)
This CL replaces instances of reset(new ..) with MakeUnique.
2018-08-14 15:01:50 -04:00
dan sinclair
1553025f4c
Move make_unique to source/util. (#1836)
This MakeUnique code is used in places other then source/opt so move it
to source/utils.
2018-08-14 12:44:54 -04:00
dan sinclair
5fc011b453
Move bit_stream, move_to_front and huffman_codec. (#1833)
bit_stream, move_to_front and huffman_codec are only used by
source/tools. Move into that directory to make the usage clearer.
2018-08-14 09:52:05 -04:00
dan sinclair
508df9a387
Remove unused bit stream methods. (#1807)
This CL deletes methods from bit stream which are never used and moves
several to the anonymous namespace in the bit_stream test file.
2018-08-07 09:10:54 -04:00
dan sinclair
e3ea909ebe
Simplify MoveToFront (#1806)
This CL removes the templating from the MoveToFront code as all non-test
code uses uint32_t as the variable.
2018-08-07 09:10:25 -04:00
dan sinclair
eda2cfbe12
Cleanup includes. (#1795)
This Cl cleans up the include paths to be relative to the top level
directory. Various include-what-you-use fixes have been added.
2018-08-03 15:06:09 -04:00
dan sinclair
58a6876cee
Rewrite include guards (#1793)
This CL rewrites the include guards to make PRESUBMIT.py include guard
check happy.
2018-08-03 08:05:33 -04:00
Dan Sinclair
89901a8a48 Wrap entire timer.cpp in SPIRV_TIMER_ENABLED.
This CL moves the SPIRV_TIMER_ENABLED preprocesser guard to encompass
the includes along with the source. Currently we will try to pull in
sys/resource.h on machines which may not have the file available and the
build will fail. If we don't need timers, then we don't need the
includes as well.
2018-07-31 10:38:18 -04:00
dan sinclair
76e0bde196 Move utils/ to spvtools::utils
Currently the utils/ folder uses both spvutils:: and spvtools::utils.
This CL changes the namespace to consistenly be spvtools::utils to match
the rest of the codebase.
2018-07-06 16:47:46 -04:00
Steven Perron
1f7b1f1bf7 Small vector optimization for operands.
We replace the std::vector in the Operand class by a new class that does
a small size optimization.  This helps improve compile time on Windows.

Tested on three sets of shaders.  Trying various values for the small
vector.  The optimal value for the operand class was 2.  However, for
the Instruction class, using an std::vector was optimal.  Size of "0"
means that an std::vector was used.

                Instruction size
	        0      4      8
Operand Size

0               489    544    684
1               593    487
2               469    570
4               473
8               505

This is a single thread run of ~120 shaders.  For the multithreaded run
the results were the similar.  The basline time was ~62sec.  The
optimal configuration was an 2 for the OperandData and an
std::vector for the OperandList with a compile time of ~38sec.  Similar
expiriments were done with other sets of shaders.  The compile time still
improved, but not as much.

Contributes to https://github.com/KhronosGroup/SPIRV-Tools/issues/1609.
2018-06-12 13:41:08 -04:00
Cort Stratton
72524db2de Fixes #1521: PadToWord() should use std::move() in && variant 2018-04-25 22:03:14 -04:00
Steven Perron
2c0ce87210
Vector DCE (#1512)
Introduce a pass that does a DCE type analysis for vector elements
instead of the whole vector as a single element.

It will then rewrite instructions that are not used with something else.
For example, an instruction whose value are not used, even though it is
referenced, is replaced with an OpUndef.
2018-04-23 11:13:07 -04:00
Steven Perron
d42f65e7c1 Use a bit vector in ADCE
The unordered_set in ADCE that holds all of the live instructions takes
a very long time to be destroyed.  In some shaders, it takes over 40% of
the time.

If we look at the unique ids of the live instructions, I believe they
are dense enough make a simple bit vector a good choice for to hold that
data.  When I check the density of the bit vector for larger shaders, we
are usually using less than 4 bytes per element in the vector, and
almost always less than 16.

So, in this commit, I introduce a simple bit vector class, and
use it in ADCE.

This help improve the compile time for some shaders on windows by the
40% mentioned above.

Contributes to https://github.com/KhronosGroup/SPIRV-Tools/issues/1328.
2018-04-13 16:38:02 -04:00
Neil Roberts
57a2441791 hex_float: Use max_digits10 for the float precision
CPPreference.com has this description of digits10:

“The value of std::numeric_limits<T>::digits10 is the number of
 base-10 digits that can be represented by the type T without change,
 that is, any number with this many significant decimal digits can be
 converted to a value of type T and back to decimal form, without
 change due to rounding or overflow.”

This means that any number with this many digits can be represented
accurately in the corresponding type. A change in any digit in a
number after that may or may not cause it a different bitwise
representation. Therefore this isn’t necessarily enough precision to
accurately represent the value in text. Instead we need max_digits10
which has the following description:

“The value of std::numeric_limits<T>::max_digits10 is the number of
 base-10 digits that are necessary to uniquely represent all distinct
 values of the type T, such as necessary for
 serialization/deserialization to text.”

The patch includes a test case in hex_float_test which tries to do a
round-robin conversion of a number that requires more than 6 decimal
places to be accurately represented. This would fail without the
patch.

Sadly this also breaks a bunch of other tests. Some of the tests in
hex_float_test use ldexp and then compare it with a value which is not
the same as the one returned by ldexp but instead is the value rounded
to 6 decimals. Others use values that are not evenly representable as
a binary floating fraction but then happened to generate the same
value when rounded to 6 decimals. Where the actual value didn’t seem
to matter these have been changed with different values that can be
represented as a binary fraction.
2018-04-03 12:53:10 -04:00
Jaebaek Seo
3b594e1630 Add --time-report to spirv-opt
This patch adds a new option --time-report to spirv-opt.  For each pass
executed by spirv-opt, the flag prints resource utilization for the pass
(CPU time, wall time, RSS and page faults)

This fixes issue #1378
2018-03-20 21:30:06 -04:00
Steven Perron
b3daa93b46 Change merge return pass to handle structured cfg.
We are seeing shaders that have multiple returns in a functions.  These
functions must get inlined for legalization purposes; however, the
inliner does not know how to inline functions that have multiple
returns.

The solution we will go with it to improve the merge return pass to
handle structured control flow.

Note that the merge return pass will assume the cfg has been cleanedup
by dead branch elimination.

Fixes #857.
2018-03-19 13:49:04 -04:00
Alan Baker
802cf053c7 Merge arithmetic with non-trivial constant operands
Adding basis of arithmetic merging

* Refactored constant collection in ConstantManager
* New rules:
 * consecutive negates
 * negate of arithmetic op with a constant
 * consecutive muls
 * reciprocal of div

* Removed IRContext::CanFoldFloatingPoint
 * replaced by Instruction::IsFloatingPointFoldingAllowed
* Fixed some bad tests
* added some header comments

Added PerformIntegerOperation

* minor fixes to constants and tests
* fixed IntMultiplyBy1 to work with 64 bit ints
* added tests for integer mul merging

Adding test for vector integer multiply merging

Adding support for merging integer add and sub through negate

* Added tests

Adding rules to merge mult with preceding divide

* Has a couple tests, but needs more
* Added more comments

Fixed bug in integer division folding

* Will no longer merge through integer division if there would be a
remainder in the division
* Added a bunch more tests

Adding rules to merge divide and multiply through divide

* Improved comments
* Added tests

Adding rules to handle mul or div of a negation

* Added tests

Changes for review

* Early exit if no constants are involved in more functions
* fixed some comments
* removed unused declaration
* clarified some logic

Adding new rules for add and subtract

* Fold adds of adds, subtracts or negates
* Fold subtracts of adds, subtracts or negates
* Added tests
2018-02-27 13:02:13 -05:00
Andrew Woloszyn
e543b195df Removed warnings from hex_float.h
Bitcasting FloatProxy<->uint_type was hitting a warning
with g++8.0.1. Replace bitcasts with new casting traits for FloatProxy.
2018-02-16 21:15:51 -05:00
Andrey Tuganov
12e6860d07 Add barrier instructions validation pass 2018-02-05 13:14:55 -05:00
Andrey Tuganov
905536c519 Fixed harmless uninit var warning 2018-01-31 17:49:01 -05:00
Diego Novillo
83228137e1 Re-format source tree - NFC.
Re-formatted the source tree with the command:

$ /usr/bin/clang-format -style=file -i \
    $(find include source tools test utils -name '*.cpp' -or -name '*.h')

This required a fix to source/val/decoration.h.  It was not including
spirv.h, which broke builds when the #include headers were re-ordered by
clang-format.
2017-11-27 14:31:49 -05:00
Diego Novillo
d2938e4842 Re-format files in source, source/opt, source/util, source/val and tools.
NFC. This just makes sure every file is formatted following the
formatting definition in .clang-format.

Re-formatted with:

$ clang-format -i $(find source tools include -name '*.cpp')
$ clang-format -i $(find source tools include -name '*.h')
2017-11-08 14:03:08 -05:00
Andrey Tuganov
7299fb5b7c Lowered initial capacity of move-to-front sequence
Also fixed outdated comments.
2017-10-31 12:00:42 -04:00
Steven Perron
94dc66b74d Change the sections in the module to use the InstructionList class.
This change will replace a number of the
std::vector<std::unique_ptr<Instruction>> member of the module to
InstructionList.  This is for consistency and to make it easier to
delete instructions that are no longer needed.
2017-10-25 15:52:06 -04:00
Andrey Tuganov
cfd95f3d5a Refactored compression debugger
Markv codec now receives two optional callbacks:
LogConsumer for internal codec logging
DebugConsumer for testing if encoding->decoding produces the original
results.
2017-10-23 22:12:40 -04:00
Steven Perron
8d6e4dbc72 Run dead variable elimination when using -O and -Os
We want to run the optimization when using -O and -Os, but it was not
added at part of https://github.com/KhronosGroup/SPIRV-Tools/pull/905.
This change will add that a well as some minor formatting changes
requested in that same pull request.
2017-10-23 22:09:12 -04:00
Steven Perron
bb7802b18c Change BasicBlock to use InstructionList to hold instructions.
This is the first step in replacing the std::vector of Instruction
pointers to using and intrusive linked list.

To this end, we created the InstructionList class.  It inherites from
the IntrusiveList class, but add the extra concept of ownership.  An
InstructionList owns the instruction that are in it.  This is to be
consistent with the current ownership rules where the vector owns the
instruction that are in it.

The other larger change is that the inst_ member of the BasicBlock class
was changed to using the InstructionList class.

Added test for the InsertBefore functions, and making sure that the
InstructionList destructor will delete the elements that it contains.

I've also add extra comments to explain ownership a little better.
2017-10-20 12:37:44 -04:00
Steven Perron
720beb161a Generic intrusive linked list class.
This commit is the initial implementation of the intrusive linked list
class.  It includes the implementation in the header files, and unit
test.

The iterators are circular: incrementing end() gives begin() and
decrementing begin() gives end().  Also made it valid to
decrement end().

Expliticly defines move constructor and move assignment
- Visual Studio 2013 does not implicitly generate the move constructor or
  move assignments.  So they need to be explicit, otherwise it will try to
  use the copy constructor, which we explicitly deleted.
- Can't use "= default" either.
  Seems like VS2013 does not support explicitly using the default move
  constructors and move assignments, so I wrote them out.
2017-10-12 12:40:18 -04:00
Andrew Woloszyn
d7f199b5d4 Hack around bug in gcc-4.8.1 templates.
This keeps the previous behavior for other compilers that will
throw warnings on a negative shift operation, but works around
the internal compiler error in GCC.
2017-10-06 10:26:17 -04:00
Andrey Tuganov
b36acbec0e Update MARK-V to version 1.01
Includes:
- Multi-sequence move-to-front
- Coding by id descriptor
- Statistical coding of non-id words
- Joint coding of opcode and num_operands

Removed explicit form Huffman codec constructor
- The standard use case for it is to be constructed from initializer list.

Using serialization for Huffman codecs
2017-09-06 16:03:16 -04:00
Andrey Tuganov
d41a52415a Fix encode zero bits on word boundary bug
Bit stream writer was manifesting incorrect behaviour when the following
two conditions were met:
- writer was on 64-bit word boundary
- WriteBits was invoked with num_bits=0 (can happen when a Huffman codec has only one
value)

The bug was causing very rare sporadic corruption which was detected by
tests after a random experimental change in MARK-V model.
2017-08-28 13:36:39 -04:00