This CL moves the files in opt/ to consistenly be under the opt::
namespace. This frees up the ir:: namespace so it can be used to make a
shared ir represenation.
The local-single-store-elim algorithm is not fundamentally bad.
However, when there are a large number of variables, some of the
maps that are used can become very large. These large data structures
then take a very long time to be destroyed. I've seen cases around 40%
if the time.
I've rewritten that algorithm to not use as much memory. This give a
significant improvement when running a large number of shader through
DXC.
I've also made a small change to local-single-block-elim to delete the
loads that is has replaced. That way local-single-store-elim will not
have to look at those. local-single-store-elim now does the same thing.
The time for one set goes from 309s down to 126s. For another set, the
time goes from 102s down to 88s.
The sprir-v generated from HLSL code contain many copyies of very large
arrays. Not only are these time consumming, but they also cause
problems for drivers because they require too much space.
To work around this, we will implement an array copy propagation. Note
that we will not implement a complete array data flow analysis in order
to implement this. We will be looking for very simple cases:
1) The source must never be stored to.
2) The target must be stored to exactly once.
3) The store to the target must be a store to the entire array, and be a
copy of the entire source.
4) All loads of the target must be dominated by the store.
The hard part is keeping all of the types correct. We do not want to
have to do too large a search to update everything, which may not be
possible, do we give up if we see any instruction that might be hard to
update.
Also in types.h, the element decorations are not stored in an std::map.
This change was done so the hashing algorithm for a Struct is
consistent. With the std::unordered_map, the traversal order was
non-deterministic leading to the same type getting hashed to different
values. See |Struct::GetExtraHashWords|.
Contributes to #1416.
* Handles simple cases only
* Identifies phis in blocks with two predecessors and attempts to
convert the phi to an select
* does not perform code motion currently so the converted values must
dominate the join point (e.g. can't be defined in the branches)
* limited for now to two predecessors, but can be extended to handle
more cases
* Adding if conversion to -O and -Os