SPIRV-Tools

mirror of https://github.com/KhronosGroup/SPIRV-Tools synced 2024-11-25 13:00:04 +00:00

Author	SHA1	Message	Date
Stephen McGroarty	1c2cbaf569	Add GetContinueBlock to loop class. Previously, the loop class used the terms latch and continue block interchangeably. This patch splits the two and corrects and tests some uses of the old uses of GetLatchBlock.	2018-05-03 14:30:41 -04:00
Steven Perron	70bb3c1cc2	Fold divide and multiply by same value. We want to fold code like (x*y)/x and other permutations of this. Fixes #1531.	2018-05-02 10:18:37 -04:00
Diego Novillo	9843484736	Fix build.	2018-05-01 17:46:47 -04:00
Toomas Remmelg	1dc2458060	Add a loop fusion pass. This pass will look for adjacent loops that are compatible and legal to be fused. Loops are compatible if: - they both have one induction variable - they have the same upper and lower bounds - same initial value - same condition - they have the same update step - they are adjacent - there are no break/continue in either of them Fusion is legal if: - fused loops do not have any dependencies with dependence distance greater than 0 that did not exist in the original loops. - there are no function calls in the loops (could have side-effects) - there are no barriers in the loops It will fuse all such loops as long as the number of registers used for the fused loop stays under the threshold defined by max_registers_per_loop.	2018-05-01 15:40:37 -04:00
Stephen McGroarty	9a5dd6fe88	Support loop fission. Adds support for spliting loops whose register pressure exceeds a user provided level. This pass will split a loop into two or more loops given that the loop is a top level loop and that spliting the loop is legal. Control flow is left intact for dead code elimination to remove. This pass is enabled with the --loop-fission flag to spirv-opt.	2018-05-01 15:15:10 -04:00
Steven Perron	9ba0879ddf	Improve Vector DCE Track live scalars in VDCE as if they were single element vectors. Handle the extended instructions for GLSL in VDCE. Handle composite construct instructions in VDCE.	2018-04-30 11:55:50 -04:00
Steven Perron	a00a0a09ae	Revert "Improvements to vector dce." This reverts commit `2813722993`. A regression was found. Undoing the change until it is fixed.	2018-04-27 10:33:19 -04:00
Alan Baker	4246abdc74	Fixes handling of kill and unreachable ops in inlining. Fixes #1527 * Adds handling for copying OpKill and OpUnreachable and forces the generation of a new basic block * Adds tests to check	2018-04-27 09:42:37 -04:00
Steven Perron	e1bcd2b2d8	Fold OpVectorTimesScalar and OpPhi better. If one of the operands to an OpVectorTimesScalar instruction is zero, then the result will be the 0 vector. Currently we do not fold the insturction unless both operands are constants. This change fixes that. We also allow folding of OpPhi instructions where the incoming values are either an OpUndef or the OpPhi instruction itself. As with other cases, this can be simplified to the OpUndef.	2018-04-26 12:41:16 -04:00
Steven Perron	2813722993	Improvements to vector dce. Track live scalars in VDCE as if they were single element vectors. Handle the extended instructions for GLSL in VDCE. Handle composite construct instructions in VDCE. Fixes #1511.	2018-04-26 11:07:48 -04:00
Cort Stratton	72524db2de	Fixes #1521 : PadToWord() should use std::move() in && variant	2018-04-25 22:03:14 -04:00
Greg Fischer	268be6143d	LocalSingleBlockElim: Add store-store elimination Eliminate unused store to variable if followed by store to same variable in same block. Most significantly, this cleans up stores made unused by this pass. These useless stores can inhibit subsequent optimizations, specifically LocalSingleStoreElim. Eliminating them makes subsequent optimization more effective. The main effect of this pass is to simplify the work done by the SSA rewriter. It catches many local loads/stores that help speeding up the work done by the main rewriter.	2018-04-25 10:30:18 -04:00
Steven Perron	ee8cd5c847	Add Dead insert elmination back in.	2018-04-24 10:10:30 -04:00
Steven Perron	2c0ce87210	Vector DCE (#1512 ) Introduce a pass that does a DCE type analysis for vector elements instead of the whole vector as a single element. It will then rewrite instructions that are not used with something else. For example, an instruction whose value are not used, even though it is referenced, is replaced with an OpUndef.	2018-04-23 11:13:07 -04:00
David Neto	7a59283587	Another fix for old XCode: std::set explicit ctor in test code	2018-04-20 15:58:01 -04:00
Victor Lomuller	efc5061929	Dominator analysis interface clean. Remove the CFG requirement when querying a dominator/post-dominator from an IRContext. Updated all uses of the function and tests.	2018-04-20 15:41:59 -04:00
Jaebaek Seo	48802bad72	Constant folding for OpVectorTimesScalar	2018-04-20 13:43:04 -04:00
Victor Lomuller	0ec08c28c1	Add register liveness analysis. For each function, the analysis determine which SSA registers are live at the beginning of each basic block and which one are killed at the end of the basic block. It also includes utilities to simulate the register pressure for loop fusion and fission. The implementation is based on the paper "A non-iterative data-flow algorithm for computing liveness sets in strict ssa programs" from Boissinot et al.	2018-04-20 09:45:15 -04:00
Alan Baker	09c206b6fb	Fixes #1480 . Validate group non-uniform scopes. * Adds new pass for validating non-uniform group instructions * Currently on checks execution scope for Vulkan 1.1 and SPIR-V 1.3 * Added test framework	2018-04-20 09:25:00 -04:00
David Neto	e7c2e91ded	Fix for old XCode: std::set has explicit ctor	2018-04-19 16:33:12 -04:00
GregF	1c89da46ff	Test/DependencyAnalysis: Fix uninitialized variables	2018-04-19 15:34:15 -04:00
Greg Fischer	df7f00f60e	DeadInsertElim: Don't revisit select phi nodes during MarkInsertChain Fixes #1487.	2018-04-19 14:40:00 -04:00
Jaebaek Seo	430a29335e	Fix broken pointer of CommonUniformElimPass	2018-04-19 09:36:10 -04:00
Steven Perron	c20a718e00	Rewrite local-single-store-elim to not create large data structures. The local-single-store-elim algorithm is not fundamentally bad. However, when there are a large number of variables, some of the maps that are used can become very large. These large data structures then take a very long time to be destroyed. I've seen cases around 40% if the time. I've rewritten that algorithm to not use as much memory. This give a significant improvement when running a large number of shader through DXC. I've also made a small change to local-single-block-elim to delete the loads that is has replaced. That way local-single-store-elim will not have to look at those. local-single-store-elim now does the same thing. The time for one set goes from 309s down to 126s. For another set, the time goes from 102s down to 88s.	2018-04-18 16:38:18 -04:00
Jaebaek Seo	0fa42996b5	Merge pull request #1461 from jaebaek/fnegate Add constant folding for OpFNegate Contributes to #709	2018-04-18 13:46:10 -04:00
Jaebaek Seo	3c5bd26668	Typo	2018-04-17 14:13:19 -04:00
Toomas Remmelg	0f335cf87e	Add support for MIV and Delta test dependence analysis. GCD MIV test as described in Chapter 3 of "Optimizing Compilers for Modern Architectures: A Dependence-Based Approach" by Randy Allen, and Ken Kennedy. Delta test as described in Figure 3 of "Practical Dependence Testing" by Gina Goff, Ken Kennedy, and Chau-Wen Tseng from PLDI '91.	2018-04-17 13:57:02 -04:00
Jaebaek Seo	ff92339fff	Format	2018-04-17 12:12:48 -04:00
Jaebaek Seo	d8b9306a4f	Add more unit tests	2018-04-17 12:08:45 -04:00
Jaebaek Seo	79491259e0	Add constant folding for FNegate	2018-04-17 12:08:45 -04:00
Alan Baker	38359ba800	Fixes #1483 . Validating Vulkan 1.1 barrier execution scopes * Reworked how execution model limitations are checked * Now OpFunction checks which entry points call it and checks its registered limitations instead of building a call stack in the entry point * New tests * Moving function to entry point mapping into VState	2018-04-17 10:26:38 -04:00
David Neto	152b9a681e	ADCE: Remove OpDecorateStringGOOGLE Also fix a few failures to set "modified" status when removing global values. Add OpDecorateStringGOOGLE to decoration ordering Fixes #1492	2018-04-17 10:24:30 -04:00
Alan Baker	0e80b86dbe	Fixes #1472 . Per-vertex variable validation fixes. Relaxs checks for per-vertex builtin variables. If the builtin decoration is applied to a variable, then those checks now allow a level of arraying on the variable before checking the type consistency. * Allows arrays of variables to be present for the per-vertex variables: * Position * PointSize * ClipDistance * CullDistance * Updated tests	2018-04-16 12:58:35 -04:00
Rex Xu	7fe186476a	Fix validation issues relevant to SPV_AMD_gpu_shader_int16. Frexp/FrexpStruct allows exp to be either 16-bit or 32 bit integer if SPV_AMD_gpu_shader_int16 is enabled.	2018-04-16 10:49:01 -04:00
Lei Zhang	a3bb782745	Travis CI: change to use the default email notification behavior	2018-04-16 10:44:42 -04:00
David Neto	e8814be732	Add validator test for OpBranch Add test for case where OpBranch branches to a value (a function value). Previous tests only checked a label value (name of a block.). Update validate_id.cpp to remove the TODO for OpBranch and say that it is already checked in validate_cfg.cpp	2018-04-16 10:27:51 -04:00
Steven Perron	d42f65e7c1	Use a bit vector in ADCE The unordered_set in ADCE that holds all of the live instructions takes a very long time to be destroyed. In some shaders, it takes over 40% of the time. If we look at the unique ids of the live instructions, I believe they are dense enough make a simple bit vector a good choice for to hold that data. When I check the density of the bit vector for larger shaders, we are usually using less than 4 bytes per element in the vector, and almost always less than 16. So, in this commit, I introduce a simple bit vector class, and use it in ADCE. This help improve the compile time for some shaders on windows by the 40% mentioned above. Contributes to https://github.com/KhronosGroup/SPIRV-Tools/issues/1328.	2018-04-13 16:38:02 -04:00
Steven Perron	8190c26270	Change parameter to Mempass::RemovePhiOperands Pass a hashtable by const ref instead of by value. Big impact on compile time.	2018-04-13 09:53:37 -04:00
Alan Baker	e805d1f8d7	Fixes #1469 . Allow subgroup memory scope for Vulkan 1.1 * New error that prevents CrossDevice memory scope for all vulkan * Old error specifically references Vulkan 1.0 * New tests	2018-04-12 13:16:04 -04:00
Alan Baker	c3ee210563	Fixes #1471 . Adds missing environments to spriv-val help * spirv-val: Added environments referenced in --version, but not mentioned in --help	2018-04-12 13:11:52 -04:00
Alan Baker	c522b697bf	Fixes #1470 . Don't restrict WGS storage class * Removed restriction that workgroup size can only be on Input storage class * added test	2018-04-12 09:22:34 -04:00
Steven Perron	bc648fd76a	Delete unused code in MemPass Since the SSA rewriter was added, the code old phi insertion code is no longer used. It is going stale and should be deleted.	2018-04-11 15:40:33 -04:00
Steven Perron	c584ac4fc6	Don't allow an instance of a pass to be run multiple times.	2018-04-11 12:02:30 -04:00
Victor Lomuller	10e5d7cf13	Add a loop peeling pass. For each loop in a function, the pass walks the loops from inner to outer most loop and tries to peel loop for which a certain amount of iteration can be done before or after the loop. To limit code growth, peeling will not happen if the growth in code size goes above a configurable threshold.	2018-04-11 15:41:29 +01:00
Alexander Johnston	61b50b3bfa	ZIV and SIV loop dependence analysis. Provides functionality to perform ZIV and SIV dependency analysis tests between a load and store within the same loop. Dependency tests rely on scalar analysis to prove and disprove dependencies with regard to the loop being analysed. Based on the 1990 paper Practical Dependence Testing by Goff, Kennedy, Tseng Adds support for marking loops in the loop nest as IRRELEVANT. Loops are marked IRRELEVANT if the analysed instructions contain no induction variables for the loops, i.e. the loops induction variable is not relevent to the dependence of the store and load.	2018-04-11 09:32:42 -04:00
Steven Perron	53bc1623ec	Fold OpDot Adding three rules to fold OpDot (implemented as two). - When an OpDot has two constants, then fold to the resulting const. - When one of the inputs is the 0 vector, then fold to zero. - When one of the inputs is a single 1 with 0s, then rewrite to an OpCompositeExtract of the appropriate element. This will help find even more folding opportunities. Contributes to #709.	2018-04-10 13:09:37 -04:00
Alan Baker	3020104ff2	Adding tests for OpenCL 1.2 and embedded profiles	2018-04-09 09:02:50 -04:00
Alan Baker	42840d15e4	Fixes #1433 . Validate binary version * Validates SPIR-V binary version against target environment	2018-04-06 22:41:50 -04:00
Lei Zhang	26a698c347	Fix PrimitiveId builtin check for Vulkan According to Vulkan spec 1.1.72: > The PrimitiveId decoration must be used only within fragment, > tessellation control, tessellation evaluation, and geometry shaders. > In a tessellation control or tessellation evaluation shader, any > variable decorated with PrimitiveId must be declared using the Input > storage class. We were enforcing that PrimitiveId can only be used with Output storage class for TCS and TES before.	2018-04-06 22:38:32 -04:00
David Neto	5f53c42a1e	Update CHANGES	2018-04-06 16:42:56 -04:00

1 2 3 4 5 ...

1445 Commits