SPIRV-Tools

mirror of https://github.com/KhronosGroup/SPIRV-Tools synced 2025-01-13 18:00:05 +00:00

Author	SHA1	Message	Date
Jaebaek Seo	d997c83b10	Add spirv-opt pass to replace descriptor accesses based on variable indices (#4574 ) This commit adds a spirv-opt pass to replace accesses to descriptor array based on variable indices with constant elements. Before: ``` %descriptor = OpVariable %_ptr_array_Image Uniform ... %ac = OpAccessChain %_ptr_Image %descriptor %variable_index (some image instructions using %ac) ``` After: ``` %descriptor = OpVariable %_ptr_array_Image Uniform ... OpSwitch %variable_index 0 %case0 1 %case1 ... ... %case0 = OpLabel %ac = OpAccessChain %_ptr_Image %descriptor %uint_0 ... %case1 = OpLabel %ac = OpAccessChain %_ptr_Image %descriptor %uint_1 ... (use OpPhi for value with concrete type) ```	2021-10-26 17:20:58 -04:00
Jaebaek Seo	57e1d8ebe3	Add spirv-opt convert-to-sampled-image pass (#4340 ) convert-to-sampled-image pass converts images and/or samplers with given pairs of descriptor set and binding to sampled image. If a pair of an image and a sampler have the same pair of descriptor set and binding that is one of the given pairs, they will be converted to a sampled image. In addition, if only an image has the descriptor set and binding that is one of the given pairs, it will be converted to a sampled image as well. For example, when we have %a = OpLoad %type_2d_image %texture %b = OpLoad %type_sampler %sampler %combined = OpSampledImage %type_sampled_image %a %b %value = OpImageSampleExplicitLod %v4float %combined ... 1. If %texture and %sampler have the same descriptor set and binding %combine_texture_and_sampler = OpVaraible %ptr_type_sampled_image_Uniform ... %combined = OpLoad %type_sampled_image %combine_texture_and_sampler %value = OpImageSampleExplicitLod %v4float %combined ... 2. If %texture and %sampler have different pairs of descriptor set and binding %a = OpLoad %type_sampled_image %texture %extracted_image = OpImage %type_2d_image %a %b = OpLoad %type_sampler %sampler %combined = OpSampledImage %type_sampled_image %extracted_image %b %value = OpImageSampleExplicitLod %v4float %combined ...	2021-08-18 08:30:48 -04:00
ZHOU He	f9893c4549	spirv-opt: A pass to removed unused input on OpEntryPoint instructions. (#4275 ) The new pass will removed interface variable on the OpEntryPoint instruction when they are not statically referenced in the call tree of the entry point. It can be enabled on the command line using the options `remove-unused-interface-variables`.	2021-06-29 11:33:58 -04:00
Greg Fischer	48007a5c7f	Add interpolate legalization pass (#4220 ) This pass converts an internal form of GLSLstd450 Interpolate ops to the externally valid form. The external form takes the lvalue of the interpolant. The internal form can do a load of the interpolant. The pass replaces the load with its pointer. The internal form is generated by glslang and possibly other frontends for HLSL shaders. The new pass is called as part of HLSL legalization after all propagation is complete. Also adds internal interpolate form to pre-legalization validation	2021-03-31 14:26:36 -04:00
Ryan Harrison	9150cd441f	Remove WebGPU support (#4108 ) Leaves SPV_ENV_WEBGPU_0 enum in place, but marked deprecated, so users of the library are not broken by an API enum being removed. Fixes #4101	2021-01-14 16:45:18 -05:00
Jaebaek Seo	f7da527757	Temporarily add EmptyPass to prevent glslang from failing (#4004 ) Removing PropagateLineInfoPass and RedundantLineInfoElimPass from `56d0f5035` makes unit tests of many open source projects fail. It will happen before submitting this glslang PR https://github.com/KhronosGroup/glslang/pull/2440. This commit will be git-reverted after merging the glslang PR.	2020-10-30 18:03:56 -04:00
Jaebaek Seo	56d0f50357	Propagate OpLine to all applied instructions in spirv-opt (#3951 ) Based on the OpLine spec, an OpLine instruction must be applied to the instructions physically following it up to the first occurrence of the next end of block, the next OpLine instruction, or the next OpNoLine instruction. ``` OpLine %file 0 0 OpNoLine OpLine %file 1 1 OpStore %foo %int_1 %value = OpLoad %int %foo OpLine %file 2 2 ``` For the above code, the current spirv-opt keeps three line instructions `OpLine %file 0 0`, `OpNoLine`, and `OpLine %file 1 1` in `std::vector<Instruction> dbg_line_insts_` of Instruction class for `OpStore %foo %int_1`. It does not put any line instruction to `std::vector<Instruction> dbg_line_insts_` of `%value = OpLoad %int %foo` even though `OpLine %file 1 1` must be applied to `%value = OpLoad %int %foo` based on the spec. This results in the missing line information for `%value = OpLoad %int %foo` while each spirv-opt pass optimizes the code. We have to put `OpLine %file 1 1` to `std::vector<Instruction> dbg_line_insts_` of both `%value = OpLoad %int %foo` and `OpStore %foo %int_1`. This commit conducts the line instruction propagation and skips emitting the eliminated line instructions at the end, which are the same with PropagateLineInfoPass and RedundantLineInfoElimPass. This commit removes PropagateLineInfoPass and RedundantLineInfoElimPass. KhronosGroup/glslang#2440 is a related PR that stop using PropagateLineInfoPass and RedundantLineInfoElimPass from glslang. When the code in this PR applied, the glslang tests will pass.	2020-10-29 13:06:30 -04:00
greg-lunarg	1fe9bcc108	Instrument: Debug Printf support (#3215 ) Create a pass to instrument OpDebugPrintf instructions. This pass replaces all OpDebugPrintf instructions with instructions to write a record containing the string id and the all specified values into a special printf output buffer (if space allows). This pass is designed to support the printf validation in the Vulkan validation layers. Fixes #3210	2020-03-12 09:19:52 -04:00
Steven Perron	35c9518c4e	Handle id overflow in the ssa rewriter. (#2845 ) * Handle id overflow in the ssa rewriter. Remove LocalSSAElim pass at the same time. It does the same thing as the SSARewrite pass. Then even share almost all of the same code. Fixes crbug.com/997246	2019-09-10 09:38:23 -04:00
greg-lunarg	d11725b1d4	Add --relax-float-ops and --convert-relaxed-to-half (#2808 ) The first pass applies the RelaxedPrecision decoration to all executable instructions with float32 based type results. The second pass converts all executable instructions with RelaxedPrecision result to the equivalent float16 type, inserting converts where necessary.	2019-09-03 13:22:13 -04:00
Steven Perron	35d98be3bc	Amd ext to khr (#2811 ) Add the first steps to removing the AMD extension VK_AMD_shader_ballot. Splitting up to make the PRs smaller. Adding utilities to add capabilities and change the version of the module. Replaces the instructions: OpGroupIAddNonUniformAMD = 5000 OpGroupFAddNonUniformAMD = 5001 OpGroupFMinNonUniformAMD = 5002 OpGroupUMinNonUniformAMD = 5003 OpGroupSMinNonUniformAMD = 5004 OpGroupFMaxNonUniformAMD = 5005 OpGroupUMaxNonUniformAMD = 5006 OpGroupSMaxNonUniformAMD = 5007 and extentend instructions WriteInvocationAMD = 3 MbcntAMD = 4 Part of #2814	2019-08-29 12:48:17 -04:00
greg-lunarg	06407250a1	Instrument: Add support for Buffer Device Address extension (#2792 )	2019-08-16 09:18:34 -04:00
Steven Perron	60043edfa1	Replace OpKill With function call. (#2790 ) We are no able to inline OpKill instructions into a continue construct. See #2433. However, we have to be able to inline to correctly do legalization. This commit creates a pass that will wrap OpKill instructions into a function of its own. That way we are able to inline the rest of the code. The follow up to this will be to not inline any function that contains an OpKill. Fixes #2726	2019-08-14 09:27:12 -04:00
Steven Perron	4b64beb1ae	Add descriptor array scalar replacement (#2742 ) Creates a pass that will replace a descriptor array with individual variables. See #2740 for details. Fixes #2740.	2019-08-08 10:53:19 -04:00
David Neto	31590104ec	Add pass to inject code for robust-buffer-access semantics (#2771 ) spirv-opt: Add --graphics-robust-access Clamps access chain indices so they are always in bounds. Assumes: - Logical addressing mode - No runtime-array-descriptor-indexing - No variable pointers Adds stub code for clamping coordinate and samples for OpImageTexelPointer. Adds SinglePassRunAndFail optimizer test fixture. Android.mk: add source/opt/graphics_robust_access_pass.cpp Adds Constant::GetSignExtendedValue, Constant::GetZeroExtendedValue	2019-07-30 19:52:46 -04:00
greg-lunarg	92c41ff1e7	Remove Common Uniform Elimination Pass (#2731 ) Remove Common Uniform Elimination Pass Fixes #2520.	2019-07-12 11:02:10 -04:00
Ryan Harrison	f6d9a17843	Add pass to fix some invalid unreachable blocks for WebGPU (#2563 ) Attempts to split up unreachable blocks that are used both as a merge-block and a continue-target. Fixes #2429	2019-05-09 12:56:10 -04:00
Ryan Harrison	048dcd38ce	Implement WebGPU->Vulkan initializer conversion for 'Function' variables (#2513 ) WebGPU requires certain variables to be initialized, whereas there are known issues with using initializers in Vulkan. This PR is the first of three implementing a pass to decompose initialized variables into a variable declaration followed by a store. This has been broken up into multiple PRs, because there 3 distinct cases that need to be handled, which require separate implementations. This first PR implements the basic infrastructure that is needed, and handling of Function storage class variables. Private and Output will be handled in future PRs. This is part of resolving #2388	2019-04-16 14:31:36 -04:00
Ryan Harrison	102e430a88	Add pass to legalize OpVectorShuffle for WebGPU (#2509 ) In WebGPU, the component operand 0xFFFFFFFF is forbidden, but in Vulkan it is used to indicate a value is undefined. When converting to WebGPU, 0xFFFFFFFF needs to converted to a legal value, though the specific one does not matter, since it was used to indicate an undefined entry in the original code. Choosing to use 0, since the operands are required to be on [0, N-1], so 0 is guaranteed to always be valid. Fixes #2349	2019-04-12 12:14:23 -04:00
Steven Perron	3a0bc9e724	Add fix storage class code. (#2434 ) This pass tries to fix validation error due to a mismatch of storage classes in instructions. There is no guarantee that all such error will be fixed, and it is possible that in fixing these errors, it could lead to other errors. Fixes #2430.	2019-04-05 13:12:08 -04:00
Ryan Harrison	01964e325f	Add pass to generate needed initializers for WebGPU (#2481 ) Fixes #2387	2019-04-03 11:44:09 -04:00
Ryan Harrison	e545522146	Add --strip-atomic-counter-memory (#2413 ) Adds an optimization pass to remove usages of AtomicCounterMemory bit. This bit is ignored in Vulkan environments and outright forbidden in WebGPU ones. Fixes #2242	2019-03-14 13:34:33 -04:00
Steven Perron	1b0047f210	Add pass to remove dead members. (#2379 ) Add a pass that looks for members of structs whose values do not affects the output of the shader. Those members are then removed and just treated like padding in the struct.	2019-02-14 13:42:35 -05:00
Steven Perron	dd4157dcee	Sink (#2284 ) Add code sinking pass. It will move OpLoad and OpAccessChain instructions as close as possible to their uses. Part of #1611.	2019-01-17 15:56:36 -05:00
alan-baker	e510b1bac5	Update memory model (#1904 ) Upgrade to VulkanKHR memory model * Converts Logical GLSL450 memory model to Logical VulkanKHR * Adds extension and capability * Removes deprecated decorations and replaces them with appropriate flags on downstream instructions * Support for Workgroup upgrades * Support for copy memory * Adding support for image functions * Adding barrier upgrades and tests * Use QueueFamilyKHR scope instead of device	2018-11-30 14:15:51 -05:00
greg-lunarg	c37388f1ad	Add passes to propagate and eliminate redundant line instructions (#2027 ). (#2039 ) These are bookend passes designed to help preserve line information across passes which delete, move and clone instructions. The propagation pass attaches a debug line instruction to every instruction based on SPIR-V line propagation rules. It should be performed before optimization. The redundant line elimination pass eliminates all line instructions which match the previous line instruction. This pass should be performed at the end of optimization to reduce physical SPIR-V file size. Fixes #2027.	2018-11-15 14:06:17 -05:00
greg-lunarg	1e9fc1aac1	Add base and core bindless validation instrumentation classes (#2014 ) * Add base and core bindless validation instrumentation classes * Fix formatting. * Few more formatting fixes * Fix build failure * More build fixes * Need to call non-const functions in order. Specifically, these are functions which call TakeNextId(). These need to be called in a specific order to guarantee that tests which do exact compares will work across all platforms. c++ pretty much does not guarantee order of evaluation of operands, so any such functions need to be called separately in individual statements to guarantee order. * More ordering. * And more ordering. * And more formatting. * Attempt to fix NDK build * Another attempt to address NDK build problem. * One more attempt at NDK build failure * Add instrument.hpp to BUILD.gn * Some name improvement in instrument.hpp * Change all types in instrument.hpp to int. * Improve documentation in instrument.hpp * Format fixes * Comment clean up in instrument.hpp * imageInst -> image_inst * Fix GetLabel() issue.	2018-11-08 13:54:54 -05:00
Steven Perron	75c1bf2843	Add option for the max id bound. (#1870 ) * Create a new entry point for the optimizer Creates a new struct to hold the options for the optimizer, and creates an entry point that take the optimizer options as a parameter. The old entry point that takes validator options are now deprecated. The validator options will be one of the optimizer options. Part of the optimizer options will also be the upper bound on the id bound. * Add a command line option to set the max value for the id bound. The default is 0x3FFFFF. * Modify `TakeNextIdBound` to return 0 when the limit is reached.	2018-09-10 11:49:41 -04:00
dan sinclair	eda2cfbe12	Cleanup includes. (#1795 ) This Cl cleans up the include paths to be relative to the top level directory. Various include-what-you-use fixes have been added.	2018-08-03 15:06:09 -04:00
dan sinclair	58a6876cee	Rewrite include guards (#1793 ) This CL rewrites the include guards to make PRESUBMIT.py include guard check happy.	2018-08-03 08:05:33 -04:00
Alan Baker	755e5c9420	Transform to combine consecutive access chains * Combines OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain and OpInBoundsPtrAccessChain * New folding rule to fold add with 0 for integers * Converts to a bitcast if the result type does not match the operand type V	2018-07-31 13:42:47 -04:00
Steven Perron	fe2fbee294	Delete the insert-extract-elim pass. Replaces anything that creates an insert-extract-elim pass and create a simplifiation pass instead. Then delete the implementation of the pass. Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1570.	2018-06-01 10:13:39 -04:00
Steven Perron	af430ec822	Add pass to fold a load feeding an extract. We have already disabled common uniform elimination because it created sequences of loads an entire uniform object, then we extract just a single element. This caused problems in some drivers, and is just generally slow because it loads more memory than needed. However, there are other way to get into this situation, so I've added a pass that looks specifically for this pattern and removes it when only a portion of the load is used. Fixes #1547.	2018-05-14 15:40:34 -04:00
Toomas Remmelg	1dc2458060	Add a loop fusion pass. This pass will look for adjacent loops that are compatible and legal to be fused. Loops are compatible if: - they both have one induction variable - they have the same upper and lower bounds - same initial value - same condition - they have the same update step - they are adjacent - there are no break/continue in either of them Fusion is legal if: - fused loops do not have any dependencies with dependence distance greater than 0 that did not exist in the original loops. - there are no function calls in the loops (could have side-effects) - there are no barriers in the loops It will fuse all such loops as long as the number of registers used for the fused loop stays under the threshold defined by max_registers_per_loop.	2018-05-01 15:40:37 -04:00
Stephen McGroarty	9a5dd6fe88	Support loop fission. Adds support for spliting loops whose register pressure exceeds a user provided level. This pass will split a loop into two or more loops given that the loop is a top level loop and that spliting the loop is legal. Control flow is left intact for dead code elimination to remove. This pass is enabled with the --loop-fission flag to spirv-opt.	2018-05-01 15:15:10 -04:00
Steven Perron	2c0ce87210	Vector DCE (#1512 ) Introduce a pass that does a DCE type analysis for vector elements instead of the whole vector as a single element. It will then rewrite instructions that are not used with something else. For example, an instruction whose value are not used, even though it is referenced, is replaced with an OpUndef.	2018-04-23 11:13:07 -04:00
Victor Lomuller	10e5d7cf13	Add a loop peeling pass. For each loop in a function, the pass walks the loops from inner to outer most loop and tries to peel loop for which a certain amount of iteration can be done before or after the loop. To limit code growth, peeling will not happen if the growth in code size goes above a configurable threshold.	2018-04-11 15:41:29 +01:00
Steven Perron	c4dc046399	Copy propagate arrays The sprir-v generated from HLSL code contain many copyies of very large arrays. Not only are these time consumming, but they also cause problems for drivers because they require too much space. To work around this, we will implement an array copy propagation. Note that we will not implement a complete array data flow analysis in order to implement this. We will be looking for very simple cases: 1) The source must never be stored to. 2) The target must be stored to exactly once. 3) The store to the target must be a store to the entire array, and be a copy of the entire source. 4) All loads of the target must be dominated by the store. The hard part is keeping all of the types correct. We do not want to have to do too large a search to update everything, which may not be possible, do we give up if we see any instruction that might be hard to update. Also in types.h, the element decorations are not stored in an std::map. This change was done so the hashing algorithm for a Struct is consistent. With the std::unordered_map, the traversal order was non-deterministic leading to the same type getting hashed to different values. See \|Struct::GetExtraHashWords\|. Contributes to #1416.	2018-03-26 14:44:41 -04:00
Diego Novillo	735d8a579e	SSA rewrite pass. This pass replaces the load/store elimination passes. It implements the SSA re-writing algorithm proposed in Simple and Efficient Construction of Static Single Assignment Form. Braun M., Buchwald S., Hack S., Leißa R., Mallon C., Zwinkau A. (2013) In: Jhala R., De Bosschere K. (eds) Compiler Construction. CC 2013. Lecture Notes in Computer Science, vol 7791. Springer, Berlin, Heidelberg https://link.springer.com/chapter/10.1007/978-3-642-37051-9_6 In contrast to common eager algorithms based on dominance and dominance frontier information, this algorithm works backwards from load operations. When a target variable is loaded, it queries the variable's reaching definition. If the reaching definition is unknown at the current location, it searches backwards in the CFG, inserting Phi instructions at join points in the CFG along the way until it finds the desired store instruction. The algorithm avoids repeated lookups using memoization. For reducible CFGs, which are a superset of the structured CFGs in SPIRV, this algorithm is proven to produce minimal SSA. That is, it inserts the minimal number of Phi instructions required to ensure the SSA property, but some Phi instructions may be dead (https://en.wikipedia.org/wiki/Static_single_assignment_form).	2018-03-20 20:56:55 -04:00
David Neto	844e186cf7	Add --strip-reflect pass Strips reflection info. This is limited to decorations and decoration instructions related to the SPV_GOOGLE_hlsl_functionality1 extension. It will remove the OpExtension for SPV_GOOGLE_hlsl_functionality1. It will also remove the OpExtension for SPV_GOOGLE_decorate_string if there are no further remaining uses of OpDecorateStringGOOGLE. Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1398	2018-03-15 21:20:42 -04:00
Victor Lomuller	3497a94460	Add loop unswitch pass. It moves all conditional branching and switch whose conditions are loop invariant and uniform. Before performing the loop unswitch we check that the loop does not contain any instruction that would prevent it (barriers, group instructions etc.).	2018-02-27 08:52:46 -05:00
Stephen McGroarty	dd8400e150	Initial support for loop unrolling. This patch adds initial support for loop unrolling in the form of a series of utility classes which perform the unrolling. The pass can be run with the command spirv-opt --loop-unroll. This will unroll loops within the module which have the unroll hint set. The unroller imposes a number of requirements on the loops it can unroll. These are documented in the comments for the LoopUtils::CanPerformUnroll method in loop_utils.h. Some of the restrictions will be lifted in future patches.	2018-02-14 15:44:38 -05:00
Alexander Johnston	84ccd0b9ae	Loop invariant code motion initial implementation	2018-02-08 22:55:47 -05:00
Steven Perron	61d8c0384b	Add pass to reaplce invalid opcodes Creates a pass that will remove instructions that are invalid for the current shader stage. For the instruction to be considered for replacement 1) The opcode must be valid for a shader modules. 2) The opcode must be invalid for the current shader stage. 3) All entry points to the module must be for the same shader stage. 4) The function containing the instruction must be reachable from an entry point. Fixes #1247.	2018-02-01 15:25:09 -05:00
GregF	f28b106173	InsertExtractElim: Split out DeadInsertElim as separate pass	2018-01-30 08:52:14 -05:00
Alan Baker	2e93e806e4	Initial implementation of if conversion * Handles simple cases only * Identifies phis in blocks with two predecessors and attempts to convert the phi to an select * does not perform code motion currently so the converted values must dominate the join point (e.g. can't be defined in the branches) * limited for now to two predecessors, but can be extended to handle more cases * Adding if conversion to -O and -Os	2018-01-25 09:42:00 -08:00
Steven Perron	34d4294c2c	Create a pass to work around a driver bug related to OpUnreachable. We have come across a driver bug where and OpUnreachable inside a loop is causing the shader to go into an infinite loop. This commit will try to avoid this bug by turning OpUnreachable instructions that are contained in a loop into branches to the loop merge block. This is not added to "-O" and "-Os" because it should only be used if the driver being targeted has this problem. Fixes #1209.	2018-01-18 20:31:46 -05:00
Diego Novillo	4ba9dcc8a0	Implement SSA CCP (SSA Conditional Constant Propagation). This implements the conditional constant propagation pass proposed in Constant propagation with conditional branches, Wegman and Zadeck, ACM TOPLAS 13(2):181-210. The main logic resides in CCPPass::VisitInstruction. Instruction that may produce a constant value are evaluated with the constant folder. If they produce a new constant, the instruction is considered interesting. Otherwise, it's considered varying (for unfoldable instructions) or just not interesting (when not enough operands have a constant value). The other main piece of logic is in CCPPass::VisitBranch. This evaluates the selector of the branch. When it's found to be a known value, it computes the destination basic block and sets it. This tells the propagator which branches to follow. The patch required extensions to the constant manager as well. Instead of hashing the Constant pointers, this patch changes the constant pool to hash the contents of the Constant. This allows the lookups to be done using the actual values of the Constant, preventing duplicate definitions.	2017-12-21 14:29:45 -05:00
Steven Perron	b86eb6842b	Convert private variables to function scope. When a private variable is used in a single function, it can be converted to a function scope variable in that function. This adds a pass that does that. The pass can be enabled using the option `--private-to-local`. This transformation allows other transformations to act on these variables. Also moved `FindPointerToType` from the inline class to the type manager.	2017-12-19 14:21:04 -05:00
Alan Baker	616908503d	Improving the usability of the type manager. The type manager hashes types. This allows the lookup of type declaration ids from arbitrarily constructed types. Users should be cautious when dealing with non-unique types (structs and potentially pointers) to get the exact id if necessary. * Changed the spec composite constant folder to handle ambiguous composites * Added functionality to create necessary instructions for a type * Added ability to remove ids from the type manager	2017-12-18 08:20:56 -05:00

1 2

84 Commits