SPIRV-Tools

mirror of https://github.com/KhronosGroup/SPIRV-Tools synced 2024-12-03 07:51:04 +00:00

Author	SHA1	Message	Date
Greg Fischer	d9f8925785	spirv-opt: Where possible make code agnostic of opencl/vulkan debuginfo (#4385 ) Co-authored-by: baldurk <baldurk@baldurk.org>	2021-07-21 12:04:38 -04:00
Jaebaek Seo	4baf3affe3	spirv-opt: support SPV_EXT_shader_image_int64 (#4379 )	2021-07-14 08:43:35 -04:00
Kévin Petit	e065c482c6	Initial support for SPV_KHR_integer_dot_product (#4327 ) * Initial support for SPV_KHR_integer_dot_product - Adds new operand types for packed-vector-format - Moves ray tracing enums to the end - PackedVectorFormat is a new optional operand type, so it requires special handling in grammar table generation. - Add SPV_KHR_integer_dot_product to optimizer whitelists. - Pass-through validation: valid cases pass validation Validation errors are not checked. - Update SPIRV-Headers Patch by David Neto <dneto@google.com> Rebase and minor tweaks by Kevin Petit <kevin.petit@arm.com> Signed-off-by: David Neto <dneto@google.com> Signed-off-by: Kevin Petit <kevin.petit@arm.com> Change-Id: Icb41741cb7f0f1063e5541ce25e5ba6c02266d2c * format fixes Change-Id: I35c82ec27bded3d1b62373fa6daec3ffd91105a3	2021-06-23 13:32:24 -04:00
alan-baker	4d22f58a81	Support SPV_KHR_subgroup_uniform_control_flow (#4318 ) * Support SPV_KHR_subgroup_uniform_control_flow Covers: - assembler - disassembler - validator - optimizer (add to whitelists) * fix copyright Co-authored-by: David Neto <dneto@google.com>	2021-06-15 10:07:42 -04:00
Jaebaek Seo	e8bd26e1f8	Set correct scope and line info for DebugValue (#4125 ) The existing spirv-opt `DebugInfoManager::AddDebugValueForDecl()` sets the scope and line info of the new added DebugValue using the scope and line of DebugDeclare. This is wrong because only a single DebugDeclare must exist under a scope while we have to add DebugValue for all the places where the variable's value is updated. Therefore, we have to set the scope and line of DebugValue based on the places of the variable updates. This bug makes https://github.com/google/amber/blob/main/tests/cases/debugger_hlsl_shadowed_vars.amber fail. This commit fixes the bug.	2021-01-28 12:57:35 -05:00
Jaebaek Seo	f686518cee	spirv-opt: properly preserve DebugValue indexes operand (#4022 ) spirv-opt has a bug that `DebugInfoManager::AddDebugValueWithIndex()` does not preserve `Indexes` operands of [DebugValue](https://www.khronos.org/registry/spir-v/specs/unified1/OpenCL.DebugInfo.100.html#DebugValue). It has to preserve all of those `Indexes` operands, but it preserves only the first index operand. This PR removes `DebugInfoManager::AddDebugValueWithIndex()` and lets the spirv-opt use `DebugInfoManager::AddDebugValueForDecl()`. `DebugInfoManager::AddDebugValueForDecl()` preserves the Indexes operand correctly.	2020-11-13 12:06:38 -05:00
Jaebaek Seo	c2b2b57885	Add DebugValue for invisible store in single_store_elim (#4002 ) The front-end language compiler would simply emit DebugDeclare for a variable when it is declared, which is effective through the variable's scope. Since DebugDeclare only maps an OpVariable to a local variable, the information can be removed when an optimization pass uses the loaded value of the variable. DebugValue can be used to specify the value of a variable. For each value update or phi instruction of a variable, we can add DebugValue to help debugger inspect the variable at any point of the program execution. For example, float a = 3; ... (complicated cfg) ... foo(a); // <-- variable inspection: debugger can find DebugValue of `float a` in the nearest dominant For the code with complicated CFG e.g., for-loop, if-statement, we need help of ssa-rewrite to analyze the effective value of each variable in each basic block. If the value update of the variable happens only once and it dominates all its uses, local-single-store-elim pass conducts the same value update with ssa-rewrite and we have to let it add DebugValue for the value assignment. One main issue is that we have to add DebugValue only when the value update of a variable is visible to DebugDeclare. For example, ``` { // scope1 %stack = OpVariable %ptr_int %int_3 { // scope2 DebugDeclare %foo %stack <-- local variable "foo" in high-level language source code is declared as OpVariable "%stack" // add DebugValue "foo = 3" ... Store %stack %int_7 <-- foo = 7, add DebugValue "foo = 7" ... // debugger can inspect the value of "foo" } Store %stack %int_11 <-- out of "scope2" i.e., scope of "foo". DO NOT add DebugValue "foo = 11" } ``` However, the initalization of a variable is an exception. For example, an argument passing of an inlined function must be done out of the function's scope, but we must add a DebugValue for it. ``` // in HLSL bar(float arg) { ... } ... float foo = 3; bar(foo); // in SPIR-V %arg = OpVariable OpStore %arg %foo <-- Argument passing. Out of "float arg" scope, but we must add DebugValue for "float arg" ... body of function bar(float arg) ... ``` This PR handles the except case in local-single-store-elim pass. It adds DebugValue for a store that is considered as an initialization. The same exception handling code for ssa-rewrite is done by this commit: `df4198e50e`.	2020-11-04 13:43:59 -05:00
Jaebaek Seo	df4198e50e	Add DebugValue for DebugDecl invisible to value assignment (#3973 ) For some cases, we have DebugDecl invisible to a value assignment, but the value assignment information is important i.e., debugger cannot inspect the variable without the information. For example, a parameter of an inlined function must have its value assignment i.e., argument passing out of its function scope. If we simply remove DebugDecl because it is invisible to the argument passing, we cannot inspec the variable. This PR - Adds DebugValue for DebugDecl invisible to a value assignment. We use the value of the variable in the basic block that contains DebugDecl, which is found by ssa-rewrite. If the value instruction does not dominate DebugDecl, we use the value of the variable in the immediate dominator of the basic block. - Checks the visibility of DebugDecl for Phi value assignment based on the all value operands of the Phi. Since Phi just references multiple values from multiple basic blocks, scopes of value operands must be regarded as the scope of the Phi.	2020-10-27 15:10:08 -04:00
Steven Perron	a187dd58a0	Allow SPV_KHR_8bit_storage extension. (#3780 )	2020-09-08 14:13:01 -04:00
Jaebaek Seo	ebaefda666	Debug info preservation in loop-unroll pass (#3548 ) When we copy the loop body to unroll it, we have to copy its instructions but DebugDeclare or DebugValue used for the declaration i.e., DebugValue with Deref must not be copied and only the first block can contain those instructions.	2020-07-30 12:18:06 -04:00
Jaebaek Seo	6a3eb679bd	Preserve debug info in scalar replacement pass (#3461 ) 1. Set the debug scope and line information for the new replacement instructions. 2. Replace DebugDeclare and DebugValue if their OpVariable or value operands are replaced by scalars. It uses 'Indexes' operand of DebugValue. For example, struct S { int a; int b;} S foo; // before scalar replacement int foo_a; // after scalar replacement int foo_b; DebugDeclare %dbg_foo %foo %null_expr // before DebugValue %dbg_foo %foo_a %Deref_expr 0 // after DebugValue %dbg_foo %foo_b %Deref_expr 1 // means Value(foo.members[1]) == Deref(%foo_b)	2020-07-27 13:02:25 -04:00
alan-baker	f3cec93665	Support SPV_KHR_terminate_invocation (#3568 ) Covers: - assembler - disassembler - validator - optimizer Co-authored-by: David Neto <dneto@google.com>	2020-07-22 11:45:02 -04:00
greg-lunarg	cf8c86a2d9	Preserve OpenCL.DebugInfo.100 through elim-local-single-store (#3498 ) This pass basically follows the same process as ssa-rewrite: it adds a DebugValue after each Store and removes the DebugDeclare or DebugValue Deref. It only does this if all instructions that are dependent on the Store are Loads and are replaced.	2020-07-10 15:17:14 -04:00
dan sinclair	52a5f074e9	Update access control lists. (#3433 ) This CL updates the access control lists used in SPIRV-Tools to the more descriptive allow/deny naming.	2020-06-15 13:20:40 -04:00
Daniel Koch	5a97e3a391	Add support for KHR_ray_{query,tracing} extensions (#3235 ) Update validator for SPV_KHR_ray_tracing. * Added handling for new enum types * Add SpvScopeShaderCallKHR as a valid scope * update spirv-headers Co-authored-by: alelenv <alele@nvidia.com> Co-authored-by: Torosdagli <ntorosda@amd.com> Co-authored-by: Tobias Hector <tobias.hector@amd.com> Co-authored-by: Steven Perron <stevenperron@google.com>	2020-03-17 15:30:19 -04:00
greg-lunarg	29af42df12	Add SPV_EXT_physical_storage_buffer to opt whitelists (#2779 ) This also fixes ADCE to not remove possibly needed OpTypeForwardPointer. The bug, its fix and the corresponding test have a circular dependency with the extension, so they are packaged together.	2019-08-08 09:45:59 -04:00
Steven Perron	12e4a7b649	Handle variable pointer in some optimizations (#2490 ) * Check var pointer capability in ADCE. * Check var ptr capability for common uniform. * Check var ptr capability in access chain convert. Since we want this pass to run even if there are variable pointer on storage buffers, we had to remove asserts that assumed there were no variable pointers. The functions with the asserts will now work, it becomes the responsibility of the callers to deal with the output as appropriate. * Single block elimination and variable pointers. It seems like the code in local single block elimination is able to handle cases with variable pointers already. This is because the function `HasOnlySupportedRefs` ensures that variables that feed a variable pointer are not candidates. * Single store elimination and variable pointers. It seems like the code in local single stroe elimination is able to handle cases with variable pointers already. This is because the function `FindSingleStoreAndCheckUses` ensures that variables that feed a variable pointer are not candidates. * SSA rewriter and variable pointers. It seems like the code in the two passes that call the SSA rewriter are able to handle cases with variable pointers already. This is because the function `HasOnlySupportedRefs` ensures that variables that feed a variable pointer are not candidates. Fixes #2458.	2019-04-03 12:47:51 -04:00
Steven Perron	2d2a512691	Don't inline recursive functions. (#2130 ) * Move ProcessFunction* function from pass to the context. There are a few functions that are used to traverse the call tree. They currently live in the Pass class, but they have nothing to do with a pass, and may be needed outside of a pass. They would be better in the ir context, or in a specific call tree class if we ever have a need for it. * Don't inline recursive functions. Inlining does not check if a function is recursive or not. This has been fine as long as the shader was a Vulkan shader, which forbid recursive functions. However, not all shaders are vulkan, so either we limit inlining to Vulkan shaders or we teach it to look for recursive functions. I prefer to keep the passes as general as is reasonable. The change does not require much new code in inlining and gives a reason to refactor some other code. The changes are to add a member function to the Function class that checks if that function is recursive or not. Then this is used in inlining to not inlining a function call if it calls a recursive function. * Add id to function analysis There are a few places that build a map from ids to Function whose result is that id. I decided to add an analysis to the context for this to reduce that code, and simplify some of the functions. * Add missing file.	2018-11-29 14:24:58 -05:00
Daniel Koch	3b210d6a63	Add basic support for EXT_fragment_invocation_density (#2100 ) Whitelisting the extension in optimizations * copying what was done for NV_shading_rate	2018-11-23 10:21:19 -05:00
alelenv	1c1e749f0b	Add support for nv-raytracing-final (#2010 ) Add support for nv-raytracing (non-experimental)	2018-10-25 14:07:46 -04:00
Chao Chen	6e2dab2ffd	Add support for Nvidia Turing extensions	2018-09-19 20:46:14 -04:00
dan sinclair	eda2cfbe12	Cleanup includes. (#1795 ) This Cl cleans up the include paths to be relative to the top level directory. Various include-what-you-use fixes have been added.	2018-08-03 15:06:09 -04:00
dan sinclair	a5a5ea0e2d	Remove using std::<foo> statements. (#1756 ) Many of the files have using std::<foo> statements in them, but then the use of <foo> will be inconsistently std::<foo> or <foo> scattered through the file. This CL removes all of the using statements and updates the code to have the required std:: prefix.	2018-08-01 14:58:12 -04:00
dan sinclair	c7da51a085	Cleanup extraneous namespace qualifies in source/opt. (#1716 ) This CL follows up on the opt namespacing CLs by removing the unnecessary opt:: and opt::analysis:: namespace prefixes.	2018-07-12 15:14:43 -04:00
dan sinclair	f96b7f1cb9	use Pass::Run to set the context on each pass. (#1708 ) Currently the IRContext is passed into the Pass::Process method. It is then up to the individual pass to store the context into the context_ variable. This CL changes the Run method to store the context before calling Process which no-longer receives the context as a parameter.	2018-07-12 09:08:45 -04:00
dan sinclair	e6b953361d	Move the ir namespace to opt. (#1680 ) This CL moves the files in opt/ to consistenly be under the opt:: namespace. This frees up the ir:: namespace so it can be used to make a shared ir represenation.	2018-07-09 11:32:29 -04:00
Victor Lomuller	efc5061929	Dominator analysis interface clean. Remove the CFG requirement when querying a dominator/post-dominator from an IRContext. Updated all uses of the function and tests.	2018-04-20 15:41:59 -04:00
Steven Perron	c20a718e00	Rewrite local-single-store-elim to not create large data structures. The local-single-store-elim algorithm is not fundamentally bad. However, when there are a large number of variables, some of the maps that are used can become very large. These large data structures then take a very long time to be destroyed. I've seen cases around 40% if the time. I've rewritten that algorithm to not use as much memory. This give a significant improvement when running a large number of shader through DXC. I've also made a small change to local-single-block-elim to delete the loads that is has replaced. That way local-single-store-elim will not have to look at those. local-single-store-elim now does the same thing. The time for one set goes from 309s down to 126s. For another set, the time goes from 102s down to 88s.	2018-04-18 16:38:18 -04:00
David Neto	a91cbfbf75	Optimizer: update extension whitelists Add two new extensions: - SPV_NV_shader_subgroup_partitioned - SPV_EXT_descriptor_indexing	2018-04-06 15:56:20 -04:00
Eleni Maria Stea	045cc8f75b	Fixes compile errors generated with -Wpedantic This patch fixes the compile errors generated when the options SPIRV_WARN_EVERYTHING and SPIRV_WERROR (that force -Wpedantic) are set to cmake.	2018-03-22 09:40:11 -04:00
David Neto	2e3aec23ca	Add recent Google extensions to optimizer whitelists Optimizations should work in the presence of recent SPV_GOOGLE_decorate_string and SPV_GOOGLE_hlsl_functionality1 SPV_GOOGLE_decorate_string: - Adds operation OpDecorateStringGOOGLE to decorate an object with decorations having string operands. SPV_GOOGLE_hlsl_functionality1: - Adds HlslSemanticGOOGLE, used to decorate an interface variable with an HLSL semantic string. Optimizations already preserve those variables as required because they are interface variables (with uses), independent of whether they have HLSL decorations. - Adds HlslCounterBufferGOOGLE, used to associate a buffer with a counter variable. Fixes #1391	2018-03-15 11:16:20 -04:00
Rex Xu	314cfa29b2	Add missing SPV extension strings	2018-03-08 21:54:00 +08:00
Steven Perron	2cb589cc14	Remove uses DCEInst and call ADCE The algorithm used in DCEInst to remove dead code is very slow. It is fine if you only want to remove a small number of instructions, but, if you need to remove a large number of instructions, then the algorithm in ADCE is much faster. This PR removes the calls to DCEInst in the load-store removal passes and adds a pass of ADCE afterwards. A number of different iterations of the order of optimization, and I believe this is the best I could find. The results I have on 3 sets of shaders are: Legalization: Set 1: 5.39 -> 5.01 Set 2: 13.98 -> 8.38 Set 3: 98.00 -> 96.26 Performance passes: Set 1: 6.90 -> 5.23 Set 2: 10.11 -> 6.62 Set 3: 253.69 -> 253.74 Size reduction passes: Set 1: 7.16 -> 7.25 Set 2: 17.17 -> 16.81 Set 3: 112.06 -> 107.71 Note that the third set's compile time is large because of the large number of basic blocks, not so much because of the number of instructions. That is why we don't see much gain there.	2018-02-27 21:06:08 -05:00
David Neto	87f9cfaba3	Disambiguate between const and nonconst ForEachSuccessorLabel This helps VisualStudio 2013 compile the code. Contributes to #1262	2018-02-02 17:54:40 -05:00
Alan Baker	6587d3f8a3	Adding early exit versions of several ForEach* methods * Looked through code for instances where code would benefit from early exit * Added a corresponding WhileEach* method and updated the code	2018-01-12 17:05:09 -05:00
Steven Perron	756b277fb8	Store all enabled capabilities in the feature manger. In order to keep track of all of the implicit capabilities as well as the explicit ones, we will add them all to the feature manager. That is the object that needs to be queried when checking if a capability is enabled. The name of the "HasCapability" function in the module was changed to make it more obvious that it does not check for implied capabilities. Keep an spv_context and AssemblyGrammar in IRContext	2017-12-21 11:14:53 -05:00
Steven Perron	79a00649b4	Allow pointers to pointers in logical addressing mode. A few optimizations are updates to handle code that is suppose to be using the logical addressing mode, but still has variables that contain pointers as long as the pointer are to opaque objects. This is called "relaxed logical addressing". \|Instruction::GetBaseAddress\| will check that pointers that are use meet the relaxed logical addressing rules. Optimization that now handle relaxed logical addressing instead of logical addressing are: - aggressive dead-code elimination - local access chain convert - local store elimination passes.	2017-12-19 14:29:14 -05:00
Andrey Tuganov	af7d5799a5	Refactor include of latest spir-v header versions	2017-12-14 11:18:20 -05:00
Diego Novillo	853a3d6c31	Fix uninitialized warning at -Os.	2017-12-12 15:46:09 -05:00
GregF	c6fdf68c2f	SingleStore: Support OpVariable Initialization Treat an OpVariable with initialization as if it was an OpStore. This fixes issue #1017.	2017-12-08 16:02:14 -05:00
Steven Perron	65046eca7c	Change IRContext::KillInst to delete instructions. The current method of removing an instruction is to call ToNop. The problem with this is that it leaves around an instruction that later passes will look at. We should just delete the instruction. In MemPass there is a utility routine called DCEInst. It can delete essentially any instruction, which can invalidate pointers now that they are actually deleted. The interface was changed to add a call back that can be used to update any local data structures that contain ir::Intruction*.	2017-12-04 11:07:45 -05:00
Pierre Moreau	69043963e4	Opt: Remove unused lambda captures Those are reported as errors by clang 5.0.0, due to the flags -Werror and -Wunused-lambda-capture.	2017-12-01 09:54:37 -05:00
Diego Novillo	83228137e1	Re-format source tree - NFC. Re-formatted the source tree with the command: $ /usr/bin/clang-format -style=file -i \ $(find include source tools test utils -name '.cpp' -or -name '.h') This required a fix to source/val/decoration.h. It was not including spirv.h, which broke builds when the #include headers were re-ordered by clang-format.	2017-11-27 14:31:49 -05:00
Alan Baker	746bfd210a	Adding new def -> use mapping container Replaced representation of uses * Changed uses from unordered_map<uint32_t, UseList> to set<pairInstruction, Instruction>> * Replaced GetUses with ForEachUser and ForEachUse functions * updated passes to use new functions * partially updated tests * lots of cleanup still todo Adding an unique id to Instruction generated by IRContext Each instruction is given an unique id that can be used for ordering purposes. The ids are generated via the IRContext. Major changes: * Instructions now contain a uint32_t for unique id and a cached context pointer * Most constructors have been modified to take a context as input * unfortunately I cannot remove the default and copy constructors, but developers should avoid these * Added accessors to parents of basic block and function * Removed the copy constructors for BasicBlock and Function and replaced them with Clone functions * Reworked BuildModule to return an IRContext owning the built module * Since all instructions require a context, the context now becomes the basic unit for IR * Added a constructor to context to create an owned module internally * Replaced uses of Instruction's copy constructor with Clone whereever I found them * Reworked the linker functionality to perform clones into a different context instead of moves * Updated many tests to be consistent with the above changes * Still need to add new tests to cover added functionality * Added comparison operators to Instruction Adding tests for Instruction, IRContext and IR loading Fixed some header comments for BuildModule Fixes to get tests passing again * Reordered two linker steps to avoid use/def problems * Fixed def/use manager uses in merge return pass * Added early return for GetAnnotations * Changed uses of Instruction::ToNop in passes to IRContext::KillInst Simplifying the uses for some contexts in passes	2017-11-23 16:40:02 -05:00
Lei Zhang	b02c9a5802	Allow derived access chain without uses in access chain conversion	2017-11-23 16:00:28 -05:00
Diego Novillo	d2938e4842	Re-format files in source, source/opt, source/util, source/val and tools. NFC. This just makes sure every file is formatted following the formatting definition in .clang-format. Re-formatted with: $ clang-format -i $(find source tools include -name '.cpp') $ clang-format -i $(find source tools include -name '.h')	2017-11-08 14:03:08 -05:00
Diego Novillo	fef669f30f	Add a new class opt::CFG to represent the CFG for the module. This class moves some of the CFG-related functionality into a new class opt::CFG. There is some other code related to the CFG in the inliner and in opt::LocalSingleStoreElimPass that should also be moved, but that require more changes than this pure restructuring. I will move those bits in a follow-up PR. Currently, the CFG is computed every time a pass is instantiated, but this should be later moved to the new IRContext class that @s-perron is working on. Other re-factoring: - Add BasicBlock::ContinueBlockIdIfAny. Re-factored out of MergeBlockIdIfAny - Rewrite IsLoopHeader in terms of GetLoopMergeInst. - Run clang-format on some files.	2017-11-02 10:37:03 -04:00
Steven Perron	476cae6f7d	Add the IRContext (part 1) This is the first part of adding the IRContext. This class is meant to hold the extra data that is build on top of the module that it owns. The first part will simply create the IRContext class and get it passed to the passes in place of the module. For now it does not have any functionality of its own, but it acts more as a wrapper for the module. The functions that I added to the IRContext are those that either traverse the headers or add to them. I did this because we may decide to have other ways of dealing with these sections (for example adding a type pool, or use the decoration manager). I also added the function that add to the header because the IRContext needs to know when an instruction is added to update other data structures appropriately. Note that there is still lots of work that needs to be done. There are still many places that change the module, and do not inform the context. That will be the next step.	2017-10-31 13:46:05 -04:00
Diego Novillo	632e2068f3	More re-factoring to simplify pass initialization. This implements two cleanups suggested by @s-perron (https://github.com/KhronosGroup/SPIRV-Tools/pull/921): - Move FindNamedOrDecoratedIds() into MemPass::InitializeProcessing(). - Remove FinalizeNextId(). Always call SetIdBound() from Pass::TakeNextId().	2017-10-30 09:06:17 -04:00
Diego Novillo	1040a95b3f	Re-factor Phi insertion code out of LocalMultiStoreElimPass Including a re-factor of common behaviour into class Pass: The following functions are now in class Pass: - IsLoopHeader. - ComputeStructuredOrder - ComputeStructuredSuccessors (annoyingly, I could not re-factor all instances of this function, the copy in common_uniform_elim_pass.cpp is slightly different and fails with the common implementation). - GetPointeeTypeId - TakeNextId - FinalizeNextId - MergeBlockIdIfAny This is a NFC (non-functional change)	2017-10-27 15:28:08 -04:00

1 2

60 Commits