SPIRV-Tools

mirror of https://github.com/KhronosGroup/SPIRV-Tools synced 2024-12-01 23:40:04 +00:00

Author	SHA1	Message	Date
Nathan Gauër	1a7f71afb4	clean: constexpr-ify and unify anon namespace use (#4991 ) Constexpr guaranteed no runtime init in addition to const semantics. Moving all opt/ to constexpr. Moving all compile-unit statics to anonymous namespaces to uniformize the method used (anonymous namespace vs static has the same behavior here AFAIK). Signed-off-by: Nathan Gauër <brioche@google.com>	2022-11-17 19:02:50 +01:00
alan-baker	d35a78db57	Switch SPIRV-Tools to use spirv.hpp11 internally (#4981 ) Fixes #4960 * Switches to using enum classes with an underlying type to avoid undefined behaviour	2022-11-04 17:27:10 -04:00
stu-s	c267127846	Add SPV_KHR_fragment_shader_barycentric support (#4805 ) * Add SPV_KHR_fragment_shader_barycentric support	2022-05-25 09:20:39 -04:00
Nikita	a3fbc9331b	Support SPV_KHR_uniform_group_instructions (#4734 )	2022-03-25 08:32:50 -04:00
Marius Hillenbrand	1ed847f438	Fix endianness of string literals (#4622 ) * Fix endianness of string literals To get correct and consistent encoding and decoding of string literals on big-endian platforms, use spvtools::utils::MakeString and MakeVector (or wrapper functions) consistently for handling string literals. - add variant of MakeVector that encodes a string literal into an existing vector of words - add variants of MakeString - add a wrapper spvDecodeLiteralStringOperand in source/ - fix wrapper Operand::AsString to use MakeString (source/opt) - remove Operand::AsCString as broken and unused - add a variant of GetOperandAs for string literals (source/val) ... and apply those wrappers throughout the code. Fixes #149 * Extend round trip test for StringLiterals to flip word order In the encoding/decoding roundtrip tests for string literals, include a case that flips byte order in words after encoding and then checks for successful decoding. That is, on a little-endian host flip to big-endian byte order and then decode, and vice versa. * BinaryParseTest.InstructionWithStringOperand: also flip byte order Test binary parsing of string operands both with the host's and with the reversed byte order.	2021-12-08 12:01:26 -05:00
JiaoluAMD	387cae472e	Opt passes should apply to the exported functions (#4554 ) This is follow-up to the commit `bd3a271ce3`	2021-10-18 13:18:16 -04:00
Greg Fischer	1454c95d1b	spirv-opt: Switch from Vulkan.DebugInfo to Shader.DebugInfo (#4493 ) Includes: - Shift to use of spirv-header extinst.nonsemantic.shader grammar.json - Remove extinst.nonsemantic.vulkan.debuginfo.100.grammar.json - Enable all optimizations for Shader.DebugInfo Also fixes scalar replacement to only insert DebugValue after all OpVariables. This is not necessary for OpenCL.DebugInfo, but it is for Shader.DebugInfo. Likewise, fixes Private-to-Local to insert DebugDeclare after all OpVariables. Also fixes inlining to handle FunctionDefinition which can show up after first block if early return processing happens. Co-authored-by: baldurk <baldurk@baldurk.org>	2021-09-15 14:38:53 -04:00
Greg Fischer	d9f8925785	spirv-opt: Where possible make code agnostic of opencl/vulkan debuginfo (#4385 ) Co-authored-by: baldurk <baldurk@baldurk.org>	2021-07-21 12:04:38 -04:00
Jaebaek Seo	4baf3affe3	spirv-opt: support SPV_EXT_shader_image_int64 (#4379 )	2021-07-14 08:43:35 -04:00
Kévin Petit	e065c482c6	Initial support for SPV_KHR_integer_dot_product (#4327 ) * Initial support for SPV_KHR_integer_dot_product - Adds new operand types for packed-vector-format - Moves ray tracing enums to the end - PackedVectorFormat is a new optional operand type, so it requires special handling in grammar table generation. - Add SPV_KHR_integer_dot_product to optimizer whitelists. - Pass-through validation: valid cases pass validation Validation errors are not checked. - Update SPIRV-Headers Patch by David Neto <dneto@google.com> Rebase and minor tweaks by Kevin Petit <kevin.petit@arm.com> Signed-off-by: David Neto <dneto@google.com> Signed-off-by: Kevin Petit <kevin.petit@arm.com> Change-Id: Icb41741cb7f0f1063e5541ce25e5ba6c02266d2c * format fixes Change-Id: I35c82ec27bded3d1b62373fa6daec3ffd91105a3	2021-06-23 13:32:24 -04:00
alan-baker	4d22f58a81	Support SPV_KHR_subgroup_uniform_control_flow (#4318 ) * Support SPV_KHR_subgroup_uniform_control_flow Covers: - assembler - disassembler - validator - optimizer (add to whitelists) * fix copyright Co-authored-by: David Neto <dneto@google.com>	2021-06-15 10:07:42 -04:00
Steven Perron	a187dd58a0	Allow SPV_KHR_8bit_storage extension. (#3780 )	2020-09-08 14:13:01 -04:00
Jaebaek Seo	ebaefda666	Debug info preservation in loop-unroll pass (#3548 ) When we copy the loop body to unroll it, we have to copy its instructions but DebugDeclare or DebugValue used for the declaration i.e., DebugValue with Deref must not be copied and only the first block can contain those instructions.	2020-07-30 12:18:06 -04:00
alan-baker	f3cec93665	Support SPV_KHR_terminate_invocation (#3568 ) Covers: - assembler - disassembler - validator - optimizer Co-authored-by: David Neto <dneto@google.com>	2020-07-22 11:45:02 -04:00
Jaebaek Seo	94667fbf66	Fix build failure (#3508 )	2020-07-09 21:12:21 -04:00
greg-lunarg	44428352ba	Upgrade elim-local-single-block for OpenCL.DebugInfo.100 (#3451 ) Creates a DebugValue when removing a store to a local variable.	2020-07-09 17:21:39 -04:00
dan sinclair	52a5f074e9	Update access control lists. (#3433 ) This CL updates the access control lists used in SPIRV-Tools to the more descriptive allow/deny naming.	2020-06-15 13:20:40 -04:00
Ehsan	e6f372c5c2	Whitelist SPV_KHR_ray_tracing (#3241 )	2020-03-23 09:31:05 -05:00
Ehsan	18d3896a15	Whitelist SPV_EXT_demote_to_helper_invocation for opt passes (#3236 )	2020-03-17 22:36:02 -05:00
Daniel Koch	5a97e3a391	Add support for KHR_ray_{query,tracing} extensions (#3235 ) Update validator for SPV_KHR_ray_tracing. * Added handling for new enum types * Add SpvScopeShaderCallKHR as a valid scope * update spirv-headers Co-authored-by: alelenv <alele@nvidia.com> Co-authored-by: Torosdagli <ntorosda@amd.com> Co-authored-by: Tobias Hector <tobias.hector@amd.com> Co-authored-by: Steven Perron <stevenperron@google.com>	2020-03-17 15:30:19 -04:00
greg-lunarg	29af42df12	Add SPV_EXT_physical_storage_buffer to opt whitelists (#2779 ) This also fixes ADCE to not remove possibly needed OpTypeForwardPointer. The bug, its fix and the corresponding test have a circular dependency with the extension, so they are packaged together.	2019-08-08 09:45:59 -04:00
Ehsan	a132c9b640	Whitelist SPV_GOOGLE_user_type. (#2673 )	2019-06-19 12:18:13 -04:00
Steven Perron	12e4a7b649	Handle variable pointer in some optimizations (#2490 ) * Check var pointer capability in ADCE. * Check var ptr capability for common uniform. * Check var ptr capability in access chain convert. Since we want this pass to run even if there are variable pointer on storage buffers, we had to remove asserts that assumed there were no variable pointers. The functions with the asserts will now work, it becomes the responsibility of the callers to deal with the output as appropriate. * Single block elimination and variable pointers. It seems like the code in local single block elimination is able to handle cases with variable pointers already. This is because the function `HasOnlySupportedRefs` ensures that variables that feed a variable pointer are not candidates. * Single store elimination and variable pointers. It seems like the code in local single stroe elimination is able to handle cases with variable pointers already. This is because the function `FindSingleStoreAndCheckUses` ensures that variables that feed a variable pointer are not candidates. * SSA rewriter and variable pointers. It seems like the code in the two passes that call the SSA rewriter are able to handle cases with variable pointers already. This is because the function `HasOnlySupportedRefs` ensures that variables that feed a variable pointer are not candidates. Fixes #2458.	2019-04-03 12:47:51 -04:00
Steven Perron	2d2a512691	Don't inline recursive functions. (#2130 ) * Move ProcessFunction* function from pass to the context. There are a few functions that are used to traverse the call tree. They currently live in the Pass class, but they have nothing to do with a pass, and may be needed outside of a pass. They would be better in the ir context, or in a specific call tree class if we ever have a need for it. * Don't inline recursive functions. Inlining does not check if a function is recursive or not. This has been fine as long as the shader was a Vulkan shader, which forbid recursive functions. However, not all shaders are vulkan, so either we limit inlining to Vulkan shaders or we teach it to look for recursive functions. I prefer to keep the passes as general as is reasonable. The change does not require much new code in inlining and gives a reason to refactor some other code. The changes are to add a member function to the Function class that checks if that function is recursive or not. Then this is used in inlining to not inlining a function call if it calls a recursive function. * Add id to function analysis There are a few places that build a map from ids to Function whose result is that id. I decided to add an analysis to the context for this to reduce that code, and simplify some of the functions. * Add missing file.	2018-11-29 14:24:58 -05:00
Daniel Koch	3b210d6a63	Add basic support for EXT_fragment_invocation_density (#2100 ) Whitelisting the extension in optimizations * copying what was done for NV_shading_rate	2018-11-23 10:21:19 -05:00
alelenv	1c1e749f0b	Add support for nv-raytracing-final (#2010 ) Add support for nv-raytracing (non-experimental)	2018-10-25 14:07:46 -04:00
Chao Chen	6e2dab2ffd	Add support for Nvidia Turing extensions	2018-09-19 20:46:14 -04:00
Steven Perron	e065cc208f	Keep decorations when replacing loads in access-chain-convert. (#1829 ) In local-access-chain-convert, we replace loads by load the entire variable, then doing the extract. The extract will have the same value as the load. However, if the load has a decoration on it, the decoration is lost because we do not copy any them to the new id. This is fixed by rewritting the load into the extract and keeping the same result id. This change has the effect that we do not call DCEInst on the loads because the load is not being deleted, but replaced. This could leave OpAccessChain instructions around that are not used. This is not a problem for -O and -Os. They run local_single_*_elim passes and then dead code elimination. The dce will remove the unused access chains, and the load elimination passes work even if there are unused access chains. I have added test to them to ensure they will not loss opportunities. Fixes #1787.	2018-08-15 09:14:21 -04:00
dan sinclair	eda2cfbe12	Cleanup includes. (#1795 ) This Cl cleans up the include paths to be relative to the top level directory. Various include-what-you-use fixes have been added.	2018-08-03 15:06:09 -04:00
dan sinclair	c7da51a085	Cleanup extraneous namespace qualifies in source/opt. (#1716 ) This CL follows up on the opt namespacing CLs by removing the unnecessary opt:: and opt::analysis:: namespace prefixes.	2018-07-12 15:14:43 -04:00
dan sinclair	f96b7f1cb9	use Pass::Run to set the context on each pass. (#1708 ) Currently the IRContext is passed into the Pass::Process method. It is then up to the individual pass to store the context into the context_ variable. This CL changes the Run method to store the context before calling Process which no-longer receives the context as a parameter.	2018-07-12 09:08:45 -04:00
dan sinclair	e6b953361d	Move the ir namespace to opt. (#1680 ) This CL moves the files in opt/ to consistenly be under the opt:: namespace. This frees up the ir:: namespace so it can be used to make a shared ir represenation.	2018-07-09 11:32:29 -04:00
Steven Perron	f46f2d3e5d	Remove redundant stores. The code patterns generated by DXC around function calls can cause many store to be storing the same value that was just loaded from the same location: ``` %10 = OpLoad %type %var OpStore %var %10 ``` We want to clean these up very early on because they can cause other transformations to do a lot of work. For the cases I see, they can be removed during local-single-block-elim. For one set of shaders the compile time goes from 248s to 182s. A 26% improvement. Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/1494.	2018-05-15 10:24:05 -04:00
Greg Fischer	268be6143d	LocalSingleBlockElim: Add store-store elimination Eliminate unused store to variable if followed by store to same variable in same block. Most significantly, this cleans up stores made unused by this pass. These useless stores can inhibit subsequent optimizations, specifically LocalSingleStoreElim. Eliminating them makes subsequent optimization more effective. The main effect of this pass is to simplify the work done by the SSA rewriter. It catches many local loads/stores that help speeding up the work done by the main rewriter.	2018-04-25 10:30:18 -04:00
Steven Perron	c20a718e00	Rewrite local-single-store-elim to not create large data structures. The local-single-store-elim algorithm is not fundamentally bad. However, when there are a large number of variables, some of the maps that are used can become very large. These large data structures then take a very long time to be destroyed. I've seen cases around 40% if the time. I've rewritten that algorithm to not use as much memory. This give a significant improvement when running a large number of shader through DXC. I've also made a small change to local-single-block-elim to delete the loads that is has replaced. That way local-single-store-elim will not have to look at those. local-single-store-elim now does the same thing. The time for one set goes from 309s down to 126s. For another set, the time goes from 102s down to 88s.	2018-04-18 16:38:18 -04:00
David Neto	a91cbfbf75	Optimizer: update extension whitelists Add two new extensions: - SPV_NV_shader_subgroup_partitioned - SPV_EXT_descriptor_indexing	2018-04-06 15:56:20 -04:00
Eleni Maria Stea	045cc8f75b	Fixes compile errors generated with -Wpedantic This patch fixes the compile errors generated when the options SPIRV_WARN_EVERYTHING and SPIRV_WERROR (that force -Wpedantic) are set to cmake.	2018-03-22 09:40:11 -04:00
David Neto	2e3aec23ca	Add recent Google extensions to optimizer whitelists Optimizations should work in the presence of recent SPV_GOOGLE_decorate_string and SPV_GOOGLE_hlsl_functionality1 SPV_GOOGLE_decorate_string: - Adds operation OpDecorateStringGOOGLE to decorate an object with decorations having string operands. SPV_GOOGLE_hlsl_functionality1: - Adds HlslSemanticGOOGLE, used to decorate an interface variable with an HLSL semantic string. Optimizations already preserve those variables as required because they are interface variables (with uses), independent of whether they have HLSL decorations. - Adds HlslCounterBufferGOOGLE, used to associate a buffer with a counter variable. Fixes #1391	2018-03-15 11:16:20 -04:00
Rex Xu	314cfa29b2	Add missing SPV extension strings	2018-03-08 21:54:00 +08:00
Steven Perron	2cb589cc14	Remove uses DCEInst and call ADCE The algorithm used in DCEInst to remove dead code is very slow. It is fine if you only want to remove a small number of instructions, but, if you need to remove a large number of instructions, then the algorithm in ADCE is much faster. This PR removes the calls to DCEInst in the load-store removal passes and adds a pass of ADCE afterwards. A number of different iterations of the order of optimization, and I believe this is the best I could find. The results I have on 3 sets of shaders are: Legalization: Set 1: 5.39 -> 5.01 Set 2: 13.98 -> 8.38 Set 3: 98.00 -> 96.26 Performance passes: Set 1: 6.90 -> 5.23 Set 2: 10.11 -> 6.62 Set 3: 253.69 -> 253.74 Size reduction passes: Set 1: 7.16 -> 7.25 Set 2: 17.17 -> 16.81 Set 3: 112.06 -> 107.71 Note that the third set's compile time is large because of the large number of basic blocks, not so much because of the number of instructions. That is why we don't see much gain there.	2018-02-27 21:06:08 -05:00
Alan Baker	6587d3f8a3	Adding early exit versions of several ForEach* methods * Looked through code for instances where code would benefit from early exit * Added a corresponding WhileEach* method and updated the code	2018-01-12 17:05:09 -05:00
Steven Perron	756b277fb8	Store all enabled capabilities in the feature manger. In order to keep track of all of the implicit capabilities as well as the explicit ones, we will add them all to the feature manager. That is the object that needs to be queried when checking if a capability is enabled. The name of the "HasCapability" function in the module was changed to make it more obvious that it does not check for implied capabilities. Keep an spv_context and AssemblyGrammar in IRContext	2017-12-21 11:14:53 -05:00
Steven Perron	79a00649b4	Allow pointers to pointers in logical addressing mode. A few optimizations are updates to handle code that is suppose to be using the logical addressing mode, but still has variables that contain pointers as long as the pointer are to opaque objects. This is called "relaxed logical addressing". \|Instruction::GetBaseAddress\| will check that pointers that are use meet the relaxed logical addressing rules. Optimization that now handle relaxed logical addressing instead of logical addressing are: - aggressive dead-code elimination - local access chain convert - local store elimination passes.	2017-12-19 14:29:14 -05:00
Steven Perron	65046eca7c	Change IRContext::KillInst to delete instructions. The current method of removing an instruction is to call ToNop. The problem with this is that it leaves around an instruction that later passes will look at. We should just delete the instruction. In MemPass there is a utility routine called DCEInst. It can delete essentially any instruction, which can invalidate pointers now that they are actually deleted. The interface was changed to add a call back that can be used to update any local data structures that contain ir::Intruction*.	2017-12-04 11:07:45 -05:00
Diego Novillo	83228137e1	Re-format source tree - NFC. Re-formatted the source tree with the command: $ /usr/bin/clang-format -style=file -i \ $(find include source tools test utils -name '.cpp' -or -name '.h') This required a fix to source/val/decoration.h. It was not including spirv.h, which broke builds when the #include headers were re-ordered by clang-format.	2017-11-27 14:31:49 -05:00
Alan Baker	746bfd210a	Adding new def -> use mapping container Replaced representation of uses * Changed uses from unordered_map<uint32_t, UseList> to set<pairInstruction, Instruction>> * Replaced GetUses with ForEachUser and ForEachUse functions * updated passes to use new functions * partially updated tests * lots of cleanup still todo Adding an unique id to Instruction generated by IRContext Each instruction is given an unique id that can be used for ordering purposes. The ids are generated via the IRContext. Major changes: * Instructions now contain a uint32_t for unique id and a cached context pointer * Most constructors have been modified to take a context as input * unfortunately I cannot remove the default and copy constructors, but developers should avoid these * Added accessors to parents of basic block and function * Removed the copy constructors for BasicBlock and Function and replaced them with Clone functions * Reworked BuildModule to return an IRContext owning the built module * Since all instructions require a context, the context now becomes the basic unit for IR * Added a constructor to context to create an owned module internally * Replaced uses of Instruction's copy constructor with Clone whereever I found them * Reworked the linker functionality to perform clones into a different context instead of moves * Updated many tests to be consistent with the above changes * Still need to add new tests to cover added functionality * Added comparison operators to Instruction Adding tests for Instruction, IRContext and IR loading Fixed some header comments for BuildModule Fixes to get tests passing again * Reordered two linker steps to avoid use/def problems * Fixed def/use manager uses in merge return pass * Added early return for GetAnnotations * Changed uses of Instruction::ToNop in passes to IRContext::KillInst Simplifying the uses for some contexts in passes	2017-11-23 16:40:02 -05:00
Lei Zhang	b02c9a5802	Allow derived access chain without uses in access chain conversion	2017-11-23 16:00:28 -05:00
Diego Novillo	d2938e4842	Re-format files in source, source/opt, source/util, source/val and tools. NFC. This just makes sure every file is formatted following the formatting definition in .clang-format. Re-formatted with: $ clang-format -i $(find source tools include -name '.cpp') $ clang-format -i $(find source tools include -name '.h')	2017-11-08 14:03:08 -05:00
Steven Perron	f32d11f74b	Add the IRContext (part 2): Add def-use manager This change will move the instances of the def-use manager to the IRContext. This allows it to persists across optimization, and does not have to be rebuilt multiple times. Added test to ensure that the IRContext is validating and invalidating the analyses correctly.	2017-11-08 13:35:34 -05:00
Steven Perron	476cae6f7d	Add the IRContext (part 1) This is the first part of adding the IRContext. This class is meant to hold the extra data that is build on top of the module that it owns. The first part will simply create the IRContext class and get it passed to the passes in place of the module. For now it does not have any functionality of its own, but it acts more as a wrapper for the module. The functions that I added to the IRContext are those that either traverse the headers or add to them. I did this because we may decide to have other ways of dealing with these sections (for example adding a type pool, or use the decoration manager). I also added the function that add to the header because the IRContext needs to know when an instruction is added to update other data structures appropriately. Note that there is still lots of work that needs to be done. There are still many places that change the module, and do not inform the context. That will be the next step.	2017-10-31 13:46:05 -04:00

1 2

65 Commits