SPIRV-Tools

mirror of https://github.com/KhronosGroup/SPIRV-Tools synced 2024-10-21 04:20:05 +00:00

Author	SHA1	Message	Date
Arseny Kapoulkine	0265a9d4de	Implement constant folding for many transcendentals (#3166 ) * Implement constant folding for many transcendentals This change adds support for folding of sin/cos/tan/asin/acos/atan, exp/log/exp2/log2, sqrt, atan2 and pow. The mechanism allows to use any C function to implement folding in the future; for now I limited the actual additions to the most commonly used intrinsics in the shaders. Unary folder had to be tweaked to work with extended instructions - for extended instructions, constants.size() == 2 and constants[0] == nullptr. This adjustment is similar to the one binary folder already performs. Fixes #1390. * Fix Android build On old versions of Android NDK, we don't get std::exp2/std::log2 because of partial C++11 support. We do get ::exp2, but not ::log2 so we need to emulate that.	2020-02-03 09:20:47 -05:00
Alastair Donaldson	7a2d408dea	Fix typo in comment. (#3163 )	2020-01-30 15:01:05 -05:00
Steven Perron	97f1d485b7	Dead branch elim fix (#3160 ) We must treat a branch to the merge node of a switch that is in the header of a construct as a nested construced. The original merge instruction is still needed in that case.	2020-01-28 10:17:43 -05:00
greg-lunarg	e7afeb060e	Use dummy switch instead of dummy loop in MergeReturn pass. (#3151 ) Fixes #3127	2020-01-24 12:20:14 -05:00
Jaebaek Seo	dd37d73c5e	Handle conflict between debug info and existing validation rule (#3104 ) * Allow OpExtInst for DebugInfo between secion 9 and 10 Fixes #3086 * Handle spirv-opt errors on DebugInfo Ext * Add IR Loader test * Fix ir loader bug * Handle DebugFunction/DebugTypeMember forward reference * Add test cases (forward reference to function) * Support old DebugInfo extension * Validate local debug info out of function	2020-01-23 17:04:30 -05:00
Jaebaek Seo	f8d7df760c	Fix OpLine bug of merge-blocks pass (#3130 ) As explained in #3118, spirv-opt merge-blocks pass causes a spirv-val error when an OpBranch has an OpLine in front of it. OpLoopMerge OpBranch ; Will be killed by merge-blocks pass OpLabel ; Will be killed by merge-blocks pass OpLine ; will be placed between OpLoopMerge and OpBranch - error! OpBranch To fix this issue, this commit moves line info of OpBranch to OpLoopMerge. Fixes #3118	2020-01-14 14:35:21 -05:00
David Neto	c8bf14393c	GetOperandConstants operand can be const (#3126 )	2020-01-06 11:14:04 -05:00
greg-lunarg	9215c1b7df	Fix convert-relax-to-half invalid code (#3099 ) (#3106 )	2019-12-20 21:08:12 -05:00
David Neto	e70b009b0f	Add support for SPV_KHR_non_semantic_info (#3110 ) Add support for SPV_KHR_non_semantic_info This entails a couple of changes: - Allowing unknown OpExtInstImport that begin with the prefix `NonSemantic.` - Allowing OpExtInst that reference any of those sets to contain unknown ext inst instruction numbers, and assume the format is always a series of IDs as guaranteed by the extension. - Allowing those OpExtInst to appear in the types/variables/constants section. - Not stripping OpString in the --strip-debug pass, since it may be referenced by these non-semantic OpExtInsts. - Stripping them instead in the --strip-reflect pass. * Add adjacency validation of non-semantic OpExtInst - We validate and test that OpExtInst cannot appear before or between OpPhi instructions, or before/between OpFunctionParameter instructions. * Change non-semantic extinst type to single value * Add helper function spvExtInstIsNonSemantic() which will check if the extinst set is non-semantic or not, either the unknown generic value or any future recognised non-semantic set. * Add test of a complex non-semantic extinst * Use DefUseManager in StripDebugInfoPass to strip some OpStrings * Any OpString used by a non-semantic instruction cannot be stripped, all others can so we search for uses to see if each string can be removed. * We only do this if the non-semantic debug info extension is enabled, otherwise all strings can be trivially removed. * Silence -Winconsistent-missing-override in protobufs	2019-12-18 18:10:29 -05:00
greg-lunarg	fccbc00aca	Make Instrumentation format version 2 the default (Step 1) (#3096 ) * Make Instrumentation format version 2 the default (Step 1) Add new interfaces without version number argument. Remove version 1 logic and tests. Version interfaces will be removed in step 2 after layers have transitioned to new interface. * Add error messages to InstrumentPass().	2019-12-16 14:18:47 -05:00
Steven Perron	00ca4e5bdf	Don't crash when folding construct of empty struct (#3092 ) * Don't crash when folding construct of empty struct An OpCompositeConstruct of an empty struct will be folded to a constant under normal circumstances. However, if the id limit has been reached and the constant cannot be generated, then other folding rules will be tried. These rules do not handle the case of an empty struct. We add allow it to be handled. Fixes http://crbug/1030194 * Changes based on the review.	2019-12-10 14:58:30 -05:00
Sarah	0a5d99d02c	Permit the debug instructions in WebGPU SPIR-V - remove from the optimizer (#3083 ) continuing #3063 fixing #3052	2019-12-03 11:21:26 -05:00
David Neto	af7410597e	graphics robust access: use signed clamp (#3073 ) Access chain indices are always interpreted as signed integers. So use signed clamp instead of unsigned clamp. We must also clamp to the max signed int for the index type. Fixes #3072	2019-12-03 11:18:56 -05:00
Steven Perron	3ed4586044	Folding: perform add and sub on mismatched integer types (#3084 ) Fixes #3040	2019-12-02 17:51:20 -05:00
alan-baker	b334829a91	Validate nested constructs (#3068 ) * Validate that if a construct contains a header and it's merge is reachable, the construct also contains the merge * updated block merging to not merge into the continue * update inlining to mark the original block of a single block loop as the continue * updated some tests * remove dead code * rename kBlockTypeHeader to kBlockTypeSelection for clarity	2019-11-27 16:45:57 -05:00
Steven Perron	54385458ca	Handle unreachable block when computing register pressure (#3070 ) Fixes #3053	2019-11-27 09:45:17 -05:00
greg-lunarg	868ca3954c	Improve RegisterSizePasses (#3059 )	2019-11-27 09:41:50 -05:00
Steven Perron	0391d0823e	Handle OpPhi with no in operands in value numbering (#3056 ) Fixes #3043	2019-11-19 09:45:39 -05:00
Steven Perron	ca703c8877	Kill the id-to-func map after wrap-opkill (#3055 ) Wrap-opkill will create a new function, invalidating the id-to-func map. The preserved analyses for the pass have been updated to reflect that. Also adding consistency check for the id-to-func map. With this new check, old tests identify this problem. No new tests are needed. Fixes #3038	2019-11-19 09:44:53 -05:00
alan-baker	ab3cdcaef5	Fix operand access of composite in upgrade memory model (#3021 ) Fixes #2992 * Accessing aggregate subtype used the wrong operand * Added a test	2019-11-12 13:41:38 -05:00
Ehsan	12e54dae16	Update Offset to ConstOffset bitmask if operand is constant. (#3024 ) Update Offset to ConstOffset bitmask if operand is constant. Fixes #3005	2019-11-11 22:35:14 -05:00
David Neto	618ee50942	Fix some clang-tidy issues in graphics_robust_access_pass (#2998 ) One remains: the fact that the image-texel-pointer modification is mostly dead code. But that's intentional for now.	2019-10-30 14:00:34 -04:00
Jakub Kuderski	f893d4d41d	[opt] Do not compare optimized binary with an invalidated buffer (#2999 )	2019-10-30 10:01:28 -07:00
greg-lunarg	5ea7099374	Add two new simplifications. (#2984 ) Implements the following simplifications: (a - b) + b => a (a * b) + (a * c) => a * (b + c) Also adds logic to simplification to handle rules that create new operations that might need simplification, such as the second rule above. Only perform the second simplification if the multiplies have the add as their only use. Otherwise this is a deoptimization of size and performance.	2019-10-28 08:19:38 -07:00
greg-lunarg	02910ffdff	Instrument: Add missing def-use analysis. (#2985 )	2019-10-22 07:24:54 -07:00
Steven Perron	6a9be627c7	Keep NOPs when comparing with original binary (#2931 ) We have a check that ensures that the optimizer did not change the binary when it says that it did not. However, when the binary is converted back to a binary, we made a decision to remove OpNop instructions. This means that any spv file that contains a NOP originally will fail this check. To get around this, we convert the module to a second binary that keeps the OpNop instructions. That binary is compared against the original. Fixes https://crbug.com/1010191	2019-10-18 09:53:29 -04:00
Jakub Kuderski	e3da3143b2	Disallow use of OpCompositeExtract/OpCompositeInsert with no indices (#2980 )	2019-10-17 13:53:34 -04:00
Aaron Barany	9c0ae6bb8e	Improved CMake install step. (#2963 ) Added exports for libraries. External libraries that themselves use libraries require all dependencies have exports, so not having exports can cause major problems when used within other projects. Install paths for exports are now placed in the proper directories expected by Windows and *nix systems. Config files are generated as well, which should work with CMake's find_package() function once installed.	2019-10-17 11:36:55 -04:00
Jakub Kuderski	e99b918221	Support constant-folding UConvert and SConvert (#2960 )	2019-10-16 16:29:55 -04:00
Steven Perron	32f76efa6c	Link cfg and dominator analysis in the context (#2946 ) Fixes #2889	2019-10-08 10:16:18 -04:00
Jeremy Hayes	3c7ff8d4f0	Enable OpTypeCooperativeMatrix specialization (#2927 )	2019-10-07 09:52:48 -04:00
Steven Perron	c18c9ff6bc	Handle OpKill better (#2933 ) We want to handle OpKill better. The wrap opkill causes lots of extra code to be generated, even when they are not needed to avoid the main problem: OpKill cannot be found directly in a continue construct. This change will be more selective on which functions the OpKill will be wrapped and inlining will avoid inlining. Fixes #2912	2019-10-04 13:05:32 -04:00
greg-lunarg	ad3d23f478	Generate null pointer by converting uint64 zero to pointer. (#2935 ) Fixes #2929.	2019-10-04 12:26:38 -04:00
Steven Perron	9eb1c9a4c4	Add continue construct analysis to struct cfg analysis (#2922 ) * Add continue construct analysis to struct cfg analysis Add the ability to identify which blocks are in the continue construct for a loop, and to get functions that are called from those blocks, directly or indirectly. Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/2912.	2019-10-01 10:27:09 -04:00
Steven Perron	85c67b5e08	Record trailing line dbg instructions (#2926 ) There is nothing in the spir-v spec that says the last instructions in a module cannot be OpLine or OpNoLine. However, the code that parses the module will simply drop these instructions. We add code that will preserve these instructions. Strip-debug-info is updated to remove these instructions. Fixes https://crbug.com/1000689.	2019-09-27 16:03:45 -04:00
Ryan Harrison	4075b921f9	Add removing references to debug instructions when removing them (#2923 ) Fixes #2921	2019-09-27 13:23:06 -05:00
Steven Perron	2a11f365bc	Handle id overflow in wrap-opkill (#2916 ) New code in wrap-opkill does not handle id overflow correctly. We fix that up. Fixes https://crbug.com/1007144	2019-09-25 17:42:58 -04:00
Steven Perron	55ea57a785	Handle extract with no indexes (#2910 ) * Handle extract with no indexes It is possible that OpCompositeExtract instructions will not have any indexes. This is not handled well by scalar replacement and instruction folding. Fixes https://crbug.com/1006435 * Fix typo.	2019-09-24 16:19:31 -04:00
Steven Perron	6f26d9ad81	Handle id overflow in convert local access chains (#2908 ) Fixes https://crbug.com/1004453	2019-09-24 14:04:54 -04:00
David Neto	8d0ca43da5	Add method comment for opt::Function::WhileEachInst (#2867 ) Also, say that ForEachInst and ForEachParam process instructions/parameters in order.	2019-09-23 09:36:48 -04:00
Steven Perron	6b07212659	Use OpReturn* in wrap-opkill (#2886 ) * Use OpReturn* in wrap-opkill The warp-opkill pass is generating incorrect code. It is placing an OpUnreachable at the end of a basic block, when the block can be reached. We can't reach the end of the block, but we can reach the end. Instead we will add a return instruction. Fixes #2875.	2019-09-20 10:32:27 -04:00
Steven Perron	61edde52a0	Revert "Use OpReturn* in wrap-opkill" This reverts commit `87f0fa432f`.	2019-09-19 22:39:56 -04:00
Steven Perron	87f0fa432f	Use OpReturn* in wrap-opkill The warp-opkill pass is generating incorrect code. It is placing an OpUnreachable at the end of a basic block, when the block can be reached. We can't reach the end of the block, but we can reach the end. Instead we will add a return instruction. Fixes #2875.	2019-09-19 22:34:57 -04:00
Ehsan	08fcf8a4ab	Fix header include syntax. (#2882 )	2019-09-19 09:26:24 -05:00
Steven Perron	248c80b049	Handle OpConstantNull in copy-prop-arrays. (#2870 ) Many of the places in copy propagate arrays assumes that integer constant will be defined by an OpConstant instruction. That is not always true. We fix these spots by allowing for an OpConstantNull.	2019-09-19 10:24:00 -04:00
Ryan Harrison	67b87f22cf	Handle another case where creating a constant can fail (#2854 ) Fixes #2847	2019-09-11 17:18:05 -04:00
Steven Perron	c7a39bc40f	Don't inline function containing OpKill (#2842 ) If an OpKill instruction is inlined into a continue construct, then the spir-v is no longer valid. To avoid this issue, we do inline into an OpKill at all. This method was chosen because it is difficult to keep track of whether or not you are in a continue construct while changing the function that is being inlined into. This will work well with wrap OpKill because every will still be inlined except for the OpKill instruction itself. Fixes #2554 Fixes #2433 This reverts commit `aa9e8f5380`.	2019-09-11 13:26:55 -04:00
Steven Perron	4f9256db35	Handle id overflow in wrap op kill. (#2851 ) Fixes https://crbug.com/997729	2019-09-11 13:26:42 -04:00
Ryan Harrison	c0e9807094	Handle creating a new constant failing gracefully (#2848 ) Fixes #2847	2019-09-10 12:51:19 -04:00
Steven Perron	35c9518c4e	Handle id overflow in the ssa rewriter. (#2845 ) * Handle id overflow in the ssa rewriter. Remove LocalSSAElim pass at the same time. It does the same thing as the SSARewrite pass. Then even share almost all of the same code. Fixes crbug.com/997246	2019-09-10 09:38:23 -04:00
Steven Perron	7f7236f1eb	Handle id overflow in the constant manager. (#2844 ) Fixes crbug.com/997246	2019-09-09 15:12:26 -04:00
Steven Perron	76261e2a7d	Replace CubeFaceCoord and CubeFaceIndexAMD (#2840 ) Part of #2814.	2019-09-06 17:11:37 -04:00
Steven Perron	b218ad1994	Fold Min, Max, and Clamp instructions. (#2836 ) Fixes #2830.	2019-09-05 13:30:03 -04:00
Steven Perron	a41520eaa4	Replace uses of SPV_AMD_shader_trinary_minmax extension (#2835 ) Part of #2814	2019-09-05 09:29:04 -04:00
rumblehhh	1dfb5fc12e	Export SPIRV-Tools targets on installation (#2785 ) This allows the targets to be used in other cmake projects. See the following for more details: https://cmake.org/cmake/help/latest/manual/cmake-packages.7.html#creating-packages https://foonathan.net/blog/2016/07/07/cmake-dependency-handling.html	2019-09-04 12:45:26 -04:00
greg-lunarg	c77045b4a0	Instrument: Be sure Float16 capability on when generating float16 null (#2831 )	2019-09-03 15:19:36 -04:00
greg-lunarg	d11725b1d4	Add --relax-float-ops and --convert-relaxed-to-half (#2808 ) The first pass applies the RelaxedPrecision decoration to all executable instructions with float32 based type results. The second pass converts all executable instructions with RelaxedPrecision result to the equivalent float16 type, inserting converts where necessary.	2019-09-03 13:22:13 -04:00
Steven Perron	b54d950298	Fold Fmix should accept vector operands. (#2826 ) Fixes #2819	2019-09-03 09:17:18 -04:00
Ben Clayton	65e362b7ae	AggressiveDCEPass: Set modified to true when appending to to_kill_ (#2825 ) Also add an assertion that these `modified` is true if to_kill_ has a non-zero size to catch this sort of issue in the pass. Fixes: #2824	2019-08-30 16:27:22 -04:00
Steven Perron	d67130caca	Replace SwizzleInvocationsAMD extended instruction. (#2823 ) Part of #2814	2019-08-30 14:07:24 -04:00
Steven Perron	ad71c057c7	Replace SwizzleInvocationsMaskedAMD extended instruction. (#2822 ) Part of #2814	2019-08-30 10:48:42 -04:00
Steven Perron	35d98be3bc	Amd ext to khr (#2811 ) Add the first steps to removing the AMD extension VK_AMD_shader_ballot. Splitting up to make the PRs smaller. Adding utilities to add capabilities and change the version of the module. Replaces the instructions: OpGroupIAddNonUniformAMD = 5000 OpGroupFAddNonUniformAMD = 5001 OpGroupFMinNonUniformAMD = 5002 OpGroupUMinNonUniformAMD = 5003 OpGroupSMinNonUniformAMD = 5004 OpGroupFMaxNonUniformAMD = 5005 OpGroupUMaxNonUniformAMD = 5006 OpGroupSMaxNonUniformAMD = 5007 and extentend instructions WriteInvocationAMD = 3 MbcntAMD = 4 Part of #2814	2019-08-29 12:48:17 -04:00
Ben Clayton	5a581e738c	spvtools::Optimizer - don't assume original_binary and optimized_binary are aliased (#2799 ) If they are not aliased, the function will always print the message: "Binary unexpectedly changed despite optimizer saying there was no change" Which is (usually) totally bogus. Fixes #2798	2019-08-29 10:04:55 -04:00
Steven Perron	73422a0a5e	Check feature mgr in context consistency check (#2818 ) We add a check that the feature manager is correcter after each pass. This resulted in a couple failing tests cases. Those are fixed. Part of #2814	2019-08-28 11:49:16 -04:00
Steven Perron	15fc19d091	Refactor instruction folders (#2815 ) * Refactor instruction folders We want to refactor the instruction folder to allow different sets of rules to be added to the instruction folder. We might want different sets of rules in different circumstances. We also need a way to add rules for extended instructions. Changes are made to the FoldingRules class and ConstFoldingRules class to enable that. We added tests to check that we can fold extended instructions using the new framework. At the same time, I noticed that there were two tests that did not tests what they were suppose to. They could not be easily salvaged. #2813 was opened to track adding the new tests.	2019-08-26 18:54:11 -04:00
Steven Perron	b00ef0d26e	Handle Id overflow in private-to-local (#2807 ) We need to handle id overflow in the private to local pass. Fixes https://crbug.com/962295	2019-08-22 09:14:48 -04:00
Steven Perron	aef8f92b2b	Even more id overflow in sroa (#2806 ) Now we need to handle id overflow when we overflow while replacing uses of the variable. While looking at this code, I noticed an error in the way we handle access chains that cannot be replaced because of overflow. Name it will make some change, and then give up by returning SuccessWithoutChange. But it was changed. This is fixed up by returning Failure if we notice the error at the time of rewriting the users. This is for both id overflow or out-of-bounds accesses. Code is added to "CheckUses" to remove variables that have out-of-bounds accesses from the candidate list, so we don't even try to rewrite its uses. Fixes https://crbug.com/995032	2019-08-21 13:12:42 -04:00
Steven Perron	c5d1dab99e	Add name for variables in desc sroa (#2805 ) Fixes #2802.	2019-08-21 10:55:02 -04:00
David Neto	0cbdc7a2c3	Remove unimplemented method declaration (#2804 )	2019-08-20 08:53:27 -04:00
Steven Perron	bc62722b80	Handle overflow in wrap-opkill (#2801 ) Fixes https://crbug/994203	2019-08-18 19:00:18 -04:00
Steven Perron	9cd07272a6	More handle overflow in sroa (#2800 ) If we run out of ids when creating a new variable, sroa does not recognize the error, and continues doing work. This leads to segmentation faults. Fixes https://crbug/969655	2019-08-16 13:15:17 -04:00
greg-lunarg	06407250a1	Instrument: Add support for Buffer Device Address extension (#2792 )	2019-08-16 09:18:34 -04:00
Jaebaek Seo	ff872dc6bf	Change the way to include header (#2795 ) `#include <source/util/string_utils.h>` works only when we specify `include_directories(${CMAKE_CURRENT_SOURCE_DIR}/)` in cmake. It is hard to set the source directory as a include path in some build systems e.g., bazel. Using the relative path easily solves this issue. This commit uses `#include "source/util/string_utils.h"` instead of `#include <source/util/string_utils.h>`.	2019-08-14 18:09:20 -04:00
Steven Perron	60043edfa1	Replace OpKill With function call. (#2790 ) We are no able to inline OpKill instructions into a continue construct. See #2433. However, we have to be able to inline to correctly do legalization. This commit creates a pass that will wrap OpKill instructions into a function of its own. That way we are able to inline the rest of the code. The follow up to this will be to not inline any function that contains an OpKill. Fixes #2726	2019-08-14 09:27:12 -04:00
greg-lunarg	95386f9e45	Instrument: Fix version 2 output record write for tess eval shaders. (#2782 ) Fix output record write for tess eval shaders. Also change command line for bindless instrumentation to use use output record version 2.	2019-08-09 08:22:41 -04:00
Steven Perron	4b64beb1ae	Add descriptor array scalar replacement (#2742 ) Creates a pass that will replace a descriptor array with individual variables. See #2740 for details. Fixes #2740.	2019-08-08 10:53:19 -04:00
greg-lunarg	29af42df12	Add SPV_EXT_physical_storage_buffer to opt whitelists (#2779 ) This also fixes ADCE to not remove possibly needed OpTypeForwardPointer. The bug, its fix and the corresponding test have a circular dependency with the extension, so they are packaged together.	2019-08-08 09:45:59 -04:00
Steven Perron	b029d3697e	Handle RelaxedPrecision in SROA (#2788 ) If a member of a struct has a relaxed precision, sroa will not split the struct. This means we do not get all cases. This commit handles these cases. The other part is that the decoration needs to be passed on to the new variables. Fixes #2786	2019-08-07 12:17:26 -04:00
Geoff Lang	0b70972a29	Remove extra ';' after member function definition. (#2780 ) This fixes a clang compiler warning about extra semicolons.	2019-08-01 19:33:55 -04:00
alan-baker	3726b500b1	Treat access chain indexes as signed in SROA (#2776 ) Fixes #2768 * In scalar replacement, interpret access chain indexes as signed counts * Use Constant::GetSignExtendedValue and Constant::GetZeroExtendedValue where appropriate * new tests	2019-07-31 15:39:33 -04:00
David Neto	31590104ec	Add pass to inject code for robust-buffer-access semantics (#2771 ) spirv-opt: Add --graphics-robust-access Clamps access chain indices so they are always in bounds. Assumes: - Logical addressing mode - No runtime-array-descriptor-indexing - No variable pointers Adds stub code for clamping coordinate and samples for OpImageTexelPointer. Adds SinglePassRunAndFail optimizer test fixture. Android.mk: add source/opt/graphics_robust_access_pass.cpp Adds Constant::GetSignExtendedValue, Constant::GetZeroExtendedValue	2019-07-30 19:52:46 -04:00
David Neto	ac3d131054	Element type is const for analysis::Vector,Matrix,RuntimeArray (#2765 ) This makes it symmetric with the result type of ...->element_type which returns a const Type. So now we can write code like this: analysis::Vector v = ... analysis::Vector(v->element_type(), 2);	2019-07-29 22:55:18 -04:00
Diego Novillo	49797609b7	Protect against out-of-bounds references when folding OpCompositeExtract (#2774 ) This fixes #2608. The original test case had an out-of-bounds reference that ended up folding into OpCompositeExtract that was indexing right outside the constant composite. The returned constant would then cause a segfault during constant propagation.	2019-07-29 13:27:40 -07:00
alan-baker	7fd2365b06	Don't move debug or decorations when folding (#2772 ) Fixes #2764 * Don't replace all uses when simplifying instructions, instead only update non-debug, non-decoration uses * added a test * Add a new version of RAUW that takes a predicate to decide whether to replace the use or not * used in simplification pass	2019-07-29 16:20:43 -04:00
Diego Novillo	9559cdbdf0	Fix #2609 - Handle out-of-bounds scalar replacements. (#2767 ) * Fix #2609 - Handle out-of-bounds scalar replacements. When SROA tries to do a replacement for an OpAccessChain that is exactly one element out of bounds, the code was trying to access its internal array of replacements and segfaulting. This protects the code from doing this, and it additionally fixes the way SROA works by not returning failure when it refuses to do a replacement. Instead of failing the optimization pass, SROA will now simply refuse to do the replacement and keep going. Additionally, this patch fixes the SROA logic to now return a proper status so we can correctly state that the pass made no changes to the IR if it only found invalid references.	2019-07-26 12:33:40 -04:00
Steven Perron	bb0e2f65bb	Fix check for unreachable blocks in merge-return (#2762 ) Merge return expects unreachable merge block to look a certain way, and unreachable continue blocks to look a certain way. What if an unreachable block is both a merge and a continue? The continue is suppose to take precedent, but merge-return implements it with the merge taking precedent. This change flips that around. Fixes #2746	2019-07-25 09:34:18 -04:00
Steven Perron	c7fcb8c3b9	Process OpDecorateId in ADCE (#2761 ) * Process OpDecorateId in ADCE When there is an OpDecorateId instruction that is live, the ids that is references must be kept live. This change adds them to the worklist. I've also updated a validator check to allow OpDecorateId to be able to apply to decoration groups. Fixes #1759. * Remove dead code.	2019-07-24 14:43:49 -04:00
Steven Perron	fb83b6fbb5	Record correct dominators in merge return (#2760 ) In merge return, we need to know the original dominator for a block in order to traverse code from the original dominator to the new dominator and add appropriate Phi nodes. The current code gets this wrong because the dominator tree is build as needed. The first time we get the immediate dominator for a function we just built the dominator tree and it takes into account that a block has been split. The second time it does not. This inconsistency needs to be fixed. We do that by recording the original dominator for all blocks at the start of the pass. If we were to record just the basic block, that could change if the block is split. We want to traverse the code in the body of the original dominator, whatever block it ends up in. To make this easy to track, we not save the terminator instruction to represent the original dominator. Fixes #2745	2019-07-24 13:56:54 -04:00
Steven Perron	c9190a54da	SSA rewriter: Don't use trivial phis (#2757 ) When a phi candidate is marked as trivial, we are suppose to update all of its uses to the reference the value that it is being folded to. However, the code updates the uses misses `defs_at_block_`. So at a later time, the id for the trivial phi can reemerge. Fixes #2744	2019-07-23 17:59:30 -04:00
greg-lunarg	3855447d93	Bindless Instrument: Make init check depend solely on input_init_enabled (#2753 ) * Bindless Instrument: Make init check depend solely on input_init_enabled Previously was dependent on presense of descriptor_indexing extension in SPIR-V, but this missed some cases. Tests updated to refect this new policy. * Fix format.	2019-07-22 13:51:39 -04:00
David Neto	76b75c40a1	Document opt::Instruction::InsertBefore methods (#2751 )	2019-07-18 11:37:28 -04:00
Steven Perron	aa9e8f5380	Revert "Do not inline OpKill Instructions (#2713 )" (#2749 ) This reverts commit `fe7cc9c612`.	2019-07-17 14:59:05 -04:00
Steven Perron	230c9e4371	Fix bug in merge return (#2734 ) * Fix bug in merge return The merge return pass seems to assume that the only new edges in the cfg are from return block to merge blocks. However, it is possible that a merge block branches to a merge block when it did not before. This change add a new variable to track all of the new edges. It also renames some other variables and cleans us the code to make it a bit easier to read. Fixes #2702.	2019-07-16 09:11:22 -04:00
Jason Macnak	1fedf72e50	Allow ray tracing shaders in inst bindle check pass. (#2733 ) Adds the ray tracing stages (ray gen, intersection, any hit, closest hit, miss, and callable) to the allowed stages in pass instrumentation and add debug records for these stages to output the global launch id. More information for ray tracing shaders: - https://github.com/KhronosGroup/GLSL/blob/master/extensions/nv/GLSL_NV_ray_tracing.txt	2019-07-15 16:24:42 -04:00
greg-lunarg	92c41ff1e7	Remove Common Uniform Elimination Pass (#2731 ) Remove Common Uniform Elimination Pass Fixes #2520.	2019-07-12 11:02:10 -04:00
Steven Perron	5ce8cf781f	Change the order branches are simplified in dead branch elim (#2728 ) Dead branch elimination needs to know about the constructs that a block is contained it when determining what to do with its merge instruction. We currently fold branches in block as we see them, which is parent constructs before their children. This causes the struct cfg analysis to crash because it tries to get the parent construct for a block after the parent has been folded. This can be fixed by folding the branch of the children before the parents. Fixes #2667.	2019-07-10 14:59:44 -04:00
Thomas Roughton	cd153db8ed	Add —preserve-bindings and —preserve-spec-constants (#2693 ) Add optimizer options to for preservation of spec constants and variable with binding decorations. They are to be preserved even if they are unused.	2019-07-10 14:12:19 -04:00
Steven Perron	86e45efe15	Handle decorations better in some optimizations (#2716 ) There are a couple spots where we are not looking at decorations when we should. 1. Value numbering is suppose to assign a different value number to ids if they have different decorations. However that is not being done for OpCopyObject and OpPhi. 1. Instruction simplification is propagating OpCopyObject instruction without checking for decorations. It should only do that if no decorations are being lost. Add a new function to the decoration manager to check if the decorations of one id are a subset of the decorations of another. Fixes #2715.	2019-07-10 11:37:16 -04:00
alan-baker	0c4feb643b	Remove extra semis (#2717 ) * Remove extra semi-colons * Update re2 dep	2019-07-08 15:07:36 -04:00
Steven Perron	37e8f79946	Perform merge return with single return in loop. (#2714 ) Inlining does not inline functions that have a single return that is in a loop. This is because the return cannot be replaced by a branch outside of the loop easily. Merge return knows how to rewrite the function so the return is replaced by a branch. Fixes #2038.	2019-07-04 14:14:49 -04:00

1 2 3 4 5 ...

756 Commits