SPIRV-Tools

mirror of https://github.com/KhronosGroup/SPIRV-Tools synced 2024-10-20 20:10:05 +00:00

Author	SHA1	Message	Date
Jaebaek Seo	6a3eb679bd	Preserve debug info in scalar replacement pass (#3461 ) 1. Set the debug scope and line information for the new replacement instructions. 2. Replace DebugDeclare and DebugValue if their OpVariable or value operands are replaced by scalars. It uses 'Indexes' operand of DebugValue. For example, struct S { int a; int b;} S foo; // before scalar replacement int foo_a; // after scalar replacement int foo_b; DebugDeclare %dbg_foo %foo %null_expr // before DebugValue %dbg_foo %foo_a %Deref_expr 0 // after DebugValue %dbg_foo %foo_b %Deref_expr 1 // means Value(foo.members[1]) == Deref(%foo_b)	2020-07-27 13:02:25 -04:00
Jaebaek Seo	7c901a49c9	Preserve OpenCL.DebugInfo.100 through private-to-local pass (#3571 ) A debug instruction must not have any impact on the private-to-local optimization.	2020-07-27 09:27:47 -04:00
alan-baker	f3cec93665	Support SPV_KHR_terminate_invocation (#3568 ) Covers: - assembler - disassembler - validator - optimizer Co-authored-by: David Neto <dneto@google.com>	2020-07-22 11:45:02 -04:00
Steven Perron	dca2c86bc8	Sink pointer instructions in merge return (#3569 ) We cannot create an OpPhi for pointers, so we have to regenerate these instructions instead. Fixes #3030 Fixes #3266	2020-07-22 11:10:58 -04:00
greg-lunarg	cf7e922e70	Preserve OpenCL.DebugInfo.100 through elim-dead-code-aggressive (#3542 ) Essentially, it marks all DebugInfo instructions in functions (and their operands) as live. It treats DebugDeclare and DebugValue with Deref as loads and so marks Stores of their variables as live. It marks each DebugGlobalVariables as live except for its variable. After closure, it rechecks if the variable is live. If not, the DebugGlobalVariable instruction's variable operand is set to DebugInfoNone, per the DebugInfo spec.	2020-07-21 16:10:09 -04:00
vkushwaha-nv	e4aebf99fa	Add changes for SPV_EXT_shader_atomic_float (#3562 )	2020-07-21 10:31:05 -04:00
Vasyl Teliman	8e0215afe0	spirv-opt: Add support for OpLabel to dominator analysis (#3516 ) Fixes #3515.	2020-07-15 12:59:35 +01:00
Jaebaek Seo	4c33fb0d3d	Rewrite KillDebugDeclares() (#3513 ) DebugInfoManager::KillDebugDeclares() must erase the variable id from \|var_id_to_dbg_decl_\| after killing its DebugDeclare instructions.	2020-07-14 14:47:16 -04:00
greg-lunarg	282392dda2	Add support to GPU-AV instrumentation for Task and Mesh shaders (#3512 )	2020-07-14 11:55:24 -04:00
greg-lunarg	cf8c86a2d9	Preserve OpenCL.DebugInfo.100 through elim-local-single-store (#3498 ) This pass basically follows the same process as ssa-rewrite: it adds a DebugValue after each Store and removes the DebugDeclare or DebugValue Deref. It only does this if all instructions that are dependent on the Store are Loads and are replaced.	2020-07-10 15:17:14 -04:00
Jaebaek Seo	a687057a83	Preserve debug info in vector DCE pass (#3497 ) This commit lets the vector DCE pass preserve the OpenCL.DebugInfo.100 information properly. When the vector DCE pass determines the liveness of instructions, the debug instructions must not affect the decision. In addition, when it kills some instructions, it has to kill DebugValue instructions that use the killed instructions. When it updates some composite values to meaningful values (not undef), it has to remove DebugValue because the value information becomes incorrect.	2020-07-10 10:19:34 -04:00
Jaebaek Seo	94667fbf66	Fix build failure (#3508 )	2020-07-09 21:12:21 -04:00
greg-lunarg	44428352ba	Upgrade elim-local-single-block for OpenCL.DebugInfo.100 (#3451 ) Creates a DebugValue when removing a store to a local variable.	2020-07-09 17:21:39 -04:00
Jaebaek Seo	f8eddbbe59	Preserve OpenCL.100.DebugInfo in reduce-load-size pass (#3492 ) The decision to reduce the load must be not affected by debug instructions. For example, even when a DebugValue references a result id of a loaded composite value, this change lets the reduce-load-size pass reduce the load if the full composite value is not used anywhere other than the DebugValue.	2020-07-08 16:34:00 -04:00
Jaebaek Seo	6a4da9da42	Debug info preservation in copy-prop-array pass (#3444 ) When the pass replaces the local variable `OpVariable` ids to their corresponding pointers, we have to update operands of DebugValue or DebugDeclare instructions.	2020-07-06 13:48:12 -04:00
Jaebaek Seo	fc0dc3a9c7	Fix ADCE pass bug for mulitple entries (#3470 ) When there are multiple entries and the shader has a variable with WorkGroup storage class, those multiple entry functions store values to the variable. Since ADCE pass uses def-use chains to propagate the work list, some of instructions in the work list are not actually a part of the currently processed function. As a result, it adds instructions in other functions and put them in \|live_insts_\|. However, it does not have the control flow information for those instructions in other functions i.e., \|block2headerBranch_\| and \|header2nextHeaderBranch_\|. When it processes those instructions (they are added when it processes a different function), it skips handling them because they are already in \|live_insts_\| and does not check \|block2headerBranch_\| and \|header2nextHeaderBranch_\|, which results in skipping some branches. Even though those branches are live branches, it considers they are dead branches.	2020-06-29 13:08:48 -04:00
Jaebaek Seo	efaae24d00	Clear debug information for kill and replacement (#3459 ) For many spirv-opt passes such as simplify-instructions pass, we have to correctly clear the OpenCL.DebugInfo.100 debug information for KillInst() and ReplaceAllUses(). If we keep some debug information that disappeared because of KillInst() and ReplaceAllUses(), adding new DebugValue instructions based on the existing DebugDeclare information will generate incorrect information. This CL update DebugInfoManager and IRContext to correctly clear debug information.	2020-06-25 15:48:26 -04:00
Ehsan	7a1af58785	Support OpCompositeExtract pattern in desc_sroa (#3456 ) * Support load and extract pattern in desc_sroa. * Fix typo in comments. * Load replacement var before use; and added test. * fix formatting * Address code review comments.	2020-06-23 12:24:53 -05:00
Jaebaek Seo	d4b9f576eb	[spirv-opt] debug info preservation in ssa-rewrite (#3356 ) Add OpenCL.DebugInfo.100 `DebugValue` instructions for store and phi instructions of local variables to provide the debugger with the updated values of local variables correctly.	2020-06-19 14:57:43 -04:00
Ehsan	2a1b8c0622	Updated desc_sroa to support flattening structures (#3448 ) Not all structures should be flattened. Code patterns used by DXC are used to create checks for which structures should be flattened.	2020-06-19 14:35:18 -04:00
Steven Perron	545d158a2f	Use structured order to unroll loops. (#3443 ) Fixes #3441	2020-06-18 16:00:34 -04:00
Vasyl Teliman	99651228b2	Add RemoveParameter method (#3437 )	2020-06-17 10:15:50 -04:00
Vasyl Teliman	57d9e360c6	Fix return type (#3435 )	2020-06-17 10:10:06 -04:00
Ehsan	a7112d544b	Eliminate branches with condition of OpConstantNull (#3438 )	2020-06-16 13:31:03 -04:00
dan sinclair	52a5f074e9	Update access control lists. (#3433 ) This CL updates the access control lists used in SPIRV-Tools to the more descriptive allow/deny naming.	2020-06-15 13:20:40 -04:00
greg-lunarg	4410272bdd	Remove deprecated interfaces from instrument passes (#3361 )	2020-05-21 13:10:42 -04:00
Jaebaek Seo	50b1557886	Preserve debug info in inline pass (#3349 ) Handles the OpenCL100Debug extension in inlining. It preserves the information that is available while also adding the debug inlined at for all of the inlining that it does.	2020-05-21 13:09:43 -04:00
Diego Novillo	4dbe18b0c8	Reject folding comparisons with unfoldable types. (#3370 ) Reject folding comparisons with unfoldable types. Fixes #3343 When CCP is evaluating an instruction, it was trying to fold a comparison with 64 bit integers. This was causing a fold failure later since the folder still cannot deal with 64 bit integers.	2020-05-21 12:58:08 -04:00
Steven Perron	3c47dac282	Add unrolling to performance passes (#3082 ) Unroll loops that are marked as unroll when doing -O. Add cleanup optimizations after unrolling. Fixes #3067	2020-05-20 15:43:13 -04:00
Jaebaek Seo	2b987c49a4	Handle OpConstantNull in ssa-rewrite (#3362 ) ssa-rewrite fails in `MemPass::GetPtr()` when the SPIR-V code contains `OpLoad` for the result id of `OpConstantNull` because of the out of index access for an operand to get the base address. This commit fixes it. Fixes #3344	2020-05-20 12:00:51 -04:00
Steven Perron	85c7e7956b	Don't register edges twice in merge return (#3350 ) Fixes #3267	2020-05-19 10:28:04 -04:00
Steven Perron	bd0a2da946	Revert "Revert "[spirv-opt] refactor inlining pass (#3328 )" (#3342 )" (#3345 ) This reverts commit `d4fac3451b`.	2020-05-14 10:55:47 -04:00
André Perez	a6b0e132ec	Add adjust branch weights transformation (#3336 ) In this PR, the classes that represent the adjust branch weights transformation and fuzzer pass were implemented. This transformation adjusts the branch weights of a OpBranchConditional instruction.	2020-05-14 11:38:34 +01:00
Steven Perron	d4fac3451b	Revert "[spirv-opt] refactor inlining pass (#3328 )" (#3342 ) This reverts commit `233246bc9c`.	2020-05-13 23:44:19 -04:00
Jaebaek Seo	233246bc9c	[spirv-opt] refactor inlining pass (#3328 ) - No longer inline functions with early exits. Merge return can modify them so they can be inlined. - Otherwise no functional change, should be just refactoring.	2020-05-13 23:17:19 -04:00
Steven Perron	63fa9114a9	Do merge return if the return is not at the end of the function. (#3337 ) * Do merge return if the return is not at the end of the function. We will remove the code in inlining to handle a return in the middle of a function. To inline those functions, we need to run merge return to move the return to the end of the function.	2020-05-12 11:56:16 -04:00
Jaebaek Seo	c8590c18bd	Preserve debug info for wrap-opkill (#3331 ) Preserve debug info for wrap-opkill	2020-05-06 12:57:57 -04:00
Alastair Donaldson	49842b88ee	Generalize IsReadOnlyVariable() to apply to pointers (#3325 ) Generalizes the IsReadOnlyVariable() method, and related methods, so that they can be used to ask whether pointer result ids are read-only. Fixes #3324.	2020-04-30 22:47:20 +01:00
Steven Perron	49ca250b44	Delete nullptr in function bb list immedietly (#3326 ) When moving blocks around, we ended up with a nullptr for a basic block, and it was left in the list for a little bit. However, in that time, it would end up being dereferenced while traversing the function. To fix this, we delete it right away. This was found in an asan build that runs our current tests. No new tests are needed, but I did add extra check asan checks for our asan bot.	2020-04-28 21:54:08 -04:00
Jaebaek Seo	d0a87194f7	Set DebugScope for termination instructions (#3323 ) Many high-level languages like HLSL and GLSL generate termination instructions such as return and branch from the actual part of the high-level language code like return and if statements. This commit lets IrLoader set `DebugScope` for termination instructions.	2020-04-28 09:30:44 -04:00
Jaebaek Seo	42268740c9	Add debug information analysis (#3305 ) We need an analysis for OpenCL.DebugInfo.100 extension instructions such as a map between function id and its DebugFunction. This commit add an analysis for it.	2020-04-27 15:18:55 -04:00
David Neto	eed48ae479	Add spvtools::opt::Operand::AsLiteralUint64 (#3320 )	2020-04-27 09:38:06 -04:00
Steven Perron	61b7de3c39	Remove unreachable code. (#3304 )	2020-04-15 14:41:52 -04:00
Steven Perron	7d65bce0bb	Sampled images as read-only storage (#3295 ) There are some cases where a variable that is declared as a sampled image could be read only. That is when the image type has sampled == 1. Fixes #3288	2020-04-14 12:58:05 -04:00
Steven Perron	4956644894	Add tests for recently added command line option (#3297 ) We have not added tests for the new command line options recently. I've updated the test and fixed on option that was incorrect. Fixes #3247	2020-04-14 12:57:06 -04:00
Steven Perron	e70d25f6fa	Struct CFG analysus and single block loop (#3293 ) Loop headers must be marked as in the continue if the loop header is also the continue target. Fixes #3264	2020-04-13 10:08:31 -04:00
Jaebaek Seo	000040e707	Preserve debug info in eliminate-dead-functions (#3251 ) * Preserve debug info in eliminate-dead-functions The elimination of dead functions makes OpFunction operand of DebugFunction invalid. This commit replaces the operand with DebugInfoNone.	2020-04-13 09:29:36 -04:00
Steven Perron	34be23373b	Handle more cases in dead member elim (#3289 ) * Handle more cases in dead member elim - Rewrite composite insert and extract operations on SpecConstnatOp. - Leaves assert for Access chain instructions, which are only allowed for kernels. - Other operations do not require any extra code will no longer cause an assert. Fixes #3284. Fixes #3282.	2020-04-09 15:44:20 -04:00
alan-baker	af01d57b5e	Update dominates to check for null nodes (#3271 ) * Update dominates to check for null nodes Fixes #3270	2020-04-02 08:19:54 -04:00
alan-baker	f20c0d7971	Set wrapped kill basic block's parent (#3269 ) Fixes #3268 * Set the parent of the basic block for the wrapper function * Add a test	2020-04-01 12:31:57 -04:00
alan-baker	022da4d0e0	Fix identification of Vulkan images and buffers (#3253 ) Fixes #3252 * Image and buffer queries did not account for optional level of arrayness on the variable * new tests	2020-03-25 17:38:24 -04:00
Jaebaek Seo	1c8bda3721	Add data structure for DebugScope, DebugDeclare in spirv-opt (#3183 ) When DebugScope is given in SPIR-V, each instruction following the DebugScope is from the lexical scope pointed by the DebugScope in the high level language. We add DebugScope struction to keep the scope information in Instruction class. When ir_loader loads DebugScope/DebugNoScope, it keeps the scope information in \|last_dbg_scope_\| and lets following instructions have that scope information. In terms of DebugDeclare/DebugValue, if it is in a function body but outside of a basic block, we keep it in \|debug_insts_in_header_\| of Function class. If it is in a basic block, we keep it as a normal instruction i.e., in a instruction list of BasicBlock.	2020-03-23 11:01:18 -04:00
Ehsan	e6f372c5c2	Whitelist SPV_KHR_ray_tracing (#3241 )	2020-03-23 09:31:05 -05:00
David Neto	60104cd974	Add opt::Operand::AsCString and AsString (#3240 ) It only works when the operand is a literal string.	2020-03-19 12:44:28 -04:00
JiaoluAMD	da52d0875c	Add RayQueryProvisionalKHR to opt types (#3239 ) Add missing RayQueryProvisionalKHR types	2020-03-19 12:41:30 -04:00
Ehsan	18d3896a15	Whitelist SPV_EXT_demote_to_helper_invocation for opt passes (#3236 )	2020-03-17 22:36:02 -05:00
Daniel Koch	5a97e3a391	Add support for KHR_ray_{query,tracing} extensions (#3235 ) Update validator for SPV_KHR_ray_tracing. * Added handling for new enum types * Add SpvScopeShaderCallKHR as a valid scope * update spirv-headers Co-authored-by: alelenv <alele@nvidia.com> Co-authored-by: Torosdagli <ntorosda@amd.com> Co-authored-by: Tobias Hector <tobias.hector@amd.com> Co-authored-by: Steven Perron <stevenperron@google.com>	2020-03-17 15:30:19 -04:00
greg-lunarg	1fe9bcc108	Instrument: Debug Printf support (#3215 ) Create a pass to instrument OpDebugPrintf instructions. This pass replaces all OpDebugPrintf instructions with instructions to write a record containing the string id and the all specified values into a special printf output buffer (if space allows). This pass is designed to support the printf validation in the Vulkan validation layers. Fixes #3210	2020-03-12 09:19:52 -04:00
Diego Novillo	a9624b4d5d	Handle TimeAMD in AmdExtensionToKhrPass. (#3168 ) This adds support for replacing TimeAMD with OpReadClockKHR. The scope for OpReadClockKHR is fixed to be a subgroup because TimeAMD operates only on subgroup.	2020-02-03 12:13:32 -05:00
Arseny Kapoulkine	0265a9d4de	Implement constant folding for many transcendentals (#3166 ) * Implement constant folding for many transcendentals This change adds support for folding of sin/cos/tan/asin/acos/atan, exp/log/exp2/log2, sqrt, atan2 and pow. The mechanism allows to use any C function to implement folding in the future; for now I limited the actual additions to the most commonly used intrinsics in the shaders. Unary folder had to be tweaked to work with extended instructions - for extended instructions, constants.size() == 2 and constants[0] == nullptr. This adjustment is similar to the one binary folder already performs. Fixes #1390. * Fix Android build On old versions of Android NDK, we don't get std::exp2/std::log2 because of partial C++11 support. We do get ::exp2, but not ::log2 so we need to emulate that.	2020-02-03 09:20:47 -05:00
Alastair Donaldson	7a2d408dea	Fix typo in comment. (#3163 )	2020-01-30 15:01:05 -05:00
Steven Perron	97f1d485b7	Dead branch elim fix (#3160 ) We must treat a branch to the merge node of a switch that is in the header of a construct as a nested construced. The original merge instruction is still needed in that case.	2020-01-28 10:17:43 -05:00
greg-lunarg	e7afeb060e	Use dummy switch instead of dummy loop in MergeReturn pass. (#3151 ) Fixes #3127	2020-01-24 12:20:14 -05:00
Jaebaek Seo	dd37d73c5e	Handle conflict between debug info and existing validation rule (#3104 ) * Allow OpExtInst for DebugInfo between secion 9 and 10 Fixes #3086 * Handle spirv-opt errors on DebugInfo Ext * Add IR Loader test * Fix ir loader bug * Handle DebugFunction/DebugTypeMember forward reference * Add test cases (forward reference to function) * Support old DebugInfo extension * Validate local debug info out of function	2020-01-23 17:04:30 -05:00
Jaebaek Seo	f8d7df760c	Fix OpLine bug of merge-blocks pass (#3130 ) As explained in #3118, spirv-opt merge-blocks pass causes a spirv-val error when an OpBranch has an OpLine in front of it. OpLoopMerge OpBranch ; Will be killed by merge-blocks pass OpLabel ; Will be killed by merge-blocks pass OpLine ; will be placed between OpLoopMerge and OpBranch - error! OpBranch To fix this issue, this commit moves line info of OpBranch to OpLoopMerge. Fixes #3118	2020-01-14 14:35:21 -05:00
David Neto	c8bf14393c	GetOperandConstants operand can be const (#3126 )	2020-01-06 11:14:04 -05:00
greg-lunarg	9215c1b7df	Fix convert-relax-to-half invalid code (#3099 ) (#3106 )	2019-12-20 21:08:12 -05:00
David Neto	e70b009b0f	Add support for SPV_KHR_non_semantic_info (#3110 ) Add support for SPV_KHR_non_semantic_info This entails a couple of changes: - Allowing unknown OpExtInstImport that begin with the prefix `NonSemantic.` - Allowing OpExtInst that reference any of those sets to contain unknown ext inst instruction numbers, and assume the format is always a series of IDs as guaranteed by the extension. - Allowing those OpExtInst to appear in the types/variables/constants section. - Not stripping OpString in the --strip-debug pass, since it may be referenced by these non-semantic OpExtInsts. - Stripping them instead in the --strip-reflect pass. * Add adjacency validation of non-semantic OpExtInst - We validate and test that OpExtInst cannot appear before or between OpPhi instructions, or before/between OpFunctionParameter instructions. * Change non-semantic extinst type to single value * Add helper function spvExtInstIsNonSemantic() which will check if the extinst set is non-semantic or not, either the unknown generic value or any future recognised non-semantic set. * Add test of a complex non-semantic extinst * Use DefUseManager in StripDebugInfoPass to strip some OpStrings * Any OpString used by a non-semantic instruction cannot be stripped, all others can so we search for uses to see if each string can be removed. * We only do this if the non-semantic debug info extension is enabled, otherwise all strings can be trivially removed. * Silence -Winconsistent-missing-override in protobufs	2019-12-18 18:10:29 -05:00
greg-lunarg	fccbc00aca	Make Instrumentation format version 2 the default (Step 1) (#3096 ) * Make Instrumentation format version 2 the default (Step 1) Add new interfaces without version number argument. Remove version 1 logic and tests. Version interfaces will be removed in step 2 after layers have transitioned to new interface. * Add error messages to InstrumentPass().	2019-12-16 14:18:47 -05:00
Steven Perron	00ca4e5bdf	Don't crash when folding construct of empty struct (#3092 ) * Don't crash when folding construct of empty struct An OpCompositeConstruct of an empty struct will be folded to a constant under normal circumstances. However, if the id limit has been reached and the constant cannot be generated, then other folding rules will be tried. These rules do not handle the case of an empty struct. We add allow it to be handled. Fixes http://crbug/1030194 * Changes based on the review.	2019-12-10 14:58:30 -05:00
Sarah	0a5d99d02c	Permit the debug instructions in WebGPU SPIR-V - remove from the optimizer (#3083 ) continuing #3063 fixing #3052	2019-12-03 11:21:26 -05:00
David Neto	af7410597e	graphics robust access: use signed clamp (#3073 ) Access chain indices are always interpreted as signed integers. So use signed clamp instead of unsigned clamp. We must also clamp to the max signed int for the index type. Fixes #3072	2019-12-03 11:18:56 -05:00
Steven Perron	3ed4586044	Folding: perform add and sub on mismatched integer types (#3084 ) Fixes #3040	2019-12-02 17:51:20 -05:00
alan-baker	b334829a91	Validate nested constructs (#3068 ) * Validate that if a construct contains a header and it's merge is reachable, the construct also contains the merge * updated block merging to not merge into the continue * update inlining to mark the original block of a single block loop as the continue * updated some tests * remove dead code * rename kBlockTypeHeader to kBlockTypeSelection for clarity	2019-11-27 16:45:57 -05:00
Steven Perron	54385458ca	Handle unreachable block when computing register pressure (#3070 ) Fixes #3053	2019-11-27 09:45:17 -05:00
greg-lunarg	868ca3954c	Improve RegisterSizePasses (#3059 )	2019-11-27 09:41:50 -05:00
Steven Perron	0391d0823e	Handle OpPhi with no in operands in value numbering (#3056 ) Fixes #3043	2019-11-19 09:45:39 -05:00
Steven Perron	ca703c8877	Kill the id-to-func map after wrap-opkill (#3055 ) Wrap-opkill will create a new function, invalidating the id-to-func map. The preserved analyses for the pass have been updated to reflect that. Also adding consistency check for the id-to-func map. With this new check, old tests identify this problem. No new tests are needed. Fixes #3038	2019-11-19 09:44:53 -05:00
alan-baker	ab3cdcaef5	Fix operand access of composite in upgrade memory model (#3021 ) Fixes #2992 * Accessing aggregate subtype used the wrong operand * Added a test	2019-11-12 13:41:38 -05:00
Ehsan	12e54dae16	Update Offset to ConstOffset bitmask if operand is constant. (#3024 ) Update Offset to ConstOffset bitmask if operand is constant. Fixes #3005	2019-11-11 22:35:14 -05:00
David Neto	618ee50942	Fix some clang-tidy issues in graphics_robust_access_pass (#2998 ) One remains: the fact that the image-texel-pointer modification is mostly dead code. But that's intentional for now.	2019-10-30 14:00:34 -04:00
Jakub Kuderski	f893d4d41d	[opt] Do not compare optimized binary with an invalidated buffer (#2999 )	2019-10-30 10:01:28 -07:00
greg-lunarg	5ea7099374	Add two new simplifications. (#2984 ) Implements the following simplifications: (a - b) + b => a (a * b) + (a * c) => a * (b + c) Also adds logic to simplification to handle rules that create new operations that might need simplification, such as the second rule above. Only perform the second simplification if the multiplies have the add as their only use. Otherwise this is a deoptimization of size and performance.	2019-10-28 08:19:38 -07:00
greg-lunarg	02910ffdff	Instrument: Add missing def-use analysis. (#2985 )	2019-10-22 07:24:54 -07:00
Steven Perron	6a9be627c7	Keep NOPs when comparing with original binary (#2931 ) We have a check that ensures that the optimizer did not change the binary when it says that it did not. However, when the binary is converted back to a binary, we made a decision to remove OpNop instructions. This means that any spv file that contains a NOP originally will fail this check. To get around this, we convert the module to a second binary that keeps the OpNop instructions. That binary is compared against the original. Fixes https://crbug.com/1010191	2019-10-18 09:53:29 -04:00
Jakub Kuderski	e3da3143b2	Disallow use of OpCompositeExtract/OpCompositeInsert with no indices (#2980 )	2019-10-17 13:53:34 -04:00
Aaron Barany	9c0ae6bb8e	Improved CMake install step. (#2963 ) Added exports for libraries. External libraries that themselves use libraries require all dependencies have exports, so not having exports can cause major problems when used within other projects. Install paths for exports are now placed in the proper directories expected by Windows and *nix systems. Config files are generated as well, which should work with CMake's find_package() function once installed.	2019-10-17 11:36:55 -04:00
Jakub Kuderski	e99b918221	Support constant-folding UConvert and SConvert (#2960 )	2019-10-16 16:29:55 -04:00
Steven Perron	32f76efa6c	Link cfg and dominator analysis in the context (#2946 ) Fixes #2889	2019-10-08 10:16:18 -04:00
Jeremy Hayes	3c7ff8d4f0	Enable OpTypeCooperativeMatrix specialization (#2927 )	2019-10-07 09:52:48 -04:00
Steven Perron	c18c9ff6bc	Handle OpKill better (#2933 ) We want to handle OpKill better. The wrap opkill causes lots of extra code to be generated, even when they are not needed to avoid the main problem: OpKill cannot be found directly in a continue construct. This change will be more selective on which functions the OpKill will be wrapped and inlining will avoid inlining. Fixes #2912	2019-10-04 13:05:32 -04:00
greg-lunarg	ad3d23f478	Generate null pointer by converting uint64 zero to pointer. (#2935 ) Fixes #2929.	2019-10-04 12:26:38 -04:00
Steven Perron	9eb1c9a4c4	Add continue construct analysis to struct cfg analysis (#2922 ) * Add continue construct analysis to struct cfg analysis Add the ability to identify which blocks are in the continue construct for a loop, and to get functions that are called from those blocks, directly or indirectly. Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/2912.	2019-10-01 10:27:09 -04:00
Steven Perron	85c67b5e08	Record trailing line dbg instructions (#2926 ) There is nothing in the spir-v spec that says the last instructions in a module cannot be OpLine or OpNoLine. However, the code that parses the module will simply drop these instructions. We add code that will preserve these instructions. Strip-debug-info is updated to remove these instructions. Fixes https://crbug.com/1000689.	2019-09-27 16:03:45 -04:00
Ryan Harrison	4075b921f9	Add removing references to debug instructions when removing them (#2923 ) Fixes #2921	2019-09-27 13:23:06 -05:00
Steven Perron	2a11f365bc	Handle id overflow in wrap-opkill (#2916 ) New code in wrap-opkill does not handle id overflow correctly. We fix that up. Fixes https://crbug.com/1007144	2019-09-25 17:42:58 -04:00
Steven Perron	55ea57a785	Handle extract with no indexes (#2910 ) * Handle extract with no indexes It is possible that OpCompositeExtract instructions will not have any indexes. This is not handled well by scalar replacement and instruction folding. Fixes https://crbug.com/1006435 * Fix typo.	2019-09-24 16:19:31 -04:00
Steven Perron	6f26d9ad81	Handle id overflow in convert local access chains (#2908 ) Fixes https://crbug.com/1004453	2019-09-24 14:04:54 -04:00
David Neto	8d0ca43da5	Add method comment for opt::Function::WhileEachInst (#2867 ) Also, say that ForEachInst and ForEachParam process instructions/parameters in order.	2019-09-23 09:36:48 -04:00
Steven Perron	6b07212659	Use OpReturn* in wrap-opkill (#2886 ) * Use OpReturn* in wrap-opkill The warp-opkill pass is generating incorrect code. It is placing an OpUnreachable at the end of a basic block, when the block can be reached. We can't reach the end of the block, but we can reach the end. Instead we will add a return instruction. Fixes #2875.	2019-09-20 10:32:27 -04:00
Steven Perron	61edde52a0	Revert "Use OpReturn* in wrap-opkill" This reverts commit `87f0fa432f`.	2019-09-19 22:39:56 -04:00
Steven Perron	87f0fa432f	Use OpReturn* in wrap-opkill The warp-opkill pass is generating incorrect code. It is placing an OpUnreachable at the end of a basic block, when the block can be reached. We can't reach the end of the block, but we can reach the end. Instead we will add a return instruction. Fixes #2875.	2019-09-19 22:34:57 -04:00
Ehsan	08fcf8a4ab	Fix header include syntax. (#2882 )	2019-09-19 09:26:24 -05:00
Steven Perron	248c80b049	Handle OpConstantNull in copy-prop-arrays. (#2870 ) Many of the places in copy propagate arrays assumes that integer constant will be defined by an OpConstant instruction. That is not always true. We fix these spots by allowing for an OpConstantNull.	2019-09-19 10:24:00 -04:00
Ryan Harrison	67b87f22cf	Handle another case where creating a constant can fail (#2854 ) Fixes #2847	2019-09-11 17:18:05 -04:00
Steven Perron	c7a39bc40f	Don't inline function containing OpKill (#2842 ) If an OpKill instruction is inlined into a continue construct, then the spir-v is no longer valid. To avoid this issue, we do inline into an OpKill at all. This method was chosen because it is difficult to keep track of whether or not you are in a continue construct while changing the function that is being inlined into. This will work well with wrap OpKill because every will still be inlined except for the OpKill instruction itself. Fixes #2554 Fixes #2433 This reverts commit `aa9e8f5380`.	2019-09-11 13:26:55 -04:00
Steven Perron	4f9256db35	Handle id overflow in wrap op kill. (#2851 ) Fixes https://crbug.com/997729	2019-09-11 13:26:42 -04:00
Ryan Harrison	c0e9807094	Handle creating a new constant failing gracefully (#2848 ) Fixes #2847	2019-09-10 12:51:19 -04:00
Steven Perron	35c9518c4e	Handle id overflow in the ssa rewriter. (#2845 ) * Handle id overflow in the ssa rewriter. Remove LocalSSAElim pass at the same time. It does the same thing as the SSARewrite pass. Then even share almost all of the same code. Fixes crbug.com/997246	2019-09-10 09:38:23 -04:00
Steven Perron	7f7236f1eb	Handle id overflow in the constant manager. (#2844 ) Fixes crbug.com/997246	2019-09-09 15:12:26 -04:00
Steven Perron	76261e2a7d	Replace CubeFaceCoord and CubeFaceIndexAMD (#2840 ) Part of #2814.	2019-09-06 17:11:37 -04:00
Steven Perron	b218ad1994	Fold Min, Max, and Clamp instructions. (#2836 ) Fixes #2830.	2019-09-05 13:30:03 -04:00
Steven Perron	a41520eaa4	Replace uses of SPV_AMD_shader_trinary_minmax extension (#2835 ) Part of #2814	2019-09-05 09:29:04 -04:00
rumblehhh	1dfb5fc12e	Export SPIRV-Tools targets on installation (#2785 ) This allows the targets to be used in other cmake projects. See the following for more details: https://cmake.org/cmake/help/latest/manual/cmake-packages.7.html#creating-packages https://foonathan.net/blog/2016/07/07/cmake-dependency-handling.html	2019-09-04 12:45:26 -04:00
greg-lunarg	c77045b4a0	Instrument: Be sure Float16 capability on when generating float16 null (#2831 )	2019-09-03 15:19:36 -04:00
greg-lunarg	d11725b1d4	Add --relax-float-ops and --convert-relaxed-to-half (#2808 ) The first pass applies the RelaxedPrecision decoration to all executable instructions with float32 based type results. The second pass converts all executable instructions with RelaxedPrecision result to the equivalent float16 type, inserting converts where necessary.	2019-09-03 13:22:13 -04:00
Steven Perron	b54d950298	Fold Fmix should accept vector operands. (#2826 ) Fixes #2819	2019-09-03 09:17:18 -04:00
Ben Clayton	65e362b7ae	AggressiveDCEPass: Set modified to true when appending to to_kill_ (#2825 ) Also add an assertion that these `modified` is true if to_kill_ has a non-zero size to catch this sort of issue in the pass. Fixes: #2824	2019-08-30 16:27:22 -04:00
Steven Perron	d67130caca	Replace SwizzleInvocationsAMD extended instruction. (#2823 ) Part of #2814	2019-08-30 14:07:24 -04:00
Steven Perron	ad71c057c7	Replace SwizzleInvocationsMaskedAMD extended instruction. (#2822 ) Part of #2814	2019-08-30 10:48:42 -04:00
Steven Perron	35d98be3bc	Amd ext to khr (#2811 ) Add the first steps to removing the AMD extension VK_AMD_shader_ballot. Splitting up to make the PRs smaller. Adding utilities to add capabilities and change the version of the module. Replaces the instructions: OpGroupIAddNonUniformAMD = 5000 OpGroupFAddNonUniformAMD = 5001 OpGroupFMinNonUniformAMD = 5002 OpGroupUMinNonUniformAMD = 5003 OpGroupSMinNonUniformAMD = 5004 OpGroupFMaxNonUniformAMD = 5005 OpGroupUMaxNonUniformAMD = 5006 OpGroupSMaxNonUniformAMD = 5007 and extentend instructions WriteInvocationAMD = 3 MbcntAMD = 4 Part of #2814	2019-08-29 12:48:17 -04:00
Ben Clayton	5a581e738c	spvtools::Optimizer - don't assume original_binary and optimized_binary are aliased (#2799 ) If they are not aliased, the function will always print the message: "Binary unexpectedly changed despite optimizer saying there was no change" Which is (usually) totally bogus. Fixes #2798	2019-08-29 10:04:55 -04:00
Steven Perron	73422a0a5e	Check feature mgr in context consistency check (#2818 ) We add a check that the feature manager is correcter after each pass. This resulted in a couple failing tests cases. Those are fixed. Part of #2814	2019-08-28 11:49:16 -04:00
Steven Perron	15fc19d091	Refactor instruction folders (#2815 ) * Refactor instruction folders We want to refactor the instruction folder to allow different sets of rules to be added to the instruction folder. We might want different sets of rules in different circumstances. We also need a way to add rules for extended instructions. Changes are made to the FoldingRules class and ConstFoldingRules class to enable that. We added tests to check that we can fold extended instructions using the new framework. At the same time, I noticed that there were two tests that did not tests what they were suppose to. They could not be easily salvaged. #2813 was opened to track adding the new tests.	2019-08-26 18:54:11 -04:00
Steven Perron	b00ef0d26e	Handle Id overflow in private-to-local (#2807 ) We need to handle id overflow in the private to local pass. Fixes https://crbug.com/962295	2019-08-22 09:14:48 -04:00
Steven Perron	aef8f92b2b	Even more id overflow in sroa (#2806 ) Now we need to handle id overflow when we overflow while replacing uses of the variable. While looking at this code, I noticed an error in the way we handle access chains that cannot be replaced because of overflow. Name it will make some change, and then give up by returning SuccessWithoutChange. But it was changed. This is fixed up by returning Failure if we notice the error at the time of rewriting the users. This is for both id overflow or out-of-bounds accesses. Code is added to "CheckUses" to remove variables that have out-of-bounds accesses from the candidate list, so we don't even try to rewrite its uses. Fixes https://crbug.com/995032	2019-08-21 13:12:42 -04:00
Steven Perron	c5d1dab99e	Add name for variables in desc sroa (#2805 ) Fixes #2802.	2019-08-21 10:55:02 -04:00
David Neto	0cbdc7a2c3	Remove unimplemented method declaration (#2804 )	2019-08-20 08:53:27 -04:00
Steven Perron	bc62722b80	Handle overflow in wrap-opkill (#2801 ) Fixes https://crbug/994203	2019-08-18 19:00:18 -04:00
Steven Perron	9cd07272a6	More handle overflow in sroa (#2800 ) If we run out of ids when creating a new variable, sroa does not recognize the error, and continues doing work. This leads to segmentation faults. Fixes https://crbug/969655	2019-08-16 13:15:17 -04:00
greg-lunarg	06407250a1	Instrument: Add support for Buffer Device Address extension (#2792 )	2019-08-16 09:18:34 -04:00
Jaebaek Seo	ff872dc6bf	Change the way to include header (#2795 ) `#include <source/util/string_utils.h>` works only when we specify `include_directories(${CMAKE_CURRENT_SOURCE_DIR}/)` in cmake. It is hard to set the source directory as a include path in some build systems e.g., bazel. Using the relative path easily solves this issue. This commit uses `#include "source/util/string_utils.h"` instead of `#include <source/util/string_utils.h>`.	2019-08-14 18:09:20 -04:00
Steven Perron	60043edfa1	Replace OpKill With function call. (#2790 ) We are no able to inline OpKill instructions into a continue construct. See #2433. However, we have to be able to inline to correctly do legalization. This commit creates a pass that will wrap OpKill instructions into a function of its own. That way we are able to inline the rest of the code. The follow up to this will be to not inline any function that contains an OpKill. Fixes #2726	2019-08-14 09:27:12 -04:00
greg-lunarg	95386f9e45	Instrument: Fix version 2 output record write for tess eval shaders. (#2782 ) Fix output record write for tess eval shaders. Also change command line for bindless instrumentation to use use output record version 2.	2019-08-09 08:22:41 -04:00
Steven Perron	4b64beb1ae	Add descriptor array scalar replacement (#2742 ) Creates a pass that will replace a descriptor array with individual variables. See #2740 for details. Fixes #2740.	2019-08-08 10:53:19 -04:00
greg-lunarg	29af42df12	Add SPV_EXT_physical_storage_buffer to opt whitelists (#2779 ) This also fixes ADCE to not remove possibly needed OpTypeForwardPointer. The bug, its fix and the corresponding test have a circular dependency with the extension, so they are packaged together.	2019-08-08 09:45:59 -04:00
Steven Perron	b029d3697e	Handle RelaxedPrecision in SROA (#2788 ) If a member of a struct has a relaxed precision, sroa will not split the struct. This means we do not get all cases. This commit handles these cases. The other part is that the decoration needs to be passed on to the new variables. Fixes #2786	2019-08-07 12:17:26 -04:00
Geoff Lang	0b70972a29	Remove extra ';' after member function definition. (#2780 ) This fixes a clang compiler warning about extra semicolons.	2019-08-01 19:33:55 -04:00
alan-baker	3726b500b1	Treat access chain indexes as signed in SROA (#2776 ) Fixes #2768 * In scalar replacement, interpret access chain indexes as signed counts * Use Constant::GetSignExtendedValue and Constant::GetZeroExtendedValue where appropriate * new tests	2019-07-31 15:39:33 -04:00
David Neto	31590104ec	Add pass to inject code for robust-buffer-access semantics (#2771 ) spirv-opt: Add --graphics-robust-access Clamps access chain indices so they are always in bounds. Assumes: - Logical addressing mode - No runtime-array-descriptor-indexing - No variable pointers Adds stub code for clamping coordinate and samples for OpImageTexelPointer. Adds SinglePassRunAndFail optimizer test fixture. Android.mk: add source/opt/graphics_robust_access_pass.cpp Adds Constant::GetSignExtendedValue, Constant::GetZeroExtendedValue	2019-07-30 19:52:46 -04:00
David Neto	ac3d131054	Element type is const for analysis::Vector,Matrix,RuntimeArray (#2765 ) This makes it symmetric with the result type of ...->element_type which returns a const Type. So now we can write code like this: analysis::Vector v = ... analysis::Vector(v->element_type(), 2);	2019-07-29 22:55:18 -04:00
Diego Novillo	49797609b7	Protect against out-of-bounds references when folding OpCompositeExtract (#2774 ) This fixes #2608. The original test case had an out-of-bounds reference that ended up folding into OpCompositeExtract that was indexing right outside the constant composite. The returned constant would then cause a segfault during constant propagation.	2019-07-29 13:27:40 -07:00
alan-baker	7fd2365b06	Don't move debug or decorations when folding (#2772 ) Fixes #2764 * Don't replace all uses when simplifying instructions, instead only update non-debug, non-decoration uses * added a test * Add a new version of RAUW that takes a predicate to decide whether to replace the use or not * used in simplification pass	2019-07-29 16:20:43 -04:00
Diego Novillo	9559cdbdf0	Fix #2609 - Handle out-of-bounds scalar replacements. (#2767 ) * Fix #2609 - Handle out-of-bounds scalar replacements. When SROA tries to do a replacement for an OpAccessChain that is exactly one element out of bounds, the code was trying to access its internal array of replacements and segfaulting. This protects the code from doing this, and it additionally fixes the way SROA works by not returning failure when it refuses to do a replacement. Instead of failing the optimization pass, SROA will now simply refuse to do the replacement and keep going. Additionally, this patch fixes the SROA logic to now return a proper status so we can correctly state that the pass made no changes to the IR if it only found invalid references.	2019-07-26 12:33:40 -04:00
Steven Perron	bb0e2f65bb	Fix check for unreachable blocks in merge-return (#2762 ) Merge return expects unreachable merge block to look a certain way, and unreachable continue blocks to look a certain way. What if an unreachable block is both a merge and a continue? The continue is suppose to take precedent, but merge-return implements it with the merge taking precedent. This change flips that around. Fixes #2746	2019-07-25 09:34:18 -04:00
Steven Perron	c7fcb8c3b9	Process OpDecorateId in ADCE (#2761 ) * Process OpDecorateId in ADCE When there is an OpDecorateId instruction that is live, the ids that is references must be kept live. This change adds them to the worklist. I've also updated a validator check to allow OpDecorateId to be able to apply to decoration groups. Fixes #1759. * Remove dead code.	2019-07-24 14:43:49 -04:00
Steven Perron	fb83b6fbb5	Record correct dominators in merge return (#2760 ) In merge return, we need to know the original dominator for a block in order to traverse code from the original dominator to the new dominator and add appropriate Phi nodes. The current code gets this wrong because the dominator tree is build as needed. The first time we get the immediate dominator for a function we just built the dominator tree and it takes into account that a block has been split. The second time it does not. This inconsistency needs to be fixed. We do that by recording the original dominator for all blocks at the start of the pass. If we were to record just the basic block, that could change if the block is split. We want to traverse the code in the body of the original dominator, whatever block it ends up in. To make this easy to track, we not save the terminator instruction to represent the original dominator. Fixes #2745	2019-07-24 13:56:54 -04:00
Steven Perron	c9190a54da	SSA rewriter: Don't use trivial phis (#2757 ) When a phi candidate is marked as trivial, we are suppose to update all of its uses to the reference the value that it is being folded to. However, the code updates the uses misses `defs_at_block_`. So at a later time, the id for the trivial phi can reemerge. Fixes #2744	2019-07-23 17:59:30 -04:00
greg-lunarg	3855447d93	Bindless Instrument: Make init check depend solely on input_init_enabled (#2753 ) * Bindless Instrument: Make init check depend solely on input_init_enabled Previously was dependent on presense of descriptor_indexing extension in SPIR-V, but this missed some cases. Tests updated to refect this new policy. * Fix format.	2019-07-22 13:51:39 -04:00
David Neto	76b75c40a1	Document opt::Instruction::InsertBefore methods (#2751 )	2019-07-18 11:37:28 -04:00
Steven Perron	aa9e8f5380	Revert "Do not inline OpKill Instructions (#2713 )" (#2749 ) This reverts commit `fe7cc9c612`.	2019-07-17 14:59:05 -04:00
Steven Perron	230c9e4371	Fix bug in merge return (#2734 ) * Fix bug in merge return The merge return pass seems to assume that the only new edges in the cfg are from return block to merge blocks. However, it is possible that a merge block branches to a merge block when it did not before. This change add a new variable to track all of the new edges. It also renames some other variables and cleans us the code to make it a bit easier to read. Fixes #2702.	2019-07-16 09:11:22 -04:00
Jason Macnak	1fedf72e50	Allow ray tracing shaders in inst bindle check pass. (#2733 ) Adds the ray tracing stages (ray gen, intersection, any hit, closest hit, miss, and callable) to the allowed stages in pass instrumentation and add debug records for these stages to output the global launch id. More information for ray tracing shaders: - https://github.com/KhronosGroup/GLSL/blob/master/extensions/nv/GLSL_NV_ray_tracing.txt	2019-07-15 16:24:42 -04:00
greg-lunarg	92c41ff1e7	Remove Common Uniform Elimination Pass (#2731 ) Remove Common Uniform Elimination Pass Fixes #2520.	2019-07-12 11:02:10 -04:00
Steven Perron	5ce8cf781f	Change the order branches are simplified in dead branch elim (#2728 ) Dead branch elimination needs to know about the constructs that a block is contained it when determining what to do with its merge instruction. We currently fold branches in block as we see them, which is parent constructs before their children. This causes the struct cfg analysis to crash because it tries to get the parent construct for a block after the parent has been folded. This can be fixed by folding the branch of the children before the parents. Fixes #2667.	2019-07-10 14:59:44 -04:00
Thomas Roughton	cd153db8ed	Add —preserve-bindings and —preserve-spec-constants (#2693 ) Add optimizer options to for preservation of spec constants and variable with binding decorations. They are to be preserved even if they are unused.	2019-07-10 14:12:19 -04:00
Steven Perron	86e45efe15	Handle decorations better in some optimizations (#2716 ) There are a couple spots where we are not looking at decorations when we should. 1. Value numbering is suppose to assign a different value number to ids if they have different decorations. However that is not being done for OpCopyObject and OpPhi. 1. Instruction simplification is propagating OpCopyObject instruction without checking for decorations. It should only do that if no decorations are being lost. Add a new function to the decoration manager to check if the decorations of one id are a subset of the decorations of another. Fixes #2715.	2019-07-10 11:37:16 -04:00
alan-baker	0c4feb643b	Remove extra semis (#2717 ) * Remove extra semi-colons * Update re2 dep	2019-07-08 15:07:36 -04:00
Steven Perron	37e8f79946	Perform merge return with single return in loop. (#2714 ) Inlining does not inline functions that have a single return that is in a loop. This is because the return cannot be replaced by a branch outside of the loop easily. Merge return knows how to rewrite the function so the return is replaced by a branch. Fixes #2038.	2019-07-04 14:14:49 -04:00
Steven Perron	fe7cc9c612	Do not inline OpKill Instructions (#2713 ) It is illegal to inline an OpKill instruction into a continue construct because the continue header will no longer dominate the backedge. This commit adds a check for this, and does not inline. If we still want to be able to inline a function that contains an OpKill, we can add a new pass that will wrap OpKill instructions into its own function with just the single instruction. I do not believe that this is a common case right now, so I will not do that yet. Fixes #2433.	2019-07-04 12:08:23 -04:00
Jason Macnak	e6e3e2ccc6	Update type for loaded builtin GlobalInvocationID in pass instrumentation (#2705 ) When working on descriptor indexing validation for compute shaders, the gl_GlobalInvocationID builtin was being loaded as uint which would cause compute shaders instrumented by the bindless check pass to have: %83 = OpLoad %uint %gl_GlobalInvocationID %84 = OpCompositeExtract %uint %83 0 %85 = OpCompositeExtract %uint %83 1 %86 = OpCompositeExtract %uint %83 2 which results in validation failures: error: line 127: Reached non-composite type while indexes still remain to be traversed. %84 = OpCompositeExtract %uint %83 0 for trying to extract a uint from a uint.	2019-06-28 09:46:16 -04:00
Ehsan	a132c9b640	Whitelist SPV_GOOGLE_user_type. (#2673 )	2019-06-19 12:18:13 -04:00
alan-baker	2090d7a2d2	Handle volatile memory semantics in upgrade (#2674 ) * If an atomic is decorated with volatile add the volatile bit to its memory semantics	2019-06-17 16:01:37 -04:00
Steven Perron	208d3132e6	Cast __LINE__ to size_t (#2661 ) Fixes #2648	2019-06-07 13:06:42 -04:00
greg-lunarg	43fb2403a6	Instrument: Fix code for version 2 output format. (#2655 ) Correct record size. Also bring version 2 tests up to version 1 equivalence.	2019-06-06 11:35:34 -04:00
David Neto	d01a3c3b4b	Optimizer: Handle array type with OpSpecConstantOp length (#2652 ) When it's an OpConstant or OpSpecConstant, then the literal values are compared. If the OpSpecConstant also has a SpecId decoration, then that's also compared. Otherwise, it's an OpSpecConstantOp and we only compare the ID of the OpSpecConstantOp instruction itself. Fixes #2649	2019-06-05 16:35:50 -04:00
Pierre Moreau	e7866de4b1	Linker: Better type comparison for OpTypeArray and OpTypeForwardPointer (#2580 ) * Types: Avoid comparing IDs for in Type::IsSameImpl When linking, we end up with duplicate types for imported and exported types, that needs to be removed. The current code would reject valid import/export pairs of symbols due to IDs mismatch, even if the types or constants behind those ID were the same. Enabled remaining type_match_test Fixes #2442	2019-05-29 16:12:02 -04:00
Ryan Harrison	0125b28ed4	Add compact ids to WebGPU <-> Vulkan transformations (#2639 ) Fixes #2634	2019-05-29 12:58:37 -07:00
greg-lunarg	3d62cb8148	Instrument: Add version 2 of record formats (#2630 ) New version has additional word in stage-specific section. Also some changes in content for tesselation and compute shaders. Either version can be invoked at pass creation. This is done to ease integration and updating of validation layers. Version 1 is deprecated and eventually will go away. Also sneaking in fix to version 1 compute shaders.	2019-05-29 15:08:21 -04:00
Steven Perron	6c7db9c630	Handle nested breaks from switches. (#2624 ) * Handle nested breaks from switches. There was a recent decision made to allow branches to the merge node of a switch even if the switch is not the first enclosing construct. They can be generated by glslang from break statements in switches. Dead branch elimination seems to be the only optimization that will break because of this change, so I will update that optimizations. The change made are: - Track switches in structured cfg analysis. - In Dead branch elimination: - Look for nested breaks that will require a switch instruction. - Rewrite, but don't delete, switchs that are required even if it could be replaced by an unconditional branch. - When looking for the first break, consider the merge of a switch as well. See #2612. * Fix variable names and comments. * Add tests for the struct cfg analysis and switches. * Fix typos in comments.	2019-05-27 16:28:14 -04:00
Ryan Harrison	4557d08584	Add in individual flags for Vulkan <-> WebGPU passes (#2615 ) Adds flags and/or documentation for individual transformation passes that had been missed in previous patches. Fixes #2574	2019-05-22 10:06:53 -07:00
Steven Perron	d9c00e1d2d	Add folding rules for OpQuantizeToF16 (#2614 ) Adding the folding rules for OpQuantizeToF16, and fixed some matching tests to check identify new lines.	2019-05-21 23:15:01 -07:00
Steven Perron	0982f0212e	Using the instruction folder to fold OpSpecConstantOp (#2598 ) In order to try to reduce code duplication and to be able to fold more cases, we want to use the instruction folder when folding an OpSpecConstantOp with constant operands. A couple other changes are need to make this work. First GetDefiningInstruction\| in the constant manager is able to handle \|type_id\| being logically equivalent to another type, so we updated the interface, and removed the assert. Some tests were also updated because we not generate better code because constants are not duplicated as much as before. No need for new tests. The functionality of the instruction folder is already tested. There are tests check that the instruction folder is being used correctly for OpCompositeExtract and OpVectorShuffle in the existing test cases. Fixes #2585.	2019-05-21 12:45:00 -04:00
greg-lunarg	9dfd4b8358	Bindless Validation: Instrument descriptor-based loads and stores (#2583 ) Essentially, support UBOs and SSBOs, scalar and array (sized and unsized).	2019-05-15 19:43:23 -04:00
alan-baker	fc7b5d8c6a	Mem model spv 1.4 (#2565 ) * Update memory model support for SPIR-V 1.4 Fixes #2552 * Upgrade memory model now supports two memory access operands for OpCopyMemory* * in all cases the pass will first generate two operands by either adding them or copying * updates accounts for multiple operands * tests	2019-05-15 19:06:37 -04:00
Steven Perron	84503583c6	Handle id overflow in sroa better. (#2582 ) There is a case where sroa is not handling id overflow gracefully. It is handled and an error message is output when the ids overflow. Fixes https://crbug.com/961030.	2019-05-15 09:29:28 -04:00
alan-baker	2947e88f79	Update instrumentation passes to handle 1.4 interfaces (#2573 ) Fixes #2556 Added variables get added to entry point interfaces Add to input buffer too	2019-05-10 11:08:28 -04:00
greg-lunarg	06ce59b0b0	Instrument: Fix load type of pre-existing builtin (#2575 ) Builtins may be declared int, so load with its pointee type and cast to uint if needed.	2019-05-10 11:06:00 -04:00
alan-baker	87c4ef8a9c	Do not fold floating point if float controls used (#2569 ) Fixes #2558 * Mark floating point instructions as non-foldable if any SPV_KHR_float_controls capabilities are present * tests	2019-05-10 11:03:22 -04:00
Ryan Harrison	f6d9a17843	Add pass to fix some invalid unreachable blocks for WebGPU (#2563 ) Attempts to split up unreachable blocks that are used both as a merge-block and a continue-target. Fixes #2429	2019-05-09 12:56:10 -04:00
Diego Novillo	89fe836fe2	Fix clang-tidy warning about definition/declaration mismatch. (#2571 ) Fix clang-tidy warning about definition/declaration mismatch.	2019-05-09 00:15:08 -04:00
alan-baker	ea5e1b62e1	Update priv-to-local for SPIR-V 1.4 (#2567 ) Fixes #2555 * Fix a bug in validation where interfaces were considered non-unique between different entry points targeting the same function * added a test * Update private to local pass to remove localized private variables from entry point interfaces * added tests	2019-05-08 12:38:49 -04:00
alan-baker	b74d92a8c3	ADCE support for SPIR-V 1.4 entry points (#2561 ) Fixes #2551 * Add support for 1.4 entry point interface lists * only input and output variables are automatically live * can clean up interfaces after DCE * added tests * allow opt tests to specify a target environment	2019-05-07 14:52:22 -04:00
Steven Perron	6d04da22c6	Fix up type mismatches. (#2545 ) Add functionality to fix-storage-class so that it can fix up mismatched data types for pointers as well. Fixes bugs in when fixing up storage class. Move GenerateCopy to the Pass class to be reused. The spirv-opt change for #2535.	2019-05-02 09:31:46 -04:00
Steven Perron	32af42616a	Change implementation of post order CFG traversal (#2543 ) * Change implementation of post order CFG traversal It seems like the recursion is going very deep, and causing some problem is particular situations. I've reimplemented the CFG post order traversal to not use recursion. Fixes #2539.	2019-04-29 17:09:20 -04:00
Steven Perron	64faf6d9cb	Fix undefined bit shift in sroa. (#2532 ) There was a bit shift done on 32-bit values, but they should have been done on 64-bit values. This is fixed. At the same time, uses of size_t are repalaced by uint64_t to ensure these values are 64-bit. A test case cannot be created because the code that was change is not run at the moment since we do not split up vectors or matricies. I do not want to delete the code because I like to experitment with it every once in a while. Fixes #2528.	2019-04-26 12:52:23 -04:00
Ryan Harrison	b68af7ca8e	Add support for Private & Output to initializer decompose flag (#2537 ) Fixes #2388	2019-04-25 16:24:32 -04:00
Ryan Harrison	048dcd38ce	Implement WebGPU->Vulkan initializer conversion for 'Function' variables (#2513 ) WebGPU requires certain variables to be initialized, whereas there are known issues with using initializers in Vulkan. This PR is the first of three implementing a pass to decompose initialized variables into a variable declaration followed by a store. This has been broken up into multiple PRs, because there 3 distinct cases that need to be handled, which require separate implementations. This first PR implements the basic infrastructure that is needed, and handling of Function storage class variables. Private and Output will be handled in future PRs. This is part of resolving #2388	2019-04-16 14:31:36 -04:00
Ryan Harrison	102e430a88	Add pass to legalize OpVectorShuffle for WebGPU (#2509 ) In WebGPU, the component operand 0xFFFFFFFF is forbidden, but in Vulkan it is used to indicate a value is undefined. When converting to WebGPU, 0xFFFFFFFF needs to converted to a legal value, though the specific one does not matter, since it was used to indicate an undefined entry in the original code. Choosing to use 0, since the operands are required to be on [0, N-1], so 0 is guaranteed to always be valid. Fixes #2349	2019-04-12 12:14:23 -04:00
Steven Perron	9047de51cb	Accept OpBitCast in fix storage class. (#2505 ) Fixes http://crbug.com/950889.	2019-04-09 14:10:35 -04:00
Steven Perron	7ce37d66a8	Fix use of Logf to avoid format security warning (#2498 ) When -Wformat-security is enabled, we are getting an error. I do not claim to fully understand when the warning is triggered or not, but this one can be avoided by calling "Log" instead of "Logf" because the formating string is not needed.	2019-04-08 11:06:48 -04:00
Ryan Harrison	0cb2d4079e	Add WebGPU->Vulkan and Vulkan->WebGPU flags in spirv-opt (#2496 ) Renames the existing flag '--webgpu-mode' to '--vulkan-to-webgpu' for the Vulkan->WebGPU operation, and adds a new flag '--webgpu-to-vulkan' for the WebGPU->Vulkan operation. Currently '--webgpu-to-vulkan' doesn't have any passes associated with it yet, but further patches will implement them. Fixes #2495	2019-04-05 15:12:26 -04:00
JasperNV	9766b22b33	spirv-opt: Behave a bit better in the face of unknown instructions (#2487 ) * opt/ir_loader: Don't silently drop unknown instructions on the floor Currently, if spirv-opt sees an instruction it does not know, it will silently ignore it and move to the next one. This changes it to be an error, as dropping it on the floor is likely to generate invalid SPIR-V output. * opt/optimizer: Complain a bit louder for unexpected binary changes If a binary change happens despite a pass saying that the binaries should be identical, this is indicative of a bug in the pass itself. This does not change behavior for it to be an error, but simply emits a warning in this case.	2019-04-05 13:36:42 -04:00
Steven Perron	3a0bc9e724	Add fix storage class code. (#2434 ) This pass tries to fix validation error due to a mismatch of storage classes in instructions. There is no guarantee that all such error will be fixed, and it is possible that in fixing these errors, it could lead to other errors. Fixes #2430.	2019-04-05 13:12:08 -04:00
alan-baker	236bdc0065	Change prioritization of unreachable merge and continue (#2460 ) Fixes #2452 Swaps priority of handling unreachable merge and continues so that the back-edge is retained in the case a block is both a loop continue and loop merge	2019-04-03 12:50:08 -04:00
Steven Perron	12e4a7b649	Handle variable pointer in some optimizations (#2490 ) * Check var pointer capability in ADCE. * Check var ptr capability for common uniform. * Check var ptr capability in access chain convert. Since we want this pass to run even if there are variable pointer on storage buffers, we had to remove asserts that assumed there were no variable pointers. The functions with the asserts will now work, it becomes the responsibility of the callers to deal with the output as appropriate. * Single block elimination and variable pointers. It seems like the code in local single block elimination is able to handle cases with variable pointers already. This is because the function `HasOnlySupportedRefs` ensures that variables that feed a variable pointer are not candidates. * Single store elimination and variable pointers. It seems like the code in local single stroe elimination is able to handle cases with variable pointers already. This is because the function `FindSingleStoreAndCheckUses` ensures that variables that feed a variable pointer are not candidates. * SSA rewriter and variable pointers. It seems like the code in the two passes that call the SSA rewriter are able to handle cases with variable pointers already. This is because the function `HasOnlySupportedRefs` ensures that variables that feed a variable pointer are not candidates. Fixes #2458.	2019-04-03 12:47:51 -04:00
Ryan Harrison	01964e325f	Add pass to generate needed initializers for WebGPU (#2481 ) Fixes #2387	2019-04-03 11:44:09 -04:00
alan-baker	4bd106b089	Handle dead infinite loops in DCE (#2471 ) Fixes #2456 * When eliminating a structured construct that has an unreachable merge, replace that unreachable terminator with an appropriate return * New tests	2019-04-03 10:30:12 -04:00
alan-baker	c9874e5090	Fix merge return in the face of breaks (#2466 ) Fixes #2453 * Enable addition of OpPhi instructions when the loop has multiple predecessors of the merge due to a break * This can result in some values no longer dominating their uses * Track return blocks in structured flow to produce OpPhis that have multiple undef and non-undef arguments * New tests to catch the bug * When a block is predicated, mark the new body as a return if the old block as already a return	2019-04-02 10:05:28 -04:00
alan-baker	0300a464a4	Maintain inst to block mapping in merge return (#2469 ) Fixes #2455 Properly maintains instruction to block mapping for newly created phi instructions in merge return	2019-04-01 13:14:10 -04:00
Paul Thomson	fcb8453104	reduce: fix loop to selection pass for loops with combined header/continue block (#2480 ) * Fix #2478. The fix is to just not try to simplify such loops. * Also added `BasicBlock::MergeBlockId()` and `BasicBlock::ContinueBlockId()`. * Some minor changes to `structured_loop_to_selection_reduction_opportunity.cpp`. * Added test.	2019-03-29 11:29:24 +00:00
alan-baker	2ff54e34ed	Handle function decls in Structured CFG analysis (#2474 ) Fixes #2451 * Structured cfg analysis now handles functions with no basic blocks * Added a test	2019-03-26 14:39:16 -04:00
alan-baker	42e6f1aa62	Add option to validate after each pass (#2462 ) * New command-line option to opt: --validate-after-all * Pass manager will validate after each pass it runs	2019-03-26 14:38:59 -04:00
greg-lunarg	e1a76269b6	Bindless Validation: Descriptor Initialization Check (#2419 ) If SPV_EXT_descriptor_indexing is enabled, add check that for a descriptor-based reference, the descriptor is initialized. Initialization data is stored in the debug input buffer, added to the length information already there. This feature must be seperately enabled on the pass creation routine. NOTE: Currently just supports image references; buffer references are still TODO.	2019-03-19 09:53:43 -04:00
Ryan Harrison	e545522146	Add --strip-atomic-counter-memory (#2413 ) Adds an optimization pass to remove usages of AtomicCounterMemory bit. This bit is ignored in Vulkan environments and outright forbidden in WebGPU ones. Fixes #2242	2019-03-14 13:34:33 -04:00
Steven Perron	5186ffedb3	Remove duplicates from list of interface IDs in OpEntryPoint instruction (#2449 ) * Remove duplicates from list of interface IDs in OpEntryPoint instruction Fixes #2002.	2019-03-13 15:46:31 -04:00
Steven Perron	9d29c37ac5	Removing decorations when doing constant propagation. (#2444 ) In constant propagation, decoration are transfered from the original expression to the constant that will replace it. This can be wrong because there are no decorations that apply to constants. We choose to simply delete the decorations. Fixes #2441	2019-03-13 10:40:49 -04:00
Steven Perron	d800bbbac9	Handle back edges better in dead branch elim. (#2417 ) * Handle back edges better in dead branch elim. Loop header must have exactly one back edge. Sometimes the branch with the back edge can be folded. However, it should not be folded if it removes the back edge. The code to check this simply avoids folding the branch in the continue block. That needs to be changed to not fold the back edge, wherever it is. At the same time, the branch can be folded if it folds to a branch to the header, because the back edge will still exist. Fixes #2391.	2019-02-26 09:06:51 -05:00
Jeff Bolz	002ef361ca	Add validation for SPV_NV_cooperative_matrix (#2404 )	2019-02-25 17:43:11 -05:00
Steven Perron	fde69dcd80	Fix OpDot folding of half float vectors. (#2411 ) * Fix OpDot folding of half float vectors. The code that folds OpDot does not handle half floats correctly. After trying to multiple the first components, we get a nullptr because we don't fold half float values. This nullptr gets passed to the code that does the addition, and causes an assert. Fixes #2405.	2019-02-20 20:05:08 -05:00
Steven Perron	8eddde2e70	Don't change type of input and output var in dead member elim (#2412 ) The types of input and output variables must match for the pipeline. We cannot see the uses in all of the shader, so dead member elimination cannot safely change the type of input and output variables.	2019-02-20 18:59:41 -05:00
greg-lunarg	2f84b5de9a	Bindless: Fix computation of set and binding for runtime bounds check (#2384 ) Also fix test to use non-zero set and binding which will make error more obvious.	2019-02-19 11:43:30 -05:00
dan sinclair	528fea2b1e	Fixup unused variables (#2402 )	2019-02-19 11:11:04 -05:00
Steven Perron	78ac954c41	Mark type id of unknown instructions at fully used. (#2399 )	2019-02-15 10:49:49 -05:00
greg-lunarg	9540f2d981	Instrumentation: Fix instruction index when multiple functions (#2389 )	2019-02-15 09:49:18 -05:00
Steven Perron	1b0047f210	Add pass to remove dead members. (#2379 ) Add a pass that looks for members of structs whose values do not affects the output of the shader. Those members are then removed and just treated like padding in the struct.	2019-02-14 13:42:35 -05:00
alan-baker	354205b3dc	Don't merge unreachable blocks (#2375 ) Fixes #2374 * Block merging no longer merges unreachable blocks into their successors * added a test	2019-02-12 09:24:01 -05:00
Ryan Harrison	12b3d7e9d6	Add strip-debug to webgpu-mode passes (#2368 ) Fixes #2366	2019-02-08 14:26:17 -05:00
Alastair Donaldson	34c5ac614c	Fixes #2358 . Added to the reducer the ability to remove a function t… (#2361 ) * Fixes #2358. Added to the reducer the ability to remove a function that is not directly called. Factored out some code from the optimizer to help with this.	2019-02-08 16:20:29 +00:00
greg-lunarg	cf21146137	Expand bindless bounds checking to runtime-sized descriptor arrays (#2316 )	2019-02-07 14:00:36 -05:00
Ryan Harrison	0f4bf0720a	Add flatten-decorations flag to webgpu-mode flags (#2348 ) Fixes #2272	2019-02-05 14:07:53 -05:00
alan-baker	63e032f910	Remove unused lambda capture (#2350 )	2019-01-31 15:57:45 -05:00
Alastair Donaldson	3b6fee3dae	Fixes #2338 . Added functionality to remove OpPhi instructions (replacing their uses) when merging blocks (#2339 ) * Fixes #2338. Added check for phi node before merging blocks. * Added functionality to merge blocks A and B even when B starts with OpPhi instructions, by replacing uses of the OpPhi results with the definitions coming from A. Added some tests for this. * Fixed assertion.	2019-01-31 09:36:05 -05:00
Steven Perron	9ab1c0ddd0	Remove code sinking for -O. (#2340 ) Community feedback says it is not generaly benificial, so we will remove it from the standard optimization set.	2019-01-28 11:50:50 -05:00
Alastair Donaldson	3345fe6a9d	Extracted block merging functionality into its own utility file (#2325 ) * Extracted useful functionality from block merger and exposed it as stand-alone methods. * Separated these methods into a utility file.	2019-01-25 10:57:13 +00:00
greg-lunarg	a64c651e18	Fix Constants Analyses bug inserted by #2302 (#2306 ) Need to also remove Constants from the valid_analyses set when invalidated, otherwise Constants is not reinitialized before used.	2019-01-21 12:34:12 -05:00
Steven Perron	8df947d2d6	Handle instructions not in blocks in code sinking. (#2308 ) When looking at the uses of the result of an instruction, code sinking assumes that all uses are in a basic block. However, this is not true if there is a decoration or name for the result of that insturction. This commit checks for this. Fixes https://crbug.com/923243.	2019-01-21 12:09:56 -05:00
greg-lunarg	d14db341b8	Invalidate ConstantManager if TypeManager is invalidated... (#2302 ) ...as the ConstantManager contains pointers into the TypeManager.	2019-01-18 15:49:00 -05:00
Steven Perron	d6c067630d	Handle extract with no index in VDCE. (#2305 ) It is legal, but not generated by any SPIR-V producer: an OpCompositeExtract with no indexes. This is essentially just a copy of the object, so we treat them that way. We simply propagate the live variables of the result to the operand. Fixes https://crbug.com/919181.	2019-01-18 15:43:36 -05:00
Steven Perron	81fb2649bf	Handle access chain with no index in SROA. (#2304 ) It is legal, but not generated by any SPIR-V producer: an OpAccessChain with no indexes. This is essentially just a copy of the pointer. I have decided to treat it like an OpCopyObject. In CheckUses, we return that it is not okay. When looking at this I realized that we had code in GetUsedComponents that cannot be reached. If there is a use in an OpCopyObject the it will not call GetUsedComponents. I removed that dead code. Fixes https://crbug.com/918311.	2019-01-18 14:19:43 -05:00
Steven Perron	213e15e100	Fix overflow when negating INT_MIN. (#2293 ) When doing (-INT_MIN) is considered overflow, so we cannot fold it by actually performing the negation. Fixes https://crbug.com/917991	2019-01-17 17:01:55 -05:00
Steven Perron	99c2c21cf4	Fix memory leak in unrolling. (#2301 ) During unrolling a new loop is created, but its ownership is not clear as it gets passed through the code. Changed something to unique_ptr to make that clearer. Fixes #2299. Fixing other memory leaks at the same time. Fixes #2296 Fixes #2297	2019-01-17 16:02:43 -05:00
Steven Perron	dd4157dcee	Sink (#2284 ) Add code sinking pass. It will move OpLoad and OpAccessChain instructions as close as possible to their uses. Part of #1611.	2019-01-17 15:56:36 -05:00
greg-lunarg	8d2d66f30c	Fix vertex instrumentation to use VertexIndex and InstanceIndex (#2294 ) ...instead of VertexId and InstanceId	2019-01-16 18:02:07 -05:00
Steven Perron	49b5b0abc6	Fix up bit shifts by 32. (#2292 ) In C++, a bit shift of the same size as the type is undefined, but it is defined in spir-v. When folding those cases, we have to be careful. We cannot simply do the shift in C++. Fixes https://crbug.com/917697.	2019-01-16 15:52:23 -05:00
greg-lunarg	83bfdc976a	Instrumentation: Add ArrayStride decoration to debug output buffer array (#2290 )	2019-01-16 10:01:40 -05:00
alan-baker	06c9dc07bd	Upgrade modf and frexp (#2266 ) Fixes #2138 * Modf and frexp are upgraded to use the struct version of the instruction and generate an explicit store whose flags can be upgraded separately * Fixed major bug where availability and visibility were reversed for non-copy memory instructions * Fixed bug where availability and visibility scope operands were reversed for copy memory * Upgraded all opt tests to use SPV_ENV_UNIVERSAL_1_3 * Upgrade tests moved into unified tests and removed standalone test	2019-01-07 12:36:38 -05:00
Steven Perron	241644a5a3	Have replace load size handle extact with no index. (#2261 ) Fixes https://crbug.com/917774	2019-01-03 13:02:10 -05:00
Steven Perron	9f36c8bb72	Handle CompositeInsert with no indices in VDCE (#2258 ) * Handle CompositeInsert with no indices in VDCE In the spec, there it nothing that forces an OpCompositeInsert to have an index, but VDCE assumes there is at least 1 in a couple places. This commit updates VDCE to handle these cases.	2019-01-02 14:00:04 -05:00
Steven Perron	bdc2ab9356	In LICM don't place code between merge instruction and branch. (#2252 ) Fixes #2210.	2018-12-20 18:33:52 -05:00
Steven Perron	c2013e248b	Make the constant and type manager analyses. (#2250 ) Currently it is impossible to invalidate the constnat and type manager. However, the compact ids pass changes the ids for the types and constants, which makes them invalid. This change will make them analyses that have to been explicitly marked as preserved by passes. This will allow compact ids to invalidate them. Fixes #2220.	2018-12-20 18:00:05 +00:00
kholtnv	e49bd96f2c	Added additional changes for the new AccelerationStructureNV type. (#2218 ) * Added additional changes for the new AccelerationStructureNV type. * Added additional changes for the new AccelerationStructureNV type. Change tabs to space... * Added additional changes for the new accelerationStructureNV type -- add proper type name. Fix TypeManager.TypeStrings test: [----------] 29 tests from TypeManager [ RUN ] TypeManager.TypeStrings [ OK ] TypeManager.TypeStrings (7 ms)	2018-12-19 21:42:39 +00:00
Steven Perron	68b69e16aa	Update the continue target in merge return. (#2249 ) When we are predicating the continue target for a loop, it can no longer be the continue target because it will have a branch that exits the loop and is not the bach edge. The continue target will have to be the target of that branch that is still in the loop. Fixes #2211.	2018-12-19 21:24:49 +00:00
Steven Perron	ac7feace90	Fix missing OpPhi after merge return. (#2248 ) The function `UpdatePhiNodes` was being called inconsistently. In one case, the cfg had already been updated to include the new edge, and in another place the cfg was not updated. This caused the function to miss flagging a block as needing new phi nodes. I picked that the cfg should not be updated before making the call. I documented it, and change the call sites to match. Fixes #2207.	2018-12-19 18:17:42 +00:00
Steven Perron	9d04f82bef	Ensure SROA gets the correct pointer type. (#2247 ) We initially assumed that if the type manager returned the correct id for the pointee type, that we would get the correct pointer type back, but that is not true. See the unit test added with this commit. We need to fall back to the linear search any time we are looking for a pointer to a type that may not be unique. At the same time, SROA considered an OpName on a variable to be a use of the entire variable. That has been fixed. Fixes #2209.	2018-12-19 17:07:29 +00:00
Steven Perron	9e81c337f9	Place load after OpPhi instructions in block. (#2246 ) We currently place the load instructions at the start of the basic block that dominates all of the loads. If that basic block contains OpPhi instructions, then this will generate invalid code. We just need to search for a location that comes after all of the OpPhi instructions. Fixes #2204.	2018-12-19 15:18:22 +00:00
Steven Perron	5ec2d1a8cd	Don't fold specialized branches in loop unswitch (#2245 ) * Don't fold specialized branchs in loop unswitch Folding branches can have a lot of special cases, and can be a little error prone. So I only want it in one place. That will be in dead branch elimination. I will change loop unswitching to set the branches that were being folded to have a constant condition. Then subsequent pass of dead branch elimination will be able to remove the code. At the same time, I added a check that loop unswitching will not unswitch a branch with a constant condition. It is not useful to do it because dead branch elimination will simple fold the branch anyway. Also it avoid an infinite loop that would other wise be introduced by my first change. Fixes #2203.	2018-12-19 04:40:30 +00:00
Ryan Harrison	47c08a79c4	Implement initial --webgpu-mode flag (#2217 ) Fixes #2166	2018-12-18 15:10:34 -05:00
Steven Perron	acd2781952	Handle id overflow in inlining. (#2196 ) Have inlining return Failure if the ids overflow. Part of #1841.	2018-12-18 19:34:03 +00:00
Steven Perron	1254335d13	Don't unswitch the latch block. (#2205 ) Loop unswitching is unswitching the conditional branch that creates the back-edge. In the version of the loop, where the bachedge is not taken, there is no back-edge. This is what causes the validator to complain. The solution I will go with will be to now unswitch a condition with a back-edge. At this time we do not now if loop unswitching is used. We do not include it in the optimization sets provided, nor is it used in glslang's set. When there are opportunities and no breaks from the loop, the loop with either be a single iteration loop, or an infinite loop. There is no performance advantage to performing loop unswitching in either of those cases. If there is a break, maintaining structured control flow will be tricky. Unless we see a clear advantage to handling these case, I would go with the safer simpler solution. Fixes #2201.	2018-12-18 18:15:00 +00:00
Steven Perron	ff07c6df83	SSA-rewriter: make sure phi entries are unique. (#2206 ) If there are multiple edges to a basic block, then the ssa rewriter will create OpPhi instructions with duplicate entries. This is invalid, and it is fixed in this commit. Fixes #2202.	2018-12-18 18:14:27 +00:00
Ryan Harrison	e0292c269d	Add --target-env flag to spirv-opt (#2216 ) Fixes #2199	2018-12-17 16:54:23 -05:00
Jeff Bolz	24328a0554	Recognize OpTypeAccelerationStructureNV as a type instruction (#2190 )	2018-12-11 19:03:55 -05:00
Steven Perron	e07dabc25f	Invalidate the decoration manager at the start of ADCE. (#2189 ) * Invalidate the decoration manager at the start of ADCE. If the decoration manager is kept live the the contex will try to keep it up to date. ADCE deals with group decorations by changing the operands in \|OpGroupDecorate\| instructions directly without informing the decoration manager. This puts it in an invalid state, which will cause an error when the context tries to update it. To Avoid this problem, we will invalidate the decoration manager upfront. At the same time, the decoration manager is now considered when checking the consistency of the decoration manager.	2018-12-10 13:24:33 -05:00
Steven Perron	0bc66a8ba9	Fix invalid OpPhi generated by merge-return. (#2172 ) * Fix invalid OpPhi generated by merge-return. When we create a new phi node for a value say %10, we have to replace all of the uses of %10 that are no longer dominated by the def of %10 by the result id of the new phi. However, if the use is in a phi node, it is possible that the bb contains the use is not dominated by either. In this case, needs to be handled differently. * Split loop headers before add a new branch to them. In merge return, Phi node in loop header that are also merges for loop do not get updated correctly. Those cases do not fit in with our current analysis. Doing this will simplify the code by reducing the number of cases that have to be handled.	2018-12-07 14:10:30 -05:00
Steven Perron	2e4563d94f	Document in the context what happens with id overflow. (#2159 ) Added documentation to the ir context to indicates that TakeNextId() returns 0 when the max id is reached. TODOs were added to each call sight so that we know where we have to start to handle this case. Handle id overflow in \|SplitLoopHeader\|. Handle id overflow in \|GetOrCreatePreHeaderBlock\|. Handle failure to create preheader in LICM. Part of https://github.com/KhronosGroup/SPIRV-Tools/issues/1841.	2018-12-06 09:07:00 -05:00
Steven Perron	17cba4695c	Remove undefined behaviour when folding shifts. (#2157 ) We currently simulate all shift operations when the two operand are constants. The problem is that if the shift amount is larger than 32, the result is undefined. I'm changing the folder to return 0 if the shift value is too high. That way, we will have defined behaviour. https://crbug.com/910937.	2018-12-04 10:04:02 -05:00
alan-baker	e510b1bac5	Update memory model (#1904 ) Upgrade to VulkanKHR memory model * Converts Logical GLSL450 memory model to Logical VulkanKHR * Adds extension and capability * Removes deprecated decorations and replaces them with appropriate flags on downstream instructions * Support for Workgroup upgrades * Support for copy memory * Adding support for image functions * Adding barrier upgrades and tests * Use QueueFamilyKHR scope instead of device	2018-11-30 14:15:51 -05:00
Steven Perron	2d2a512691	Don't inline recursive functions. (#2130 ) * Move ProcessFunction* function from pass to the context. There are a few functions that are used to traverse the call tree. They currently live in the Pass class, but they have nothing to do with a pass, and may be needed outside of a pass. They would be better in the ir context, or in a specific call tree class if we ever have a need for it. * Don't inline recursive functions. Inlining does not check if a function is recursive or not. This has been fine as long as the shader was a Vulkan shader, which forbid recursive functions. However, not all shaders are vulkan, so either we limit inlining to Vulkan shaders or we teach it to look for recursive functions. I prefer to keep the passes as general as is reasonable. The change does not require much new code in inlining and gives a reason to refactor some other code. The changes are to add a member function to the Function class that checks if that function is recursive or not. Then this is used in inlining to not inlining a function call if it calls a recursive function. * Add id to function analysis There are a few places that build a map from ids to Function whose result is that id. I decided to add an analysis to the context for this to reduce that code, and simplify some of the functions. * Add missing file.	2018-11-29 14:24:58 -05:00
Alastair Donaldson	3b13040cf9	New spirv-reduce reduction pass: operand to dominating id. (#2099 ) * Added a reduction pass to replace ids with ids of the same type that dominate them. * Introduce helper method for querying whether an operand type is an input id.	2018-11-26 17:06:21 -05:00
Daniel Koch	3b210d6a63	Add basic support for EXT_fragment_invocation_density (#2100 ) Whitelisting the extension in optimizations * copying what was done for NV_shading_rate	2018-11-23 10:21:19 -05:00
dan sinclair	15fdcf94d7	Add missing override to ProcessLinesPass	2018-11-19 19:24:48 -05:00
greg-lunarg	c37388f1ad	Add passes to propagate and eliminate redundant line instructions (#2027 ). (#2039 ) These are bookend passes designed to help preserve line information across passes which delete, move and clone instructions. The propagation pass attaches a debug line instruction to every instruction based on SPIR-V line propagation rules. It should be performed before optimization. The redundant line elimination pass eliminates all line instructions which match the previous line instruction. This pass should be performed at the end of optimization to reduce physical SPIR-V file size. Fixes #2027.	2018-11-15 14:06:17 -05:00
Greg Fischer	d4a10590b7	Fix Instruction::IsFloatingPointFoldingAllowed() Was looking for decorations based on opcode. Should use result_id.	2018-11-14 15:25:51 -07:00
Steven Perron	dc9d155d62	Fix folding of volatile store. (#2048 ) When looking for the Volatile mask on a store, the instruction folder accesses an out-of-bounds element. We fix that up. Fixes crbug.com/903530.	2018-11-14 13:52:18 -05:00
Steven Perron	a6150a3fe7	Don't assert on void function parameters. (#2047 ) The type manager in spirv-opt currently asserts if a function parameter has type void. It is not exactly clear from the spec that this is disallowed, even if it probably will be disallowed. In either case, asserts should be used to verify assumptions that will actually make a difference to the code. As far as the optimizer is concerned, a void parameter does not matter. I don't see the point of the assert. I'll just remove it and let the validator decide whether to accept it or not. No test was added because it is not clear that it is legal, and should not force us to accept it in the future unless the spec make it clear that it is legal. Fixes crbug.com/903088.	2018-11-14 12:43:43 -05:00
Steven Perron	ec5574a9c6	Instruction::GetBaseAddress to handle OpPtrAccessChain (#2050 ) That function currently only handled OpPtrAccessChain if it was in the middle of the chain, but not at the start. Fixing that up. Fixes crbug.com/905271.	2018-11-14 12:42:25 -05:00
dan sinclair	f343a15764	Add missing overrides (#2041 )	2018-11-12 15:11:32 -05:00
greg-lunarg	1e9fc1aac1	Add base and core bindless validation instrumentation classes (#2014 ) * Add base and core bindless validation instrumentation classes * Fix formatting. * Few more formatting fixes * Fix build failure * More build fixes * Need to call non-const functions in order. Specifically, these are functions which call TakeNextId(). These need to be called in a specific order to guarantee that tests which do exact compares will work across all platforms. c++ pretty much does not guarantee order of evaluation of operands, so any such functions need to be called separately in individual statements to guarantee order. * More ordering. * And more ordering. * And more formatting. * Attempt to fix NDK build * Another attempt to address NDK build problem. * One more attempt at NDK build failure * Add instrument.hpp to BUILD.gn * Some name improvement in instrument.hpp * Change all types in instrument.hpp to int. * Improve documentation in instrument.hpp * Format fixes * Comment clean up in instrument.hpp * imageInst -> image_inst * Fix GetLabel() issue.	2018-11-08 13:54:54 -05:00
greg-lunarg	6721478ef1	Don't assume one return means function can be inlined. (#2018 ) (#2025 ) If there is only 1 return and it is in a loop, then the function cannot be inlined. Fix condition when inlined code needs one-trip loop wrapper. The dummy loop is needed when there is a return inside a selection construct. Even if there is only 1 return.	2018-11-08 09:11:20 -05:00
Jeff Bolz	c06a35b902	Rename PCH macro to spvtools_pch to avoid conflicts with other projects. Also add pch to test/opt. (#2034 )	2018-11-07 09:15:04 -05:00
Jeff Bolz	60fac96c6b	Enable precompiled headers for spirv-tools(-shared) and some unit tests (#2026 )	2018-11-06 09:26:23 -05:00
Steven Perron	f2cc71e5cb	Handle OpMemberDecorateStringGOOGLE in ACDE (#2029 ) Add missing case to the switch statement for the annotation instructions. See https://github.com/KhronosGroup/glslang/issues/1561.	2018-11-02 13:42:45 -04:00
Jeff Bolz	fb996dce75	Add /Zm flag as a workaround for VS2013 build (#2023 )	2018-10-31 07:59:43 -04:00
Steven Perron	6647884a13	Remove MemberDecorateStringGOOGLE during stript-refect. (#2021 ) The strip-reflect pass is not removing the reflection decorations that are decorating members. With this commit, they will now be removed. Fixes #2019.	2018-10-30 16:17:35 -04:00
alelenv	1c1e749f0b	Add support for nv-raytracing-final (#2010 ) Add support for nv-raytracing (non-experimental)	2018-10-25 14:07:46 -04:00
Steven Perron	18fe6d59e5	Fix dead branch elim infinite loop. (#2009 ) When looking for a break from a selection construct, we do not realize that a jump to the continue target of a loop containing the selection is a break. This causes and infinit loop, or possibly other failures. Fixes #2004.	2018-10-24 09:10:30 -04:00
Steven Perron	0ba35798c3	Fix dead branch elim infinite loop. (#1997 ) When looking for a break from a selection construct, we do not need to look inside nested constructs. However, if a loop header has an unconditional branch, then we enter the loop. Entering the loop causes an infinite loop because we keep going through the loop. The solution is to look for a merge block, if one exsits, even for block terminated by an OpBranch. Fixes #1979.	2018-10-22 13:59:20 -04:00
alan-baker	6e85d1a6fc	Fix restrictions in if conversion (#1998 ) Fixes #1991 * Improved identification of potential conditional branches * Pass changed to only work for shaders * added a test to catch the bug	2018-10-19 15:16:46 -04:00
Jeff Bolz	dd1e837e1c	Use per-configuration location for pch file (#1989 )	2018-10-19 14:58:26 -04:00
Steven Perron	715afb0cea	Add a nullptr check to array copy propagation. (#1987 ) We are missing a check for a nullptr that is causing things to fail. Added an extra test case, and fixed up others. This is the fix for https://github.com/Microsoft/DirectXShaderCompiler/issues/1598.	2018-10-19 12:53:40 -04:00
greg-lunarg	c4687889b7	Fix ADCE to treat OpUnreachable correctly during liveness analysis (#1984 ) ADCE liveness algorithm should treat OpUnreachable at least like other branch instructions. It was being treated as always live which was preventing useless structured constructs from being eliminated. OpUnreachable is generated by dead branch elimination which is now being required by merge return, so this fix should accompany that change.	2018-10-19 10:16:35 -04:00
Steven Perron	0e68bb3632	Only run merge-returnon reachable functions. (#1983 ) We currently run merge-return on all functions, but dead-branch-elimination only runs on function reachable from an entry point or exported function. Since dead-branch-elimination is needed for merge-return, they have to match. Fixes #1976.	2018-10-18 08:48:27 -04:00
greg-lunarg	ab45d69154	Fix ADCE liveness to include all enclosing control structures. (#1975 ) Was removing control structures which didn't have data dependency with enclosed live loop and otherwise did not contain live code. An example is a counting loop around a live loop. Fixes #1967.	2018-10-16 08:00:07 -04:00
Jeff Bolz	339d23275d	Enable precompiled headers for MSVC (#1969 )	2018-10-15 11:12:02 -04:00
greg-lunarg	e545564887	Consider atomics that load when analyzing live stores in ADCE (#1956 ) (#1958 ) Consider atomics that load when analyzing live stores in ADCE. Previously it asserted that the base of an OpImageTexelPointer should be an image. It is actually a pointer to an image, so IsValidBasePointer should suffice.	2018-10-12 08:46:35 -04:00
Steven Perron	82663f34c9	Check for unreachable blocks in merge-return. (#1966 ) Merge return assumes that the only unreachable blocks are those needed to keep the structured cfg valid. Even those must be essentially empty blocks. If this is not the case, we get unpredictable behaviour. This commit add a check in merge return, and emits an error if it is not the case. Added a pass of dead branch elimination before merge return in both the performance and size passes. It is a precondition of merge return. Fixes #1962.	2018-10-10 15:18:15 -04:00
Steven Perron	4e266f775a	Fold divisions by 0. (#1963 ) The current implementation in the folder when seeing a division by zero is to assert. In the release build, the compiler will attempt to compute the value, which causes its own problems. The solution I will go with is to fold the division, and just give it the value of 0. The same goes for remainder and mod operations. Fixes #1961.	2018-10-10 11:17:26 -04:00
Steven Perron	497958d899	Removing HLSLCounterBuffer decorations when not needed. (#1954 ) The HlslCounterBufferGOOGLE that was introduced changed the OpDecorateId so that is can now reference an id other than the target. If that other id is used only in the decoration, then the definition of the id will be removed because decoration do not count as real uses. However, if the target of the decoration is still live the decoration will not be removed. This leaves a reference to an id that is not defined. There are two solutions to consider. The first is that is the decoration is kept, then the definition of the id should be kept live. Implementing this change would be involved because the way ADCE handles decorations will have to be reimplemented. The other solution is to remove the decoration the id is otherwise dead. This works for this specific case. Also this is the more desirable behaviour in this case. The id will always be the id of a variable that belongs to a descriptor set. If that variable is not bound and we do not remove it, the driver will complain. I chose to implement the second solution. The first will be left to when a case for it comes up. Fixes https://github.com/KhronosGroup/SPIRV-Tools/issues/1885.	2018-10-05 08:23:09 -04:00
Alan Baker	3b5960174f	Don't scalarize spec constant sized arrays Fixes #1952 * Prevent scalarization of arrays that are sized by a specialization constant	2018-10-04 11:58:23 -04:00
Steven Perron	146eb3bdcf	Fix erroneous uses of the type manager in copy-prop-arrays. (#1942 ) There are a few spots where copy propagate arrays is trying to go from a Type to an id, but the type is not unique. When generating code this pass needs specific ids, otherwise we get type mismatches. However, the ambigous types means we can sometimes get the wrong type and generate invalid code. That code has been rewritten to not rely on the type manager, and just look at the instructions instead. I have opened https://github.com/KhronosGroup/SPIRV-Tools/issues/1939 to try to get a way to make this more robust.	2018-10-01 14:45:44 -04:00
Jeff Bolz	fe90a1d2dc	Enable /MP4 (parallel build across 4 cores for MSVC) for SPIRV-Tools/source[/opt] (#1930 )	2018-10-01 10:47:39 -04:00
Steven Perron	ddc705933d	Analyze uses for all instructions. (#1937 ) * Analyze uses for all instructions. The def-use manager needs to fill in the `inst_to_used_ids_` field for every instruction. This means we have to analyze the uses for every instruction, even if they do not have any uses. This mistake was not found earlier because there was a typo in the equality check for def-use managers. No new tests are needed. While looking into this I found redundant work in block merge. Cleaning that up at the same time. * Fix other transformations Aggressive dead code elimination did not update the OpGroupDecorate and the OpGroupMemberDecorate instructions properly when they are updated. That is fixed. Dead branch elimination did not analyze the OpUnreachable instructions that is would add. That is taken care of.	2018-09-28 14:39:06 -04:00
Steven Perron	32381e30ef	Handle decoration groups with no decorations. (#1921 ) In DecorationManager::RemoveDecorationsFrom, we do not remove the id from a decoration group if the group has no decorations. This causes problems because KillNamesAndDecorates is suppose to remove all references to the id, but in this case, there is still a reference. This is fixed by adding a special case. Also, there is the possibility of a double free because RemoveDecorationsFrom will delete the instructions defining \|id\| when \|id\| is a decoration group. Later, KillInst would later write to memory that has been deleted when trying to turn it into a Nop. To fix this, we will only remove the decorations that use \|id\| and not its definition in RemoveDecorationsFrom.	2018-09-28 14:16:04 -04:00
Steven Perron	80564a56ec	Keep analyses live in unrolling (#1929 ) Add code to keep the def-use manger and the inst-to-block mapping up-to-date. This means we do not have to rebuild them later. To make this work, we will have to have to find places to update the def-use manager. Updating the def-use manager is not straight forward because we are unrolling loops, and we have circular references. This forces one pass to register all of the definitions. A second one to analyze the uses. Also because there will be references to the new instructions in the old code, we want to register the definitions of the new instructions early, so we can update the uses of the older code as we go along. The inst-to-block mapping is not too difficult. It can be done as instructions are created. Fixes #1928.	2018-09-26 17:36:27 -04:00
Steven Perron	0e5fc7d75e	Allow 0 as argument to scalar replacement. (#1917 ) A limit of 0 for the scalar replacement options it used to indicate that there is no limit. The current implementation does not allow 0. This should be fixed.	2018-09-26 09:58:28 -04:00
Steven Perron	b85fb4a300	Get KillNameAndDecorates to handle group decorations. (#1919 ) It seems like the current implementation of KillNameAndDecorates does not handle group decorations correctly. The id being removed is not removed from the OpGroupDecorate instructions. Even worst, any decorations that apply to that group are removed. The solution is to use the function in the decoration manager that will remove the decorations and update the instructions instead of doing the work itself.	2018-09-25 12:57:44 -04:00
Chao Chen	6e2dab2ffd	Add support for Nvidia Turing extensions	2018-09-19 20:46:14 -04:00
Steven Perron	9fbcce4ca1	Add unrolling to the legalization passes (#1903 ) Adds unrolling to the legalization passes. After enabling unrolling I found a bug when there is a self-referencing phi node. That has been fixed. The test that checks for that the order of optimizations is correct also needed to be updated.	2018-09-19 16:40:09 -04:00
Steven Perron	7075c49923	Add dummy loop in merge-return. (#1896 ) The current implementation of merge return can create bad, but correct, code. When it is not in a loop construct, it will insert a lot of extra branch around code. The potentially large number of branches are bad. At the same time, it can separate code store to variables from its uses hiding the fact that the store dominates the load. This hurts the later analysis because the compiler thinks that multiple values can reach a load, when there is really only 1. This poorer analysis leads to missed optimizations. The solution is to create a dummy loop around the entire body of the function, then we can break from that loop with a single branch. Also only new merge nodes would be those at the end of loops meaning that most analysies will not be hurt. Remove dead code for cases that are no longer possible. It seems like some drivers expect there the be an OpSelectionMerge before conditional branches, even if they are not strictly needed. So we add them.	2018-09-18 08:52:47 -04:00

... 4 5 6 7 8 ...

1015 Commits