llvm-project

Author	SHA1	Message	Date
Valery Dmitriev	c80b503496	[SLP] Improve gather tree nodes matching when users are PHIs. (#69392 )	2023-10-18 09:05:11 -07:00
Valery Dmitriev	9aa571f080	[SLP][NFC] Try to cleanup and better document some isGatherShuffledEntry code. (#69384 ) Outline some often used common code to dedicated variables in order to make code compact. Rename variables to more accurately reflect their purpose. Apply const qualifier where appropriate. Fix and add bit more explanation comment for the existing code.	2023-10-17 14:59:36 -07:00
Florian Hahn	fd31112634	[VPlan] Insert Trunc/Exts for reductions directly in VPlan. Update the code to create Trunc/Ext recipes directly in adjustRecipesForReductions instead of fixing it up later in fixReductions. This explicitly models the required conversions and also makes sure they are generated at the right place (instead of after the exit condition), hence the changes in a few tests.	2023-10-17 19:17:40 +01:00
Alexey Bataev	66775f8ccd	[SLP]Fix PR69196: Instruction does not dominate all uses During emission of the postponed gathers, need to insert them before user instruction to avoid use before definition crash.	2023-10-17 10:43:59 -07:00
Alexey Bataev	119b0f3895	Revert "[SLP]Fix PR69196: Instruction does not dominate all uses" This reverts commit 8e2b2c4181506efc5b9321c203dd107bbd63392b to fix a crash reported in https://lab.llvm.org/buildbot/#/builders/230/builds/19993.	2023-10-16 13:29:17 -07:00
Alexey Bataev	8e2b2c4181	[SLP]Fix PR69196: Instruction does not dominate all uses During emission of the postponed gathers, need to insert them before user instruction to avoid use before definition crash.	2023-10-16 12:57:18 -07:00
Yingwei Zheng	4718b4011f	[LV] Invalidate disposition of SCEV values after loop vectorization (#69230 ) This PR fixes the assertion failure of `SE.verify()` after loop vectorization.	2023-10-17 03:49:39 +08:00
Florian Hahn	f7a8a78cb7	[VPlan] Also print operands of canonical IV (NFC). Also print the operands of VPCanonicalIVPHIRecipe. That was missed earlier.	2023-10-16 20:28:23 +01:00
Nikita Popov	d4300154b6	Revert "[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)" This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc. This causes some minor compile-time impact. Revert for now, better to do the change more gradually.	2023-10-16 14:04:09 +02:00
Nikita Popov	b5743d4798	[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC) Remove the old overloads that accept KnownBits by reference, in favor of those that return it by value.	2023-10-16 13:00:31 +02:00
Florian Hahn	b1115f8cce	[LV] Use LatchVPBB directly instead of going through region (NFC). Split off from D158333.	2023-10-13 20:08:31 +01:00
Fangrui Song	2d854dd3e7	Move global namespace cl::opt inside llvm:: or internalize them	2023-10-10 19:58:03 -07:00
Alexey Bataev	c2ae16f6a7	[VectorCombine]Fix a crash during long vector analysis. If the analysis of the single vector requested, need to use original type to avoid crash	2023-10-09 14:22:37 -07:00
Rin	df8e0d057d	[AArch64][LoopVectorize] Use upper bound trip count instead of the constant TC when choosing max VF (#67697 ) This patch is based off of https://github.com/llvm/llvm-project/pull/67543. We are currently using the exact trip count to make decisions regarding the maximum VF. We can instead use the upper bound TC, which will be the same as the constant trip count when that is known.	2023-10-09 16:26:19 +01:00
Simon Pilgrim	bea3967271	[VectorCombine] Rename foldBitcastShuf -> foldBitcastShuffle. NFC. Consistently use the term "Shuffle" in all vector combiner folds.	2023-10-09 11:28:50 +01:00
Graham Hunter	3273ea40e5	[LV] Cache call vectorization decisions (#66521 ) LoopVectorize currently queries VFDatabase repeatedly for each CI, and each query to VFDatabase rescans all vector variants. This patch instead makes a decision for each call once per VF based on the cost of scalarization vs. function call to a vector variant of the function vs. a vector intrinsic, then caches the decision along with relevant info for use in planning and plan execution.	2023-10-09 11:23:19 +01:00
Florian Hahn	dae91f5dbc	[VPlan] Avoid VPTransformState::reset in fixReduction (NFCI). There's no need to repeatedly query and reset the state for LoopExitInstDef. This removes one of the last uses of VPTransformState::reset, by use a vector to store and update the results. No other code should try to retrieve the result from State outside the fixReductionCall.	2023-10-07 23:24:24 +01:00
Simon Pilgrim	94795a37e8	[VectorCombine] foldBitcastShuf - add support for length changing shuffles Allow length changing shuffle masks in the "bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC'" fold. It also exposes some poor shuffle mask detection for extract/insert subvector cases inside improveShuffleKindFromMask First stage towards addressing Issue #67803	2023-10-06 11:59:51 +01:00
Simon Pilgrim	d3e66a88c2	[VectorCombine] foldBitcastShuf - compute scale factors using shuffle type element size instead of element count. NFCI. First step towards supporting length changing shuffles	2023-10-05 18:58:36 +01:00
Alexey Bataev	e22818d5c9	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-05 06:17:07 -07:00
Rin	d3e4702c0f	[AArch64] [LoopVectorize] Use either fixed-width or scalable VF when tail-folding (#67543 ) Since the getMaximisedVFForTarget function is called twice, once for fixed-width and once for scalable, it adds no value to always return a fixed-width VF. Instead, when we are tail-folding, we can use either fixed-width or scalable vectors.	2023-10-05 10:24:30 +01:00
Arthur Eubanks	07389535a7	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit b186f1f68be11630355afb0c08b80374a6d31782. Causes crashes, see https://reviews.llvm.org/D158449.	2023-10-04 14:37:16 -07:00
Alexey Bataev	b186f1f68b	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-04 07:53:30 -07:00
Alexey Bataev	1129dec778	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 6f43d28f3452b3ef598bc12b761cfc2dbd0f34c9 to fix a crash reported in https://reviews.llvm.org/D158449.	2023-10-03 13:02:16 -07:00
Florian Hahn	07e715953b	[VPlan] Check users of LoopExitInstDef in VPlan directly. (NFCI) Instead of walking the IR def use chains of the generated code, adjust the generated VPInstruction if needed and check its users in VPlan.	2023-10-03 20:42:15 +01:00
Alexey Bataev	6f43d28f34	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-10-03 10:26:11 -07:00
Alexey Bataev	d0d608383e	[SLP][NFC]Fix assert message, NFC.	2023-10-02 13:38:54 -07:00
Alexey Bataev	ebcb5d59fc	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.	2023-09-29 15:03:46 -07:00
Alexey Bataev	9f5960e004	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-29 13:16:03 -07:00
Alexey Bataev	019aee8327	[SLP]Improve costs in computeExtractCost() to avoid crash after D158449. Need to consider the length of the original vector for extractelements, not the length, matched number of the scalars. It fixes 2 issues: 1) improves cost estimation; 2) Fixes crashes after D158449.	2023-09-29 07:48:02 -07:00
Hans Wennborg	06f3b0ed43	Revert "[SLP]Improve costs in computeExtractCost() to avoid crash after D158449." This caused asserts: Assertion failed: NumElts > 1 && "Expected at least 2-element fixed length vector(s).", file C:\b\s\w\ir\cache\builder\src\third_party\llvm\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp, line 7096 see comment on `59a67ea35d` > Need to consider the length of the original vector for extractelements, > not the length, matched number of the scalars. It fixes 2 issues: 1) > improves cost estimation; 2) Fixes crashes after D158449. This reverts commit 59a67ea35d608480257fc64ec3e5106ef50de740.	2023-09-29 10:42:19 +02:00
Alexey Bataev	3204f88a8b	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.	2023-09-28 11:57:32 -07:00
Alexey Bataev	c88c281cf1	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-28 11:03:21 -07:00
Alexey Bataev	59a67ea35d	[SLP]Improve costs in computeExtractCost() to avoid crash after D158449. Need to consider the length of the original vector for extractelements, not the length, matched number of the scalars. It fixes 2 issues: 1) improves cost estimation; 2) Fixes crashes after D158449.	2023-09-28 09:36:08 -07:00
Nikita Popov	3b82397965	[VectorCombine] Check for non-byte-sized element type We should check whether the element type is non-byte-sized, not the vector type. For types like <32 x i1> the whole type is byte-sized, but the individual elements (that we scalarize to) are not. Fixes https://github.com/llvm/llvm-project/issues/67060.	2023-09-28 14:18:30 +02:00
Mikael Holmen	9cecee97a0	[VPlan] Silence gcc Wparentheses warning [NFC] Without the fix gcc warns about ../lib/Transforms/Vectorize/VPlanTransforms.cpp:968:42: warning: suggest parentheses around '&&' within '\|\|' [-Wparentheses] 968 \| UseActiveLaneMaskForControlFlow && \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ 969 \| "DataAndControlFlowWithoutRuntimeCheck implies " \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 970 \| "UseActiveLaneMaskForControlFlow"); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	2023-09-28 12:04:26 +02:00
Alexey Bataev	9eeb0293e2	[SLP]Cleanup MultiNodeScalars when tree deleted. Need to clear MultiNodeScalars map to avoid compiler crash when tree is deleted.	2023-09-27 07:48:53 -07:00
Alexey Bataev	ea7f43ec14	[SLP]Do not gather node, if the instruction, that does not require scheduling, is previously vectorized. If the main node was vectorized already, but does not require scheduling, we still can try to vectorize it in this new node instead of gathering.	2023-09-26 11:57:35 -07:00
Ben Shi	ea0ee55c02	[VectorCombine] Enable transform 'scalarizeLoadExtract' for non constant indexes (#65445 ) Enable the transform if a non constant index is guaranteed to be safe via a UREM/AND.	2023-09-26 09:41:53 +08:00
alexfh	5d86176f48	Revert "[SLP]Do not gather node, if the instruction, that does not require" (#67386 ) This reverts commit 77053421228edd12a3ba73d4eebd970fcdd3b2c0, which introduces a clang crash (test case: https://gcc.godbolt.org/z/zn5n4KWPY).	2023-09-26 02:45:11 +02:00
Florian Hahn	97687b7aea	[VPlan] Add active-lane-mask as VPlan-to-VPlan transformation. This patch updates the mask creation code to always create compares of the form (ICMP_ULE, wide canonical IV, backedge-taken-count) up front when tail folding and introduce active-lane-mask as later transformation. This effectively makes (ICMP_ULE, wide canonical IV, backedge-taken-count) the canonical form for tail-folding early on. Introducing more specific active-lane-mask recipes is treated as a VPlan-to-VPlan optimization. This has the advantage of keeping the logic (and complexity) of introducing active-lane-mask recipes in a single place, instead of spreading the logic out across multiple functions. It also simplifies initial VPlan construction and enables treating introducing EVL as similar optimization. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158779	2023-09-25 13:34:45 +01:00
Florian Hahn	1a9e45080f	[VPBuilder] Add setInsertPoint version taking a recipe directly (NFC). This helps to slightly simplify code when a recipe can be obtained easily. Suggested in D158779.	2023-09-25 12:17:53 +01:00
Florian Hahn	541e88dbc2	[VPlan] Simplify HCFG construction of region blocks (NFC). Update the logic to update the successors and predecessors of region blocks directly. This adds special handling for header and latch blocks in place, and removes the separate loop to fix up the region blocks. Helps to simplify D158333. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D159136	2023-09-24 21:53:35 +01:00
Kazu Hirata	e7497570d8	[Vectorize] Use range-based for loops (NFC)	2023-09-22 17:43:06 -07:00
Youngsuk Kim	e5026f0179	[llvm] Remove uses of Type::getPointerTo() (NFC) Partial progress towards removing in-tree uses of `getPointerTo()`, by employing the following options: * Drop the call entirely if the sole purpose of it is to support a no-op bitcast (remove the no-op bitcast as well). * Replace with `PointerType::get()`/`PointerType::getUnqual()` This is a NFC cleanup effort. Reviewed By: barannikov88 Differential Revision: https://reviews.llvm.org/D155232	2023-09-22 19:44:38 -04:00
Florian Hahn	d9f83169d1	[VPlan] Ensure start value of phis is the first op at construction (NFC) Header phi recipes have the start value (incoming from outside the loop) as first operand. This wasn't the case for VPWidenPHIRecipes. Instead the start value was picked during execute() by doing extra work. To be in line with other recipes, ensure the operand order is as expected during construction.	2023-09-22 21:24:15 +01:00
Alexey Bataev	7ff83ed6cd	[SLP]Do not try to reorder possible strided nodes. Reordering of possible strided nodes in bottom-to-top order requires top-to-bottom reordering of the operands of such nodes, which is not supported. Need to disable reordering of strided operands to avoid compiler crashes.	2023-09-22 07:55:43 -07:00
David Spickett	8f548610a6	Revert "[SLP]Use source vector type as the original vector type instead of" This reverts commit 9a99944df068b29b905cd8ba9a2132cc6382b6fb. Due to test suite failures on all our SVE buildbots e.g.: https://lab.llvm.org/buildbot/#/builders/184/builds/7375 clang: ../llvm/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:3565: InstructionCost llvm::AArch64TTIImpl::getShuffleCost(TTI::ShuffleKind, VectorType , ArrayRef<int>, TTI::TargetCostKind, int, VectorType , ArrayRef<const Value *>): Assertion `Mask.size() == TpNumElts && "Expected Mask and Tp size to match!"' failed.	2023-09-22 07:52:16 +00:00
Alexey Bataev	9a99944df0	[SLP]Use source vector type as the original vector type instead of artificial for better cost estimation. Need to use original source vector type, not the one artificially constructed, based on the number of vectorized scalars. It affect the cost significantly.	2023-09-21 11:34:02 -07:00
Alexey Bataev	3dc28e6c6a	[SLp]Fix a crash because of wrong deps between vectorized nodes. Need to change the order of the nodes vectorization to avoid too early insertion of the first node.	2023-09-21 10:19:11 -07:00

1 2 3 4 5 ...

4020 Commits