llvm-project

Author	SHA1	Message	Date
Jie Fu	20fa37bbfa	[Vectorize] Fix -Wunused-variable in SLPVectorizer.cpp (NFC) /llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:10310:26: error: unused variable 'isExtractSubvectorMask' [-Werror,-Wunused-variable] bool isExtractSubvectorMask = ^ 1 error generated.	2024-09-03 21:46:42 +08:00
Han-Kuan Chen	ce8ec31298	[SLP][REVEC] Support more mask pattern usage in shufflevector. (#106212 )	2024-09-03 21:30:40 +08:00
Alexey Bataev	b74e09cb20	[SLP]Check for the whole vector vectorization in unique scalars analysis Need to check that thr whole number of register is attempted to vectorize before actually trying to build the node to avoid compiler crash.	2024-09-03 06:19:21 -07:00
Florian Hahn	dd94537b40	[LV] Update call widening decision when scalarzing calls. collectInstsToScalarize may decide to scalarize a call. If so, we have to update the widening decision for the call, otherwise the call won't be scalarized as expected during VPlan construction. This issue was uncovered by f82543d509.	2024-09-03 14:12:41 +01:00
Alexey Bataev	f381cd0699	[SLP]Fix PR107036: Check if the type of the user is sizable before requesting its size. Only some instructions should be considered as potentially reducing the size of the operands types, not all instructions should be considered. Fixes https://github.com/llvm/llvm-project/issues/107036	2024-09-03 05:29:59 -07:00
Florian Hahn	954ed05c10	[VPlan] Simplify MUL operands at recipe construction. This moves the logic to create simplified operands using SCEV to MUL recipe creation. This is needed to match the behavior of the legacy's cost model. TODOs are to extend to other opcodes and move to a transform. Note that this also restricts the number of SCEV simplifications we apply to more precisely match the cases handled by the legacy cost model. Fixes https://github.com/llvm/llvm-project/issues/107015.	2024-09-02 21:25:31 +01:00
Florian Hahn	50a02e7c68	[VPlan] Pass intrinsic inst to TTI in VPWidenCallRecipe::computeCost. Follow-up to 9ccf825, adjust computeCost to also pass IntrinsicInst to TTI if available, as there are multiple places in TTI which use the IntrinsicInst. Fixes https://github.com/llvm/llvm-project/issues/107016.	2024-09-02 20:47:37 +01:00
Florian Hahn	b0de7fa466	[VPlan] Use op from underlying call in computeCost if needed. This fixes a divergence between legacy and VPlan-based cost model, e.g. if one of the operands has an first-order recurrence phi as operand.	2024-09-02 14:00:10 +01:00
David Sherwood	dc6c3ba4c4	[NFC][IR] Add CreateCountTrailingZeroElems helper (#106711 ) The LoopIdiomVectorize pass already creates calls to the intrinsic experimental_cttz_elts, but PR #88385 will start calling this more too so I've created a helper for it.	2024-09-02 13:40:14 +01:00
Florian Hahn	654bb4e9f2	[LV] Don't consider branches leaving loop in collectValuesToIgnore. Branches exiting the loop will remain regardless, so don't consider them in collectValuesToIgnore. This fixes another divergence between legacy and VPlan-based cost model. Fixes https://github.com/llvm/llvm-project/issues/106780.	2024-09-01 20:35:36 +01:00
Florian Hahn	9ccf82543d	[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). (#106047 ) Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments. PR: https://github.com/llvm/llvm-project/pull/106731	2024-09-01 16:26:08 +01:00
Alexey Bataev	6e68fa921b	[SLP]Fix PR106909: add a check for unsafe FP operations. NEON has non-IEEE compliant denormal flushing and the compiler should check if it safe to vectorize instructions for NEON in non-fast math mode. Fixes https://github.com/llvm/llvm-project/issues/106909	2024-09-01 07:10:09 -07:00
tcwzxx	24a043a6ff	[SLP] Fix crash of shuffle poison (#106857 ) When the shuffle masks are `PoisonMaskElem`, there is not need to check the cost of `SK_ExtractSubvector`. It is free. Otherwise, it will cause the compiler to crash. Assertion `(Idx + EltsPerVector) <= alignTo(NumElts, EltsPerVector) && "SK_ExtractSubvector index out of range"' failed.	2024-09-01 20:24:09 +08:00
Alexey Bataev	a3ea90ffbb	[SLP]Initial support for non-power-of-2 (but still whole register) number of elements in operands. Patch adds basic support for non-power-of-2 number of elements in operands. The patch still requires that this number addresses whole registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/106449	2024-08-31 08:14:49 -07:00
Martin Storsjö	9e86d4f2ed	Revert "[SLP]Initial support for non-power-of-2 (but still whole register) number of elements in operands." This reverts commit 6ab07d71174982e5cb95420ee4df01347333c342. This commit caused failed asserts, see https://github.com/llvm/llvm-project/pull/106449.	2024-08-31 14:53:08 +03:00
Philip Reames	c53008de89	[VPlan] Manually jumpthread a bit of reduction code for readability [nfc]	2024-08-30 12:46:49 -07:00
Alexey Bataev	6ab07d7117	[SLP]Initial support for non-power-of-2 (but still whole register) number of elements in operands. Patch adds basic support for non-power-of-2 number of elements in operands. The patch still requires that this number addresses whole registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/106449	2024-08-30 14:50:34 -04:00
Alexey Bataev	8a267b7211	[SLP][NFC]Remove unused variable	2024-08-30 11:44:29 -07:00
Alexey Bataev	079746d2c0	[SLP]Better cost estimation for masked gather or "clustered" loads. After landing support for actual vectorization of the "clustered" loads, need better estimate the cost between the masked gather and clustered loads. This includes estimation of the address calculation and better estimation of the gathered loads. Also, this estimation now relies on SLPCostThreshold option, allowing modify the behavior of the compiler. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/105858	2024-08-30 14:27:51 -04:00
Alexey Bataev	6023d17e6b	[SLP][NFC]Add a function description, NFC.	2024-08-30 10:35:10 -07:00
Alexey Bataev	a4aa6bc8fc	[SLP]Fix PR106667: carefully look for operand nodes. If the operand node has the same scalars as one of the vectorized nodes, the compiler could miss this and incorrectly request minbitwidth data for the wrong node. It may lead to a compiler crash, because the vectorized node might have different minbw result. Fixes https://github.com/llvm/llvm-project/issues/106667	2024-08-30 10:19:27 -07:00
Simon Pilgrim	b719c92551	[SLP] findBestRootPair - fix incorrect argument name comment. NFC.	2024-08-30 14:45:48 +01:00
Simon Pilgrim	96ad495289	[SLP] vectorizeChainsInBlock - remove superfluous continue at the end of for loop. NFC.	2024-08-30 14:45:48 +01:00
Paul Walker	ce5620ba9a	[LLVM][VPlan] Pick more optimal initial value for VPBlend. (#104019 ) By choosing an initial value whose mask is only used by the blend we can remove the need for the mask entirely.	2024-08-30 13:30:23 +01:00
Alexey Bataev	87a988e881	[SLP]Fix PR106655: Use FinalShuffle for alternate cast nodes. Need to use FinalShuffle function for all vectorized results to correctly produce vectorized value. Fixes https://github.com/llvm/llvm-project/issues/106655	2024-08-30 05:18:21 -07:00
Florian Hahn	f0e34f3818	[VPlan] Don't skip optimizable truncs in planContainsAdditionalSimps. A optimizable cast can also be removed by VPlan simplifications. Remove the restriction from planContainsAdditionalSimplifications, as this causes it to miss relevant simplifications, triggering false positives for the cost decision verification. Also adds debug output for printing additional cost-precomputations. Fixes https://github.com/llvm/llvm-project/issues/106641.	2024-08-30 11:29:30 +01:00
Alexey Bataev	cc943a67d1	[SLP]Fix PR106626: trye several attempts for lookup values, if not found. If the value is used in Scalar several times, the first attempt to find its position in the node (if ReuseShuffleIndices and ReorderIndices not empty) may fail. In this case need to find another copy of the same value and try again. Fixes https://github.com/llvm/llvm-project/issues/106626	2024-08-29 15:07:20 -07:00
Florian Hahn	c4906588ce	[VPlan] Use skipCostComputation when pre-computing induction costs. This ensures we skip any instructions identified to be ignored by the legacy cost model as well. Fixes a divergence between legacy and VPlan-based cost model. Fixes https://github.com/llvm/llvm-project/issues/106417.	2024-08-29 21:20:00 +01:00
Alexey Bataev	aeedab77b5	[SLP]Correctly decide if the non-power-of-2 number of stores can be vectorized. Need to consider the maximum type size in the graph before doing attempt for the vectorization of non-power-of-2 number of elements, which may be less than MinVF.	2024-08-29 12:40:31 -07:00
Philip Reames	4bc7c74240	[SLP] Extract isIdentityOrder to common routine [probably NFC] (#106582 ) This isn't quite just code motion as the four different versions we had of this routine differed in whether they ignored the "size" marker used to represent undef. I doubt this matters in practice, but it is a functional change. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2024-08-29 11:00:31 -07:00
Philip Reames	b5a1b45fe3	[SLP] Early return in getReorderingData [nfc]	2024-08-29 08:58:27 -07:00
Alexey Bataev	50515db57f	[SLP][NFC]Format canVectorizeLoads after previous NFC patches.	2024-08-29 04:31:13 -07:00
Florian Hahn	0a272d3a17	[LV] Use SCEV to analyze second operand for cost query. Improve operand analysis using SCEV for cost purposes. This fixes a divergence between legacy and VPlan-based cost-modeling after 533e6bbd0d34. Fixes https://github.com/llvm/llvm-project/issues/106248.	2024-08-29 12:08:27 +01:00
Alexey Bataev	fdf72c992b	[SLP]Fix a crash when requestin the cost for buildvector cmp nodes types. Need to use original cmp type i1 when estimating the cost for the buildvector node, not its operand types to prevent compiler crash upon TTI cost estimation.	2024-08-29 03:53:28 -07:00
tcwzxx	121fb2c2cc	[SLP] Fix the Vec lane overridden by the shuffle mask (#106341 ) Currently, SLP uses shuffle for the external user of `InsertElementInst` and iterates through the `InsertElementInst` chain to fill the mask with constant indices. However, it may override the original Vec lane. Using the original Vec lane is sufficient.	2024-08-29 11:18:26 +08:00
Michael Maitland	18c79ca360	[LV][NFC] Remove unnecessary space in comment	2024-08-28 14:23:44 -07:00
Alexey Bataev	ec360d6523	[SLP][NFC]Add getValueType function and use instead of complex scalar type analysis	2024-08-28 13:02:59 -07:00
Florian Hahn	4b84288f00	[VPlan] Pass live-ins used as exit values straight to live-out. Live-ins that are used as exit values don't need to be extracted, they can be passed through directly. This fixes a crash when trying to extract from a live-in. Fixes https://github.com/llvm/llvm-project/issues/106257.	2024-08-28 19:12:05 +01:00
Florian Hahn	16910a21ee	[VPlan] Move logic to create interleave groups to VPlanTransforms (NFC). This is a step towards further breaking up the rather large tryToBuildVPlanWithVPRecipes. It moves logic create interleave groups to VPlanTransforms.cpp, where similar replacements for other recipes are defined as well (e.g. EVL-based ones)	2024-08-28 15:56:09 +01:00
Florian Hahn	96e1320a9a	[VPlan] Move properlyDominates to VPDominatorTree (NFCI). This allows for easier re-use in additional places in the future. Also move code to VPlanAnalysis.cpp	2024-08-28 13:58:12 +01:00
Ramkumar Ramachandra	71ede8d831	VPlan: factor out VPlanUtils into its own file (NFC) (#105857 )	2024-08-28 13:54:41 +01:00
Philip Reames	ee764a2603	[SLP] Remove -slp-optimize-identity-hor-reduction-ops option (#106238 ) This code has been unchanged for two years; let's simplify the code and remove configurability which makes the code harder to follow.	2024-08-27 13:21:57 -07:00
Philip Reames	6a74b0ee59	[SLP] Use early-return in canVectorizeLoads [nfc]	2024-08-27 12:30:15 -07:00
Philip Reames	ed03070eb3	[SLP] Support vectorizing 2^N-1 reductions (#106266 ) Build on the -slp-vectorize-non-power-of-2 experimental option, and support vectorizing reductions with 2^N-1 sized vector. Specifically, two related changes: 1) When searching for a profitable VL, start with the 2^N-1 reduction width. If cost model does not select that VL, return to power of two boundaries when halfing the search VL. The later is mostly for simplicity. 2) Reduce the minimum reduction width from 4 to 3 when supporting non-power of two vectors. This is required to support <3 x Ty> cases. One thing which isn't directly related to this change, but I want to note for clarity is that the non-power-of-two vectorization appears to be sensative to operand order of reduction. I haven't yet fully figured out why, but I suspect this is non-power-of-two specific.	2024-08-27 12:27:03 -07:00
Alexey Bataev	2dbc6d4d4b	[SLP][NFC]Assert total number of scalar uses not less than number of scalar uses, NFC.	2024-08-27 09:57:08 -07:00
Danial Klimkin	9671ed1afc	Revert "LSV: forbid load-cycles when vectorizing; fix bug (#104815 )" (#106245 ) This reverts commit c46b41aaa6eaa787f808738d14c61a2f8b6d839f. Multiple tests time out, either due to performance hit (see comment) or a cycle.	2024-08-27 18:45:22 +02:00
Philip Reames	d0a6434e86	[SLP] Reduce scope of variable using if clause [NFC] This particular variable name is shadowed by another lower in the function, so reducing it's scope to it's single use removes the shadowing and makes the code much less error prone.	2024-08-27 09:14:30 -07:00
Alexey Bataev	9b408961eb	[SLP][NFC]Use has_single_bit instead of isPowerOf2 functions, NFC.	2024-08-27 08:21:19 -07:00
Alexey Bataev	9b4a8f44ed	[SLP][NFC]Improve auto types, NFC.	2024-08-27 06:11:08 -07:00
Han-Kuan Chen	3d1c63ee2c	[SLP][REVEC] Expand getelementptr into vector form. (#103704 )	2024-08-27 16:11:52 +08:00

1 2 3 4 5 ...

4829 Commits