llvm-project

Author	SHA1	Message	Date
Kazu Hirata	53d3d1ab9a	[SLPVectorizer] Avoid two successive hash lookups on the same key (#107143 ) This patch replaces the find-try_emplace sequence with just one call to try_emplace, thereby avoiding two successive hash lookups on the same key. I am not using the "inserted" boolean from try_emplace to preserve the original behavior (that is, before PR 107123) that checks to see if the value is nullptr or not.	2024-09-03 14:51:00 -07:00
Kazu Hirata	126940bde3	[SLPVectorizer] Use DenseMap::{find,try_emplace} (NFC) (#107123 ) I'm planning to deprecate and eventually remove DenseMap::FindAndConstruct in favor of operator[].	2024-09-03 11:25:35 -07:00
Alexey Bataev	571c8c2c88	Revert "[SLP]Initial support for non-power-of-2 (but still whole register) number of elements in operands." This reverts commit a3ea90ffbbe47d9a1b3eab03324f09d7b8e0dcb3 after the post commit review. The number of parts is calculated incorrectly.	2024-09-03 11:02:07 -07:00
Alexey Bataev	884d7c137a	Revert "[SLP]Check for the whole vector vectorization in unique scalars analysis" This reverts commit b74e09cb20e6218320013b54c9ba2f5c069d44b9 after post-commit review. The number of parts is calculated incorrectly.	2024-09-03 11:02:07 -07:00
Jie Fu	20fa37bbfa	[Vectorize] Fix -Wunused-variable in SLPVectorizer.cpp (NFC) /llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:10310:26: error: unused variable 'isExtractSubvectorMask' [-Werror,-Wunused-variable] bool isExtractSubvectorMask = ^ 1 error generated.	2024-09-03 21:46:42 +08:00
Han-Kuan Chen	ce8ec31298	[SLP][REVEC] Support more mask pattern usage in shufflevector. (#106212 )	2024-09-03 21:30:40 +08:00
Alexey Bataev	b74e09cb20	[SLP]Check for the whole vector vectorization in unique scalars analysis Need to check that thr whole number of register is attempted to vectorize before actually trying to build the node to avoid compiler crash.	2024-09-03 06:19:21 -07:00
Alexey Bataev	f381cd0699	[SLP]Fix PR107036: Check if the type of the user is sizable before requesting its size. Only some instructions should be considered as potentially reducing the size of the operands types, not all instructions should be considered. Fixes https://github.com/llvm/llvm-project/issues/107036	2024-09-03 05:29:59 -07:00
Alexey Bataev	6e68fa921b	[SLP]Fix PR106909: add a check for unsafe FP operations. NEON has non-IEEE compliant denormal flushing and the compiler should check if it safe to vectorize instructions for NEON in non-fast math mode. Fixes https://github.com/llvm/llvm-project/issues/106909	2024-09-01 07:10:09 -07:00
tcwzxx	24a043a6ff	[SLP] Fix crash of shuffle poison (#106857 ) When the shuffle masks are `PoisonMaskElem`, there is not need to check the cost of `SK_ExtractSubvector`. It is free. Otherwise, it will cause the compiler to crash. Assertion `(Idx + EltsPerVector) <= alignTo(NumElts, EltsPerVector) && "SK_ExtractSubvector index out of range"' failed.	2024-09-01 20:24:09 +08:00
Alexey Bataev	a3ea90ffbb	[SLP]Initial support for non-power-of-2 (but still whole register) number of elements in operands. Patch adds basic support for non-power-of-2 number of elements in operands. The patch still requires that this number addresses whole registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/106449	2024-08-31 08:14:49 -07:00
Martin Storsjö	9e86d4f2ed	Revert "[SLP]Initial support for non-power-of-2 (but still whole register) number of elements in operands." This reverts commit 6ab07d71174982e5cb95420ee4df01347333c342. This commit caused failed asserts, see https://github.com/llvm/llvm-project/pull/106449.	2024-08-31 14:53:08 +03:00
Alexey Bataev	6ab07d7117	[SLP]Initial support for non-power-of-2 (but still whole register) number of elements in operands. Patch adds basic support for non-power-of-2 number of elements in operands. The patch still requires that this number addresses whole registers. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/106449	2024-08-30 14:50:34 -04:00
Alexey Bataev	8a267b7211	[SLP][NFC]Remove unused variable	2024-08-30 11:44:29 -07:00
Alexey Bataev	079746d2c0	[SLP]Better cost estimation for masked gather or "clustered" loads. After landing support for actual vectorization of the "clustered" loads, need better estimate the cost between the masked gather and clustered loads. This includes estimation of the address calculation and better estimation of the gathered loads. Also, this estimation now relies on SLPCostThreshold option, allowing modify the behavior of the compiler. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/105858	2024-08-30 14:27:51 -04:00
Alexey Bataev	6023d17e6b	[SLP][NFC]Add a function description, NFC.	2024-08-30 10:35:10 -07:00
Alexey Bataev	a4aa6bc8fc	[SLP]Fix PR106667: carefully look for operand nodes. If the operand node has the same scalars as one of the vectorized nodes, the compiler could miss this and incorrectly request minbitwidth data for the wrong node. It may lead to a compiler crash, because the vectorized node might have different minbw result. Fixes https://github.com/llvm/llvm-project/issues/106667	2024-08-30 10:19:27 -07:00
Simon Pilgrim	b719c92551	[SLP] findBestRootPair - fix incorrect argument name comment. NFC.	2024-08-30 14:45:48 +01:00
Simon Pilgrim	96ad495289	[SLP] vectorizeChainsInBlock - remove superfluous continue at the end of for loop. NFC.	2024-08-30 14:45:48 +01:00
Alexey Bataev	87a988e881	[SLP]Fix PR106655: Use FinalShuffle for alternate cast nodes. Need to use FinalShuffle function for all vectorized results to correctly produce vectorized value. Fixes https://github.com/llvm/llvm-project/issues/106655	2024-08-30 05:18:21 -07:00
Alexey Bataev	cc943a67d1	[SLP]Fix PR106626: trye several attempts for lookup values, if not found. If the value is used in Scalar several times, the first attempt to find its position in the node (if ReuseShuffleIndices and ReorderIndices not empty) may fail. In this case need to find another copy of the same value and try again. Fixes https://github.com/llvm/llvm-project/issues/106626	2024-08-29 15:07:20 -07:00
Alexey Bataev	aeedab77b5	[SLP]Correctly decide if the non-power-of-2 number of stores can be vectorized. Need to consider the maximum type size in the graph before doing attempt for the vectorization of non-power-of-2 number of elements, which may be less than MinVF.	2024-08-29 12:40:31 -07:00
Philip Reames	4bc7c74240	[SLP] Extract isIdentityOrder to common routine [probably NFC] (#106582 ) This isn't quite just code motion as the four different versions we had of this routine differed in whether they ignored the "size" marker used to represent undef. I doubt this matters in practice, but it is a functional change. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>	2024-08-29 11:00:31 -07:00
Philip Reames	b5a1b45fe3	[SLP] Early return in getReorderingData [nfc]	2024-08-29 08:58:27 -07:00
Alexey Bataev	50515db57f	[SLP][NFC]Format canVectorizeLoads after previous NFC patches.	2024-08-29 04:31:13 -07:00
Alexey Bataev	fdf72c992b	[SLP]Fix a crash when requestin the cost for buildvector cmp nodes types. Need to use original cmp type i1 when estimating the cost for the buildvector node, not its operand types to prevent compiler crash upon TTI cost estimation.	2024-08-29 03:53:28 -07:00
tcwzxx	121fb2c2cc	[SLP] Fix the Vec lane overridden by the shuffle mask (#106341 ) Currently, SLP uses shuffle for the external user of `InsertElementInst` and iterates through the `InsertElementInst` chain to fill the mask with constant indices. However, it may override the original Vec lane. Using the original Vec lane is sufficient.	2024-08-29 11:18:26 +08:00
Alexey Bataev	ec360d6523	[SLP][NFC]Add getValueType function and use instead of complex scalar type analysis	2024-08-28 13:02:59 -07:00
Philip Reames	ee764a2603	[SLP] Remove -slp-optimize-identity-hor-reduction-ops option (#106238 ) This code has been unchanged for two years; let's simplify the code and remove configurability which makes the code harder to follow.	2024-08-27 13:21:57 -07:00
Philip Reames	6a74b0ee59	[SLP] Use early-return in canVectorizeLoads [nfc]	2024-08-27 12:30:15 -07:00
Philip Reames	ed03070eb3	[SLP] Support vectorizing 2^N-1 reductions (#106266 ) Build on the -slp-vectorize-non-power-of-2 experimental option, and support vectorizing reductions with 2^N-1 sized vector. Specifically, two related changes: 1) When searching for a profitable VL, start with the 2^N-1 reduction width. If cost model does not select that VL, return to power of two boundaries when halfing the search VL. The later is mostly for simplicity. 2) Reduce the minimum reduction width from 4 to 3 when supporting non-power of two vectors. This is required to support <3 x Ty> cases. One thing which isn't directly related to this change, but I want to note for clarity is that the non-power-of-two vectorization appears to be sensative to operand order of reduction. I haven't yet fully figured out why, but I suspect this is non-power-of-two specific.	2024-08-27 12:27:03 -07:00
Alexey Bataev	2dbc6d4d4b	[SLP][NFC]Assert total number of scalar uses not less than number of scalar uses, NFC.	2024-08-27 09:57:08 -07:00
Philip Reames	d0a6434e86	[SLP] Reduce scope of variable using if clause [NFC] This particular variable name is shadowed by another lower in the function, so reducing it's scope to it's single use removes the shadowing and makes the code much less error prone.	2024-08-27 09:14:30 -07:00
Alexey Bataev	9b408961eb	[SLP][NFC]Use has_single_bit instead of isPowerOf2 functions, NFC.	2024-08-27 08:21:19 -07:00
Alexey Bataev	9b4a8f44ed	[SLP][NFC]Improve auto types, NFC.	2024-08-27 06:11:08 -07:00
Han-Kuan Chen	3d1c63ee2c	[SLP][REVEC] Expand getelementptr into vector form. (#103704 )	2024-08-27 16:11:52 +08:00
Alexey Bataev	e1d2251290	[SLP]Fix minbitwidth analysis for gather nodes with icmp users. If the node is not in MinBWs container and the user node is icmp node, the compiler should not check the type size of the user instruction, it is always 1 and is not good for actual bitwidth analysis. Fixes https://github.com/llvm/llvm-project/issues/105988	2024-08-26 11:40:44 -07:00
Alexey Bataev	b9d3da8c8d	[SLP]Fix PR105904: the root node might be a gather node without user for reductions. Before checking the user components of the gather/buildvector nodes, need to check if the node has users at all. Root nodes might not have users, if it is a node for the reduction. Fixes https://github.com/llvm/llvm-project/issues/105904	2024-08-26 07:09:05 -07:00
Alexey Bataev	dab19dac94	[SLP]Fix a crash for the strided nodes with reversed order and externally used pointer. If the strided node is reversed, need to cehck for the last instruction, not the first one in the list of scalars, when checking if the root pointer must be extracted.	2024-08-23 07:35:48 -07:00
Alexey Bataev	f3d2609af3	[SLP]Improve/fix subvectors in gather/buildvector nodes handling SLP vectorizer has an estimation for gather/buildvector nodes, which contain some scalar loads. SLP vectorizer performs pretty similar (but large in SLOCs) estimation, which not always correct. Instead, this patch implements clustering analysis and actual node allocation with the full analysis for the vectorized clustered scalars (not only loads, but also some other instructions) with the correct cost estimation and vector insert instructions. Improves overall vectorization quality and simplifies analysis/estimations. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/104144	2024-08-23 06:45:22 -07:00
Vitaly Buka	96b3166602	Revert "[SLP]Improve/fix subvectors in gather/buildvector nodes handling" (#105780 ) with "[Vectorize] Fix warnings" It introduced compiler crashes, see #104144. This reverts commit 69332bb8995aef60d830406de12cb79a50390261 and 351f4a5593f1ef507708ec5eeca165b20add3340.	2024-08-22 22:21:20 -07:00
Vitaly Buka	351f4a5593	Reland "[Vectorize] Fix warnings"" (#105772 ) Revert was wrong, The bot is still broken https://lab.llvm.org/buildbot/#/builders/51/builds/2838 Reverts llvm/llvm-project#105771	2024-08-22 21:14:12 -07:00
Vitaly Buka	151945151c	Revert "[Vectorize] Fix warnings" (#105771 ) Triggers assert in compiler https://lab.llvm.org/buildbot/#/builders/51/builds/2836 ``` Instructions.cpp:1700: llvm::ShuffleVectorInst::ShuffleVectorInst(Value , Value , ArrayRef<int>, const Twine &, InsertPosition): Assertion `isValidOperands(V1, V2, Mask) && "Invalid shuffle vector instruction operands!"' failed. ``` This reverts commit a625435d3ef4c7bbfceb44498b9b5a2cbbed838b.	2024-08-22 20:03:08 -07:00
Kazu Hirata	a625435d3e	[Vectorize] Fix warnings This patch fixes warnings of the form: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:9300:23: error: loop variable '[E, Idx]' creates a copy from type 'const value_type' (aka 'const std::pair<const llvm::slpvectorizer::BoUpSLP::TreeEntry *, unsigned int>') [-Werror,-Wrange-loop-construct]	2024-08-22 08:52:01 -07:00
Alexey Bataev	69332bb899	[SLP]Improve/fix subvectors in gather/buildvector nodes handling SLP vectorizer has an estimation for gather/buildvector nodes, which contain some scalar loads. SLP vectorizer performs pretty similar (but large in SLOCs) estimation, which not always correct. Instead, this patch implements clustering analysis and actual node allocation with the full analysis for the vectorized clustered scalars (not only loads, but also some other instructions) with the correct cost estimation and vector insert instructions. Improves overall vectorization quality and simplifies analysis/estimations. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/104144	2024-08-22 11:24:08 -04:00
Alexey Bataev	9402bb0908	[SLP]Do not count extractelement costs in unreachable/landing pad blocks. If the external user of the scalar to be extract is in unreachable/landing pad block, we can skip counting their cost. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/105667	2024-08-22 11:03:34 -04:00
Alexey Bataev	b765fdd997	[SLP]Try to keep scalars, used in phi nodes, if phi nodes from same block are vectorized. Before doing the vectorization of the PHI nodes, the compiler sorts them by the opcodes of the operands. If the scalar is replaced during the vectorization by extractelement, it breaks this sorting and prevent some further vectorization attempts. Patch tries to improve this by doing extra analysis of the scalars and tries to keep them, if it is found that this scalar is used in other (external) PHI node in the same block. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/103923	2024-08-21 15:23:47 -04:00
Alexey Bataev	e31252bf54	[SLP]Fix PR105120: fix the order of phi nodes vectorization. The operands of the phi nodes should be vectorized in the same order, in which they were created, otherwise the compiler may crash when trying to correctly build dependency for nodes with non-schedulable instructions for gather/buildvector nodes. Fixes https://github.com/llvm/llvm-project/issues/105120	2024-08-21 12:22:01 -07:00
tcwzxx	816068e462	[NFC][SLP] Remove useless code of the schedule (#104697 ) Currently, the SLP schedule has two containers of `ScheduleData`: `ExtraScheduleDataMap` and `ScheduleDataMap`. However, the `ScheduleData` in `ExtraScheduleDataMap` is only used to indicate whether the instruction is processed or not and does not participate in the schedule, which is useless. `ScheduleDataMap` is sufficient for this purpose. The `OpValue` member is used only in `ExtraScheduleDataMap`, which is also useless.	2024-08-19 20:16:51 +08:00
Alexey Bataev	4a0bbbcbcf	[SLP]Fix PR104637: do not create new nodes for fully overlapped non-schedulable nodes If the scalars do not require scheduling and were already vectorized, but in the different order, compiler still tries to create the new node. It may cause the compiler crash for the gathered operands. Instead need to consider such nodes as full overlap and just reshuffle vectorized node. Fixes https://github.com/llvm/llvm-project/issues/104637	2024-08-16 13:49:44 -07:00

1 2 3 4 5 ...

1886 Commits