llvm-project

Author	SHA1	Message	Date
Han-Kuan Chen	07d496538f	[SLP] Replace MainOp and AltOp in TreeEntry with InstructionsState. (#122443 ) Add TreeEntry::hasState. Add assert for getTreeEntry. Remove the OpValue parameter from the canReuseExtract function. Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.	2025-01-18 10:23:20 +08:00
George Chaltas	b1bf95c081	ReduxWidth check for 0 (#123257 ) Added assert to check for underflow of ReduxWidth modified: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp Source code analysis flagged the operation (ReduxWwidth - 1) as potential underflow, since ReduxWidth is unsigned. Realize that this should never happen if everything is working right, but added an assert to check for it just in case.	2025-01-17 15:56:58 -05:00
Alexey Bataev	fec503d1a3	[SLP][NFC]Add safe createExtractVector and use instead Builder.CreateExtractVector	2025-01-17 11:46:46 -08:00
Ramkumar Ramachandra	0fe8469e08	SLPVectorizer: strip bad FIXME (NFC) (#122888 ) Follow up on 4a0d53a (PatternMatch: migrate to CmpPredicate) to get rid of the FIXME it introduced in SLPVectorizer: the FIXME is bad, and we'd get no testable impact by using CmpPredicate::getMatching here.	2025-01-14 11:27:55 +00:00
Alexey Bataev	066b88879a	[SLP]Correctly set vector operand for extracts with poisons When extracts are vectorized and it has some poison values instead of instructions, need to correctly set the vectorized operand not as poison, but as a main vector operand of the main extract instruction. Fixes #122583	2025-01-13 10:57:07 -08:00
Alexey Bataev	092d628383	[SLP]Check for div/rem instructions before extending with poisons Need to check if the instructions can be safely extended with poison before actually doing this to avoid incorrect transformations. Fixes #122691	2025-01-13 09:28:27 -08:00
Alexey Bataev	af524de1fa	[SLP]Do not include subvectors for fully matched buildvectors If the buildvector node fully matched another node, need to exclude subvectors, when building final shuffle, just a shuffle of the original node must be emitted. Fixes #122584	2025-01-13 07:24:16 -08:00
Mel Chen	56a37a3c76	[SLPVectorizer] Refactor HorizontalReduction::createOp (NFC) (#121549 ) This patch simplifies select-based integer min/max reductions by utilizing `llvm::getMinMaxReductionPredicate`, and generates intrinsic-based min/max reductions by utilizing `llvm::getMinMaxReductionIntrinsicOp`.	2025-01-13 16:11:31 +08:00
Han-Kuan Chen	35e76b6a4f	Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 )" This reverts commit f3d6cdc5aebafac3961d4fccbd2ca0e302c6082c.	2025-01-10 10:09:54 -08:00
Alexey Bataev	681c83a2f9	[SLP]Fix mask generation after cost estimation When estimating the cost of entries shuffles for buildvectors, need to rebuild original mask, not a generated submask, used for subregisters analysis. Fixes #122430	2025-01-10 09:32:35 -08:00
Alex MacLean	986f2ac48f	[SLPVectorizer] minor tweaks around lambdas for compatibility with older compilers (#122348 ) Older version of msvc do not have great lambda support and are not able to handle uses of class data or lambdas with implicit return types in some cases. These minor changes improve the sources compatibility with older msvc and don't hurt readability either.	2025-01-10 09:18:28 -08:00
Alexey Bataev	3c9c94a24f	Revert "[SLP]Fix mask generation after cost estimation" This reverts commit 547ba9730bf05df3383150f730a689f2c8336206 to fix buildbots reported in https://lab.llvm.org/buildbot/#/builders/123/builds/11370, https://lab.llvm.org/buildbot/#/builders/133/builds/9492	2025-01-10 08:46:42 -08:00
Alexey Bataev	547ba9730b	[SLP]Fix mask generation after cost estimation When estimating the cost of entries shuffles for buildvectors, need to rebuild original mask, not a generated submask, used for subregisters analysis. Fixes #122430	2025-01-10 08:17:56 -08:00
Mel Chen	e0f14e11c7	[SLPVectorizer] Refine the scope of RdxOpcode in HorizontalReduction::createOp (NFC) (#122239 ) This patch is one part of unifying IAnyOf and FAnyOf reduction. #118393 The related patch is #118777.	2025-01-10 16:01:36 +08:00
Han-Kuan Chen	f3d6cdc5ae	[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 ) Add TreeEntry::hasState. Add assert for getTreeEntry. Remove the OpValue parameter from the canReuseExtract function. Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.	2025-01-09 23:41:52 -08:00
Han-Kuan Chen	5454ac28b3	Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 )" This reverts commit 760f550de25792db83cd39c88ef57ab6d80a41a0.	2025-01-09 18:41:47 -08:00
Han-Kuan Chen	36b423e0f8	[SLP] NFC. Refactor getSameOpcode and reduce for loop iterations. (#122241 ) Replace Cnt and AltIndex with MainOp and AltOp. Reduce the number of iterations in the for loop.	2025-01-10 09:06:07 +08:00
Han-Kuan Chen	760f550de2	[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198 ) Add TreeEntry::hasState. Add assert for getTreeEntry. Remove the OpValue parameter from the canReuseExtract function. Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.	2025-01-10 09:05:39 +08:00
Alexey Bataev	5ff36748cf	[SLP]Fix mask processing for reused gathered scalars Need to sync the mask between cost and actual emission to avoid bugs in mask calculation Fixes #122324	2025-01-09 11:24:48 -08:00
Alexey Bataev	5b76a2e51b	[SLP]Correctly calculate mask for the inserted vector	2025-01-08 15:18:06 -08:00
Alexey Bataev	0d921f96d4	[SLP][NFC]Introduce and use createInsertVector helper function, NFC	2025-01-08 14:26:13 -08:00
Alexey Bataev	1160994602	[SLP]Fix a crash for very long GEP chains Need to check if the GEP bases are equal and return false early. Also, need to return false if the lookup is too deep, considering bases equal too. Fixes a crash in the assertion.	2025-01-08 06:47:41 -08:00
Han-Kuan Chen	c50370c67a	[SLP] NFC. Use InstructionsState::valid if users just want to know whether VL has same opcode. (#120217 ) Add assert for InstructionsState::getOpcode. Use InstructionsState::getOpcode only when necessary.	2025-01-04 00:44:57 +08:00
Fangrui Song	edc42b2dc1	[SLP] Migrate away from PointerUnion::get	2024-12-27 21:01:09 -08:00
Alexey Bataev	07ba457525	[SLP][NFC]Add dump of combined entries, where applicable	2024-12-27 07:56:10 -08:00
Alexey Bataev	889215a30e	[SLP]Followup fix for the poisonous logical op in reductions If the VectorizedTree still may generate poisonous value, but it is not the original operand of the reduction op, need to check if Res still the operand, to generate correct code. Fixes #114905	2024-12-26 05:11:26 -08:00
Alexey Bataev	07d284d4eb	[SLP]Add cost estimation for gather node reshuffling Adds cost estimation for the variants of the permutations of the scalar values, used in gather nodes. Currently, SLP just unconditionally emits shuffles for the reused buildvectors, but in some cases better to leave them as buildvectors rather than shuffles, if the cost of such buildvectors is better. X86, AVX512, -O3+LTO Metric: size..text Program size..text results results0 diff test-suite :: External/SPEC/CINT2006/445.gobmk/445.gobmk.test 912998.00 913238.00 0.0% test-suite :: MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame.test 203070.00 203102.00 0.0% test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 1396320.00 1396448.00 0.0% test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 1396320.00 1396448.00 0.0% test-suite :: MultiSource/Benchmarks/Bullet/bullet.test 309790.00 309678.00 -0.0% test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12477607.00 12470807.00 -0.1% CINT2006/445.gobmk - extra code vectorized MiBench/consumer-lame - small variations CFP2017speed/638.imagick_s CFP2017rate/538.imagick_r - extra vectorized code Benchmarks/Bullet - extra code vectorized CFP2017rate/526.blender_r - extra vector code RISC-V, sifive-p670, -O3+LTO CFP2006/433.milc - regressions, should be fixed by https://github.com/llvm/llvm-project/pull/115173 CFP2006/453.povray - extra vectorized code CFP2017rate/508.namd_r - better vector code CFP2017rate/510.parest_r - extra vectorized code SPEC/CFP2017rate - extra/better vector code CFP2017rate/526.blender_r - extra vectorized code CFP2017rate/538.imagick_r - extra vectorized code CINT2006/403.gcc - extra vectorized code CINT2006/445.gobmk - extra vectorized code CINT2006/464.h264ref - extra vectorized code CINT2006/483.xalancbmk - small variations CINT2017rate/525.x264_r - better vectorization Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/115201	2024-12-24 15:35:29 -05:00
Alexey Bataev	852feea820	[SLP]Propagate AssumptionCache where possible	2024-12-24 09:20:26 -08:00
Alexey Bataev	0d6cb0ae9d	[SLP]Fix strict weak ordering criterion in comparators Fixes #121019	2024-12-24 08:13:57 -08:00
Alexey Bataev	f0f8dab712	[SLP]Check if the first reduced value requires freeze/swap, if it may be too poisonous If several reduced values are combined and the first reduced value is just the original reduced value of the bool logical op, need to freeze it to prevent the propagation of the poison value. Fixes #114905	2024-12-24 07:40:35 -08:00
Alexey Bataev	030829a7e5	[SLP]Drop samesign flag if the vector node has reduced bitwidth If the operands of the icmp instructions has reduced bitwidth after MinBitwidth analysis, need to drop samesign flag to preserve correctness of the transformation. Fixes #120823	2024-12-23 16:55:11 -08:00
Han-Kuan Chen	11676da808	[SLP] Normalize debug messages for newTreeEntry. (#119514 ) A debug message should follow after newTreeEntry. Make ExtractValueInst and ExtractElementInst use setOperand directly.	2024-12-23 21:42:02 +08:00
Finn Plummer	45c01e8a33	[NFC][TargetTransformInfo][VectorUtils] Consolidate `isVectorIntrinsic...` api (#117635 ) - update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for all uses, to allow specifiction of target specific intrinsics - add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api - update TTI api to provide `isTargetIntrinsicWith...` functions and consistently name them - move `isTriviallyScalarizable` to VectorUtils - update all uses of the api and provide the TTI parameter Resolves #117030	2024-12-19 11:54:26 -08:00
DianQK	e7a4d78ad3	[SLP] Check if instructions exist after vectorization (#120434 ) Fixes #120433.	2024-12-19 06:21:57 +08:00
Alexey Bataev	d1a7225076	[SLP]Check if the node must keep its original bitwidth Need to check if during previous analysis the node has requested to keep its original bitwidth to avoid incorrect codegen. Fixes #120076	2024-12-16 08:01:22 -08:00
Han-Kuan Chen	da439d3af4	[SLP] NFC. Refactor getEntryCost and isReverseOrder usage. (#119680 ) Users should check whether an input is empty before using isReverseOrder.	2024-12-14 02:01:25 +08:00
Ramkumar Ramachandra	4a0d53a0b0	PatternMatch: migrate to CmpPredicate (#118534 ) With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.	2024-12-13 14:18:33 +00:00
Han-Kuan Chen	3133acf1fb	Revert "[SLP] Make getSameOpcode support different instructions if they have same semantics. (#112181 )" This reverts commit 82204154b7bd1f8c487c94c7ef00399d776b29f0.	2024-12-12 20:38:31 -08:00
Han-Kuan Chen	82204154b7	[SLP] Make getSameOpcode support different instructions if they have same semantics. (#112181 )	2024-12-13 12:06:10 +08:00
Han-Kuan Chen	2546ae4ed0	[SLP][REVEC] Fix the number of elements in the mask of a ShuffleVectorInst is not a power of 2. (#119689 ) The following shufflevector should not be vectorized when slp-vectorize-non-power-of-2 is enabled. shufflevector <8 x float> %1, <8 x float> poison, <3 x i32> <i32 0, i32 1, i32 2> shufflevector <8 x float> %1, <8 x float> poison, <3 x i32> <i32 4, i32 5, i32 6>	2024-12-13 02:22:41 +08:00
Kazu Hirata	2f8238f849	[llvm] Migrate away from PointerUnion::{is,get} (NFC) (#119679 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2024-12-12 07:54:48 -08:00
Mel Chen	b3cba9be41	[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable (#67812 ) Consider the following loop: ``` int rdx = init; for (int i = 0; i < n; ++i) rdx = (a[i] > b[i]) ? i : rdx; ``` We can vectorize this loop if `i` is an increasing induction variable. The final reduced value will be the maximum of `i` that the condition `a[i] > b[i]` is satisfied, or the start value `init`. This patch added new RecurKind enums - IFindLastIV and FFindLastIV. --------- Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>	2024-12-12 16:48:31 +08:00
Han-Kuan Chen	51a0c1bf25	[SLP] NFC. Replace TreeEntry::setOperandsInOrder with VLOperands. (#118949 ) To reduce repeated code, TreeEntry::setOperandsInOrder will be replaced by VLOperands. Arg_size will be provided to make sure other operands will not be reorderd when VL[0] is IntrinsicInst (because APO is a boolean value). In addition, BoUpSLP::reorderInputsAccordingToOpcode will also be removed since it is simple.	2024-12-11 10:09:23 +08:00
Alexey Bataev	a42aa8f265	[SLP]Fix adjusting of the mask for the fully matched nodes. When checking for the poison elements in the matches node, need to consider the register number, when clearing the corresponding mask element. Fixes #119393	2024-12-10 09:47:16 -08:00
Han-Kuan Chen	da421f55a7	[SLP] NFC. Make InstructionsState more constant. (#118609 ) Add getMainOp and getAltOp. Use `InstructionsState &` instead of `const InstructionsState &`. Use `!S.isAltShuffle()` instead of `S.MainOp == S.AltOp`.	2024-12-10 23:28:14 +08:00
Han-Kuan Chen	b97c447dac	[SLP] NFC. Add assert for shouldBroadcast and canBeVectorized. (#119327 )	2024-12-10 19:25:19 +08:00
Alexey Bataev	376dad72ab	[SLP]Move resulting vector before inert point, if the late generated buildvector fully matched If the perfect diamond match was detected for the postponed buildvectors and the vector for the previous node comes after the current node, need to move the vector register before the current inserting point to prevent compiler crash. Fixes #119002	2024-12-06 13:54:48 -08:00
Nikita Popov	6307e4b31e	Revert "[SLP] NFC. Replace TreeEntry::setOperandsInOrder with VLOperands. (#113880 )" This reverts commit 94fbe7e3ae7c0ce4e9a7d801e7700457a36f731d. Causes a crash when linking mafft in ReleaseLTO-g config.	2024-12-06 14:27:03 +01:00
Anutosh Bhat	89e919fb0d	Fix warnings while compiling SLPVectorizer.cpp (#118051 ) Towards #118048 I was building llvm (clang and lld) for webassembly and came across these warnings. Not sure if they are seen in our builds too. ``` /Users/anutosh491/work/llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:6924:67: warning: comparison of integers of different signs: 'typename iterator_traits<user_iterator_impl<User>>::difference_type' (aka 'long') and 'unsigned int' [-Wsign-compare] 6924 \| if (std::distance(LI->user_begin(), LI->user_end()) != \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ 6925 \| LI->getNumUses()) \| ~~~~~~~~~~~~~~~~ [ 79%] Building CXX object lib/Transforms/Instrumentation/CMakeFiles/LLVMInstrumentation.dir/PGOInstrumentation.cpp.o /Users/anutosh491/work/llvm-project/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:9754:43: warning: comparison of integers of different signs: 'typename iterator_traits<Value const >::difference_type' (aka 'long') and 'unsigned int' [-Wsign-compare] 9754 \| count(Slice, Slice.front()) == \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ 9755 \| (isa<UndefValue>(Slice.front()) ? VF - 1 : 1)) { ``` This PR tries to address those warnings.	2024-12-06 08:11:39 -05:00
Han-Kuan Chen	94fbe7e3ae	[SLP] NFC. Replace TreeEntry::setOperandsInOrder with VLOperands. (#113880 ) To reduce repeated code, TreeEntry::setOperandsInOrder will be replaced by VLOperands. Arg_size will be provided to make sure other operands will not be reorderd when VL[0] is IntrinsicInst (because APO is a boolean value). In addition, BoUpSLP::reorderInputsAccordingToOpcode will also be removed since it is simple.	2024-12-06 12:03:23 +08:00

1 2 3 4 5 ...

2065 Commits