llvm-project

Author	SHA1	Message	Date
Sam Tebbs	0bfa1718af	[LV] Create in-loop sub reductions (#147026 ) This PR allows the loop vectorizer to handle in-loop sub reductions by forming a normal in-loop add reduction with a negated input. Stacked PRs: 1. -> https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/147302 4. https://github.com/llvm/llvm-project/pull/147513	2025-08-12 10:22:41 +01:00
Florian Hahn	004c67ea25	[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239 ) Update LV to vectorize maxnum/minnum reductions without fast-math flags, by adding an extra check in the loop if any inputs to maxnum/minnum are NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros are already handled consistently by maxnum/minnum. If any input is NaN, exit the vector loop, compute the reduction result up to the vector iteration that contained NaN inputs and * resume in the scalar loop New recurrence kinds are added for reductions using maxnum/minnum without fast-math flags. PR: https://github.com/llvm/llvm-project/pull/148239	2025-07-18 21:58:19 +01:00
Ramkumar Ramachandra	62f8377e40	[LV] Extend FindFirstIV to unsigned case (#146386 ) Extend FindFirstIV vectorization to the unsigned case by introducing and handling FindFirstIVUMin. Co-authored-by: Florian Hahn <flo@fhahn.com>	2025-07-09 15:56:40 +01:00
Florian Hahn	20fbbd7675	[LV] Add support for cmp reductions with decreasing IVs. (#140451 ) Similar to FindLastIV, add FindFirstIVSMin to support select (icmp(), x, y) reductions where one of x or y is a decreasing induction, producing a SMin reduction. It uses signed max as sentinel value. PR: https://github.com/llvm/llvm-project/pull/140451	2025-06-29 11:17:03 +01:00
Ramkumar Ramachandra	bb8c42e859	[LV] Extend FindLastIV to unsigned case (#141752 ) Split the FindLastIV RecurKind into SMax and UMax variants, depending on the reduction op produced.	2025-06-23 15:27:49 +01:00
Ramkumar Ramachandra	3f7b8852cd	[IVDesc] Drop unused arg in isConditionalRdxPattern (NFC) (#142942 )	2025-06-06 08:48:23 +01:00
Ramkumar Ramachandra	b40e4ceaa6	[ValueTracking] Make Depth last default arg (NFC) (#142384 ) Having a finite Depth (or recursion limit) for computeKnownBits is very limiting, but is currently a load-bearing necessity, as all KnownBits are recomputed on each call and there is no caching. As a prerequisite for an effort to remove the recursion limit altogether, either using a clever caching technique, or writing a easily-invalidable KnownBits analysis, make the Depth argument in APIs in ValueTracking uniformly the last argument with a default value. This would aid in removing the argument when the time comes, as many callers that currently pass 0 explicitly are now updated to omit the argument altogether.	2025-06-03 17:12:24 +01:00
Ramkumar Ramachandra	0240129218	[IVDesc] Unify RecurKinds [I\|F]AnyOf (#118393 ) Co-authored-by: Mel Chen <mel.chen@sifive.com>	2025-05-23 11:57:30 +01:00
Ramkumar Ramachandra	b81170ecff	[IVDesc] Unify RecurKinds [I\|F]FindLastIV (NFC) (#141082 )	2025-05-22 22:48:01 +01:00
Ramkumar Ramachandra	bec038db5c	[IVDesc] Prefer empty m_Cmp on unused result (NFC) (#141071 )	2025-05-22 17:32:08 +01:00
Mel Chen	f594cd0936	[IVDescriptor][LV] Return Instruction::Or for IAnyOf/FAnyOf in getOpcode(), nfc (#140242 )	2025-05-19 16:17:04 +08:00
Mel Chen	08f0aa4800	[IVDescriptors] Call getOpcode on demand in getReductionOpChain. nfc (#118777 ) Non-arithmetic reductions do not require the binary opcodes. As a first step toward removing the dependency of non-arithmetic reductions on `getOpcode` function, this patch refactors the `getReductionOpChain` function. In the future, once all users of `getOpcode` function are refactored, an assertion can be added to `getOpcode` function to ensure that only arithmetic reductions rely on it.	2025-04-30 17:01:14 +08:00
Florian Hahn	d68b446933	[IR] Add matchers for remaining FP min/max intrinsics (NFC). (#137612 ) Add dedicated matchers for minimum,maximum,minimumnum and maximumnum intrinsics, similar for the existing matchers for maxnum and minnum. As suggested in https://github.com/llvm/llvm-project/pull/137335. PR: https://github.com/llvm/llvm-project/pull/137612	2025-04-29 12:20:00 +01:00
Florian Hahn	ec1016f7ef	[IVDescriptors] Support reductions with minimumnum/maximumnum. (#137335 ) Add a new reduction recurrence kind for reductions with minimumnum/maximumnum. Such reductions can be vectorized without nsz/nnans, same as reductions with maximum/minimum intrinsics. Note that a new reduction kind is needed to make sure partial reductions are also combined with minimumnum/maximumnum. Note that the final reduction to a scalar value is performed with vector.reduce.fmin/fmax. This should be fine, as the results of the partial reductions with maximumnum/minimumnum silences any sNaNs. In-loop and reductions in SLP are not supported yet, as there's no reduction version of maximumnum/minimumnum yet and fmax may be incorrect. PR: https://github.com/llvm/llvm-project/pull/137335	2025-04-28 11:16:36 +01:00
Kazu Hirata	8f5c3deadd	[Analysis] Use llvm::append_range (NFC) (#133602 )	2025-03-29 16:52:36 -07:00
Luke Lau	345748e027	[IVDescriptor] Explicitly check for isMinMaxRecurrenceKind in getReductionOpChain. NFC (#132025 ) There are other types of recurrences with an icmp/fcmp opcode, AnyOf and FindLastIV, so don't rely on the opcode to detect them. This makes adding support for AnyOf in #131830 easier. Note that these currently fail the ExpectedUses/isCorrectOpcode checks anyway, so there shouldn't be any functional change.	2025-03-20 21:22:55 +08:00
Ramkumar Ramachandra	4a0d53a0b0	PatternMatch: migrate to CmpPredicate (#118534 ) With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.	2024-12-13 14:18:33 +00:00
Mel Chen	b3cba9be41	[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable (#67812 ) Consider the following loop: ``` int rdx = init; for (int i = 0; i < n; ++i) rdx = (a[i] > b[i]) ? i : rdx; ``` We can vectorize this loop if `i` is an increasing induction variable. The final reduced value will be the maximum of `i` that the condition `a[i] > b[i]` is satisfied, or the start value `init`. This patch added new RecurKind enums - IFindLastIV and FFindLastIV. --------- Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>	2024-12-12 16:48:31 +08:00
Ramkumar Ramachandra	2a0ee090db	IVDesc: strip redundant arg in getOpcode call (NFC) (#118476 )	2024-12-03 13:40:51 +00:00
Kazu Hirata	236fda550d	[Analysis] Remove unused includes (NFC) (#114936 ) Identified with misc-include-cleaner.	2024-11-05 19:11:34 -08:00
Alexey Bader	583fa4f5b7	[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088 ) Today, InstCombine can fold fcmp+select patterns to minnum/maxnum intrinsics when the nnan and nsz flags are set. The ordering of the operands in both the fcmp and select instructions is important for the folding to occur. maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult} The second pattern is supposed to make the order of the operands in the select instruction irrelevant. However, the pattern matching code uses the CmpInst::getInversePredicate method to invert the comparison predicate. This method doesn't take into account the fast-math flags, which can lead missing the folding opportunity. The patch extends the pattern matching code to handle unordered fcmp instructions. This allows the folding to occur even when the select instruction has the operands in the inverse order. New maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt} The same changes are applied to the minnum intrinsic.	2024-10-15 22:05:16 +04:00
Philip Reames	3d9abfc9f8	Consolidate all IR logic for getting the identity value of a reduction [nfc] This change merges the three different places (at the IR layer) for finding the identity value of a reduction into a single copy. This depends on several prior commits which fix ommissions and bugs in the distinct copies, but this patch itself should be fully non-functional. As the new comments and naming try to make clear, the identity value is a property of the @llvm.vector.reduce.* intrinsic, not of e.g. the recurrence descriptor. (We still provide an interface for clients using recurrence descriptors, but the implementation simply translates to the intrinsic which each corresponds to.) As a note, the getIntrinsicIdentity API does not support fminnum/fmaxnum or fminimum/fmaximum which is why we still need manual logic (but at least only one copy of manual logic) for those cases.	2024-09-04 08:23:21 -07:00
Ramkumar Ramachandra	f119151537	IVDescriptors: improve readability of a function (NFC) (#106219 ) Avoid dereferencing operand to llvm::isa.	2024-09-04 14:09:04 +01:00
Philip Reames	1fbb6b4efc	[LV] Prefer FLT_MIN/MAX for fmin/fmax reductions with ninf (#107141 ) Analogous to 2c7786e94a1058bd4f96794a1d4f70dcb86e5cc5, cleanup a case where the vectorizer is emitting a non-canonical identity value given the available flags. We use largest/smallest value during ISEL, and VP expansion, but not during vectorization. Since the fmin/fmax/fminimum/fmaximum intrinsics don't require a start value, this difference is only visible when masking of inactive lanes is required. Primary motivation of this change is simply to remove a difference between version of code which reason about the identity value of a reduction so I can kill all but one off. In review, it was pointed out that this is actually a functional fix as well. The old code used inf on a noinf reduction instruction - whose result is poison! That wasn't the intent of the code.	2024-09-03 12:21:54 -07:00
Philip Reames	0b2f2537a5	[LV] Separate AnyOf recurrence from getRecurrenceIdentity [NFC] These recurrence types don't have a meaningful identity, and the routine was abused to return the start value instead. Out of the three callers to this routine, only one actually wants this behavior. This is a prep change for removing the routine entirely and commoning it with other copies of the same logic.	2024-09-03 09:46:30 -07:00
Philip Reames	68805de902	[IVDesc] Reuse getBinOpIdentity in getRecurrenceIdentity [nfc] Avoid duplication so that we can easily tell these lists are in sync.	2024-08-30 09:10:34 -07:00
Ramkumar Ramachandra	ae58cc0e99	IVDescriptors: clarify getSCEV use in a function (NFC) (#106222 ) getSCEV will assert unless the operand is SCEVable. Replace an instance of the implementation of ScalarEvolution::isSCEVable (which checks that the operand is either integer or pointer type) with a call to the function, to make it clear that the subsequent use of getSCEV will not fail.	2024-08-27 16:44:50 +01:00
Dinar Temirbulatov	31d4c97506	[LoopVectorize] LLVM fails to vectorise loops with multi-bool varables (#89226 ) This change allows to consider compare instructions in the loop with multiple use inside the loop and outside. This change allows to vectorise this loop: int foo(float* a, int n) { _Bool any = 0; _Bool all = 1; for (int i = 0; i < n; i++) { if (a[i] < 0.0f) { any = 1; } else { all = 0; } } return all ? 1 : any ? 2 : 3; }	2024-07-15 20:21:50 +01:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Yingwei Zheng	470c5b8011	[InstSimplify][InstCombine] Remove unnecessary `m_c_` matchers. (#81712 ) This patch removes unnecessary `m_c_` matchers since we always canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`. Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u	2024-02-14 16:40:36 +08:00
Kazu Hirata	6c87a0af95	[Analysis] Remove unnecessary includes (NFC)	2023-12-07 22:15:32 -08:00
Fangrui Song	111fcb0df0	[llvm] Fix duplicate word typos. NFC Those fixes were taken from https://reviews.llvm.org/D137338	2023-09-01 18:25:16 -07:00
Mel Chen	425e9e81a0	[LV] Rename the Select[I\|F]Cmp reduction pattern to [I\|F]AnyOf. (NFC) Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261 Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D155786	2023-08-03 00:37:19 -07:00
Nikita Popov	94abecca6b	[IVDescriptors] Remove typed pointer support (NFC) This also removes the element type from the descriptor, as it is always i8. The meaning of the step is now the same between integers and pointers.	2023-07-12 15:48:29 +02:00
Anna Thomas	ec146cb7c0	[LV] Add support for minimum/maximum intrinsics {mini\|maxi}mum intrinsics are different from {min\|max}num intrinsics in the propagation of NaN and signed zero. Also, the minnum/maxnum intrinsics require the presence of nsz flags to be valid reductions in vectorizer. In this regard, we introduce a new recurrence kind and also add support for identifying reduction patterns using these intrinsics. The reduction intrinsics and lowering was introduced here: 26bfbec5d2. There are tests added which show how this interacts across chains of min/max patterns. Differential Revision: https://reviews.llvm.org/D151482	2023-06-20 13:17:28 -04:00
Vedant Paranjape	cf9b3e55a2	[IVDescriptors] Add assert to isInductionPhi to check for invalid Phis Phis that are present inside loop headers can only be Induction Phis legally. This patch adds an assertion to isInductionPhi which checks for the said legality and it also updates the docs of the said function to reflect the given legality. Differential Revision: https://reviews.llvm.org/D149041	2023-04-28 04:41:47 +00:00
Florian Hahn	6b8d19d2b5	Recommit "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts the revert commit 3d8ed8b5192a59104bfbd5bf7ac84d035ee0a4a5. The new version of the patch adds a set to avoid duplicating work in isFixedOrderRecurrence, which was previously done through the removed SinkAfter map. Original commit message: Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-20 09:31:16 +01:00
Manoj Gupta	3d8ed8b519	Revert "[VPlan] Switch to checking sinking legality for recurrences in VPlan." This reverts commit 7fc0b3049df532fce726d1ff6869a9f6e3183780. Causes a clang hang when building xz utils, github issue #62187.	2023-04-17 12:19:36 -07:00
Florian Hahn	7fc0b3049d	[VPlan] Switch to checking sinking legality for recurrences in VPlan. Building on D142885 and D142589, retire the SinkAfter map from the recurrence handling code. It is replaced by checking whether it is possible to sink all users of a recurrence directly in VPlan. This results in simpler code overall and allows to handle additional cases (see the improvements in @test_crash). Depends on D142885. Depends on D142589. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142886	2023-04-13 22:00:52 +01:00
Philip Reames	c416f6700f	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2) (JFYI - This has been heavily reframed since original attempt at landing.) This change updates the InductionDescriptor logic to allow matching a pointer IV with a non-constant stride, but also updates the LoopVectorizer to bailout on such descriptors by default. This preserves the default vectorizer behavior. In review, it was pointed out that there's multiple unfortunate performance implications which need to be addressed before this can be enabled. Having a flag allows us to exercise the behavior, and write test cases for logic which is otherwise unreachable (or hard to reach). This will also enable non-constant stride pointer recurrences for other consumers. I've audited said code, and don't see any obvious issues. Differential Revision: https://reviews.llvm.org/D147336	2023-04-05 09:32:35 -07:00
David Green	965a090f02	Revert "[IVDescriptors] Add pointer InductionDescriptors with non-constant strides" Multiple errors have being reported on https://reviews.llvm.org/rG498aa534f472d28db893aa9a8627d0b46e17f312 Reverting until the correctness issues can be resolved. We are also seeing a lot of performance differences from the patch. Some are looking good, but some are looking pretty bad.	2023-03-31 11:08:50 +01:00
Philip Reames	498aa534f4	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides This matches the handling for integer IVs. I left the non-opaque cases alone, mostly because they're largely irrelevant today. This doesn't actually make much difference in vectorization right now as we immediately fail on aliasing checks (which also bail on non-constant strides). Slightly suprisingly, it's the case which do need runtime checks which work after this patch as they don't use the same dependency analysis path. This will also enable non-constant stride pointer recurrences for other consumers. I've auditted said code, and don't see any obvious issues.	2023-03-30 11:56:00 -07:00
Craig Topper	80910d6ceb	[IVDescriptors] Pass IsSigned when creating an all 1s constant for UMin recurrence. This only matters for types larger than i64, and is consistent with the code for RecurKind::And which also creates all 1s. We don't have any tests for UMin or And with types larger than i64.	2023-03-08 09:52:42 -08:00
Kazu Hirata	f8f3db2756	Use APInt::count{l,r}_{zero,one} (NFC)	2023-02-19 22:04:47 -08:00
Matt Devereau	8ff47f6032	[LoopVectorize] Enable integer Mul and Add as select reduction patterns This patch vectorizes Phi node loop reductions for select's whos condition comes from a floating-point comparison, with its operands being integers for Add, Sub, and Mul reductions. Example: int foo(float *x, int n) { int sum = 0; for (int i=0; i<n; ++i) { float elem = x[i]; if (elem > 0) { sum += 2; } } return sum; } This would previously fail to vectorize due to the integer reduction.	2023-01-30 09:41:40 +00:00
Kazu Hirata	526966d07d	Use llvm::bit_ceil (NFC) Note that: std::has_single_bit(X) ? X : llvm::NextPowerOf2(X); is equivalent to: std::bit_ceil(X) even for input 0.	2023-01-28 16:13:09 -08:00
Matt Devereau	4468e27d9f	Revert "[LoopVectorize] Enable integer Mul and Add as select reduction patterns" This reverts commit f90103851f9a381bbf7ed6da250217577afd00d2.	2023-01-26 12:02:16 +00:00
Matt Devereau	f90103851f	[LoopVectorize] Enable integer Mul and Add as select reduction patterns This patch vectorizes Phi node loop reductions for select's whos condition comes from a floating-point comparison, with its operands being integers for Add, Sub, and Mul reductions. Example: int foo(float *x, int n) { int sum = 0; for (int i=0; i<n; ++i) { float elem = x[i]; if (elem > 0) { sum += 2; } } return sum; } Differential Revision: https://reviews.llvm.org/D141842	2023-01-25 13:25:18 +00:00
Piotr Fusik	898b5c9f5e	[NFC] Fix "form/from" typos Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D142007	2023-01-22 20:05:51 +01:00
Guillaume Chatelet	8fd5558b29	[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:49:38 +00:00

1 2 3

129 Commits