143 Commits

Author SHA1 Message Date
Benjamin Maxwell
288909883c
[IVDesc] Add [[maybe_unused]] to NumNonPHIUsers (NFC) (#180729) 2026-02-10 12:08:00 +00:00
Benjamin Maxwell
f22a178b13
Reland "[LV] Support conditional scalar assignments of masked operations" (#180708)
This patch extends the support added in #158088 to loops where the
assignment is non-speculatable (e.g. a conditional load or divide).

For example, the following loop can now be vectorized:

```
int simple_csa_int_load(
  int* a, int* b, int default_val, int N, int threshold)
{
  int result = default_val;
  for (int i = 0; i < N; ++i)
    if (a[i] > threshold)
      result = b[i];
  return result;
}
```

It does this by extending the recurrence matching from only looking for
selects, to include phis where all operands are the header phi, except
for one which can be an arbitrary value outside the recurrence.

---

Reverts llvm/llvm-project#180275 (original PR: #178862)

Additional type legalization for `ISD::VECTOR_FIND_LAST_ACTIVE` was
added in #180290, which should resolve the backend crashes on x86.
2026-02-10 09:57:48 +00:00
hanbeom
77ccd853d0
[IVDesc] Check loop-preheader for loop-legality when pass-remarks enabled (#166310)
When `-pass-remarks=loop-vectorize` is specified, the subsequent logic
is executed to display detailed debug messages even if no PreHeader
exists in the loop.

Therefore, an assert occurs when the `getLoopPreHeader()` function is
called. This commit resolves that issue.

Fixed: #165377
2026-02-10 00:02:13 +09:00
Kewen Meng
703c2762d3
Revert "[LV] Support conditional scalar assignments of masked operations" (#180275)
Reverts llvm/llvm-project#178862 

revert to unblock bot:
https://lab.llvm.org/buildbot/#/builders/206/builds/13225
2026-02-06 13:24:40 -08:00
Benjamin Maxwell
4f90eb6427
[LV] Support conditional scalar assignments of masked operations (#178862)
This patch extends the support added in #158088 to loops where the
assignment is non-speculatable (e.g. a conditional load or divide).

For example, the following loop can now be vectorized:

```
int simple_csa_int_load(
  int* a, int* b, int default_val, int N, int threshold)
{
  int result = default_val;
  for (int i = 0; i < N; ++i)
    if (a[i] > threshold)
      result = b[i];
  return result;
}
```

It does this by extending the recurrence matching from only looking for
selects, to include phis where all operands are the header phi, except
for one which can be an arbitrary value outside the recurrence.
2026-02-06 11:43:06 +00:00
Florian Hahn
05a2b146fb
[LV] Optimize FindLast recurrences to FindIV (NFCI). (#177870)
This patch restructures Find(First|Last)IV handling. Instead of
differentiating between FindLast, FindFirstIV and FindLastIV up front,
this patch simplifies the logic in IVDescriptor to just identify the
FindLast pattern up-front.

It then adds a new VPlan transformation to optimize FindLast reductions
to FindIV reductions if there is a suitable sentinel value.
Find(Last|First)IV recurrence kinds to a single FindIV kind.

This is simpler and more accurate, given selecting the first/last
induction of the final IV reduction is directly controlled by the
corresponding recurrence kind of the ComputeReductionResult.

The new structure also allows further optimizations, like vectorizing
FindLastIV with another boolean reduction that tracks if the condition
in the loop was ever true, if there is no suitable sentinel value.

PR: https://github.com/llvm/llvm-project/pull/177870
2026-02-05 13:57:20 +00:00
serge-sans-paille
5a4754d2ce
[perf] Replace copy-assign by move-assign in llvm/lib/Analysis/* (#178169) 2026-02-02 21:26:49 +00:00
Graham Hunter
2abd6d6d7a
[LV] Vectorize conditional scalar assignments (#158088)
Based on Michael Maitland's previous work:
https://github.com/llvm/llvm-project/pull/121222

This PR uses the existing recurrences code instead of introducing a
new pass just for CSA autovec. I've also made recipes that are more
generic.
2026-01-14 14:59:18 +00:00
Ramkumar Ramachandra
e8cceccea1
[IVDesc] Fix off-by-one error in FindFirstIV ranges (#174441)
ConstantRange::getNonEmpty was excluding MAX and MAX - 1 in FindFirstIV
vectorization, and this was discovered in an i1 miscompile, where it
returns the full range: fix it to exclude MAX only. The change has also
necessitated fixing a test that's not supposed to be vectorized.

Fixes #173459.

Co-authored-by: Nikita Popov <npopov@redhat.com>
2026-01-12 18:08:49 +00:00
Florian Hahn
99addbf73d
[LV] Vectorize selecting last IV of min/max element. (#141431)
Add support for vectorizing loops that select the index of the minimum
or maximum element. The patch implements vectorizing those patterns by
combining Min/Max and FindFirstIV reductions.

It extends matching Min/Max reductions to allow in-loop users that are
FindLastIV reductions. It records a flag indicating that the Min/Max
reduction is used by another reduction. The extra user is then check as
part of the new `handleMultiUseReductions` VPlan transformation.

It processes any reduction that has other reduction users. The reduction
using the min/max reduction currently must be a FindLastIV reduction,
which needs adjusting to compute the correct result:
 1. We need to find the last IV for which the condition based on the
     min/max reduction is true,
 2. Compare the partial min/max reduction result to its final value and,
 3. Select the lanes of the partial FindLastIV reductions which
     correspond to the lanes matching the min/max reduction result.

Depends on https://github.com/llvm/llvm-project/pull/140451

PR: https://github.com/llvm/llvm-project/pull/141431
2025-11-28 22:26:19 +00:00
Ramkumar Ramachandra
e06c148af7
[IVDesc] Use SCEVPatternMatch to improve code (NFC) (#168397) 2025-11-25 12:29:56 +00:00
Julian Nagele
c73de9777e
[IVDesciptors] Support detecting reductions with vector instructions. (#166353)
In combination with https://github.com/llvm/llvm-project/pull/149470
this will introduce parallel accumulators when unrolling reductions with
vector instructions. See also
https://github.com/llvm/llvm-project/pull/166630, which aims to
introduce parallel accumulators for FP reductions.
2025-11-24 11:12:06 +00:00
Mel Chen
3277f6caef
[LV] Explicitly disable in-loop reductions for AnyOf and FindIV. nfc (#163541)
Currently, in-loop reductions for AnyOf and FindIV are not supported.
They were implicitly blocked. This happened because
RecurrenceDescriptor::getReductionOpChain could not detect their
recurrence chain. The reason is that RecurrenceDescriptor::getOpcode was
set to Instruction::Or, but the recurrence chains of AnyOf and FindIV do
not actually contain an Instruction::Or.

This patch explicitly disables in-loop reductions for AnyOf and FindIV
instead of relying on getReductionOpChain to implicitly prevent them.
2025-11-14 09:14:07 +00:00
Ramkumar Ramachandra
f345d9b58e
[IVDesc] Improve isConditionalRdxPattern (NFC) (#162818) 2025-10-14 11:41:47 +01:00
Sam Tebbs
0bfa1718af
[LV] Create in-loop sub reductions (#147026)
This PR allows the loop vectorizer to handle in-loop sub reductions by
forming a normal in-loop add reduction with a negated input.

Stacked PRs:
1. -> https://github.com/llvm/llvm-project/pull/147026
2. https://github.com/llvm/llvm-project/pull/147255
3. https://github.com/llvm/llvm-project/pull/147302
4. https://github.com/llvm/llvm-project/pull/147513
2025-08-12 10:22:41 +01:00
Florian Hahn
004c67ea25
[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239)
Update LV to vectorize maxnum/minnum reductions without fast-math flags,
by adding an extra check in the loop if any inputs to maxnum/minnum are
NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros 
are already handled consistently by maxnum/minnum.

If any input is NaN,
 *exit the vector loop,
 *compute the reduction result up to the vector iteration that contained
   NaN inputs and
 * resume in the scalar loop


New recurrence kinds are added for reductions using maxnum/minnum
without fast-math flags.

PR: https://github.com/llvm/llvm-project/pull/148239
2025-07-18 21:58:19 +01:00
Ramkumar Ramachandra
62f8377e40
[LV] Extend FindFirstIV to unsigned case (#146386)
Extend FindFirstIV vectorization to the unsigned case by introducing and
handling FindFirstIVUMin.

Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-07-09 15:56:40 +01:00
Florian Hahn
20fbbd7675
[LV] Add support for cmp reductions with decreasing IVs. (#140451)
Similar to FindLastIV, add FindFirstIVSMin to support select (icmp(), x, y)
reductions where one of x or y is a decreasing induction, producing a SMin
 reduction. It uses signed max as sentinel value.

PR: https://github.com/llvm/llvm-project/pull/140451
2025-06-29 11:17:03 +01:00
Ramkumar Ramachandra
bb8c42e859
[LV] Extend FindLastIV to unsigned case (#141752)
Split the FindLastIV RecurKind into SMax and UMax variants, depending on
the reduction op produced.
2025-06-23 15:27:49 +01:00
Ramkumar Ramachandra
3f7b8852cd
[IVDesc] Drop unused arg in isConditionalRdxPattern (NFC) (#142942) 2025-06-06 08:48:23 +01:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
Ramkumar Ramachandra
0240129218
[IVDesc] Unify RecurKinds [I|F]AnyOf (#118393)
Co-authored-by: Mel Chen <mel.chen@sifive.com>
2025-05-23 11:57:30 +01:00
Ramkumar Ramachandra
b81170ecff
[IVDesc] Unify RecurKinds [I|F]FindLastIV (NFC) (#141082) 2025-05-22 22:48:01 +01:00
Ramkumar Ramachandra
bec038db5c
[IVDesc] Prefer empty m_Cmp on unused result (NFC) (#141071) 2025-05-22 17:32:08 +01:00
Mel Chen
f594cd0936
[IVDescriptor][LV] Return Instruction::Or for IAnyOf/FAnyOf in getOpcode(), nfc (#140242) 2025-05-19 16:17:04 +08:00
Mel Chen
08f0aa4800
[IVDescriptors] Call getOpcode on demand in getReductionOpChain. nfc (#118777)
Non-arithmetic reductions do not require the binary opcodes.
As a first step toward removing the dependency of non-arithmetic
reductions on `getOpcode` function, this patch refactors the
`getReductionOpChain` function.

In the future, once all users of `getOpcode` function are refactored, an
assertion can be added to `getOpcode` function to ensure that only
arithmetic reductions rely on it.
2025-04-30 17:01:14 +08:00
Florian Hahn
d68b446933
[IR] Add matchers for remaining FP min/max intrinsics (NFC). (#137612)
Add dedicated matchers for minimum,maximum,minimumnum and maximumnum
intrinsics, similar for the existing matchers for maxnum and minnum.

As suggested in https://github.com/llvm/llvm-project/pull/137335.

PR: https://github.com/llvm/llvm-project/pull/137612
2025-04-29 12:20:00 +01:00
Florian Hahn
ec1016f7ef
[IVDescriptors] Support reductions with minimumnum/maximumnum. (#137335)
Add a new reduction recurrence kind for reductions with
minimumnum/maximumnum. Such reductions can be vectorized without
nsz/nnans, same as reductions with maximum/minimum intrinsics.

Note that a new reduction kind is needed to make sure partial reductions
are also combined with minimumnum/maximumnum.

Note that the final reduction to a scalar value is performed with
vector.reduce.fmin/fmax. This should be fine, as the results of the
partial reductions with maximumnum/minimumnum silences any sNaNs.

In-loop and reductions in SLP are not supported yet, as there's no
reduction version of maximumnum/minimumnum yet and fmax may be
incorrect.

PR: https://github.com/llvm/llvm-project/pull/137335
2025-04-28 11:16:36 +01:00
Kazu Hirata
8f5c3deadd
[Analysis] Use llvm::append_range (NFC) (#133602) 2025-03-29 16:52:36 -07:00
Luke Lau
345748e027
[IVDescriptor] Explicitly check for isMinMaxRecurrenceKind in getReductionOpChain. NFC (#132025)
There are other types of recurrences with an icmp/fcmp opcode, AnyOf and
FindLastIV, so don't rely on the opcode to detect them.
This makes adding support for AnyOf in #131830 easier.

Note that these currently fail the ExpectedUses/isCorrectOpcode checks
anyway, so there shouldn't be any functional change.
2025-03-20 21:22:55 +08:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Mel Chen
b3cba9be41
[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable (#67812)
Consider the following loop:
```
  int rdx = init;
  for (int i = 0; i < n; ++i)
    rdx = (a[i] > b[i]) ? i : rdx;
```
We can vectorize this loop if `i` is an increasing induction variable.
The final reduced value will be the maximum of `i` that the condition
`a[i] > b[i]` is satisfied, or the start value `init`.

This patch added new RecurKind enums - IFindLastIV and FFindLastIV.

---------

Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>
2024-12-12 16:48:31 +08:00
Ramkumar Ramachandra
2a0ee090db
IVDesc: strip redundant arg in getOpcode call (NFC) (#118476) 2024-12-03 13:40:51 +00:00
Kazu Hirata
236fda550d
[Analysis] Remove unused includes (NFC) (#114936)
Identified with misc-include-cleaner.
2024-11-05 19:11:34 -08:00
Alexey Bader
583fa4f5b7
[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088)
Today, InstCombine can fold fcmp+select patterns to minnum/maxnum
intrinsics when the nnan and nsz flags are set. The ordering of the
operands in both the fcmp and select instructions is important for the
folding to occur.

maxnum patterns:
1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge}
2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult}

The second pattern is supposed to make the order of the operands in the
select instruction irrelevant. However, the pattern matching code uses
the CmpInst::getInversePredicate method to invert the comparison
predicate. This method doesn't take into account the fast-math flags,
which can lead missing the folding opportunity.

The patch extends the pattern matching code to handle unordered fcmp
instructions. This allows the folding to occur even when the select
instruction has the operands in the inverse order.

New maxnum patterns:
1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge}
2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt}

The same changes are applied to the minnum intrinsic.
2024-10-15 22:05:16 +04:00
Philip Reames
3d9abfc9f8 Consolidate all IR logic for getting the identity value of a reduction [nfc]
This change merges the three different places (at the IR layer) for
finding the identity value of a reduction into a single copy.  This
depends on several prior commits which fix ommissions and bugs in
the distinct copies, but this patch itself should be fully
non-functional.

As the new comments and naming try to make clear, the identity value
is a property of the @llvm.vector.reduce.* intrinsic, not of e.g.
the recurrence descriptor.  (We still provide an interface for
clients using recurrence descriptors, but the implementation simply
translates to the intrinsic which each corresponds to.)

As a note, the getIntrinsicIdentity API does not support fminnum/fmaxnum
or fminimum/fmaximum which is why we still need manual logic (but at
least only one copy of manual logic) for those cases.
2024-09-04 08:23:21 -07:00
Ramkumar Ramachandra
f119151537
IVDescriptors: improve readability of a function (NFC) (#106219)
Avoid dereferencing operand to llvm::isa.
2024-09-04 14:09:04 +01:00
Philip Reames
1fbb6b4efc
[LV] Prefer FLT_MIN/MAX for fmin/fmax reductions with ninf (#107141)
Analogous to 2c7786e94a1058bd4f96794a1d4f70dcb86e5cc5, cleanup a case
where the vectorizer is emitting a non-canonical identity value given
the available flags. We use largest/smallest value during ISEL, and VP
expansion, but not during vectorization.

Since the fmin/fmax/fminimum/fmaximum intrinsics don't require a start
value, this difference is only visible when masking of inactive lanes is
required.

Primary motivation of this change is simply to remove a difference
between version of code which reason about the identity value of a
reduction so I can kill all but one off.

In review, it was pointed out that this is actually a functional fix as well. 
The old code used inf on a noinf reduction instruction - whose
result is poison!  That wasn't the intent of the code.
2024-09-03 12:21:54 -07:00
Philip Reames
0b2f2537a5 [LV] Separate AnyOf recurrence from getRecurrenceIdentity [NFC]
These recurrence types don't have a meaningful identity, and the
routine was abused to return the start value instead.  Out of the
three callers to this routine, only one actually wants this
behavior.  This is a prep change for removing the routine entirely
and commoning it with other copies of the same logic.
2024-09-03 09:46:30 -07:00
Philip Reames
68805de902 [IVDesc] Reuse getBinOpIdentity in getRecurrenceIdentity [nfc]
Avoid duplication so that we can easily tell these lists are in sync.
2024-08-30 09:10:34 -07:00
Ramkumar Ramachandra
ae58cc0e99
IVDescriptors: clarify getSCEV use in a function (NFC) (#106222)
getSCEV will assert unless the operand is SCEVable. Replace an instance
of the implementation of ScalarEvolution::isSCEVable (which checks that
the operand is either integer or pointer type) with a call to the
function, to make it clear that the subsequent use of getSCEV will not
fail.
2024-08-27 16:44:50 +01:00
Dinar Temirbulatov
31d4c97506
[LoopVectorize] LLVM fails to vectorise loops with multi-bool varables (#89226)
This change allows to consider compare instructions in the loop with
multiple use inside the loop and outside.

This change allows to vectorise this loop:
int foo(float* a, int n) {
  _Bool any = 0;
  _Bool all = 1;
  for (int i = 0; i < n; i++) {
    if (a[i] < 0.0f) {
      any = 1;
    } else {
      all = 0;
    }
  }
  return all ? 1 : any ? 2 : 3;
}
2024-07-15 20:21:50 +01:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Yingwei Zheng
470c5b8011
[InstSimplify][InstCombine] Remove unnecessary m_c_* matchers. (#81712)
This patch removes unnecessary `m_c_*` matchers since we always
canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`.

Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u
2024-02-14 16:40:36 +08:00
Kazu Hirata
6c87a0af95 [Analysis] Remove unnecessary includes (NFC) 2023-12-07 22:15:32 -08:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Mel Chen
425e9e81a0 [LV] Rename the Select[I|F]Cmp reduction pattern to [I|F]AnyOf. (NFC)
Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D155786
2023-08-03 00:37:19 -07:00
Nikita Popov
94abecca6b [IVDescriptors] Remove typed pointer support (NFC)
This also removes the element type from the descriptor, as it is
always i8. The meaning of the step is now the same between
integers and pointers.
2023-07-12 15:48:29 +02:00
Anna Thomas
ec146cb7c0 [LV] Add support for minimum/maximum intrinsics
{mini|maxi}mum intrinsics are different from {min|max}num intrinsics in
the propagation of NaN and signed zero. Also, the minnum/maxnum
intrinsics require the presence of nsz flags to be valid reductions in
vectorizer. In this regard, we introduce a new recurrence kind and also
add support for identifying reduction patterns using these intrinsics.

The reduction intrinsics and lowering was introduced here: 26bfbec5d2.

There are tests added which show how this interacts across chains of
min/max patterns.

Differential Revision: https://reviews.llvm.org/D151482
2023-06-20 13:17:28 -04:00
Vedant Paranjape
cf9b3e55a2 [IVDescriptors] Add assert to isInductionPhi to check for invalid Phis
Phis that are present inside loop headers can only be Induction Phis
legally. This patch adds an assertion to isInductionPhi which checks for
the said legality and it also updates the docs of the said function to
reflect the given legality.

Differential Revision: https://reviews.llvm.org/D149041
2023-04-28 04:41:47 +00:00