117 Commits

Author SHA1 Message Date
Florian Hahn
d68b446933
[IR] Add matchers for remaining FP min/max intrinsics (NFC). (#137612)
Add dedicated matchers for minimum,maximum,minimumnum and maximumnum
intrinsics, similar for the existing matchers for maxnum and minnum.

As suggested in https://github.com/llvm/llvm-project/pull/137335.

PR: https://github.com/llvm/llvm-project/pull/137612
2025-04-29 12:20:00 +01:00
Florian Hahn
ec1016f7ef
[IVDescriptors] Support reductions with minimumnum/maximumnum. (#137335)
Add a new reduction recurrence kind for reductions with
minimumnum/maximumnum. Such reductions can be vectorized without
nsz/nnans, same as reductions with maximum/minimum intrinsics.

Note that a new reduction kind is needed to make sure partial reductions
are also combined with minimumnum/maximumnum.

Note that the final reduction to a scalar value is performed with
vector.reduce.fmin/fmax. This should be fine, as the results of the
partial reductions with maximumnum/minimumnum silences any sNaNs.

In-loop and reductions in SLP are not supported yet, as there's no
reduction version of maximumnum/minimumnum yet and fmax may be
incorrect.

PR: https://github.com/llvm/llvm-project/pull/137335
2025-04-28 11:16:36 +01:00
Kazu Hirata
8f5c3deadd
[Analysis] Use llvm::append_range (NFC) (#133602) 2025-03-29 16:52:36 -07:00
Luke Lau
345748e027
[IVDescriptor] Explicitly check for isMinMaxRecurrenceKind in getReductionOpChain. NFC (#132025)
There are other types of recurrences with an icmp/fcmp opcode, AnyOf and
FindLastIV, so don't rely on the opcode to detect them.
This makes adding support for AnyOf in #131830 easier.

Note that these currently fail the ExpectedUses/isCorrectOpcode checks
anyway, so there shouldn't be any functional change.
2025-03-20 21:22:55 +08:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Mel Chen
b3cba9be41
[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable (#67812)
Consider the following loop:
```
  int rdx = init;
  for (int i = 0; i < n; ++i)
    rdx = (a[i] > b[i]) ? i : rdx;
```
We can vectorize this loop if `i` is an increasing induction variable.
The final reduced value will be the maximum of `i` that the condition
`a[i] > b[i]` is satisfied, or the start value `init`.

This patch added new RecurKind enums - IFindLastIV and FFindLastIV.

---------

Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>
2024-12-12 16:48:31 +08:00
Ramkumar Ramachandra
2a0ee090db
IVDesc: strip redundant arg in getOpcode call (NFC) (#118476) 2024-12-03 13:40:51 +00:00
Kazu Hirata
236fda550d
[Analysis] Remove unused includes (NFC) (#114936)
Identified with misc-include-cleaner.
2024-11-05 19:11:34 -08:00
Alexey Bader
583fa4f5b7
[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088)
Today, InstCombine can fold fcmp+select patterns to minnum/maxnum
intrinsics when the nnan and nsz flags are set. The ordering of the
operands in both the fcmp and select instructions is important for the
folding to occur.

maxnum patterns:
1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge}
2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult}

The second pattern is supposed to make the order of the operands in the
select instruction irrelevant. However, the pattern matching code uses
the CmpInst::getInversePredicate method to invert the comparison
predicate. This method doesn't take into account the fast-math flags,
which can lead missing the folding opportunity.

The patch extends the pattern matching code to handle unordered fcmp
instructions. This allows the folding to occur even when the select
instruction has the operands in the inverse order.

New maxnum patterns:
1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge}
2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt}

The same changes are applied to the minnum intrinsic.
2024-10-15 22:05:16 +04:00
Philip Reames
3d9abfc9f8 Consolidate all IR logic for getting the identity value of a reduction [nfc]
This change merges the three different places (at the IR layer) for
finding the identity value of a reduction into a single copy.  This
depends on several prior commits which fix ommissions and bugs in
the distinct copies, but this patch itself should be fully
non-functional.

As the new comments and naming try to make clear, the identity value
is a property of the @llvm.vector.reduce.* intrinsic, not of e.g.
the recurrence descriptor.  (We still provide an interface for
clients using recurrence descriptors, but the implementation simply
translates to the intrinsic which each corresponds to.)

As a note, the getIntrinsicIdentity API does not support fminnum/fmaxnum
or fminimum/fmaximum which is why we still need manual logic (but at
least only one copy of manual logic) for those cases.
2024-09-04 08:23:21 -07:00
Ramkumar Ramachandra
f119151537
IVDescriptors: improve readability of a function (NFC) (#106219)
Avoid dereferencing operand to llvm::isa.
2024-09-04 14:09:04 +01:00
Philip Reames
1fbb6b4efc
[LV] Prefer FLT_MIN/MAX for fmin/fmax reductions with ninf (#107141)
Analogous to 2c7786e94a1058bd4f96794a1d4f70dcb86e5cc5, cleanup a case
where the vectorizer is emitting a non-canonical identity value given
the available flags. We use largest/smallest value during ISEL, and VP
expansion, but not during vectorization.

Since the fmin/fmax/fminimum/fmaximum intrinsics don't require a start
value, this difference is only visible when masking of inactive lanes is
required.

Primary motivation of this change is simply to remove a difference
between version of code which reason about the identity value of a
reduction so I can kill all but one off.

In review, it was pointed out that this is actually a functional fix as well. 
The old code used inf on a noinf reduction instruction - whose
result is poison!  That wasn't the intent of the code.
2024-09-03 12:21:54 -07:00
Philip Reames
0b2f2537a5 [LV] Separate AnyOf recurrence from getRecurrenceIdentity [NFC]
These recurrence types don't have a meaningful identity, and the
routine was abused to return the start value instead.  Out of the
three callers to this routine, only one actually wants this
behavior.  This is a prep change for removing the routine entirely
and commoning it with other copies of the same logic.
2024-09-03 09:46:30 -07:00
Philip Reames
68805de902 [IVDesc] Reuse getBinOpIdentity in getRecurrenceIdentity [nfc]
Avoid duplication so that we can easily tell these lists are in sync.
2024-08-30 09:10:34 -07:00
Ramkumar Ramachandra
ae58cc0e99
IVDescriptors: clarify getSCEV use in a function (NFC) (#106222)
getSCEV will assert unless the operand is SCEVable. Replace an instance
of the implementation of ScalarEvolution::isSCEVable (which checks that
the operand is either integer or pointer type) with a call to the
function, to make it clear that the subsequent use of getSCEV will not
fail.
2024-08-27 16:44:50 +01:00
Dinar Temirbulatov
31d4c97506
[LoopVectorize] LLVM fails to vectorise loops with multi-bool varables (#89226)
This change allows to consider compare instructions in the loop with
multiple use inside the loop and outside.

This change allows to vectorise this loop:
int foo(float* a, int n) {
  _Bool any = 0;
  _Bool all = 1;
  for (int i = 0; i < n; i++) {
    if (a[i] < 0.0f) {
      any = 1;
    } else {
      all = 0;
    }
  }
  return all ? 1 : any ? 2 : 3;
}
2024-07-15 20:21:50 +01:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Yingwei Zheng
470c5b8011
[InstSimplify][InstCombine] Remove unnecessary m_c_* matchers. (#81712)
This patch removes unnecessary `m_c_*` matchers since we always
canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`.

Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u
2024-02-14 16:40:36 +08:00
Kazu Hirata
6c87a0af95 [Analysis] Remove unnecessary includes (NFC) 2023-12-07 22:15:32 -08:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Mel Chen
425e9e81a0 [LV] Rename the Select[I|F]Cmp reduction pattern to [I|F]AnyOf. (NFC)
Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D155786
2023-08-03 00:37:19 -07:00
Nikita Popov
94abecca6b [IVDescriptors] Remove typed pointer support (NFC)
This also removes the element type from the descriptor, as it is
always i8. The meaning of the step is now the same between
integers and pointers.
2023-07-12 15:48:29 +02:00
Anna Thomas
ec146cb7c0 [LV] Add support for minimum/maximum intrinsics
{mini|maxi}mum intrinsics are different from {min|max}num intrinsics in
the propagation of NaN and signed zero. Also, the minnum/maxnum
intrinsics require the presence of nsz flags to be valid reductions in
vectorizer. In this regard, we introduce a new recurrence kind and also
add support for identifying reduction patterns using these intrinsics.

The reduction intrinsics and lowering was introduced here: 26bfbec5d2.

There are tests added which show how this interacts across chains of
min/max patterns.

Differential Revision: https://reviews.llvm.org/D151482
2023-06-20 13:17:28 -04:00
Vedant Paranjape
cf9b3e55a2 [IVDescriptors] Add assert to isInductionPhi to check for invalid Phis
Phis that are present inside loop headers can only be Induction Phis
legally. This patch adds an assertion to isInductionPhi which checks for
the said legality and it also updates the docs of the said function to
reflect the given legality.

Differential Revision: https://reviews.llvm.org/D149041
2023-04-28 04:41:47 +00:00
Florian Hahn
6b8d19d2b5
Recommit "[VPlan] Switch to checking sinking legality for recurrences in VPlan."
This reverts the revert commit 3d8ed8b5192a59104bfbd5bf7ac84d035ee0a4a5.

The new version of the patch adds a set to avoid duplicating work in
isFixedOrderRecurrence, which was previously done through the removed
SinkAfter map.

Original commit message:
    Building on D142885 and D142589, retire the SinkAfter map from the
    recurrence handling code. It is replaced by checking whether it is
    possible to sink all users of a recurrence directly in VPlan. This
    results in simpler code overall and allows to handle additional cases
    (see the improvements in @test_crash).

    Depends on D142885.
    Depends on D142589.

    Reviewed By: Ayal

    Differential Revision: https://reviews.llvm.org/D142886
2023-04-20 09:31:16 +01:00
Manoj Gupta
3d8ed8b519 Revert "[VPlan] Switch to checking sinking legality for recurrences in VPlan."
This reverts commit 7fc0b3049df532fce726d1ff6869a9f6e3183780.

Causes a clang hang when building xz utils, github issue #62187.
2023-04-17 12:19:36 -07:00
Florian Hahn
7fc0b3049d
[VPlan] Switch to checking sinking legality for recurrences in VPlan.
Building on D142885 and D142589, retire the SinkAfter map from the
recurrence handling code. It is replaced by checking whether it is
possible to sink all users of a recurrence directly in VPlan. This
results in simpler code overall and allows to handle additional cases
(see the improvements in @test_crash).

Depends on D142885.
Depends on D142589.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D142886
2023-04-13 22:00:52 +01:00
Philip Reames
c416f6700f [IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2)
(JFYI - This has been heavily reframed since original attempt at landing.)

This change updates the InductionDescriptor logic to allow matching a pointer IV with a non-constant stride, but also updates the LoopVectorizer to bailout on such descriptors by default. This preserves the default vectorizer behavior.

In review, it was pointed out that there's multiple unfortunate performance implications which need to be addressed before this can be enabled. Having a flag allows us to exercise the behavior, and write test cases for logic which is otherwise unreachable (or hard to reach).

This will also enable non-constant stride pointer recurrences for other consumers. I've audited said code, and don't see any obvious issues.

Differential Revision: https://reviews.llvm.org/D147336
2023-04-05 09:32:35 -07:00
David Green
965a090f02 Revert "[IVDescriptors] Add pointer InductionDescriptors with non-constant strides"
Multiple errors have being reported on
https://reviews.llvm.org/rG498aa534f472d28db893aa9a8627d0b46e17f312

Reverting until the correctness issues can be resolved.

We are also seeing a lot of performance differences from the patch.  Some are
looking good, but some are looking pretty bad.
2023-03-31 11:08:50 +01:00
Philip Reames
498aa534f4 [IVDescriptors] Add pointer InductionDescriptors with non-constant strides
This matches the handling for integer IVs.  I left the non-opaque cases alone, mostly because they're largely irrelevant today.

This doesn't actually make much difference in vectorization right now as we immediately fail on aliasing checks (which also bail on non-constant strides).  Slightly suprisingly, it's the case which *do* need runtime checks which work after this patch as they don't use the same dependency analysis path.

This will also enable non-constant stride pointer recurrences for other consumers.  I've auditted said code, and don't see any obvious issues.
2023-03-30 11:56:00 -07:00
Craig Topper
80910d6ceb [IVDescriptors] Pass IsSigned when creating an all 1s constant for UMin recurrence.
This only matters for types larger than i64, and is consistent with
the code for RecurKind::And which also creates all 1s.

We don't have any tests for UMin or And with types larger than i64.
2023-03-08 09:52:42 -08:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Matt Devereau
8ff47f6032 [LoopVectorize] Enable integer Mul and Add as select reduction patterns
This patch vectorizes Phi node loop reductions for select's whos condition
comes from a floating-point comparison, with its operands being integers
for Add, Sub, and Mul reductions.

Example:

int foo(float *x, int n) {
    int sum = 0;
    for (int i=0; i<n; ++i) {
        float elem = x[i];
        if (elem > 0) {
            sum += 2;
        }
    }
    return sum;
}

This would previously fail to vectorize due to the integer reduction.
2023-01-30 09:41:40 +00:00
Kazu Hirata
526966d07d Use llvm::bit_ceil (NFC)
Note that:

  std::has_single_bit(X) ? X : llvm::NextPowerOf2(X);

is equivalent to:

  std::bit_ceil(X)

even for input 0.
2023-01-28 16:13:09 -08:00
Matt Devereau
4468e27d9f Revert "[LoopVectorize] Enable integer Mul and Add as select reduction patterns"
This reverts commit f90103851f9a381bbf7ed6da250217577afd00d2.
2023-01-26 12:02:16 +00:00
Matt Devereau
f90103851f [LoopVectorize] Enable integer Mul and Add as select reduction patterns
This patch vectorizes Phi node loop reductions for select's whos condition
comes from a floating-point comparison, with its operands being integers
for Add, Sub, and Mul reductions.

Example:

int foo(float *x, int n) {
    int sum = 0;
    for (int i=0; i<n; ++i) {
        float elem = x[i];
        if (elem > 0) {
            sum += 2;
        }
    }
    return sum;
}

Differential Revision: https://reviews.llvm.org/D141842
2023-01-25 13:25:18 +00:00
Piotr Fusik
898b5c9f5e [NFC] Fix "form/from" typos
Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D142007
2023-01-22 20:05:51 +01:00
Guillaume Chatelet
8fd5558b29 [NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:49:38 +00:00
Karthik Senthil
d9c52c31a0 [LV][IVDescriptors] Fix recurrence identity element for FMin and FMax reductions
For a min and max reduction idioms, the identity (i.e. neutral) element
should be datatype's highest and lowest possible values respectively.
Current implementation in IVDescriptors incorrectly returns -Inf for FMin
reduction and +Inf for FMax reduction. This patch fixes this bug which
was causing incorrect reduction computation results in loops vectorized
by LV.

Differential Revision: https://reviews.llvm.org/D137220
2022-11-04 10:39:37 -04:00
Guozhi Wei
ded26bf6b9 [IVDescriptors] Before moving an instruction in SinkAfter checking if it is target of other instructions
The attached test case can cause LLVM crash in buildVPlanWithVPRecipes because
invalid VPlan is generated.

  FIRST-ORDER-RECURRENCE-PHI ir<%792> = phi ir<%501>, ir<%806>
  CLONE ir<%804> = fdiv ir<1.000000e+00>, vp<%17>      // use of %17
  CLONE ir<%806> = load ir<%805>
  EMIT vp<%17> = first-order splice ir<%792> ir<%806>   // def of %17
  ...

There is a use before def error on %17.

When vectorizer generates a VPlan, it generates a "first-order splice"
instruction for a loop carried variable after its definition. All related PHI
users are changed to use this "first-order splice" result, and are moved after
it. The move is guided by a MapVector SinkAfter. And the content of SinkAfter is
filled by RecurrenceDescriptor::isFixedOrderRecurrence.

Let's look at the first PHI and related instructions

  %v792 = phi double [ %v806, %Loop ], [ %d1, %Entry ]
  %v802 = fdiv double %v794, %v792
  %v804 = fdiv double 1.000000e+00, %v792
  %v806 = load double, ptr %v805, align 8

%v806 is a loop carried variable, %v792 is related PHI instruction. Vectorizer
will generated a new "first-order splice" instruction for %v806, and it will be
used by %v802 and %v804. So %v802 and %v804 will be moved after %v806 and its
"first-order splice" instruction. So SinkAfter contains

   %v802   ->  %v806
   %v804   ->  %v802

It means %v802 should be moved after %v806 and %v804 will be moved after %v802.
Please pay attention that the order is important.

When isFixedOrderRecurrence processing PHI instruction %v794, related
instructions are

  %v793 = phi double [ %v813, %Loop ], [ %d1, %Entry ]
  %v794 = phi double [ %v793, %Loop ], [ %d2, %Entry ]
  %v802 = fdiv double %v794, %v792
  %v813 = load double, ptr %v812, align 8

This time its related loop carried variable is %v813, its user is %v802. So
%v802 should also be moved after %v813. But %v802 is already in SinkAfter,
because %v813 is later than %v806, so the original %v802 entry in SinkAfter is
deleted, a new %v802 entry is added. Now SinkAfter contains

  %v804   ->  %v802
  %v802   ->  %v813

With these data, %v802 can still be moved after all its operands, but %v804
can't be moved after %v806 and its "first-order splice" instruction. And causes
use before def error.

So when remove/re-insert an instruction I in SinkAfter, we should also
recursively remove instructions targeting I and re-insert them into SinkAfter.
But for simplicity I just bail out in this case.

Differential Revision: https://reviews.llvm.org/D134083
2022-10-03 18:47:51 +00:00
Florian Hahn
b8709a9d03
[LV] Support fixed order recurrences.
If the incoming previous value of a fixed-order recurrence is a phi in
the header, go through incoming values from the latch until we find a
non-phi value. Use this as the new Previous, all uses in the header
will be dominated by the original phi, but need to be moved after
the non-phi previous value.

At the moment, fixed-order recurrences are modeled as a chain of
first-order recurrences.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D119661
2022-08-18 19:15:52 +01:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Kazu Hirata
601b3a13de [Analysis] Qualify auto variables in for loops (NFC) 2022-07-16 23:26:34 -07:00
Kazu Hirata
92a1b2afc8 [Analysis] Remove isArithmeticRecurrenceKind
The last use was removed on Jul 30, 2021 in commit
9d355949937038c32c7608ebb558bbc3984f6340.
2022-07-16 13:23:32 -07:00
Igor Kirillov
4e5e042d9a [LoopVectorize] Support reductions that store intermediary result
Adds ability to vectorize loops containing a store to a loop-invariant
address as part of a reduction that isn't converted to SSA form due to
lack of aliasing info. Runtime checks are generated to ensure the store
does not alias any other accesses in the loop.

Ordered fadd reductions are not yet supported.

Differential Revision: https://reviews.llvm.org/D110235
2022-05-03 10:12:30 +01:00
David Green
9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Florian Hahn
a2979c8399
[IVDescriptors] Bail out instead of asserting that order is expected.
When dealing with multiple phis that depend on each other, the order
might have been changed and may not match the expectation. If that
happens, bail out, rather than asserting.

Fixes https://github.com/llvm/llvm-project/issues/54218
Fixes https://github.com/llvm/llvm-project/issues/54233
Fixes https://github.com/llvm/llvm-project/issues/54254
2022-03-07 19:57:26 +00:00
Florian Hahn
de8ac485e5
[IVDescriptor] Remove SinkCandidate from SinkAfter before re-sinking.
This ensures the right order in the sink-after map is maintained. If we
re-sink an instruction, it must be sunk after all earlier instructions
have been sunk.

Fixes https://github.com/llvm/llvm-project/issues/54223
2022-03-05 19:48:26 +00:00
Florian Hahn
5a60260efe
[IVDescriptor] Use DT to check order of Previous, OtherPrev.
Previous and OhterPrev may not be in the same block. Use DT::dominates
instead of local comesBefore. DT::dominates is already used earlier to
check the order of Previous and SinkCandidate.

Fixes https://github.com/llvm/llvm-project/issues/54195
2022-03-04 11:07:42 +00:00
Florian Hahn
139215af8e
[IVDescriptor] Find original 'Previous' for first-order recurrences.
This patch extends first-order recurrence handling to support cases
where we already sunk an instruction for a different recurrence, but
LastPrev comes before Previous.

To handle those cases correctly, we need to find the earliest entry for
the sink-after chain, because this is references the Previous from the
original recurrence. This is needed to ensure we use the correct
instruction as sink point.

Depends on D118558.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D118642
2022-03-03 16:41:26 +00:00