1067 Commits

Author SHA1 Message Date
Nikita Popov
2dc44b3a7b
[InstCombine] Fix multi-use handling for multi-GEP rewrite (#146689)
If we're expanding offsets for a chain of GEPs in RewriteGEPs mode, we
should also rewrite GEPs that have one-use themselves, but are kept
alive by a multi-use GEP later in the chain.

For the sake of simplicity, I've changed this to just skip the one-use
condition entirely (which will perform an unnecessary rewrite of a no
longer used GEP, but shouldn't otherwise matter).
2025-07-02 16:45:27 +02:00
Ricardo Jesus
84c849e85b
[InstCombine] Combine interleaved recurrences. (#143878)
Combine sequences such as:
```llvm
  %pn1 = phi [init1, %BB1], [%op1, %BB2]
  %pn2 = phi [init2, %BB1], [%op2, %BB2]
  %op1 = binop %pn1, constant1
  %op2 = binop %pn2, constant2
  %rdx = binop %op1, %op2
```
Into:
```llvm
  %phi_combined = phi [init_combined, %BB1], [%op_combined, %BB2]
  %rdx_combined = binop %phi_combined, constant_combined
```

This allows us to simplify interleaved reductions, for example as
introduced by the loop vectorizer.

The anecdotal example for this is the loop below:
```c
float foo() {
  float q = 1.f;
  for (int i = 0; i < 1000; ++i)
    q *= .99f;
  return q;
}
```
Which currently gets lowered explicitly such as (on AArch64,
interleaved by four):
```gas
.LBB0_1:
  fmul    v0.4s, v0.4s, v1.4s
  fmul    v2.4s, v2.4s, v1.4s
  fmul    v3.4s, v3.4s, v1.4s
  fmul    v4.4s, v4.4s, v1.4s
  subs    w8, w8, #32
  b.ne    .LBB0_1
```
But with this patch lowers trivially:
```gas
foo:
  mov     w8, #5028
  movk    w8, #14389, lsl #16
  fmov    s0, w8
  ret
```
2025-07-01 09:54:38 +01:00
Kazu Hirata
2a35414e98
[Transforms] Use range-based for loops (NFC) (#145252)
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-06-25 10:08:26 -07:00
Philip Reames
a84891698a
[instcombine] Scalarize operands of vector geps if possible (#145402)
If we have a gep with vector indices which were splats (either constants
or shuffles), prefer the scalar form of the index. If all operands are
scalarizable, then prefer a scalar gep with splat following.

This does loose some information about undef/poison lanes, but I'm not
sure that's significant versus the number of downstream transformations
which get confused by having to manual scalarize operands.
2025-06-24 09:48:00 -07:00
Jameson Nash
940ff110d7
[InstCombine] fix hwasan mistake in "remove dead loads" (#145057)
Detected by CI after #143958.
2025-06-20 12:22:59 -04:00
Jameson Nash
96ab74bf17
[InstCombine] remove undef loads, such as memcpy from undef (#143958)
Extend `isAllocSiteRemovable` to be able to check if the ModRef info
indicates the alloca is only Ref or only Mod, and be able to remove it
accordingly. It seemed that there were a surprising number of
benchmarks with this pattern which weren't getting optimized previously
(due to MemorySSA walk limits). There were somewhat more existing tests
than I'd like to have modified which were simply doing exactly this
pattern (and thus relying on undef memory). Claude code contributed the
new tests (and found an important typo that I'd made).

This implements the discussion in
https://github.com/llvm/llvm-project/pull/143782#discussion_r2142720376.
2025-06-20 10:32:31 -04:00
Philip Reames
b5aaf9d988
[InstCombine] Implement vp.reverse reordering/elimination through binop/unop (#143963)
This simply copies the structure of the vector.reverse patterns from
just above, and reimplements them for the vp.reverse intrinsics when the
mask is all ones and the EVLs exactly match.

Its unfortunate that we have three different ways to represent a reverse
(shuffle, vector.reverse, and vp.reverse) but I don't see an obvious way
to remove any them because the semantics are slightly different.

This significantly improves vectorization in TSVC_2's s112 and s1112
loops when using EVL tail folding.
2025-06-18 08:53:45 -07:00
Jeremy Morse
9eb0020555
[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)
Seeing how we can't generate any debug intrinsics any more: delete a
variety of codepaths where they're handled. For the most part these are
plain deletions, in others I've tweaked comments to remain coherent, or
added a type to (what was) type-generic-lambdas.

This isn't all the DbgInfoIntrinsic call sites but it's most of the
simple scenarios.

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-17 15:55:14 +01:00
Konstantin Bogdanov
07fa6d1d90
[InstCombine] Avoid folding select(umin(X, Y), X) with min/max values in false arm (#143020)
Fixes https://github.com/llvm/llvm-project/issues/139050.

This patch adds a check to avoid folding min/max reduction into select, which may block loop vectorization.

The issue is that the following snippet:
```
declare i8 @llvm.umin.i8(i8, i8)

define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
; CHECK-LABEL: @masked_min_fold_bug(
; CHECK:       %cond = icmp eq i8 %mask, 0
; CHECK:       %masked_val = select i1 %cond, i8 %val, i8 255
; CHECK:       call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
;
  %cond = icmp eq i8 %mask, 0
  %masked_val = select i1 %cond, i8 %val, i8 255
  %res = call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
  ret i8 %res
}
```

is being optimized to the following code, which can not be vectorized
later.
```
declare i8 @llvm.umin.i8(i8, i8) #0

define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
  %cond = icmp eq i8 %mask, 0
  %1 = call i8 @llvm.umin.i8(i8 %acc, i8 %val)
  %res = select i1 %cond, i8 %1, i8 %acc
  ret i8 %res
}

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```

Expected:
```
declare i8 @llvm.umin.i8(i8, i8) #0

define i8 @masked_min_fold_bug(i8 %acc, i8 %val, i8 %mask) {
  %cond = icmp eq i8 %mask, 0
  %masked_val = select i1 %cond, i8 %val, i8 -1
  %res = call i8 @llvm.umin.i8(i8 %acc, i8 %masked_val)
  ret i8 %res
}

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```

https://godbolt.org/z/cYMheKE5r
2025-06-14 14:32:54 +08:00
Nikita Popov
2019553a0b [InstCombine] Extract EmitGEPOffsets() helper (NFC)
Extract a reusable helper for emitting a sum of multiple GEP
offsets.
2025-06-13 12:34:18 +02:00
Stephen Tozer
a08a831515
[DLCov][NFC] Propagate annotated DebugLocs through transformations (#138047)
Part of the coverage-tracking feature, following #107279.

In order for DebugLoc coverage testing to work, we firstly have to set
annotations for intentionally-empty DebugLocs, and secondly we have to
ensure that we do not drop these annotations as we propagate DebugLocs
throughout compilation. As the annotations exist as part of the DebugLoc
class, and not the underlying DILocation, they will not survive a
DebugLoc->DILocation->DebugLoc roundtrip. Therefore this patch modifies
a number of places in the compiler to propagate DebugLocs directly
rather than via the underlying DILocation. This has no effect on the
output of normal builds; it only ensures that during coverage builds, we
do not drop incorrectly annotations and therefore create false
positives.

The bulk of these changes are in replacing
DILocation::getMergedLocation(s) with a DebugLoc equivalent, and in
changing the IRBuilder to store a DebugLoc directly rather than storing
DILocations in its general Metadata array. We also use a new function,
`DebugLoc::orElse`, which selects the "best" DebugLoc out of a pair
(valid location > annotated > empty), preferring the current DebugLoc on
a tie - this encapsulates the existing behaviour at a few sites where we
_may_ assign a DebugLoc to an existing instruction, while extending the
logic to handle annotation DebugLocs at the same time.
2025-06-12 14:06:27 +01:00
Acthinks Yang
ca4dfca5c7
[InstCombine] Relax guard against FP min/max in select fold (#143144)
FCmp's commutativity predicates do not work with min/max semantics

Closes #142711
2025-06-07 14:26:43 +08:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
Luke Lau
9262e37d8c
[InstCombine] Fold shuffled intrinsic operands with constant operands (#141300)
We currently pull shuffles through binops and intrinsics, which is an
important canonical form for VectorCombine to be able to scalarize
vector sequences. But while binops can be folded with a constant
operand, intrinsics currently require all operands to be shufflevectors.

This extends intrinsic folding to be in line with regular binops by
reusing the constant "unshuffling" logic.

As far as I can tell the list of currently folded intrinsics don't
require any special UB handling.

This change in combination with #138095 and #137823 fixes the following
C:

```c
void max(int *x, int *y, int n) {
  for (int i = 0; i < n; i++)
    x[i] += *y > 42 ? *y : 42;
}
```

Not using the splatted vector form on RISC-V with `-O3 -march=rva23u64`:

```asm
	vmv.s.x	v8, a4
	li	a4, 42
	vmax.vx	v10, v8, a4
	vrgather.vi	v8, v10, 0
.LBB0_9:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
	vl2re32.v	v10, (a5)
	vadd.vv	v10, v10, v8
	vs2r.v	v10, (a5)
```

i.e., it now generates

```asm
        li	a6, 42
        max	a6, a4, a6
.LBB0_9:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
	vl2re32.v	v8, (a5)
	vadd.vx	v8, v8, a6
	vs2r.v	v8, (a5)
```
2025-05-28 10:57:08 +01:00
Luke Lau
8f9549c414
[InstCombine] Refactor fixed and scalable binop shuffle combine. NFCI (#141287)
This extracts the logic that works out the "unshuffled" constant when
pulling shuffle vectors out of binary ops, so the same combine can be
generic over fixed and scalable vectors.

The plan is to reuse this helper to do the same canonicalization on
intrinsics too.
2025-05-24 15:16:16 +01:00
Luke Lau
4b4699a13c
[InstCombine] Don't cover up poison elements for shifts when folding shuffles thru binops (#141303)
As noted in the TODO, we don't need to cover up the poison elements
placed in the unused lanes for shifts, since it's not UB unlike div/rem.

New poison elements are only introduced in cases like

ShMask = <1,1,2,2> and C = <5,5,6,6> --> NewC = <poison,5,6,poison>

And the resulting shuffle won't use the poison lanes.
2025-05-24 13:47:18 +01:00
Luke Lau
7b2fc48c27
[InstCombine] Remove dead poison check. NFCI (#141264)
As far as I understand any binary op with poison as either operand will
constant fold to poison, so this check will never trigger.
`llvm::ConstantFoldBinaryInstruction` seems to confirm this?

I think this ended up getting left behind because originally
shufflevectors with undef indices produced undef elements, and we
couldn't pull the shuffle across some binops like `or undef, -1 --> -1`.

This code was added in 8c655150827b5d56772e628994db08441c554097 to
partially fix it and further extended in
f7499011ca29bebeda7c9d79d79b290cf0b8b46d, originally checking for undef
but changed to check for poison in cd54c47424456

But nowadays shufflevectors with undef indices are treated as poison
indices as of 575fdea70a86f68b0d303a9a3273fc47f810628a, and so produce
poison elements, so this is no longer an issue
2025-05-23 20:21:12 +01:00
Yingwei Zheng
6ee30e8dd8
[InstCombine] Fix incorrect number of iterations (#140004) 2025-05-15 21:47:18 +08:00
Yingwei Zheng
6f1f6d184f
[InstCombine][DebugInfo] Update debug value uses in freelyInvertAllUsersOf (#137013)
This patch updates all debug value uses in `freelyInvertAllUsersOf` by
inserting `DW_OP_not` at the front of the DIExpression.

Related issue: https://github.com/llvm/llvm-project/issues/71065
2025-05-13 08:56:22 +08:00
Luke Lau
cf3242f3b0
[InstCombine] Pull shuffles out of binops with splatted ops (#137948)
Given a binary op on splatted vector and a splatted constant,
InstCombine will normally pull the shuffle out in
`InstCombinerImpl::foldVectorBinop`:

```llvm
define <4 x i32> @f(i32 %x) {
  %x.insert = insertelement <4 x i32> poison, i32 %x, i64 0
  %x.splat = shufflevector <4 x i32> %x.insert, <4 x i32> poison, <4 x i32> zeroinitializer
  %res = add <4 x i32> %x.splat, splat (i32 42)
  ret <4 x i32> %res
}
```

```llvm
define <4 x i32> @f(i32 %x) {
  %x.insert = insertelement <4 x i32> poison, i32 %x, i64 0
  %1 = add <4 x i32> %x.insert, <i32 42, i32 poison, i32 poison, i32 poison>
  %res = shufflevector <4 x i32> %1, <4 x i32> poison, <4 x i32> zeroinitializer
  ret <4 x i32> %res
}
```

However, this currently only operates on fixed length vectors. Splats of
scalable vectors don't currently have their shuffle pulled out, e.g:

```llvm
define <vscale x 4 x i32> @f(i32 %x) {
  %x.insert = insertelement <vscale x 4 x i32> poison, i32 %x, i64 0
  %x.splat = shufflevector <vscale x 4 x i32> %x.insert, <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer
  %res = add <vscale x 4 x i32> %x.splat, splat (i32 42)
  ret <vscale x 4 x i32> %res
}
```

Having this canonical form with the shuffle pulled out is important as
VectorCombine relies on it in order to scalarize binary ops in
`scalarizeBinopOrCmp`, which would prevent the need for #137786. This
also brings it in line for scalable binary ops with two non-constant
operands: https://godbolt.org/z/M9f7ebzca

This adds a combine just after the fixed-length version, but restricted
to splats at index 0 so that it also handles the scalable case:

So the whilst the existing combine looks like: `Op(shuffle(V1, Mask), C)
-> shuffle(Op(V1, NewC), Mask)`

This patch adds: `Op(shuffle(V1, 0), (splat C)) -> shuffle(Op(V1, (splat
C)), 0)`

I think this could be generalized to other splat indexes that aren't
zero, but I think it would be dead code since only fixed-length vectors
can have non-zero shuffle indices, which would be covered by the
existing combine.
2025-05-13 00:11:55 +01:00
Matt Arsenault
9383fb23e1
Reapply "IR: Remove uselist for constantdata (#137313)" (#138961)
Reapply "IR: Remove uselist for constantdata (#137313)"

This reverts commit 5936c02c8b9c6d1476f7830517781ce8b6e26e75.

Fix checking uselists of constants in assume bundle queries
2025-05-08 08:00:09 +02:00
Kirill Stoimenov
5936c02c8b Revert "IR: Remove uselist for constantdata (#137313)"
Possibly breaks the build: https://lab.llvm.org/buildbot/#/builders/24/builds/8119

This reverts commit 87f312aad6ede636cd2de5d18f3058bf2caf5651.
2025-05-07 00:07:55 +00:00
Matt Arsenault
87f312aad6
IR: Remove uselist for constantdata (#137313)
This is a resurrected version of the patch attached to this RFC:

https://discourse.llvm.org/t/rfc-constantdata-should-not-have-use-lists/42606

In this adaptation, there are a few differences. In the original patch, the Use's
use list was replaced with an unsigned* to the reference count in the value. This
version leaves them as null and leaves the ref counting only in Value.

Remove use-lists from instances of ConstantData (which are shared
across modules and have no operands).

To continue supporting most of the use-list API, store a ref-count in
place of the use-list; this is for API like Value::use_empty and
Value::hasNUses.  Operations that actually need the use-list -- like
Value::use_begin -- will assert.

This change has three benefits:

 1. The compiler output cannot in any way depend on the use-list order
    of instances of ConstantData.

 2. There's no use-list traffic when adding and removing simple
    constants from operand lists (although there is ref-count traffic;
    YMMV).

 3. It's cheaper to serialize use-lists (since we're no longer
    serializing the use-list order of things like i32 0).

The downside is that you can't look at all the users of ConstantData,
but traversals of users of i32 0 are already ill-advised.

Possible follow-ups:
  - Track if an instance of a ConstantVector/ConstantArray/etc. is known
    to have all ConstantData arguments, and drop the use-lists to
    ref-counts in those cases.  Callers need to check Value::hasUseList
    before iterating through the use-list.
  - Remove even the ref-counts.  I'm not sure they have any benefit
    besides minimizing the scope of this commit, and maintaining the
    counts is not free.

Fixes #58629

Co-authored-by: Duncan P. N. Exon Smith <dexonsmith@apple.com>
2025-05-06 17:20:37 +02:00
Nikita Popov
b492ec5899
[ErrorHandling] Add reportFatalInternalError + reportFatalUsageError (NFC) (#138251)
This implements the result of the discussion at:

https://discourse.llvm.org/t/rfc-report-fatal-error-and-the-default-value-of-gencrashdialog/73587

There are two different use cases for report_fatal_error, so replace it
with two functions reportFatalInternalError() and
reportFatalUsageError(). The former indicates a bug in LLVM and
generates a crash dialog. The latter does not. The names have been
suggested by rnk and people seemed to like them.

This replaces a lot of the usages that passed an explicit value for
GenCrashDiag. I did not bulk replace remaining report_fatal_error usage
-- they probably require case by case review for which function to use.
2025-05-05 12:10:03 +02:00
clubby789
b006756d44
[InstCombine] Fix crash when alloc functions are missing alloc-family (#138310)
Fixes #63477 by bailing out instead of crashing.

Co-authored-by: Jamie <jamie@osec.io>
2025-05-03 09:53:57 +02:00
Kazu Hirata
799916ae10
[llvm] Construct SmallVector with iterator ranges (NFC) (#136064) 2025-04-16 19:29:47 -07:00
Björn Pettersson
092b6e73e6
[InstCombine] Handle "add like" in ADD+GEP->GEP+GEP rewrites (#135156)
Considering that "or disjoint" is the canonical for certain add
operations, then I think we want to support such "add like" operations
when doing ADD+GEP->GEP+GEP rewrites to make things more consistent.

Problem was found when improving ValueTracking, which turned an ADD into
OR, and then suddenly optimizations got worse due to these rewrites no
longer triggering.
2025-04-14 17:11:13 +02:00
Björn Pettersson
29555ad5ef
[InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (#135155)
Given that we have a "add nuw" and a "getelementptr inbounds nuw" like
this:
   %idx = add nuw i64 %idx1, %idx2
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx

Then we can preserve the "inbounds nuw" flag when transforming that into
two getelementptr instructions:
   %gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2

Similarly for just having "nuw", and "nusw nuw" instead of "inbounds nuw"
on the getelementptr.

Proof: https://alive2.llvm.org/ce/z/QSweWW
2025-04-14 11:03:06 +02:00
Stephen Tozer
93505f8e0e
[DebugInfo][InstCombine] Propagate DILocation when noop-ing invoke (#134678)
In InstCombine we may decide that an alloc is removable, and the alloc
fn is called by an InvokeInst, we replace that InvokeInst with a invoke
of a noop intrinsic; this patch has us also copy the original invoke's
DILocation to the new noop invoke.

Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-08 09:56:26 +01:00
Tim Gymnich
049f179606
[Analysis][NFC] Extract KnownFPClass (#133457)
- extract KnownFPClass for future use inside of GISelKnownBits

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-03-28 18:10:02 +01:00
John McIver
9cab82fde8
[InstCombine] Enable select freeze poison folding when storing value (#129776)
The non-freeze poison argument to select can be one of the following: global,
constant, and noundef arguments.

Alive2 test validation: https://alive2.llvm.org/ce/z/jbtCS6
2025-03-19 21:48:58 -06:00
Matt Arsenault
0f34b656f0
InstCombine: Remove a check for pointer bitcasts (#128491) 2025-02-24 19:52:44 +07:00
Yingwei Zheng
2ebc69a521
[InstCombine] Add support for GEPs in simplifyNonNullOperand (#128365)
Alive2: https://alive2.llvm.org/ce/z/2KE8zG
2025-02-23 17:19:31 +08:00
Yingwei Zheng
126016b662
[InstCombine] Simplify nonnull pointers (#128111)
This patch is the follow-up of
https://github.com/llvm/llvm-project/pull/127979. It introduces a helper
`simplifyNonNullOperand` to avoid duplicate logic. It also addresses the
one-use issue in `visitLoadInst`, as discussed in
https://github.com/llvm/llvm-project/pull/127979#issuecomment-2671013972.
The `nonnull` attribute is also supported. Proof:
https://alive2.llvm.org/ce/z/MCKgT9
2025-02-22 15:30:04 +08:00
Narayan
55be370f37
[InstCombine] Fold frexp of select to select of frexp (#121227)
This patch implements an optimization to push select operations through
frexp when one of the select operands is a constant. When we encounter:
```
define float @src(float %x, i1 %bool) {
  %select = select i1 %bool, float 1.000000e+00, float %x
  %frexp = tail call { float, i32 } @llvm.frexp.f32.i32(float %select)
  %frexp.0 = extractvalue { float, i32 } %frexp, 0
  ret float %frexp.0
}
```
We transform it to:
```
define float @tgt(float %x, i1 %bool) {
  %frexp = tail call { float, i32 } @llvm.frexp.f32.i32(float %x)
  %frexp.0 = extractvalue { float, i32 } %frexp, 0
  %select = select i1 %bool, float 5.000000e-01, float %frexp.0
  ret float %select
}
```

Fixes #92542
2025-01-31 23:17:58 +08:00
Nikita Popov
212f344b84 [InstCombine] Handle constant expression result in tryFactorization()
If IRBuilder folds the result to a constant expression, don't try
to set nowrap flags on it.

Fixes https://github.com/llvm/llvm-project/issues/124526.
2025-01-27 16:25:37 +01:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Yingwei Zheng
0e13ce770b
[InstCombine] Handle mul in maintainNoSignedWrap (#123299)
Alive2: https://alive2.llvm.org/ce/z/Kgamks
Closes https://github.com/llvm/llvm-project/issues/123175.

For `@foo1`, the nsw flag is propagated because we first convert it into
`mul nsw nuw (shl nsw nuw X, 1), 3`.
2025-01-17 16:59:04 +08:00
Yingwei Zheng
b8337dc4b2
[InstCombine] Handle commuted patterns in foldBinOpShiftWithShift (#122126)
Closes https://github.com/llvm/llvm-project/issues/121775.
2025-01-09 14:36:17 +08:00
Nikita Popov
a8072a0b4e
[InstCombine] Eliminate icmp+zext pairs over phis more aggressively (#121767)
When folding icmp over phi, add a special case for `icmp eq (zext(bool),
0)`, which is known to fold to `!bool` and thus won't increase the
instruction count. This helps convert more phis to i1, esp. in loops.

This is based on existing logic we have to support this for icmp of
ucmp/scmp.
2025-01-07 09:29:36 +01:00
Yingwei Zheng
a4d92400a6
[InstCombine] Fix GEPNoWrapFlags propagation in foldGEPOfPhi (#121572)
Closes https://github.com/llvm/llvm-project/issues/121459.
2025-01-03 23:19:57 +08:00
Alexander Kornienko
23a239267e
Revert "[InstCombine] Infer nuw for gep inbounds from base of object" (#120460)
Reverts llvm/llvm-project#119225 due to the lack of sanitizer support,
large potential of breaking code containing latent UB, non-trivial
localization and investigation, and what seems to be a bad interaction
with msan (a test is in the works).

Related discussions:
https://github.com/llvm/llvm-project/pull/119225#issuecomment-2551904822
https://github.com/llvm/llvm-project/pull/118472#issuecomment-2549986255
2024-12-18 19:06:34 +01:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Matthias Braun
768754807f
[InstCombine] Optimistically allow multiple shufflevector uses in foldOpPhi (#114278)
We would like to optimize situations of the form that happen after loop
vectorization+SROA:
```
loop:
    %phi = phi zeroinitializer, %interleaved

    %deinterleave_a = shufflevector %phi, poison ; pick half of the lanes
    %deinterleave_b = shufflevector %phi, posion ; pick remaining lanes

    ... %a = ... %b = ...

    %interleaved = shufflevector %a, %b ; interleave lanes of a+b
```
where the interleave and de-interleave shuffle operations cancel each
other out.
This could be handled by `foldOpPhi` but does not currently work because
it does
not proceed when there are multiple uses of the `Phi` operation.

This extends `foldOpPhi` to allow multiple `shufflevector` uses when
they are
shown to simplify for all `Phi` input values.
2024-12-12 17:20:48 -08:00
Nikita Popov
e21ab4d16b
[InstCombine] Infer nuw for gep inbounds from base of object (#119225)
When we have a gep inbounds from the base of an object (e.g. alloca or
global), we know that the index cannot be negative, as this would go out
of bounds. As such, we can infer nuw as well.

The implementation is a bit stricter than necessary, we could also
accept one unknown index followed by known-non-negative indices.

Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently
incorrectly doesn't require the inbounds for the alloca case, see
https://github.com/AliveToolkit/alive2/issues/1138).
2024-12-10 10:00:50 +01:00
Nikita Popov
f7685af4a5 [InstCombine] Move gep of phi fold into separate function
This makes sure that an early return during this fold doesn't end
up skipping later gep folds.
2024-12-05 15:20:56 +01:00
Nikita Popov
462cb3cd6c
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.

Proof: https://alive2.llvm.org/ce/z/ihztLy
2024-12-05 14:36:40 +01:00
Ramkumar Ramachandra
51a895aded
IR: introduce struct with CmpInst::Predicate and samesign (#116867)
Introduce llvm::CmpPredicate, an abstraction over a floating-point
predicate, and a pack of an integer predicate with samesign information,
in order to ease extending large portions of the codebase that take a
CmpInst::Predicate to respect the samesign flag.

We have chosen to demonstrate the utility of this new abstraction by
migrating parts of ValueTracking, InstructionSimplify, and InstCombine
from CmpInst::Predicate to llvm::CmpPredicate. There should be no
functional changes, as we don't perform any extra optimizations with
samesign in this patch, or use CmpPredicate::getMatching.

The design approach taken by this patch allows for unaudited callers of
APIs that take a llvm::CmpPredicate to silently drop the samesign
information; it does not pose a correctness issue, and allows us to
migrate the codebase piece-wise.
2024-12-03 13:31:04 +00:00
Nikita Popov
9a844a36eb
[InstCombine] Use InstSimplify in FoldOpIntoSelect (#116073)
Instead of only trying to constant fold the select arms, try to simplify
them. This subsumes https://github.com/llvm/llvm-project/pull/115969
which implements this for extractvalue only.

This is still fairly limited in that we will usually only call
FoldOpIntoSelect in the first place if we have a constant operand. This
can be relaxed in the future if worthwhile.
2024-11-18 10:07:31 +01:00
serge-sans-paille
f5e4ffaa49
Revert "[llvm] Use computeConstantRange to improve llvm.objectsize computation (#114673)"
This reverts commit 5f342816efe1854333f2be41a03fdd25fa0db433.

This seems to break various builders, such as

https://lab.llvm.org/buildbot/#/builders/41/builds/3259
https://lab.llvm.org/buildbot/#/builders/76/builds/4298
2024-11-07 13:40:50 +01:00