1069 Commits

Author SHA1 Message Date
Yingwei Zheng
2f1f6b704d
[LLVM] Use std::move for APInt. NFC. (#86257)
This patch adjusts argument passing for `APInt` to improve the
compile-time.
Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=ba3e326def3a6e5cd6d72ff5a49c74fba18de1df&stat=instructions:u
2024-03-23 14:58:25 +08:00
Andreas Jonson
e66cfebb04
[ValueTracking] Handle range attributes (#85143)
Handle the range attribute in ValueTracking.
2024-03-20 12:43:00 +01:00
Noah Goldstein
5265be11b1 [InstSimply] Simplify (fmul -x, +/-0) -> -/+0
We already handle the `+x` case, and noticed it was missing in the bug
affecting #82555

Proofs: https://alive2.llvm.org/ce/z/WUSvmV

Closes #85345
2024-03-18 15:11:55 -05:00
Artem Tyurin
141145232f
[IRBuilder] Fold binary intrinsics (#80743)
Fixes https://github.com/llvm/llvm-project/issues/61240.
2024-03-15 09:58:25 +01:00
Andreas Jonson
a3b52509d5
[InstSimpliy] Use range attribute to simplify comparisons (#84627)
Use the new range attribute from https://github.com/llvm/llvm-project/pull/84617
to simplify comparisons where both sides have range information.
2024-03-12 10:39:37 +01:00
Andreas Jonson
54bb4be018
[InstSimplify] Handle vec values when simplifying comparisons using range metadata (#84673)
Found that this failed with an assertion when vec was used in this
optimization while working on https://github.com/llvm/llvm-project/pull/84627.
2024-03-10 12:54:37 +01:00
Björn Pettersson
7677453886
[ConstantFolding] Do not consider padded-in-memory types as uniform (#81854)
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.

Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.

Main problem solved would be something like this:
  store i17 -1, ptr %p, align 4
  %v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.

If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.

Fixes: https://github.com/llvm/llvm-project/issues/81793
2024-02-15 15:40:21 +01:00
Yingwei Zheng
470c5b8011
[InstSimplify][InstCombine] Remove unnecessary m_c_* matchers. (#81712)
This patch removes unnecessary `m_c_*` matchers since we always
canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`.

Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u
2024-02-14 16:40:36 +08:00
Yingwei Zheng
dc866ae49e
[ValueTracking] Move the isSignBitCheck helper into ValueTracking. NFC. (#81704)
This patch moves the `isSignBitCheck` helper into ValueTracking to reuse
the logic in ValueTracking/InstSimplify.

Addresses the comment
https://github.com/llvm/llvm-project/pull/80740#discussion_r1488440050.
2024-02-14 15:33:08 +08:00
Danila Malyutin
cb1a9f70ec
[InstSimplify] Add trivial simplifications for gc.relocate intrinsic (#81639)
Fold gc.relocate of undef and null to undef and null respectively.

Similar transform is currently done by instcombine, but there is no
reason to not include it here as well.
2024-02-14 02:16:32 +03:00
Yingwei Zheng
e17dded8d7
[InstSimplify] Generalize simplifyAndOrOfFCmps (#81027)
This patch generalizes `simplifyAndOrOfFCmps` to simplify patterns like:
```
define i1 @src(float %x, float %y) {
  %or.cond.i = fcmp ord float %x, 0.000000e+00
  %cmp.i.i34 = fcmp olt float %x, %y
  %cmp.i2.sink.i = and i1 %or.cond.i, %cmp.i.i34
  ret i1 %cmp.i2.sink.i
}

define i1 @tgt(float %x, float %y) {
  %cmp.i.i34 = fcmp olt float %x, %y
  ret i1 %cmp.i.i34
}
```
Alive2: https://alive2.llvm.org/ce/z/9rydcx

This patch and #80986 will fix the regression introduced by #80941.
See also the IR diff
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/199#discussion_r1480974120.
2024-02-08 15:07:35 +08:00
Yingwei Zheng
f37d81f8a3
[PatternMatch] Add a matching helper m_ElementWiseBitCast. NFC. (#80764)
This patch introduces a matching helper `m_ElementWiseBitCast`, which is
used for matching element-wise int <-> fp casts.
The motivation of this patch is to avoid duplicating checks in
https://github.com/llvm/llvm-project/pull/80740 and
https://github.com/llvm/llvm-project/pull/80414.
2024-02-07 21:02:13 +08:00
Yingwei Zheng
930996e9e4
[ValueTracking][NFC] Pass SimplifyQuery to computeKnownFPClass family (#80657)
This patch refactors the interface of the `computeKnownFPClass` family
to pass `SimplifyQuery` directly.
The motivation of this patch is to compute known fpclass with
`DomConditionCache`, which was introduced by
https://github.com/llvm/llvm-project/pull/73662. With
`DomConditionCache`, we can do more optimization with context-sensitive
information.

Example (extracted from
[fmt/format.h](e17bc67547/include/fmt/format.h (L3555-L3566))):
```
define float @test(float %x, i1 %cond) {
  %i32 = bitcast float %x to i32
  %cmp = icmp slt i32 %i32, 0
  br i1 %cmp, label %if.then1, label %if.else

if.then1:
  %fneg = fneg float %x
  br label %if.end

if.else:
  br i1 %cond, label %if.then2, label %if.end

if.then2:
  br label %if.end

if.end:
  %value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
  %ret = call float @llvm.fabs.f32(float %value)
  ret float %ret
}
```
We can prove the signbit of `%value` is always zero. Then the fabs can
be eliminated.
2024-02-06 02:30:12 +08:00
Yingwei Zheng
50e80e06d1
[ValueTracking] Merge cannotBeOrderedLessThanZeroImpl into computeKnownFPClass (#76360)
This patch merges the logic of `cannotBeOrderedLessThanZeroImpl` into
`computeKnownFPClass` to improve the signbit inference.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-01-31 18:26:50 +08:00
Nikita Popov
97e3220d63 [InstSimplify] Consider bitcast as potential cross-lane operation
The bitcast might change the number of vector lanes, in which case
it will be a cross-lane operation.

Fixes https://github.com/llvm/llvm-project/issues/77320.
2024-01-08 15:52:58 +01:00
ChipsSpectre
4444a7e89a
[InstSimplify] Simplify the expression (a^c)&(a^~c) to zero and (a^c) | (a^~c) to minus one (#76637)
Changes the InstSimplify pass of the LLVM optimizer, such that the
aforementioned expression is reduced to zero if c2==~c1.
Alive2: https://alive2.llvm.org/ce/z/xkQiid
Fixes https://github.com/llvm/llvm-project/issues/75692.
2024-01-03 12:01:02 +01:00
Yingwei Zheng
554feb0058
[InstSimplify] Simplify select cond, undef, val to val if val = poison implies cond = poison (#76465)
This patch folds:
```
select cond, undef, val -> val
select cond, val, undef -> val
```
iff `impliesPoison(val, cond)` returns true.

Example:
```
define i32 @src1(i32 %retval.0.i.i) {
  %cmp.i = icmp sgt i32 %retval.0.i.i, -1
  %spec.select.i = select i1 %cmp.i, i32 %retval.0.i.i, i32 undef
  ret i32 %spec.select.i
}

define i32 @tgt1(i32 %retval.0.i.i) {
  ret i32 %retval.0.i.i
}
```
Alive2: https://alive2.llvm.org/ce/z/okJW3G

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=38c9390b59c4d2b9181614d6a909887497d3692f&to=e146f51ba278aa3bb6879a9ec651831ac8938e91&stat=instructions%3Au
2023-12-28 23:37:19 +08:00
Yingwei Zheng
8a4266a626
[InstSimplify] Fold u/sdiv exact (mul nsw/nuw X, C), C --> X when C is not a power of 2 (#76445)
Alive2: https://alive2.llvm.org/ce/z/3D9R7d
2023-12-28 17:36:25 +08:00
Paul Walker
dea16ebd26
[LLVM][IR] Replace ConstantInt's specialisation of getType() with getIntegerType(). (#75217)
The specialisation will not be valid when ConstantInt gains native
support for vector types.

This is largely a mechanical change but with extra attention paid to constant
folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to
remove the need to call `getIntegerType()`.

Co-authored-by: Nikita Popov <github@npopov.com>
2023-12-18 11:58:42 +00:00
Yingwei Zheng
741975df92
[InstCombine][InstSimplify] Pass SimplifyQuery to computeKnownBits directly. NFC. (#74246)
This patch passes `SimplifyQuery` to `computeKnownBits` directly in
`InstSimplify` and `InstCombine`.
As the `DomConditionCache` in #73662 is only used in `InstCombine`, it
is inconvenient to introduce a new argument `DC` to `computeKnownBits`.
2023-12-04 02:26:39 +08:00
Nikita Popov
cd31cf5989 [InstSimplify] Fix or disjoint miscompile with op replacement
Make sure %x does not get folded to "or disjoint %x, %x" without
dropping the flag, as this would be a derefinement.
2023-12-01 11:45:09 +01:00
Nikita Popov
07c18a05e2 [InstSimplify] Fix select bit test miscompile with disjoint
The select condition ensures the disjointness here. The transform
is not valid without dropping the flag, which InstSimplify can't
do.
2023-11-30 16:55:32 +01:00
Nikita Popov
d9e8ae7d2f [ValueTracking] Convert MaskedValueIsZero() to use SimplifyQuery (NFC) 2023-11-29 11:18:42 +01:00
Nikita Popov
9ca9c2cf7e [InstSimplify] Remove redundant gep zero fold (NFC)
We already higher the all zero indices case above, no need to
also handle the special case of a single zero index.
2023-11-20 16:25:48 +01:00
Graham Hunter
4028dd2e93
[InstSimplify] Fold converted urem to 0 if there's no overlapping bits (#71528)
When folding urem instructions we can end up not recognizing that
the output will always be 0 due to Value*s being different, despite
generating the same data (in this case, 2 different calls to vscale).

This patch recognizes the (x << N) & (add (x << M), -1) pattern that
instcombine replaces urem with after the two vscale calls have been
reduced to one via CSE, then replaces with 0 when x is a power of 2
and N >= M.
2023-11-20 10:27:16 +00:00
Nikita Popov
2310066faa [InstSimplify] Simplify calculation of GEP result pointer type (NFC)
The result type is the same as the input pointer type, except for
splat geps.
2023-11-17 17:14:07 +01:00
Nikita Popov
ebb8ffde94 [InstSimplify] Extract commutative and folds into helper (NFCI)
There are a number of and folds that are repeated for both
operand orders. Move these into a helper that is invoked with
both orders.

This is conceptually NFC, but may not be entirely so, as the order
of folds may change.
2023-11-15 16:31:55 +01:00
annamthomas
98d8b688bd
[InstSimplify] Check call for FMF instead of CtxI (#71585)
This code was incorrectly checking that the CtxI has required FMF, but
the context instruction need not always be the instrinsic call.

Check that the intrinsic call has the required FMF.

Fixes PR71548.
2023-11-08 10:25:11 -05:00
Anna Thomas
f0cdf4b468 [InstCombine] Check FPMathOperator for Ctx before FMF check
We need to check FPMathOperator for Ctx instruction before checking fast
math flag on this Ctx.

Ctx is not always an FPMathOperator, so explicitly check for it.

Fixes #71548.
2023-11-07 10:50:19 -05:00
Nikita Popov
0c6a77baa6 [InstSimplify] Remove redundant simplifyAndOrOfICmpsWithZero() fold (NFCI)
This has been subsumed by simplifyAndOrWithICmpEq().
2023-11-07 14:53:32 +01:00
Nikita Popov
fb01f683af [InstSimplify] Remove redundant simplifyAndOrOfICmpsWithLimitConst() fold (NFCI)
This fold has been subsumed by simplifyAndOrWithICmpEq().
2023-11-07 14:35:03 +01:00
Nikita Popov
060de415af Reapply [InstCombine] Simplify and/or of icmp eq with op replacement (#70335)
Relative to the first attempt, this contains two changes:

First, we only handle the case where one side simplifies to true or
false, instead of calling simplification recursively. The previous
approach would return poison if one operand simplified to poison
(under the equality assumption), which is incorrect.

Second, we do not fold llvm.is.constant in simplifyWithOpReplaced().
We may be assuming that a value is constant, if the equality holds,
but it may not actually be constant. This is nominally just a QoI
issue, but the std::list implementation in libstdc++ relies on the
precise behavior in a way that causes miscompiles.

-----

and/or in logical (select) form benefit from generic simplifications via
simplifyWithOpReplaced(). However, the corresponding fold for plain
and/or currently does not exist.

Similar to selects, there are two general cases for this fold
(illustrated with `and`, but there are `or` conjugates).

The basic case is something like `(a == b) & c`, where the replacement
of a with b or b with a inside c allows it to fold to true or false.
Then the whole operation will fold to either false or `a == b`.

The second case is something like `(a != b) & c`, where the replacement
inside c allows it to fold to false. In that case, the operand can be
replaced with c, because in the case where a == b (and thus the icmp is
false), c itself will already be false.

As the test diffs show, this catches quite a lot of patterns in existing
test coverage. This also obsoletes quite a few existing special-case
and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst),
but I haven't removed anything as part of this patch in the interest of
risk mitigation.

Fixes #69050.
Fixes #69091.
2023-11-03 10:16:15 +01:00
Noah Goldstein
8c2fcf5b77 [InstSimplify] Add some basic simplifications for llvm.ptrmask
Mostly the same as `and`. We also have a check for a useless
`llvm.ptrmask` if the ptr is already known aligned.

Differential Revision: https://reviews.llvm.org/D156633
2023-11-01 23:50:35 -05:00
Nikita Popov
e91812792a [InstSimplify] Avoid ConstantExpr::getIntegerCast() (NFCI)
This always works on a constant integer or integer splat, so the
constant fold here should always succeed.
2023-11-01 11:15:18 +01:00
Nikita Popov
e46dd6fbc0 Revert "[InstCombine] Simplify and/or of icmp eq with op replacement (#70335)"
This reverts commit 1770a2e325192f1665018e21200596da1904a330.

Stage 2 llvm-tblgen crashes when generating X86GenAsmWriter.inc and
other files.
2023-10-30 18:33:03 +01:00
Nikita Popov
1770a2e325
[InstCombine] Simplify and/or of icmp eq with op replacement (#70335)
and/or in logical (select) form benefit from generic simplifications via
simplifyWithOpReplaced(). However, the corresponding fold for plain
and/or currently does not exist.

Similar to selects, there are two general cases for this fold
(illustrated with `and`, but there are `or` conjugates).

The basic case is something like `(a == b) & c`, where the replacement
of a with b or b with a inside c allows it to fold to true or false.
Then the whole operation will fold to either false or `a == b`.

The second case is something like `(a != b) & c`, where the replacement
inside c allows it to fold to false. In that case, the operand can be
replaced with c, because in the case where a == b (and thus the icmp is
false), c itself will already be false.

As the test diffs show, this catches quite a lot of patterns in existing
test coverage. This also obsoletes quite a few existing special-case
and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst),
but I haven't removed anything as part of this patch in the interest of
risk mitigation.

Fixes #69050.
Fixes #69091.
2023-10-30 10:05:39 +01:00
Pierre van Houtryve
4fc1e7db27
[InstSimplify] Fold (a != 0) ? abs(a) : 0 (#70305)
Solves #70204
2023-10-27 14:52:09 +02:00
Nikita Popov
4638c29c3d [InstSimplify] Remove redundant pointer icmp fold (NFCI)
This fold is already performed as part of simplifyICmpWithZero().
2023-10-26 14:30:33 +02:00
Craig Topper
a5686c2b55
[InstSimplify] Avoid use of ConstantExpr::getICmp. NFC (#67873) 2023-09-30 13:01:29 -07:00
Craig Topper
abcaebfe3a [InstSimplify] Use cast instead of dyn_cast+assert. NFC 2023-09-29 22:08:53 -07:00
Nikita Popov
b35f2940e9 [InstSimplify] Avoid use of ConstantExpr::getCast()
Use the constant folding API instead.

One of these uses actually improves results, because the bitcast
expression gets folded away.
2023-09-29 10:23:40 +02:00
Nikita Popov
a09e32e5fe [InstSimplify] Respect UseInstrInfo in more folds
Some folds using m_NUW, m_NSW style matchers were missed, make
sure they respect UseInstrInfo.

This is part of #53218, but not a complete fix for the issue.
2023-09-26 13:54:03 +02:00
Yingwei Zheng
0d821b22e0
[InstSimplify] Generalize fold for icmp ugt/ule (pow2 << X), signmask
Alive2: https://alive2.llvm.org/ce/z/wZ41t7
2023-09-25 00:07:20 +08:00
Nikita Popov
c41b4b6397 [InstCombine] Make flag drop during select equiv fold more generic
Instead of unsetting flags on the instruction, attempting the
fold, and the resetting the flags if it failed, add support to
simplifyWithOpReplaced() to ignore poison-generating flags/metadata
and collect all instructions where they may need to be dropped.

This allows us to perform the fold a) with poison-generating
metadata, which was previously not handled and b) poison-generating
flags/metadata that are not on the root instruction.

Proof for the ctpop case: https://alive2.llvm.org/ce/z/3H3HFs

Fixes https://github.com/llvm/llvm-project/issues/62450.
2023-09-19 14:54:25 +02:00
Yingwei Zheng
be2723da5c
[InstSimplify] Fold icmp of X and/or C1 and X and/or C2 into constant (#65905)
This patch simplifies the pattern `icmp X and/or C1, X and/or C2` when
one constant mask is the subset of the other.
If `C1 & C2 == C1`, `A = X and/or C1`, `B = X and/or C2`, we can do the
following folds:
`icmp ule A, B -> true`
`icmp ugt A, B -> false`
We can apply similar folds for signed predicates when `C1` and `C2` are
the same sign:
`icmp sle A, B -> true`
`icmp sgt A, B -> false`

Alive2: https://alive2.llvm.org/ce/z/Q4ekP5
Fixes #65833.
2023-09-18 21:32:48 +08:00
Paul Walker
c7d65e4466 [IR] Enable load/store/alloca for arrays of scalable vectors.
Differential Revision: https://reviews.llvm.org/D158517
2023-09-14 13:49:01 +00:00
Matt Arsenault
00061843bd InstSimplify: Simplifications for ldexp
Ported from old amdgcn intrinsic which will soon be deleted.

https://reviews.llvm.org/D149587
2023-09-13 08:38:48 +03:00
Matt Arsenault
6f2e943de6 InstSimplify: Handle folding fcmp with literal nans without a context instruction
Fixes reported assert after ddb3f12c428bc4bd5a98913d74dfd7f2402bdfd8
2023-09-02 10:22:09 -04:00
Matt Arsenault
5dcd6669ff InstSimplify: Handle exp10(log10(x)) -> x
Copy from exp/exp2 case.

https://reviews.llvm.org/D157894
2023-09-02 09:21:47 -04:00
Matt Arsenault
da077a52c4 InstSimplify: Handle log10(exp10(x))
Copied from the exp/exp2 cases

https://reviews.llvm.org/D157894
2023-09-02 08:57:54 -04:00