1421 Commits

Author SHA1 Message Date
Noah Goldstein
81cdd35c0c [ValueTracking] Add support for xor/disjoint or in isKnownNonZero
Handles cases like `X ^ Y == X` / `X disjoint| Y == X`.

Both of these cases have identical logic to the existing `add` case,
so just converting the `add` code to a more general helper.

Proofs: https://alive2.llvm.org/ce/z/Htm7pe

Closes #87706
2024-04-10 13:13:43 -05:00
Noah Goldstein
2646790155 [ValueTracking] Add tests for xor/disjoint or in isKnownNonZero; NFC 2024-04-10 13:13:43 -05:00
Noah Goldstein
0c57a2e4b4 [ValueTracking] Add support for xor/disjoint or in getInvertibleOperands
This strengthens our `isKnownNonEqual` logic with some fairly
trivial cases.

Proofs: https://alive2.llvm.org/ce/z/4pxRTj

Closes #87705
2024-04-10 13:13:43 -05:00
Noah Goldstein
195d278d50 [ValueTracking] Add tests for xor/disjoint or in getInvertibleOperands; NFC 2024-04-10 13:13:43 -05:00
Noah Goldstein
9c545a14c0 [ValueTracking] Add support for insertelement in isKnownNonZero
Inserts don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.

Closes #87703
2024-04-10 13:13:43 -05:00
Noah Goldstein
8a28b9b8ec [ValueTracking] Add tests for insertelement in isKnownNonZero; NFC 2024-04-10 13:13:43 -05:00
Noah Goldstein
87528bfefb [ValueTracking] Add support for shufflevector in isKnownNonZero
Shuffles don't modify the data, so if all elements that end up in the
destination are non-zero the result is non-zero.

Closes #87702
2024-04-10 13:13:42 -05:00
Noah Goldstein
c1d3f39ae9 [ValueTracking] Add tests for shufflevector in isKnownNonZero 2024-04-10 13:13:42 -05:00
Noah Goldstein
f1ee458ddb [ValueTracking] improve isKnownNonZero precision for smax
Instead of relying on known-bits for strictly positive, use the
`isKnownPositive` API. This will use `isKnownNonZero` which is more
accurate.

Closes #88170
2024-04-10 10:40:49 -05:00
Noah Goldstein
2ff82c2c64 [ValueTracking] Add tests for improving isKnownNonZero of smax; NFC 2024-04-10 10:40:49 -05:00
Noah Goldstein
678f32ab66 [ValueTracking] Add more conditions in to isTruePredicate
There is one notable "regression". This patch replaces the bespoke `or
disjoint` logic we a direct match. This means we fail some
simplification during `instsimplify`.
All the cases we fail in `instsimplify` we do handle in `instcombine`
as we add `disjoint` flags.

Other than that, just some basic cases.

See proofs: https://alive2.llvm.org/ce/z/_-g7C8

Closes #86083
2024-04-04 12:42:58 -05:00
Noah Goldstein
74447cf46f [ValueTracking] Add tests for deducing more conditions in isTruePredicate; NFC 2024-04-04 12:42:58 -05:00
Andreas Jonson
e66cfebb04
[ValueTracking] Handle range attributes (#85143)
Handle the range attribute in ValueTracking.
2024-03-20 12:43:00 +01:00
Nikita Popov
0f46e31cfb
[IR] Change representation of getelementptr inrange (#84341)
As part of the migration to ptradd
(https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699),
we need to change the representation of the `inrange` attribute, which
is used for vtable splitting.

Currently, inrange is specified as follows:

```
getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2)
```

The `inrange` is placed on a GEP index, and all accesses must be "in
range" of that index. The new representation is as follows:

```
getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2)
```

This specifies which offsets are "in range" of the GEP result. The new
representation will continue working when canonicalizing to ptradd
representation:

```
getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48)
```

The inrange offsets are relative to the return value of the GEP. An
alternative design could make them relative to the source pointer
instead. The result-relative format was chosen on the off-chance that we
want to extend support to non-constant GEPs in the future, in which case
this variant is more expressive.

This implementation "upgrades" the old inrange representation in bitcode
by simply dropping it. This is a very niche feature, and I don't think
trying to upgrade it is worthwhile. Let me know if you disagree.
2024-03-20 10:59:45 +01:00
Nikita Popov
00ca80938b [ConstantFold] Fix comparison between special pointer constants
This code was assuming that the LHS would always be one of
GlobalVariable, BlockAddress or ConstantExpr. However, it can
also be a special constant like dso_local_equivalent or no_cfi.
Make sure this is handled gracefully.
2024-03-19 12:24:11 +01:00
Noah Goldstein
5265be11b1 [InstSimply] Simplify (fmul -x, +/-0) -> -/+0
We already handle the `+x` case, and noticed it was missing in the bug
affecting #82555

Proofs: https://alive2.llvm.org/ce/z/WUSvmV

Closes #85345
2024-03-18 15:11:55 -05:00
Noah Goldstein
6984ba7b94 [InstSimply] Add tests for simplify (fmul -x, +/-0); NFC 2024-03-18 15:11:55 -05:00
Matt Arsenault
6cfd3439d4
APFloat: Fix signed zero handling in minnum/maxnum (#83376)
Follow the 2019 rules and order -0 as less than +0 and +0 as greater
than -0. As currently defined this isn't required for the intrinsics,
but is a better QoI.

This will avoid the workaround in libc added by #83158
2024-02-29 16:51:33 +05:30
Paul Walker
6a17929e9f [LLVM][tests/Transforms/InstSimplify] Convert instances of ConstantExpr based splats to use splat().
This is mostly NFC but some output does change due to consistently
inserting into poison rather than undef and using i64 as the index
type for inserts.
2024-02-27 13:37:23 +00:00
Björn Pettersson
7677453886
[ConstantFolding] Do not consider padded-in-memory types as uniform (#81854)
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.

Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.

Main problem solved would be something like this:
  store i17 -1, ptr %p, align 4
  %v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.

If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.

Fixes: https://github.com/llvm/llvm-project/issues/81793
2024-02-15 15:40:21 +01:00
Yingwei Zheng
470c5b8011
[InstSimplify][InstCombine] Remove unnecessary m_c_* matchers. (#81712)
This patch removes unnecessary `m_c_*` matchers since we always
canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`.

Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u
2024-02-14 16:40:36 +08:00
Danila Malyutin
cb1a9f70ec
[InstSimplify] Add trivial simplifications for gc.relocate intrinsic (#81639)
Fold gc.relocate of undef and null to undef and null respectively.

Similar transform is currently done by instcombine, but there is no
reason to not include it here as well.
2024-02-14 02:16:32 +03:00
Yingwei Zheng
e17dded8d7
[InstSimplify] Generalize simplifyAndOrOfFCmps (#81027)
This patch generalizes `simplifyAndOrOfFCmps` to simplify patterns like:
```
define i1 @src(float %x, float %y) {
  %or.cond.i = fcmp ord float %x, 0.000000e+00
  %cmp.i.i34 = fcmp olt float %x, %y
  %cmp.i2.sink.i = and i1 %or.cond.i, %cmp.i.i34
  ret i1 %cmp.i2.sink.i
}

define i1 @tgt(float %x, float %y) {
  %cmp.i.i34 = fcmp olt float %x, %y
  ret i1 %cmp.i.i34
}
```
Alive2: https://alive2.llvm.org/ce/z/9rydcx

This patch and #80986 will fix the regression introduced by #80941.
See also the IR diff
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/199#discussion_r1480974120.
2024-02-08 15:07:35 +08:00
Yingwei Zheng
f37d81f8a3
[PatternMatch] Add a matching helper m_ElementWiseBitCast. NFC. (#80764)
This patch introduces a matching helper `m_ElementWiseBitCast`, which is
used for matching element-wise int <-> fp casts.
The motivation of this patch is to avoid duplicating checks in
https://github.com/llvm/llvm-project/pull/80740 and
https://github.com/llvm/llvm-project/pull/80414.
2024-02-07 21:02:13 +08:00
Yingwei Zheng
50e80e06d1
[ValueTracking] Merge cannotBeOrderedLessThanZeroImpl into computeKnownFPClass (#76360)
This patch merges the logic of `cannotBeOrderedLessThanZeroImpl` into
`computeKnownFPClass` to improve the signbit inference.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-01-31 18:26:50 +08:00
Matt Arsenault
a46422a776 Reapply "ValueTracking: Identify implied fp classes by general fcmp (#66505)"
This reverts commit 0d0c2298552222b049fa3b8db5efef4b161e51e9.

Includes a bug fix for fcmp one handling, as well as for positive constants.
2024-01-25 13:38:23 +05:30
Nikita Popov
49e3e75143 [ConstantFold] Clean up binop identity folding
Resolve the two FIXMEs: Perform the binop identitiy fold with
AllowRHSConstant, and remove redundant folds later in the code.
2024-01-18 10:37:48 +01:00
Nikita Popov
97e3220d63 [InstSimplify] Consider bitcast as potential cross-lane operation
The bitcast might change the number of vector lanes, in which case
it will be a cross-lane operation.

Fixes https://github.com/llvm/llvm-project/issues/77320.
2024-01-08 15:52:58 +01:00
Nikita Popov
ade7ae4760 [InstSimplify] Add test for #77320 (NFC) 2024-01-08 15:52:58 +01:00
Yingwei Zheng
554feb0058
[InstSimplify] Simplify select cond, undef, val to val if val = poison implies cond = poison (#76465)
This patch folds:
```
select cond, undef, val -> val
select cond, val, undef -> val
```
iff `impliesPoison(val, cond)` returns true.

Example:
```
define i32 @src1(i32 %retval.0.i.i) {
  %cmp.i = icmp sgt i32 %retval.0.i.i, -1
  %spec.select.i = select i1 %cmp.i, i32 %retval.0.i.i, i32 undef
  ret i32 %spec.select.i
}

define i32 @tgt1(i32 %retval.0.i.i) {
  ret i32 %retval.0.i.i
}
```
Alive2: https://alive2.llvm.org/ce/z/okJW3G

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=38c9390b59c4d2b9181614d6a909887497d3692f&to=e146f51ba278aa3bb6879a9ec651831ac8938e91&stat=instructions%3Au
2023-12-28 23:37:19 +08:00
Yingwei Zheng
8a4266a626
[InstSimplify] Fold u/sdiv exact (mul nsw/nuw X, C), C --> X when C is not a power of 2 (#76445)
Alive2: https://alive2.llvm.org/ce/z/3D9R7d
2023-12-28 17:36:25 +08:00
Yingwei Zheng
1fea712cd1
[ValueTracking] Infer X u<= X +nuw Y for any Y (#75524)
Alive2: https://alive2.llvm.org/ce/z/kiGxCf
Fixes #70374.
2023-12-15 16:33:39 +08:00
Nikita Popov
7686d49517 [ValueTracking] Handle returned attribute with mismatched type
The returned attribute can be used when it is possible to
"losslessly bitcast" between the argument and return type,
including between two vector types.

computeKnownBits() would crash in this case, isKnownNonZero()
would potentially produce a miscompile.

Fixes https://github.com/llvm/llvm-project/issues/74722.
2023-12-08 17:05:13 +01:00
Mikhail Goncharov
0d0c229855 Revert "Reapply "ValueTracking: Identify implied fp classes by general fcmp (#66505)""
This reverts commit d55692d60d218f402ce107520daabed15f2d9ef6.

See discussion in #66505: assertion fires in OSS build of TensorFlow.
2023-12-05 11:10:24 +01:00
Nikita Popov
460faa0c87 [InstSimplify] Check common operand with constant earlier
If both icmps have the same operands and the RHS is constant, we
would currently go into the isImpliedCondMatchingOperands() code
path, instead of the isImpliedCondCommonOperandWithConstants()
path. Both are correct, but the latter can produce more accurate
results if the implication is dependent on the sign.
2023-12-01 12:18:59 +01:00
Nikita Popov
89b0044ca9 [InstSimplify] Add test for implied cond with equal ops and constant (NFC) 2023-12-01 12:18:27 +01:00
Nikita Popov
cd31cf5989 [InstSimplify] Fix or disjoint miscompile with op replacement
Make sure %x does not get folded to "or disjoint %x, %x" without
dropping the flag, as this would be a derefinement.
2023-12-01 11:45:09 +01:00
Nikita Popov
5a1020bb00 [InstSimplify] Add test for disjoint or miscompile (NFC)
The absorption case is already handled correctly, but the
idempentence case is not.
2023-12-01 11:45:09 +01:00
Matt Arsenault
d55692d60d Reapply "ValueTracking: Identify implied fp classes by general fcmp (#66505)"
This reverts commit 96a0d714d58e48c363ee6abbbcdfd7a6ce646ac1.

Avoid assert with dynamic denormal-fp-math We don't recognize compares
with 0 as an exact class test if we don't know the denormal mode. We could
try to do better here, but it's probably not worth it.

Fixes asserts reported after 1adce7d8e47e2438f99f91607760b825e5e3cc37
2023-12-01 17:51:46 +09:00
Nikita Popov
ea602cb806 [IR] Support or disjoint in hasPoisonGeneratingFlags()
This fixed incorrect removal of freeze instructions.
2023-11-30 17:26:23 +01:00
Nikita Popov
ca5a01d8e4 [InstSimplify] Add test for incorrect freeze of or disjoint (NFC) 2023-11-30 17:26:23 +01:00
Nikita Popov
07c18a05e2 [InstSimplify] Fix select bit test miscompile with disjoint
The select condition ensures the disjointness here. The transform
is not valid without dropping the flag, which InstSimplify can't
do.
2023-11-30 16:55:32 +01:00
Nikita Popov
c89553ae82 [InstSimplify] Add test for or disjoint miscompile (NFC) 2023-11-30 16:55:32 +01:00
Graham Hunter
4028dd2e93
[InstSimplify] Fold converted urem to 0 if there's no overlapping bits (#71528)
When folding urem instructions we can end up not recognizing that
the output will always be 0 due to Value*s being different, despite
generating the same data (in this case, 2 different calls to vscale).

This patch recognizes the (x << N) & (add (x << M), -1) pattern that
instcombine replaces urem with after the two vscale calls have been
reduced to one via CSE, then replaces with 0 when x is a power of 2
and N >= M.
2023-11-20 10:27:16 +00:00
Nikita Popov
56c1d30183
[IR] Remove support for lshr/ashr constant expressions (#71955)
Remove support for the lshr and ashr constant expressions. All places
creating them have been removed beforehand, so this just removes the
APIs and uses of these constant expressions in tests.

This is part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
2023-11-14 09:25:14 +01:00
Matt Arsenault
0e1a52f556
ValueTracking: Handle compare gt to -inf in class identification (#72086)
This apparently shows up somewhere in chromium. We also are missing a
canonicalization to an equality compare with inf.
2023-11-14 10:05:38 +09:00
Matt Arsenault
d4912e8050 ValueTracking: Add some tests to cover asserts in fcmpImpliesClass
Catch asserts hit after 1adce7d8e47e2438f99f91607760b825e5e3cc37
2023-11-13 15:05:52 +09:00
Hans Wennborg
96a0d714d5 Revert "ValueTracking: Identify implied fp classes by general fcmp (#66505)"
This causes asserts to fire:

  llvm/lib/Analysis/ValueTracking.cpp:4262:
  std::tuple<Value *, FPClassTest, FPClassTest> llvm::fcmpImpliesClass(CmpInst::Predicate, const Function &, Value *, const APFloat *, bool):
  Assertion `(RHSClass == fcPosNormal || RHSClass == fcNegNormal || RHSClass == fcPosSubnormal || RHSClass == fcNegSubnormal) && "should have been recognized as an exact class test"' failed.

See comments on the PR.

> Previously we could recognize exact class tests performed by
> an fcmp with special values (0s, infs and smallest normal).
> Expand this to recognize the implied classes by a compare with a general
> constant. e.g. fcmp ogt x, 1 implies positive and non-0.
>
> The API should be better merged with fcmpToClassTest but that
> made the diff way bigger, will try to do that in a future
> patch.

This reverts commit dc3faf0ed0e3f1ea9e435a006167d9649f865da1.
2023-11-10 14:45:52 +01:00
Matt Arsenault
dc3faf0ed0
ValueTracking: Identify implied fp classes by general fcmp (#66505)
Previously we could recognize exact class tests performed by
an fcmp with special values (0s, infs and smallest normal).
Expand this to recognize the implied classes by a compare with a general
constant. e.g. fcmp ogt x, 1 implies positive and non-0.
    
The API should be better merged with fcmpToClassTest but that
made the diff way bigger, will try to do that in a future
patch.
2023-11-10 11:39:19 +09:00
Graham Hunter
34f83e86b4 [InstSimplify] Precommit extra tests for PR71528 2023-11-08 17:02:10 +00:00