1109 Commits

Author SHA1 Message Date
Nikita Popov
8201926ec0
[InstSimplify] Generalize simplification of icmps with monotonic operands (#69471)
InstSimplify currently folds patterns like `(x | y) uge x` and `(x & y)
ule x` to true. However, it cannot handle combinations of such
situations, such as `(x | y) uge (x & z)` etc.

To support this, recursively collect operands of monotonic instructions
(that preserve either a greater-or-equal or less-or-equal relationship)
and then check whether any of them match.

Fixes https://github.com/llvm/llvm-project/issues/69333.
2024-12-02 09:53:10 +01:00
Yingwei Zheng
42ed775783
[InstSimplify] Generalize simplifyAndOrOfFCmps to handle fabs (#116590)
This patch generalizes https://github.com/llvm/llvm-project/issues/81027
to handle pattern `and/or (fcmp ord/uno X, 0), (fcmp pred fabs(X), Y)`.
Alive2: https://alive2.llvm.org/ce/z/tsgUrz
The correctness is straightforward because `fcmp ord/uno X, 0.0` is
equivalent to `fcmp ord/uno fabs(X), 0.0`. We may generalize it to
handle fneg as well.

Address comment
https://github.com/llvm/llvm-project/pull/116065#pullrequestreview-2434796846
2024-11-19 20:10:40 +08:00
Ramkumar Ramachandra
94eebf721a
InstSimplify: support floating-point equivalences (#115152)
Since cd16b07 (IR: introduce CmpInst::isEquivalence), there is now an
isEquivalence routine in CmpInst that we can use to determine
equivalence in simplifySelectWithICmpEq. Implement this, extending the
code from integer-equalities to integer and floating-point equivalences.
2024-11-15 20:06:11 +00:00
Nikita Popov
dd9f1a572b
[InstSimplify] Correctly handle comparison with zero-size allocs (#115728)
InstSimplify currently folds alloc1 == alloc2 to false, even if one of
them is a zero-size allocation. A zero-size allocation may have the same
address as another allocation.

This also disables the fold for the case where we're comparing a
zero-size alloc with the middle of another allocation. It's possible
that this case is legal to fold depending on our precise zero-size
allocation semantics, but LangRef currently doesn't specify this either
way, so we shouldn't make assumptions here.
2024-11-14 11:55:19 +01:00
Nikita Popov
dd116369f6
[InstSimplify] Fix incorrect poison propagation when folding phi (#96631)
We can only replace phi(X, undef) with X, if X is known not to be
poison. Otherwise, the result may be more poisonous on the undef branch.

Fixes https://github.com/llvm/llvm-project/issues/68683.
2024-11-07 14:09:45 +01:00
Yingwei Zheng
a77dedcacb
[InstSimplify][InstCombine][ConstantFold] Move vector div/rem by zero fold to InstCombine (#114280)
Previously we fold `div/rem X, C` into `poison` if any element of the
constant divisor `C` is zero or undef. However, it is incorrect when
threading udiv over an vector select:
https://alive2.llvm.org/ce/z/3Ninx5
```
define <2 x i32> @vec_select_udiv_poison(<2 x i1> %x) {
  %sel = select <2 x i1> %x, <2 x i32> <i32 -1, i32 -1>, <2 x i32> <i32 0, i32 1>
  %div = udiv <2 x i32> <i32 42, i32 -7>, %sel
  ret <2 x i32> %div
}
```
In this case, `threadBinOpOverSelect` folds `udiv <i32 42, i32 -7>, <i32
-1, i32 -1>` and `udiv <i32 42, i32 -7>, <i32 0, i32 1>` into
`zeroinitializer` and `poison`, respectively. One solution is to
introduce a new flag indicating that we are threading over a vector
select. But it requires to modify both `InstSimplify` and
`ConstantFold`.

However, this optimization doesn't provide benefits to real-world
programs:

https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/IR/ConstantFold.cpp.html#L908

https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/Analysis/InstructionSimplify.cpp.html#L1107

This patch moves the fold into InstCombine to avoid breaking numerous
existing tests.

Fixes #114191 and #113866 (only poison-safety issue).
2024-11-01 22:56:22 +08:00
Ramkumar Ramachandra
c5f82f7893
ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011)
Factor out and unify common code from InstSimplify and InstCombine that
partially guard against cross-lane vector operations into
llvm::isNotCrossLaneOperation in ValueTracking.

Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka
2024-10-14 11:37:30 +01:00
Nikita Popov
b8d1bae648
[CmpInstAnalysis] Return decomposed bit test as struct (NFC) (#109819)
decomposeBitTestICmp() currently returns the result via two out
parameters plus an in-place modification of Pred. This changes it to
return an optional struct instead.

The motivation here is twofold. First, I'd like to extend this code to
handle cases where the comparison is against a value other than zero,
which would mean yet another out parameter. Second, while doing that I
was badly bitten by the in-place modification, so I'd like to get rid of
it.
2024-09-25 10:14:15 +02:00
Yingwei Zheng
ff80e1ffe7
[InstSimplify] Simplify uadd.sat(X, Y) u>= X + Y and usub.sat(X, Y) u<= X, Y (#104698)
These patterns are found in harfbuzz/typst.

Alive2: https://alive2.llvm.org/ce/z/cxyjYV
2024-08-18 20:55:05 +08:00
Daniil Fukalov
0da2ba811a
[NFC] Cleanup in ADT and Analysis headers. (#104484)
Remove unused directly includes and forward declarations in ADT and
Analysis headers.
2024-08-17 13:11:18 +02:00
Benjamin Kramer
c31ac81091
[InstSimplify] Fold (insertelement Splat(C), C, X) -> Splat(C) (#102315)
The index doesn't matter here.
2024-08-07 16:39:08 +02:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00
Nikita Popov
de29b850f0
[InstSimplify] Fix simplifyAndOrWithICmpEq with undef refinement (#98898)
The final case in Simplify (where Res == Absorber and the predicate is
inverted) is not generally safe when the simplification is a refinement.
In particular, we may simplify assuming a specific value for undef, but
then chose a different one later.

However, it *is* safe to refine poison in this context, unlike in the
equivalent select folds. This is the reason why this fold did not use
AllowRefinement=false in the first place, and using that option would
introduce a lot of test regressions.

This patch takes the middle path of disabling undef refinements in
particular using the getWithoutUndef() SimplifyQuery option. However,
this option doesn't actually work in this case, because the problematic
fold is inside constant folding, and we currently don't propagate this
option all the way from InstSimplify over ConstantFolding to
ConstantFold. Work around this by explicitly checking for undef operands
in simplifyWithOpReplaced().

Finally, make sure that places where AllowRefinement=false also use
Q.getWithoutUndef(). I don't have a specific test case for this (the
original one does not work because we don't simplify selects with
constant condition in this mode in the first place) but this seems like
the correct thing to do to be conservative.

Fixes https://github.com/llvm/llvm-project/issues/98753.
2024-07-16 11:40:04 +02:00
Nikita Popov
9d34b673c0
[InstSimplify] Fold ptrtoint(ptradd(P,X-ptrtoint(P))) to X (#98649)
This is a special case of the general ptrtoint(gep) to add(ptrtoint)
transform that is particularly profitable, as everything folds away.

Proof: https://alive2.llvm.org/ce/z/fwv8_L

Fixes https://github.com/llvm/llvm-project/issues/86417.
2024-07-15 09:26:03 +02:00
Yingwei Zheng
6f619c98ae
[InstSimplify] Only handle canonical forms in simplifyAndOrOfFCmps. NFC. (#98136)
This patch avoids calling `isKnownNeverNaN` in `simplifyAndOrOfFCmps`
since `fcmp ord/uno X, NNAN` will be canonicalized into `fcmp ord/uno X,
0.0` in InstCombine.
2024-07-10 12:59:27 +08:00
Alex MacLean
8e9d50cdd1
[InstSimplify] fold uno/ord comparison if fpclass is always NaN (#97763)
In InstSimplify we already fold `fcmp ord/uno` to a constant when both
operands are known to be non-NaN. This change slightly generalizes this
to also handle the case where either of the operands is known to always
be NaN.

Proof: https://alive2.llvm.org/ce/z/AhCmJN
2024-07-09 14:44:48 -07:00
Noah Goldstein
aef44e49a7 [InstSimplify] Add simplification for ({u,s}rem (mul {nuw,nsw} X, C1), C0)
We can simplify these to `0` if `C1 % C0 == 0`

Proofs: https://alive2.llvm.org/ce/z/EejAdk

Closes #97037
2024-07-01 22:22:36 +08:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Nikita Popov
c5aa983f91 [InstSimplify] Fold all poison phi to poison instead of undef 2024-06-25 14:28:13 +02:00
Nikita Popov
9b8c3c6871 [InstSimplify] Use poison instead of undef for unreachable inst 2024-06-24 16:25:40 +02:00
Poseydon42
b7b3d1798d
[InstSimplify] Implement simple folds for ucmp/scmp intrinsics (#95601)
This patch adds folds for the cases where both operands are the same or
where it can be established that the first operand is less than, equal
to, or greater than the second operand.
2024-06-17 13:10:57 +08:00
AtariDreams
fbac697782
[Transforms] Replace incorrect uses of m_Deferred with m_Specific (#95719)
The values have been bound already, so use m_Specific.
2024-06-17 05:50:39 +08:00
Nikita Popov
cc2dc0916a Reapply [ConstantFold] Drop gep of gep fold entirely (#95126)
Reapplying without changes. The flang+openmp buildbot failure
should be addressed by https://github.com/llvm/llvm-project/pull/94541.

-----

This is a followup to https://github.com/llvm/llvm-project/pull/93823
and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are
now left to the DataLayout-aware constant folder, which will fold
everything to a single i8 GEP.

We didn't have any test coverage for this fold in LLVM, but some Clang
tests change.
2024-06-13 17:03:35 +02:00
Nikita Popov
cece0a105b Revert "[ConstantFold] Drop gep of gep fold entirely (#95126)"
This reverts commit 3b3b839c66dc49674fd6646650525a2173030690.

This broke the flang+openmp+offload buildbot, as reported in
https://github.com/llvm/llvm-project/pull/95126#issuecomment-2162424019.
2024-06-12 11:52:12 +02:00
Nikita Popov
3b3b839c66
[ConstantFold] Drop gep of gep fold entirely (#95126)
This is a followup to https://github.com/llvm/llvm-project/pull/93823
and drops the DataLayout-unaware GEP of GEP fold entirely. All cases are
now left to the DataLayout-aware constant folder, which will fold
everything to a single i8 GEP.

We didn't have any test coverage for this fold in LLVM, but some Clang
tests change.
2024-06-12 09:50:14 +02:00
Nikita Popov
f98be870e4 [InstSimplify] Accept GEPNoWrapFlags instead of only InBounds flag
This preserves the flags if a constexpr GEP is created (at least
as long as they don't get dropped later -- the test cases uses a
constexpr index to avoid that).
2024-06-04 11:15:06 +02:00
Nikita Popov
d094bb660d [InstSimplify] Avoid use of ConstantExpr::getICmp() (NFC)
Use ConstantFoldCompareInstOperands() instead.
2024-05-21 09:26:51 +02:00
Ramkumar Ramachandra
c4d6867b78
InstSimplify: strip bad TODO (NFC) (#92754)
foldIdentityShuffles requires two sets of canceling shuffles. If there
are any intervening instructions, they are feeding in the result of the
first set of shuffles. To eliminate the two sets of shuffles, you'd have
to rewrite the head of the intervening instructions to feed in the
operand of the first set of shuffles. Since modifying the IR in any way
is disallowed by an analysis, strip this bad TODO.
2024-05-21 08:02:10 +01:00
Yingwei Zheng
d085b42cbb
[InstSimplify] Do not simplify freeze in simplifyWithOpReplaced (#91215)
See the LangRef:
> All uses of a value returned by the same ‘freeze’ instruction are
guaranteed to always observe the same value, while different ‘freeze’
instructions may yield different values.

It is incorrect to replace freezes with the simplified value.

Proof:
https://alive2.llvm.org/ce/z/3Dn9Cd
https://alive2.llvm.org/ce/z/Qyh5h6

Fixes https://github.com/llvm/llvm-project/issues/91178
2024-05-08 10:04:09 +08:00
Maciej Gabka
bfc0317153
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Nikita Popov
f8a19a8f74 [SimplifyQuery] Avoid PatternMatch.h include (NFC)
Move the one method that uses it out of line. This is primarily to
reduce the number of files to rebuild when changing PatternMatch.h.
2024-04-23 12:18:07 +09:00
Nikita Popov
1baa385065
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.

This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.

As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.

There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
2024-04-18 15:44:12 +09:00
Andreas Jonson
ff3523f67b
[IR] Drop poison-generating return attributes when necessary (#89138)
Rename has/dropPoisonGeneratingFlagsOrMetadata to
has/dropPoisonGeneratingAnnotations and make it also handle
nonnull, align and range return attributes on calls, similar
to the existing handling for !nonnull, !align and !range metadata.
2024-04-18 15:27:36 +09:00
Nikita Popov
d9a5aa8e2d
[PatternMatch] Do not accept undef elements in m_AllOnes() and friends (#88217)
Change all the cstval_pred_ty based PatternMatch helpers (things like
m_AllOnes and m_Zero) to only allow poison elements inside vector
splats, not undef elements.

Historically, we used to represent non-demanded elements in vectors
using undef. Nowadays, we use poison instead. As such, I believe that
support for undef in vector splats is no longer useful.

At the same time, while poison splat elements are pretty much always
safe to ignore, this is not generally the case for undef elements. We
have existing miscompiles in our tests due to this (see the
masked-merge-*.ll tests changed here) and it's easy to miss such cases
in the future, now that we write tests using poison instead of undef
elements.

I think overall, keeping support for undef elements no longer makes
sense, and we should drop it. Once this is done consistently, I think we
may also consider allowing poison in m_APInt by default, as doing that
change is much less risky than doing the same with undef.

This change involves a substantial amount of test changes. For most
tests, I've just replaced undef with poison, as I don't think there is
value in retaining both. For some tests (where the distinction between
undef and poison is important), I've duplicated tests.
2024-04-17 18:22:05 +09:00
Harald van Dijk
60de56c743
[ValueTracking] Restore isKnownNonZero parameter order. (#88873)
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
2024-04-16 15:21:09 +01:00
Jordan Rupprecht
37575f5262 [NFC][ValueTracking] Fix Wunused-variable
For e0a628715a8464e220c8ba9e9aaaf2561139198a
2024-04-12 16:27:53 +00:00
Yingwei Zheng
e0a628715a
[ValueTracking] Convert isKnownNonZero to use SimplifyQuery (#85863)
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can
use the context information from `DomCondCache`.

Fixes https://github.com/llvm/llvm-project/issues/85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj
2024-04-12 23:47:20 +08:00
AtariDreams
5d6b00929b
[NFC] Replace m_Sub(m_Zero(), X) with m_Neg(X) (#88461) 2024-04-12 18:24:03 +09:00
Yingwei Zheng
3197f9d8b0
[InstSimplify] Make sure the simplified value doesn't generate poison in threadBinOpOverSelect (#87075)
Alive2: https://alive2.llvm.org/ce/z/y_Jmdn
Fix https://github.com/llvm/llvm-project/issues/87042.
2024-04-11 12:48:52 +08:00
Yingwei Zheng
2f1f6b704d
[LLVM] Use std::move for APInt. NFC. (#86257)
This patch adjusts argument passing for `APInt` to improve the
compile-time.
Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=ba3e326def3a6e5cd6d72ff5a49c74fba18de1df&stat=instructions:u
2024-03-23 14:58:25 +08:00
Andreas Jonson
e66cfebb04
[ValueTracking] Handle range attributes (#85143)
Handle the range attribute in ValueTracking.
2024-03-20 12:43:00 +01:00
Noah Goldstein
5265be11b1 [InstSimply] Simplify (fmul -x, +/-0) -> -/+0
We already handle the `+x` case, and noticed it was missing in the bug
affecting #82555

Proofs: https://alive2.llvm.org/ce/z/WUSvmV

Closes #85345
2024-03-18 15:11:55 -05:00
Artem Tyurin
141145232f
[IRBuilder] Fold binary intrinsics (#80743)
Fixes https://github.com/llvm/llvm-project/issues/61240.
2024-03-15 09:58:25 +01:00
Andreas Jonson
a3b52509d5
[InstSimpliy] Use range attribute to simplify comparisons (#84627)
Use the new range attribute from https://github.com/llvm/llvm-project/pull/84617
to simplify comparisons where both sides have range information.
2024-03-12 10:39:37 +01:00
Andreas Jonson
54bb4be018
[InstSimplify] Handle vec values when simplifying comparisons using range metadata (#84673)
Found that this failed with an assertion when vec was used in this
optimization while working on https://github.com/llvm/llvm-project/pull/84627.
2024-03-10 12:54:37 +01:00
Björn Pettersson
7677453886
[ConstantFolding] Do not consider padded-in-memory types as uniform (#81854)
Teaching ConstantFoldLoadFromUniformValue that types that are padded in
memory can't be considered as uniform.

Using the big hammer to prevent optimizations when loading from a
constant for which DataLayout::typeSizeEqualsStoreSize would return
false.

Main problem solved would be something like this:
  store i17 -1, ptr %p, align 4
  %v = load i8, ptr %p, align 1
If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't
really tell where the padding goes. And even if we assume that the 15
most significant bits are padding, then they should be considered as
undefined (even if LLVM backend typically would pad with zeroes).
Anyway, for a big-endian target the load would read those most
significant bits, which aren't guaranteed to be one's. So it would be
wrong to constant fold the load as returning -1.

If LLVM IR had been more explicit about the placement of padding, then
we could allow the constant fold of the load in the example, but only
for little-endian.

Fixes: https://github.com/llvm/llvm-project/issues/81793
2024-02-15 15:40:21 +01:00
Yingwei Zheng
470c5b8011
[InstSimplify][InstCombine] Remove unnecessary m_c_* matchers. (#81712)
This patch removes unnecessary `m_c_*` matchers since we always
canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`.

Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u
2024-02-14 16:40:36 +08:00
Yingwei Zheng
dc866ae49e
[ValueTracking] Move the isSignBitCheck helper into ValueTracking. NFC. (#81704)
This patch moves the `isSignBitCheck` helper into ValueTracking to reuse
the logic in ValueTracking/InstSimplify.

Addresses the comment
https://github.com/llvm/llvm-project/pull/80740#discussion_r1488440050.
2024-02-14 15:33:08 +08:00
Danila Malyutin
cb1a9f70ec
[InstSimplify] Add trivial simplifications for gc.relocate intrinsic (#81639)
Fold gc.relocate of undef and null to undef and null respectively.

Similar transform is currently done by instcombine, but there is no
reason to not include it here as well.
2024-02-14 02:16:32 +03:00