1005 Commits

Author SHA1 Message Date
Yingwei Zheng
345d7b1618
[InstCombine] Fold minmax intrinsic using KnownBits information (#76242)
This patch tries to fold minmax intrinsic by using
`computeConstantRangeIncludingKnownBits`.
Fixes regression in
[_karatsuba_rec:cpython/Modules/_decimal/libmpdec/mpdecimal.c](c31943af16/Modules/_decimal/libmpdec/mpdecimal.c (L5460-L5462)),
which was introduced by #71396.
See also
https://github.com/dtcxzyw/llvm-opt-benchmark/issues/16#issuecomment-1865875756.

Alive2 for splat vectors with undef: https://alive2.llvm.org/ce/z/J8hKWd
2023-12-23 04:41:32 +08:00
Chia
8674a023bc
[InstCombine] fold (Binop phi(a, b) phi(b, a)) -> (Binop a, b) while Binop is commutative. (#75765)
Alive2 proof: https://alive2.llvm.org/ce/z/2P8gq-
This patch closes #73905
2023-12-21 22:47:21 +08:00
Nikita Popov
465ecf872e [InstCombine] Rename UndefElts -> PoisonElts (NFC)
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
2023-12-18 12:36:19 +01:00
Benjamin Kramer
60aeea21fd [InstCombine] Fix uninitialized variable usage
m_Specific can only be used if the previous check suceeded. Found by
msan.
2023-12-13 16:31:19 +01:00
Sizov Nikita
88cc35b27e
[InstCombine] Fold binop (select cond, a, b), (select cond, b, a) to binop a, b (#74953)
```
CommutativeBinOp(select(V, A, B), select(V, B, A) --> CommutativeBinOp(A, B)
CommutativeIntrinsicCall(select(V, A, B), select(V, B, A), ...) --> CommutativeIntrinsicCall(A, B, ...)
```

https://alive2.llvm.org/ce/z/8CDUZ4

Closes #73904
2023-12-13 14:09:27 +08:00
Sizov Nikita
827f8a7ef6
Add opt with ctlz and shifts of power of 2 constants (#74175)
This patch does the following simplifications:
```
cttz(shl(C, X), 1) -> add(cttz(C, 1), X)
cttz(lshr exact(C, X), 1) -> sub(cttz(C, 1), X)
ctlz(lshr(C, X), 1) --> add(ctlz(C, 1), X)
ctlz(shl nuw (C, X), 1) --> sub(ctlz(C, 1), X)
```
Alive2: https://alive2.llvm.org/ce/z/9KHlKc
Closes #41333
2023-12-08 15:06:23 +08:00
Jeremy Morse
2425e2940e
[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149)
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.

This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.

Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
2023-11-30 12:19:57 +00:00
Noah Goldstein
b7c0f79926 [InstCombine] Replace isFreeToInvert + CreateNot with getFreelyInverted
This is nearly an NFC, the only change is potentially to order that
values are created/names.

Otherwise it is a slight speed boost/simplification to avoid having to
go through the `getFreelyInverted` recursive logic twice to simplify
the extra `not` op.
2023-11-20 17:59:27 -06:00
Tom Stellard
2750a22745
Passes: Consolidate EnableKnowledgeRetention declarations into a header file (#71695) 2023-11-13 11:03:49 -08:00
Noah Goldstein
cc8341872d [InstCombine] Preserve return attributes when merging llvm.ptrmask
If we have assosiated attributes i.e `([ret_attrs] (ptrmask (ptrmask
p0, m0), m1))` we should preserve `[ret_attrs]` when combining the two
`llvm.ptrmask`s.

Differential Revision: https://reviews.llvm.org/D156638
2023-11-01 23:50:36 -05:00
Noah Goldstein
51abbf98d1 [InstCombine] Deduce align and nonnull return attributes for llvm.ptrmask
We can deduce the former based on the mask / incoming pointer
alignment.  We can set the latter based if know the result in non-zero
(this is essentially just caching our analysis result).

Differential Revision: https://reviews.llvm.org/D156636
2023-11-01 23:50:35 -05:00
Noah Goldstein
edb9e9a5fb [InstCombine] Implement SimplifyDemandedBits for llvm.ptrmask
Logic basically copies 'and' but we can't return a constant if the
result == rhs (mask) so that case is skipped.
2023-11-01 23:50:35 -05:00
Nikita Popov
0b5e0fb62d [InstCombine] Avoid some uses of ConstantExpr::getIntegerCast() (NFC)
Use IRBuilder or ConstantFolding instead.
2023-11-01 11:41:50 +01:00
Nikita Popov
eb86de63d9
[IR] Require that ptrmask mask matches pointer index size (#69343)
Currently, we specify that the ptrmask intrinsic allows the mask to have
any size, which will be zero-extended or truncated to the pointer size.

However, what semantics of the specified GEP expansion actually imply is
that the mask is only meaningful up to the pointer type *index* size --
any higher bits of the pointer will always be preserved. In other words,
the mask gets 1-extended from the index size to the pointer size. This
is also the behavior we want for CHERI architectures.

This PR makes two changes:
* It spells out the interaction with the pointer type index size more
explicitly.
* It requires that the mask matches the pointer type index size. The
intention here is to make handling of this intrinsic more robust, to
avoid accidental mix-ups of pointer size and index size in code
generating this intrinsic. If a zero-extend or truncate of the mask is
desired, it should just be done explicitly in IR. This also cuts down on
the amount of testing we have to do, and things transforms needs to
check for.

As far as I can tell, we don't actually support pointers with different
index type size at the SDAG level, so I'm just asserting the sizes match
there for now. Out-of-tree targets using different index sizes may need
to adjust that code.
2023-10-24 09:54:29 +02:00
Kerry McLaughlin
b0cc47c959
[InstCombine] Remove scalable vector extracts to and from the same type (#69702)
visitCallInst already looks for fixed width vector extracts where number of
elements in the source and destination types are equal. This patch modifies
the function to also identify scalable extracts which can be removed.
2023-10-23 11:21:49 +01:00
Nikita Popov
d4300154b6 Revert "[ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)"
This reverts commit b5743d4798b250506965e07ebab806a3c2d767cc.

This causes some minor compile-time impact. Revert for now, better
to do the change more gradually.
2023-10-16 14:04:09 +02:00
Nikita Popov
b5743d4798 [ValueTracking] Remove by-ref computeKnownBits() overloads (NFC)
Remove the old overloads that accept KnownBits by reference, in
favor of those that return it by value.
2023-10-16 13:00:31 +02:00
Nikita Popov
6cd5eb1f54 [InstCombine] Avoid some uses of ConstantExpr::getZExt() (NFC)
Add helpers getLosslessUnsignedTrunc/getLosslessSignedTrunc for
this common pattern.
2023-09-28 17:02:33 +02:00
Matt Arsenault
07acfe3a4d
ADT: Replace FPClassTest fabs with inverse_fabs and unknown_sign (#66390) 2023-09-14 19:46:53 +03:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Qi Hu
1a65cd3fcf [InstCombine] Optimize implementations of min/max for bool
umin.i1 -> and : https://alive2.llvm.org/ce/z/6FNH6k
smin.i1 -> or : https://alive2.llvm.org/ce/z/h96S6o
umax.i1 -> or : https://alive2.llvm.org/ce/z/XHdeVk
smax.i1 -> and : https://alive2.llvm.org/ce/z/fkxKJx
umin.v4i1 -> and : https://alive2.llvm.org/ce/z/yV4VgP
smin.v4i1 -> or : https://alive2.llvm.org/ce/z/e9TF68
umax.v4i1 -> or : https://alive2.llvm.org/ce/z/tfNyfK
smax.v4i1 -> and : https://alive2.llvm.org/ce/z/0__Af2

Reviewed By: goldstein.w.n, bryanpkc

Differential Revision: https://reviews.llvm.org/D158915
2023-09-07 10:28:54 -04:00
Kazu Hirata
83e6931827 [llvm] Use llvm::is_contained (NFC) 2023-09-02 09:32:46 -07:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Matt Arsenault
5ae881ff0a InstCombine: Fold out scale-if-denormal pattern
Fold select (fcmp oeq x, 0), (fmul x, y), x => x

This cleans up a pattern left behind by denormal range checks under
denormals are zero.

The pattern starts out as something like:
  x = x < smallest_normal ? x * K : x;

The comparison folds to an == 0 when the denormal mode treats input
denormals as zero. This makes library denormal checks free after
linked into DAZ enabled code.

alive2 is mostly happy with this, but there are some issues. First,
there are many reported failures in some of the negative tests that
happen to trigger some preexisting canonicalize introducing
combine. Second, alive2 is incorrectly asserting that denormals must
be flushed with the DAZ modes. It's allowed to drop a canonicalize.

https://reviews.llvm.org/D157030
2023-09-01 07:47:12 -04:00
Matt Arsenault
2b582440c1 InstCombine: Fold is.fpclass(x, fcInf) to fabs+fcmp
This is a better canonical form. fcmp and fabs are more widely
understood and fabs can fold for free into some sources.

Addresses todo from D146170

https://reviews.llvm.org/D159084
2023-08-29 17:58:15 -04:00
Zhongyunde
4225f54bf5 [InstCombine] Fold abs of known sign operand when source is sub
abs(x-y) --> x-y where x >= y, done on D122013
abs(x-y) --> y-x where x <= y

proofs: https://alive2.llvm.org/ce/z/KkeEsd

Reviewed By: goldstein.w.n, nikic
Differential Revision: https://reviews.llvm.org/D156499
2023-08-07 11:55:11 +08:00
Matt Arsenault
d74c89fdb4 InstCombine: Drop some typed pointer bitcasts 2023-07-31 08:05:58 -04:00
Matt Arsenault
d388222be2 InstCombine: Drop some typed pointer bitcast handling 2023-07-31 08:05:12 -04:00
Noah Goldstein
edf2e0e075 [InstCombine] Folding @llvm.ptrmask with itself
`@llvm.ptrmask` is basically just `and` with a `ptr` operand. This is
a trivial combine to do with `and` (many others could also be added).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D154006
2023-07-27 17:43:08 -05:00
Nikita Popov
dc2b2ae7dc [InstCombine] Fold cttz of lowest set bit
cttz(-a & a) is the same as cttz(a). -a & a is an idiom to extract
the lowest set bit, which naturally does not affect the number of
trailing zeroes.

Proof: https://alive2.llvm.org/ce/z/Yp26x7
2023-07-14 14:31:35 +02:00
Matt Arsenault
4f9aad964f InstCombine: Fold ldexp(ldexp(x, a), b) -> ldexp(x, a + b)
The problem here is overflow or underflow which would have occurred in
the inner operation, which the exponent offsetting avoids. We can do
this if we know the two exponents are in the same direction, or
reassoc flags allow unsafe reassociates.
2023-07-07 08:15:09 -04:00
Elliot Goodrich
f0fa2d7c29 [llvm] Move AttributeMask to a separate header
Move `AttributeMask` out of `llvm/IR/Attributes.h` to a new file
`llvm/IR/AttributeMask.h`.  After doing this we can remove the
`#include <bitset>` and `#include <set>` directives from `Attributes.h`.
Since there are many headers including `Attributes.h`, but not needing
the definition of `AttributeMask`, this causes unnecessary bloating of
the translation units and slows down compilation.

This commit adds in the include directive for `llvm/IR/AttributeMask.h`
to the handful of source files that need to see the definition.

This reduces the total number of preprocessing tokens across the LLVM
source files in lib from (roughly) 1,917,509,187 to 1,902,982,273 - a
reduction of ~0.76%. This should result in a small improvement in
compilation time.

Differential Revision: https://reviews.llvm.org/D153728
2023-06-27 15:26:17 +01:00
Nikita Popov
8762f4c748 [InstCombine] Track inserted instructions when lowering objectsize
The inserted instructions can usually be simplified. Make sure this
happens in the same InstCombine iteration by adding them to the
worklist.

We happen to get some better optimization in two cases, but this is
just a lucky accident. https://github.com/llvm/llvm-project/issues/63472
tracks implementing a fold for that case.

This doesn't track all inserted instructions yet, for that we would
also have to include those created by ObjectSizeOffsetEvaluator.
2023-06-23 15:36:23 +02:00
Nikita Popov
7b356769fc [InstCombine] Fold assume(false) to non-terminator unreachable
assume(false) is immediate UB, so fold it to (non-terminator)
unreachable.
2023-06-22 16:44:05 +02:00
luxufan
7fc0efd0dc [InstCombine] Add !noundef to match behavior of violating assume
The behaviors of violating assume instruction or !nonnull metadata is
different. The former is immediate undefined behavior, but the latter is
returning poison value. This patch adds !noundef to trigger immediate
undefined behavior if !nonnull is violated.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D153400
2023-06-21 23:17:57 +08:00
Nikita Popov
97b5cc214a [ValueTracking] Remove ORE argument (NFC-ish)
The ORE argument threaded through ValueTracking is used only in a
single, untested place. It is also essentially never passed: The
only places that do so have been added very recently as part of the
KnownFPClass migration, which is vanishingly unlikely to hit this
code path. Remove this effectively dead argument.

Differential Revision: https://reviews.llvm.org/D151562
2023-06-02 09:11:53 +02:00
Austin Chang
68c5d46b6e [InstCombine] Improve bitreverse optimization
This patch utilizes the helper function implemented in D149699 and thus folds the following cases:

```
bitreverse(logic_op(x, bitreverse(y))) -> logic_op(bitreverse(x), y)
bitreverse(logic_op(bitreverse(x), y)) -> logic_op(x, bitreverse(y))
bitreverse(logic_op(bitreverse(x), bitreverse(y))) -> logic_op(x, y) in multiuse case
```

Reviewed By: goldstein.w.n, RKSimon

Differential Revision: https://reviews.llvm.org/D151246
2023-05-25 13:41:32 -05:00
Matt Arsenault
591ba11b93 Reapply "SimplifyLibCalls: Pass AssumptionCache to isKnownNeverInfinity"
This reverts commit b357f379c81811409348dd0e0273a248b055bb7a.
2023-05-23 08:48:25 +01:00
Alina Sbirlea
b357f379c8 Revert "SimplifyLibCalls: Pass AssumptionCache to isKnownNeverInfinity"
This reverts commit faa32734bf9a55fa3f91d91f6fdf0f8a951a9c0e.
Revert due to test failures introduced by 73925ef8b0eacc6792f0e3ea21a3e6d51f5ee8b0
2023-05-18 23:31:51 -07:00
Matt Arsenault
faa32734bf SimplifyLibCalls: Pass AssumptionCache to isKnownNeverInfinity
Let's assumes work for determining no infinities.
2023-05-18 19:44:56 +01:00
Austin Chang
2b346a138d Recommit "[InstCombine] Improve bswap optimization" (2nd try)
Issue was an assertion failure due to an unchecked `cast`. Fix is to
check the operator is `BinaryOperator` before cast so that we won't
match `ConstExpr`

Reviewed By: goldstein.w.n, RKSimon

Differential Revision: https://reviews.llvm.org/D149699
2023-05-16 18:58:09 -05:00
Matt Arsenault
86d0b524f3 ValueTracking: Expand signature of isKnownNeverInfinity/NaN
This is in preparation for replacing the implementation
with a wrapper around computeKnownFPClass.
2023-05-16 20:42:58 +01:00
Matt Arsenault
e09115bcfd InstCombine: Try to turn is.fpclass sign checks to fcmp with 0
Try to use gt/lt compares with 0 instead of class.
2023-05-16 20:42:58 +01:00
Noah Goldstein
8606e91f2b Revert "[InstCombine] Improve bswap + logic_op optimization"
The generic cast to `BinaryOperator` can break if `V` is not a
`BinaryOperator` (i.e a `ConstantExpr`). This occurs in things like
PPC linux build.

This reverts commit fe733f54da6faca95070b36b1640dbca3e43d396.
2023-05-08 00:55:43 -05:00
Austin Chang
fe733f54da [InstCombine] Improve bswap + logic_op optimization
The patch implements a helper function that matches and fold the following cases in the InstCombine pass:

    bswap(logic_op(x, bswap(y))) -> logic_op(bswap(x), y)
    bswap(logic_op(bswap(x), y)) -> logic_op(x, bswap(y))
    bswap(logic_op(bswap(x), bswap(y))) -> logic_op(x, y) in multiuse case, which still reduces the number of instructions.

The helper function accepts bswap and bitreverse intrinsics. This patch folds the bswap cases and remain the bitreverse optimization for the future

Differential Revision: https://reviews.llvm.org/D149699
2023-05-07 14:28:07 +01:00
ManuelJBrito
d22edb9794 [IR][NFC] Change UndefMaskElem to PoisonMaskElem
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.

Differential Revision: https://reviews.llvm.org/D149256
2023-04-27 18:01:54 +01:00
Matt Arsenault
9156559254 InstCombine: Use computeKnownFPClass in is.fpclass combines and pass AC
The various isKnownNever* calls can be merged into one. This also introduces
the new ability to remove zero/sub/normal checks. Also start passing the
AssumptionCache arguments.
2023-04-26 13:36:48 -04:00
Kazu Hirata
804467de94 Use isNegative (NFC) 2023-04-15 14:26:24 -07:00
Nikita Popov
a162ddf7f2 [InstCombine] Remove various checks for opaque pointers (NFC)
All pointers are opaque now, so these are no longer necessary.
2023-04-06 09:45:51 +02:00
Nikita Popov
238a59c3f1 [InstCombine] Remove varargs cast transform (NFC)
This is no longer relevant with opaque pointers.

Also drop the CastInst::isLosslessCast() method, which was only
used here.
2023-04-05 16:36:21 +02:00