537 Commits

Author SHA1 Message Date
Nikita Popov
d873630fda Revert "[InstCombine] Generalize ptrtoint(gep) fold (NFC)"
This reverts commit c45f939e34dafaf0f57fd1d93df7df5cc89f1dec.

This refactoring turned out to not be useful for the case I had
originally in mind, so revert it for now.
2024-07-12 16:54:24 +02:00
Nikita Popov
c45f939e34 [InstCombine] Generalize ptrtoint(gep) fold (NFC)
We're currently handling a special case of
ptrtoint gep -> add ptrtoint. Reframe the code to make it easier
to add more patterns for this transform.
2024-07-12 16:34:26 +02:00
Nikita Popov
4502ea89b9 [InstCombine] More precise nuw preservation in ptrtoint of gep fold
We can transfer a nuw flag from the gep to the add. Additionally,
the inbounds + nneg case can be relaxed to nusw + nneg. Finally,
don't forget to pass the correct context instruction to
SimplifyQuery.
2024-07-11 16:55:16 +02:00
Nikita Popov
440af98a04 [InstCombine] Avoid use of ConstantExpr::getShl()
Use IRBuilder instead. Also use ImmConstant to guarantee that this
will fold.
2024-06-18 16:27:17 +02:00
Nikita Popov
534f8569a3 [InstCombine] Don't preserve context across div
We can't preserve the context across a non-speculatable instruction,
as this might introduce a trap. Alternatively, we could also
insert all the replacement instruction at the use-site, but that
would be a more intrusive change for the sake of this edge case.

Fixes https://github.com/llvm/llvm-project/issues/95547.
2024-06-17 15:38:59 +02:00
Monad
6bf1601a0d
[InstCombine] Fold pointer adding in integer to arithmetic add (#91596)
Fold
``` llvm
define i32 @src(i32 %x, i32 %y) {
  %base = inttoptr i32 %x to ptr
  %ptr = getelementptr inbounds i8, ptr %base, i32 %y
  %r = ptrtoint ptr %ptr to i32
  ret i32 %r
}
```
where both `%base` and `%ptr` have only one use, to
``` llvm
define i32 @tgt(i32 %x, i32 %y) {
  %r = add i32 %x, %y
  ret i32 %r
}
```

The `add` can be `nuw` if the GEP is `inbounds` and the offset is
non-negative. The relevant Alive2 proof is
https://alive2.llvm.org/ce/z/nP3RWy.

### Motivation

It seems unnecessary to convert `int` to `ptr` just to get its offset.
In most cases, they generates the same assembly, but sometimes it may
miss some optimizations since the analysis of `GEP` is not as perfect as
that of arithmetic operation. One example is


e3c822bf41/bench/protobuf/optimized/generated_message_reflection.cc.ll (L39860-L39873)

``` llvm
  %conv.i188 = zext i32 %145 to i64
  %add.i189 = add i64 %conv.i188, %125
  %146 = load i16, ptr %num_aux_entries10.i, align 2
  %conv2.i191 = zext i16 %146 to i64
  %mul.i192 = shl nuw nsw i64 %conv2.i191, 3
  %add3.i193 = add i64 %add.i189, %mul.i192
  %147 = inttoptr i64 %add3.i193 to ptr
  %sub.ptr.lhs.cast.i195 = ptrtoint ptr %144 to i64
  %sub.ptr.rhs.cast.i196 = ptrtoint ptr %143 to i64
  %sub.ptr.sub.i197 = sub i64 %sub.ptr.lhs.cast.i195, %sub.ptr.rhs.cast.i196
  %add.ptr = getelementptr inbounds i8, ptr %147, i64 %sub.ptr.sub.i197
  %sub.ptr.lhs.cast = ptrtoint ptr %add.ptr to i64
  %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %125
```

where `%conv.i188` first adds `%125` and then subtracts `%125` (the
result is `%sub.ptr.sub`), which can be optimized.
2024-05-20 12:20:47 +08:00
Monad
34c89eff64
[InstCombine] Fold trunc nuw/nsw (x xor y) to i1 to x != y (#90408)
Fold:
``` llvm
define i1 @src(i8 %x, i8 %y) {
  %xor = xor i8 %x, %y
  %r = trunc nuw/nsw i8 %xor to i1
  ret i1 %r
}

define i1 @tgt(i8 %x, i8 %y) {
  %r = icmp ne i8 %x, %y
  ret i1 %r
}
```

Proof: https://alive2.llvm.org/ce/z/dcuHmn
2024-04-30 18:07:40 +08:00
Nikita Popov
873889b7fa [InstCombine] Extract logic for "emit offset and rewrite gep" (NFC) 2024-04-25 14:18:11 +09:00
Nikita Popov
1baa385065
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.

This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.

As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.

There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
2024-04-18 15:44:12 +09:00
Noah Goldstein
da04e4afd3 [InstCombine] Use auto * instead of auto in visitSIToFP; NFC 2024-04-17 12:36:25 -05:00
Noah Goldstein
b6bd41db31 [InstCombine] Add canonicalization of sitofp -> uitofp nneg
This is essentially the same as #82404 but has the `nneg` flag which
allows the backend to reliably undo the transform.

Closes #88299
2024-04-16 15:26:25 -05:00
Yingwei Zheng
b109477615
[InstCombine] Infer nsw/nuw for trunc (#87910)
This patch adds support for inferring trunc's nsw/nuw flags.
2024-04-11 19:10:53 +08:00
Monad
56b3222b79
[InstCombine] Remove the canonicalization of trunc to i1 (#84628)
Remove the canonicalization of `trunc` to `i1` according to the
suggestion of
https://github.com/llvm/llvm-project/pull/83829#issuecomment-1986801166

a84e66a92d/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L737-L745)

Alive2: https://alive2.llvm.org/ce/z/cacYVA
2024-03-29 21:47:35 +08:00
Noah Goldstein
6960ace534 Revert "[InstCombine] Canonicalize (sitofp x) -> (uitofp x) if x >= 0"
This reverts commit d80d5b923c6f611590a12543bdb33e0c16044d44.

It wasn't a particularly important transform to begin with and caused
some codegen regressions on targets that prefer `sitofp` so dropping.

Might re-visit along with adding `nneg` flag to `uitofp` so its easily
reversable for the backend.
2024-03-20 00:50:45 -05:00
Yingwei Zheng
a747e86caa
[InstCombine] Fold fpto{s|u}i non-norm to zero (#85569)
This patch enables more optimization after canonicalizing `fmul X, 0.0`
into a copysign.
I decide to implement this fold in InstCombine because
`computeKnownFPClass` may be expensive.

Alive2: https://alive2.llvm.org/ce/z/ASM8tQ
2024-03-19 17:16:48 +08:00
Noah Goldstein
d80d5b923c [InstCombine] Canonicalize (sitofp x) -> (uitofp x) if x >= 0
Just a standard canonicalization.

Proofs: https://alive2.llvm.org/ce/z/9W4VFm

Closes #82404
2024-03-13 18:26:21 -05:00
Yingwei Zheng
c18e1215c4
[InstCombine] Simplify zext nneg i1 X to zero (#85043)
Alive2: https://alive2.llvm.org/ce/z/Wm6kCk
2024-03-13 20:15:29 +08:00
Quentin Dian
e96c0c1d5e
[InstCombine] Fix shift calculation in InstCombineCasts (#84027)
Fixes #84025.
2024-03-06 06:16:28 +08:00
Benjamin Kramer
d3f6dd6585
[InstCombine] Pick bfloat over half when shrinking ops that started with an fpext from bfloat (#82493)
This fixes the case where we would shrink an frem to half and then
bitcast to bfloat, producing invalid results. The transformation was
written under the assumption that there is only one type with a given
bit width.

Also add a strategic assert to CastInst::CreateFPCast to turn this
miscompilation into a crash.
2024-02-22 15:25:17 +01:00
Yingwei Zheng
f37d81f8a3
[PatternMatch] Add a matching helper m_ElementWiseBitCast. NFC. (#80764)
This patch introduces a matching helper `m_ElementWiseBitCast`, which is
used for matching element-wise int <-> fp casts.
The motivation of this patch is to avoid duplicating checks in
https://github.com/llvm/llvm-project/pull/80740 and
https://github.com/llvm/llvm-project/pull/80414.
2024-02-07 21:02:13 +08:00
Alexey Bataev
5a667bee9c [InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)
Tries to remove extra trunc/ext instruction for shufflevector
instructions.

Differential Review: https://github.com/llvm/llvm-project/pull/78636
2024-01-22 05:50:20 -08:00
Pranav Kant
4482fd846a Revert "[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)"
This reverts commit 4d11f04b20f0bd7488e19e8f178ba028412fa519.

This breaks some programs as mentioned in #78636
2024-01-19 21:02:20 +00:00
Alexey Bataev
4d11f04b20
[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)
Tries to remove extra trunc/ext instruction for shufflevector
instructions.
2024-01-19 09:29:01 -05:00
Nikita Popov
4b3ea337ad [ValueTracking] Convert isKnownNonNegative() to use SimplifyQuery (NFC) 2023-11-29 10:52:52 +01:00
Nikita Popov
ac75171d41 [InstCombine] Fix incorrect nneg inference on shift amount
Whether this is valid depends on the bit widths of the involved
integers.

Fixes https://github.com/llvm/llvm-project/issues/72927.
2023-11-21 15:47:55 +01:00
Yingwei Zheng
44cdbef715
[InstCombine] Infer nneg flag from shift users (#71947)
This patch sets `nneg` flag when the zext is only used by a shift.

Alive2: https://alive2.llvm.org/ce/z/h3xKjP
Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=bd611264993f64decbce178d460caf1d1cb05f59&to=26bc473b239010bb24ff1bc39d58b42ecbbc4730&stat=instructions:u

This is an alternative to #71906.
2023-11-13 21:05:05 +08:00
Nikita Popov
8391f405cb [InstCombine] Avoid uses of ConstantExpr::getLShr()
Use the constant folding API instead.
2023-11-10 15:50:42 +01:00
Nikita Popov
5918f62301
[InstCombine] Infer zext nneg flag (#71534)
Use KnownBits to infer the nneg flag on zext instructions.

Currently we only set nneg when converting sext -> zext, but don't set
it when we have a zext in the first place. If we want to use it in
optimizations, we should make sure the flag inference is consistent.
2023-11-08 09:34:40 +01:00
Antonio Frighetto
7d39838948 [InstCombine] Favour CreateZExtOrTrunc in narrowFunnelShift (NFC)
Use `CreateZExtOrTrunc`, reduce test and regenerate checks.
2023-11-07 22:48:14 +01:00
Antonio Frighetto
caa124b58d [InstCombine] Zero-extend shift amounts in narrow funnel shift ops
An issue arose when handling shift amounts while performing
narrowed funnel shifts simplification. Specifically, shift
amounts were incorrectly truncated when their type was
narrower than the target bit width. This has been addressed
by zero-extending `ShAmt` in such cases.

Fixes: https://github.com/llvm/llvm-project/issues/71463.

Proof: https://alive2.llvm.org/ce/z/5draKz.
2023-11-07 14:15:32 +01:00
Noah Goldstein
bd29197fd8 [InstCombine] Fold (ptrtoint (ptrmask p0, m0)) -> (and (ptrtoint p0), m0)
`and` is generally more supported so if we have a `ptrmask` anyways
might as well use `and`.

Differential Revision: https://reviews.llvm.org/D156640

Closes #67166
2023-11-01 23:50:36 -05:00
Nikita Popov
c7f0e49915 [InstCombine] Fix canAlwaysEvaluateInTy() with constant exprs
The m_ZExtOrSExt / m_Trunc in the following code can match constant
expressions, which we don't want here. Make sure we bail out early
for non-immediate constants.
2023-11-01 14:52:50 +01:00
Nikita Popov
aca9c891a2 [InstCombine] Avoid use of ConstantExpr::getIntegerCast()
Require that constants are ImmConstant for this transform, as we
may otherwise generate constant expressions, which are not
necessarily free.
2023-11-01 12:56:30 +01:00
Nikita Popov
0b5e0fb62d [InstCombine] Avoid some uses of ConstantExpr::getIntegerCast() (NFC)
Use IRBuilder or ConstantFolding instead.
2023-11-01 11:41:50 +01:00
Philip Reames
3f2ed812f0
[InstCombine] Infer nneg on zext when forming from non-negative sext (#70706)
Builds on #67982 which recently introduced the nneg flag on a zext
instruction. InstCombine is one of our largest canonicalizers of zext
from non-negative sext instructions, so set the flag there.
2023-10-30 12:09:43 -07:00
Nikita Popov
e4dc7d492c [InstCombine] Remove redundant cast of GEP fold (NFC)
With opaque pointers, zero-index GEPs will be eliminated in
general.
2023-10-27 11:47:38 +02:00
Allen
ea86fb8caf
[InstCombine] Fold zext-of-icmp with no shift (#68503)
This regression triggers after commit f400daa to fix infinite loop
issue.

In this case, we can known the shift count is 0, so it will not be
triggered by the form of (iN (~X) u>> (N - 1)) in commit 21d3871, of
which N indicates the data type bitwidth of X.

Fixes https://github.com/llvm/llvm-project/issues/68465.
2023-10-09 23:46:09 +08:00
Nikita Popov
9ace23c9a2 [InstCombine] Avoid use of ConstantExpr::getSExt() (NFC)
Use the constant folding API instead.
2023-10-02 11:30:15 +02:00
Nikita Popov
1b8fb1a664 [InstCombine] Avoid some uses of ConstantExpr::getZExt() (NFC)
Let the IRBuilder constant fold instead.
2023-09-28 15:31:42 +02:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Nikita Popov
8249d6724c [InstCombine] Avoid uses of ConstantExpr::getOr()
Replace these with IRBuilder uses, as we don't (from a type
perspective) care about Constant results.

Switch the predicate to m_ImmConstant() instead of isa<Constant>
to guarantee that these do get folded away and our assumptions
about simplifications hold true.
2023-07-24 16:50:45 +02:00
Nikita Popov
503ef0a8e7 [InstCombine] Remove addrspacecast bitcast extraction fold (NFC)
This is not relevant for opaque pointers, and as such no longer
necessary.
2023-04-06 09:53:32 +02:00
Nikita Popov
032e5d403e [InstCombine] Remove convertBitCastToGEP() fold (NFC)
This only applies to typed pointers, so the fold is no longer
necessary.
2023-04-05 16:20:14 +02:00
Jie Fu
d1dd995196 [InstCombine] Remove unneeded internal function 'decomposeSimpleLinearExpr' in InstCombineCasts.cpp (NFC)
/data/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:32:15: error: function 'decomposeSimpleLinearExpr' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static Value *decomposeSimpleLinearExpr(Value *Val, unsigned &Scale,
              ^
1 error generated.
2023-04-05 22:18:39 +08:00
Nikita Popov
3cbdcd6ebf [InstCombine] Remove PromoteCastOfAllocation() fold (NFC)
This fold does not apply to opaque pointers, and as such is no
longer needed.
2023-04-05 15:55:43 +02:00
Nikita Popov
aff1863859 [IR] Remove ConstantExpr::getUMin() (NFC)
This is part of select constant expression removal. As there is
only a single place where this is used, just expand it to explicit
constant folding calls.

(Normally we'd just use the IRBuilder here, but this isn't possible
due to mergeUndefsWith use).
2023-03-06 13:16:27 +01:00
Nikita Popov
ee2f9d6dfb Reapply [InstCombine] Remove early constant fold
The reported compile-time regression has been address in
47f9109dff80a1abbe2705ee71dc0882b1d62274.

Additionally, this contains a change to immediately fold zext
with constant operand, even if it's used in a trunc. I'm not sure
if this is relevant for anything, but I noticed it as a behavioral
discrepancy when investigating this issue.

-----

InstCombine currently performs a constant folding attempt as part
of the main InstCombine loop, before visiting the instruction.
However, each visit method will also attempt to simplify the
instruction, which will in turn constant fold it. (Additionally,
we also constant fold instructions before the main InstCombine loop
and use a constant folding IR builder, so this is doubly redundant.)

There is one place where InstCombine visit methods currently don't
call into simplification, and that's casts. To be conservative,
I've added an explicit constant folding call there (though it has
no impact on tests).

This makes for a mild compile-time improvement and in particular
mitigates the compile-time regression from enabling load
simplification in be88b5814d9efce131dbc0c8e288907e2e6c89be.

Differential Revision: https://reviews.llvm.org/D144369
2023-02-27 12:23:06 +01:00
Vitaly Buka
779679284e Revert "[InstCombine] Remove early constant fold"
Increase compile time with ubsan ARM from 3 to 14 min single file.
I upload reproducer into D144369.

Also we have random timeouts on internal x86_64 builds.
Both bisected to this one.

This reverts commit 45a0b812fa13ec255cae91f974540a4d805a8d79.
2023-02-24 10:21:32 -08:00
Nikita Popov
8347ca7dc8 [PatternMatch] Don't require DataLayout for m_VScale()
The m_VScale() matcher is unusual in that it requires a DataLayout.
It is currently used to determine the size of the GEP type. However,
I believe it is sufficient to check for the canonical
<vscale x 1 x i8> form here -- I don't think there's a need to
recognize exotic variations like <vscale x 1 x i4> as a vscale
constant representation as well.

Differential Revision: https://reviews.llvm.org/D144566
2023-02-23 15:30:29 +01:00
Nikita Popov
45a0b812fa [InstCombine] Remove early constant fold
InstCombine currently performs a constant folding attempt as part
of the main InstCombine loop, before visiting the instruction.
However, each visit method will also attempt to simplify the
instruction, which will in turn constant fold it. (Additionally,
we also constant fold instructions before the main InstCombine loop
and use a constant folding IR builder, so this is doubly redundant.)

There is one place where InstCombine visit methods currently don't
call into simplification, and that's casts. To be conservative,
I've added an explicit constant folding call there (though it has
no impact on tests).

This makes for a mild compile-time improvement and in particular
mitigates the compile-time regression from enabling load
simplification in be88b5814d9efce131dbc0c8e288907e2e6c89be.

Differential Revision: https://reviews.llvm.org/D144369
2023-02-20 16:48:39 +01:00