1070 Commits

Author SHA1 Message Date
goldsteinn
c85611e858
[SimplifyLibCall][Attribute] Fix bug where we may keep range attr with incompatible type (#112649)
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.

The issue in this case is from the `range` attribute when inlining
`__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an
`i8`, and if the `i32` had a `range` attr assosiated it will cause an
error.

Fixes #112633
2024-10-17 10:32:55 -05:00
Jay Foad
85c17e4092
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
  Fn = Intrinsic::getOrInsertDeclaration(...);
  CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
2024-10-17 16:20:43 +01:00
Ramkumar Ramachandra
c5f82f7893
ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011)
Factor out and unify common code from InstSimplify and InstCombine that
partially guard against cross-lane vector operations into
llvm::isNotCrossLaneOperation in ValueTracking.

Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka
2024-10-14 11:37:30 +01:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
David Green
c0e308ba3d
[InstCombine] Pass DomTree and DomTreeCacheto LibCallSimplifier (#108446)
This allows any combines to pick up Known states from dominating
conditions.
2024-09-13 08:36:48 +01:00
Volodymyr Vasylkun
d163935585
[InstCombine] Fold scmp(x -nsw y, 0) to scmp(x, y) (#105583)
Proof: https://alive2.llvm.org/ce/z/v6VtXz
2024-08-22 14:18:48 +01:00
Snehasish Kumar
95daf1aedf
Allow optimization of __size_returning_new variants. (#102258)
https://github.com/llvm/llvm-project/pull/101564 added support to TLI to
detect variants of operator new which provide feedback on the actual
size of memory allocated (http://wg21.link/P0901R5). This patch extends
SimplifyLibCalls to handle hot cold hinting of these variants.
2024-08-15 08:06:41 -07:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
Nikita Popov
4d97ad59f9 [InstCombine] Use APInt::getSplat()
Split off from https://github.com/llvm/llvm-project/pull/80309.
2024-08-13 15:04:23 +02:00
Nikita Popov
cc14ecc281
[InstCombine] Don't change fn signature for calls to declarations (#102596)
transformConstExprCastCall() implements a number of highly dubious
transforms attempting to make a call function type line up with the
function type of the called function. Historically, the main value this
had was to avoid function type mismatches due to pointer type
differences, which is no longer relevant with opaque pointers.

This patch is a step towards reducing the scope of the transform, by
applying it only to definitions, not declarations. For declarations, the
declared signature might not match the actual function signature, e.g.
`void @fn()` is sometimes used as a placeholder for functions with
unknown signature. The implementation already bailed out in some cases
for declarations, but I think it would be safer to disable the transform
entirely.

For the test cases, I've updated some of them to use definitions
instead, so that the test coverage is preserved.
2024-08-12 10:12:00 +02:00
YongKang Zhu
6c367168d6
[InstCombine] Remove transformation on call instruction where return value need void to non-void conversion (#98536)
Skip simplification on call instruction where a non-void return value is
expected but the callee returns void, which is undefined behavior and
could lead to non-determinism or crashes.
2024-08-02 09:07:48 -07:00
Yingwei Zheng
4e89d1199c
[InstCombine] Convert mem intrinsic with null into a noop (#100388)
When src/dest passed into memset/memcpy is null: 
```
len == 0: this call is a noop.
len != 0: the behavior is undefined.
```
See also https://llvm.org/docs/LangRef.html#llvm-memset-intrinsics
Alive2: https://alive2.llvm.org/ce/z/tJeRNL

This patch converts these mem intrinsic calls into an assumption `len ==
0` to mitigate code-size bloat caused by JumpThreading.
2024-08-01 22:46:07 +08:00
Andreas Jonson
f7491f53cb
[InstCombine] Reduce range of ctpop for non zero argument (#100899) 2024-07-29 14:05:50 +02:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00
Kazu Hirata
0fb9f898e2
[InstCombine] Initialize a SmallVector with a range (NFC) (#100947) 2024-07-28 14:31:51 -07:00
Kazu Hirata
a17f8fe7d4
[InstCombine] Use more inline elements in a SmallVector (#100942)
The 4 inline elements only cover 58% of cases encountered here during
the compilation of X86ISelLowering.cpp.ll, a .ll version of
X86ISelLowering.cpp.

The 8 inline elements cover 96% and save 0.27% of heap allocations.
2024-07-28 13:19:44 -07:00
Antonio Frighetto
6ce7b1f861 [TBAA] Do not rewrite TBAA if exists, always null out !tbaa.struct
Retrieve `!tbaa` metadata via `!tbaa.struct` in `adjustForAccess`
unless it already exists, as struct-path aware `MDNodes` emitted
via `new-struct-path-tbaa` may be leveraged. As `!tbaa.struct`
carries memcpy padding semantics among struct fields and `!tbaa`
is already meant to aid to alias semantics, it should be possible
to zero out `!tbaa.struct` once the memcpy has been simplified.
`SROA/tbaa-struct.ll` test has gone out of scope, as `!tbaa` has
already replaced `!tbaa.struct` in SROA.

Fixes: https://github.com/llvm/llvm-project/issues/95661.
2024-07-25 09:24:56 +02:00
Yingwei Zheng
16f22c0fe6
Fold fma x, -1.0, y into fsub x, y (#100106)
Alive2 proof (Please run alive-tv locally with larger `smt-to`):
https://alive2.llvm.org/ce/z/YvUVg-
2024-07-23 20:13:23 +08:00
Philip Reames
f6add66b72
[instcombine] Extend logical reduction canonicalization to scalable vectors (#99366)
These transformations do not depend on the type being fixed in size, so
enable them for scalable vectors too. Unlike for fixed vectors, these
are only a canonicalization - the bitcast lowering for and/or/add is not
legal on a scalable vector type.
2024-07-17 14:36:11 -07:00
mskamp
949bbdc923
[InstCombine] Fold Minimum over Trailing/Leading Bits Counts (#90402)
The new transformation folds `umin(cttz(x), c)` to `cttz(x | (1 << c))`
and `umin(ctlz(x), c)` to `ctlz(x | ((1 << (bitwidth - 1)) >> c))`. The
transformation is only implemented for constant `c` to not increase the
number of instructions.
    
The idea of the transformation is to set the c-th lowest (for `cttz`) or
highest (for `ctlz`) bit in the operand. In this way, the `cttz` or
`ctlz` instruction always returns at most `c`.
    
Alive2 proofs: https://alive2.llvm.org/ce/z/y8Hdb8

Fixes #90000
2024-07-13 16:55:11 +02:00
Matt Arsenault
5b77ed4d94
InstCombine: Try to fold ldexp with select of power operand (#97354)
This makes it more likely a constant value can fold into the source
operand.
2024-07-02 15:11:14 +02:00
AtariDreams
2399d87768
[Transforms] Let amdgcn take advantage of sin(-x) --> -sin(x) (#79700)
We do it for amdgcn_cos, and we should do it for amdgcn_sin as well.
2024-06-30 09:09:36 +02:00
Vaibhav
ea68668647
[InstCombine] Add fold for fabs(-x) -> fabs(x) (#95627)
This patch folds `fabs(-x) -> fabs(x)`

Closes #94170

Proofs: https://alive2.llvm.org/ce/z/gjzmgf
2024-06-27 20:10:33 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Ahmed Bougacha
34e5a71b32
[InstCombine] Combine ptrauth constants into ptrauth intrinsics. (#94705)
When we encounter two consecutive ptrauth intrinsics, we can already
combine the inner matching sign + auth pair, e.g.:
  resign(sign(p,ks,ds),ks,ds,kr,dr) -> sign(p,kr,dr)

We can generalize that to ptrauth constants, which are effectively
constant equivalents to ptrauth.sign, i.e.:
  resign(ptrauth(p,ks,ds),ks,ds,kr,dr) -> ptrauth(p,kr,dr)
  auth(ptrauth(p,k,d),k,d) -> p

While there, cleanup a redundant return after eraseInstFromFunction in
the shared (intrinsic|constant)->intrinsic folding code.
2024-06-26 18:54:40 -07:00
c8ef
e5bdb7af86
[InstCombine] fold ldexp(x, sext(i1 y)) to fmul x, (select y, 0.5, 1.0) (#95073)
Follow up of #94887.

Context:
https://github.com/llvm/llvm-project/pull/94887#pullrequestreview-2106213891
2024-06-11 11:42:19 +02:00
c8ef
f26bc5f0c4
[InstCombine] fold ldexp(x, zext(i1 y)) to fmul x, (select y, 2.0, 1.0) (#94887)
close: #92538
2024-06-10 16:40:37 +02:00
Nikita Popov
3387e55844 [InstCombine] Use SimplifyQuery in isKnownSign()
This enabled the use of DomConditionCache. As such, remove the
explicit isImpliedByDomCondition() call. This is probably not
entirely NFC because these APIs don't support exactly the same
cases.
2024-06-05 15:22:50 +02:00
zhongyunde 00443407
2631531764 [InstCombine] Removing the combine of fmuladd with fast flag
We should treat fmuladd like an fma intrinsic, and any regressions need to be
addressed by dealing with fma/fmuladd in other contexts.
2024-05-21 14:33:39 +08:00
David Sherwood
0ad275c158
[InstCombine] Fold vector.reduce.op(vector.reverse(X)) -> vector.reduce.op(X) (#91743)
For all of the following reductions:

vector.reduce.or
vector.reduce.and
vector.reduce.xor
vector.reduce.add
vector.reduce.mul
vector.reduce.umin
vector.reduce.umax
vector.reduce.smin
vector.reduce.smax
vector.reduce.fmin
vector.reduce.fmax

if the input operand is the result of a vector.reverse then we can
perform a reduction on the vector.reverse input instead since the answer
is the same. If the reassociation is permitted we can also do the same
folds for these:

vector.reduce.fadd
vector.reduce.fmul
2024-05-17 12:58:14 +01:00
Eli Friedman
f893dccbba
Replace uses of ConstantExpr::getCompare. (#91558)
Use ICmpInst::compare() where possible, ConstantFoldCompareInstOperands
in other places. This only changes places where the either the fold is
guaranteed to succeed, or the code doesn't use the resulting compare if
we fail to fold.
2024-05-09 16:50:01 -07:00
Monad
0ab4458df0
[InstCombine] Fold cttz(lshr(-1, x) + 1) to width - x (#91244)
Fold
``` llvm
define i64 @src(i64 %50) {
  %52 = lshr i64 -1, %50
  %53 = add i64 %52, 1
  %54 = call i64 @llvm.cttz.i64(i64 %53, i1 false)
  ret i64 %54
}
```
to
``` llvm
define i64 @tgt(i64 %50) {
  %52 = sub i64 64, %50
  ret i64 %52
}
```

as
https://github.com/llvm/llvm-project/pull/91171#pullrequestreview-2040663002
pointed out.

Alive2 proof: https://alive2.llvm.org/ce/z/2aHfYa

Note: the `ctlz` version of this pattern seems not exist in dtcxzyw's
benchmark, so put it aside for now.
2024-05-07 11:06:52 +09:00
Maciej Gabka
bfc0317153
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Andreas Jonson
b8f3024a31
[InstCombine] Swap out range metadata to range attribute for cttz/ctlz/ctpop (#88776)
Since all optimizations that use range metadata now also handle range attribute, this patch replaces writes of
range metadata for call instructions to range attributes.
2024-04-25 01:45:50 +08:00
Yingwei Zheng
d18ab0e1bd
[InstCombine] Fold fabs over selects (#86390)
This patch folds fabs over select if it is beneficial. I also tried
other interger/fp intrinsics. Only handling fabs shows benefit to some
real-world applications.
2024-04-22 15:37:42 +08:00
Nikita Popov
46957a138d [InstCombine] Fix incorrect fshr to fshl transform
This transform is only valid if the (modular) shift amount is not
zero.

Proof: https://alive2.llvm.org/ce/z/WBxn-x

Fixes https://github.com/llvm/llvm-project/issues/89338.
2024-04-19 14:02:22 +09:00
Nikita Popov
1baa385065
[IR][PatternMatch] Only accept poison in getSplatValue() (#89159)
In #88217 a large set of matchers was changed to only accept poison
values in splats, but not undef values. This is because we now use
poison for non-demanded vector elements, and allowing undef can cause
correctness issues.

This patch covers the remaining matchers by changing the AllowUndef
parameter of getSplatValue() to AllowPoison instead. We also carry out
corresponding renames in matchers.

As a followup, we may want to change the default for things like m_APInt
to m_APIntAllowPoison (as this is much less risky when only allowing
poison), but this change doesn't do that.

There is one caveat here: We have a single place
(X86FixupVectorConstants) which does require handling of vector splats
with undefs. This is because this works on backend constant pool
entries, which currently still use undef instead of poison for
non-demanded elements (because SDAG as a whole does not have an explicit
poison representation). As it's just the single use, I've open-coded a
getSplatValueAllowUndef() helper there, to discourage use in any other
places.
2024-04-18 15:44:12 +09:00
Harald van Dijk
60de56c743
[ValueTracking] Restore isKnownNonZero parameter order. (#88873)
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
2024-04-16 15:21:09 +01:00
Yingwei Zheng
80fce05f21
[InstCombine] Fold minmax (X & NegPow2C, Y & NegPow2C) -> minmax(X, Y) & NegPow2C (#88859)
Alive2: https://alive2.llvm.org/ce/z/NFtkSX

This optimization will be beneficial to jemalloc users.
2024-04-16 17:16:35 +08:00
Matthias Braun
d23a85066b
InstCombine: Increase threadlocal.address alignment if pointee is more aligned (#88435)
Increase alignment of `llvm.threadlocal.address` if the pointed to
global has higher alignment.
2024-04-15 18:19:06 -07:00
Nikita Popov
a9d7ad23fa [InstCombine] Relax shamt assertion in fsh fold
Allow the result of the comparison to contain poison elements,
which happens if one of the elements in the input vector is
poison.
2024-04-15 10:30:05 +09:00
Yingwei Zheng
e0a628715a
[ValueTracking] Convert isKnownNonZero to use SimplifyQuery (#85863)
This patch converts `isKnownNonZero` to use SimplifyQuery. Then we can
use the context information from `DomCondCache`.

Fixes https://github.com/llvm/llvm-project/issues/85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj
2024-04-12 23:47:20 +08:00
Yingwei Zheng
caa2258250
[LLVM] Remove nuw neg (#86295)
This patch removes APIs that creating NUW neg. It is a trivial case
because `sub nuw 0, X` always gets simplified into zero.
I believe there is no optimization opportunities in the real-world
applications that we can take advantage of the nuw flag.

Motivated by
https://github.com/llvm/llvm-project/pull/84792#discussion_r1524891134.

Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u
2024-03-26 20:56:16 +08:00
Noah Goldstein
b3ee127e7d [InstCombine] integrate N{U,S}WAddLike into existing folds
Just went a quick replacement of `N{U,S}WAdd` with the `Like` variant
that old matches `or disjoint`

Closes #86082
2024-03-21 13:03:38 -05:00
Stephen Tozer
ffd08c7759
[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:

- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.

Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:

```
  DPValue -> DbgVariableRecord
  DPVal -> DbgVarRec
  DPV -> DVR
```

Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
2024-03-19 20:07:07 +00:00
Yingwei Zheng
0b59af4d86
[InstCombine] Clear sign-bit of the constant magnitude in copysign (#85787)
Alive2: https://alive2.llvm.org/ce/z/vFykcZ
Address the comment
https://github.com/llvm/llvm-project/pull/85772#discussion_r1530179048.

Unfortunately, non-splat vector constants are not supported because we
haven't implemented constant folding of fabs with vector operands.
2024-03-20 03:28:19 +08:00
Artem Tyurin
141145232f
[IRBuilder] Fold binary intrinsics (#80743)
Fixes https://github.com/llvm/llvm-project/issues/61240.
2024-03-15 09:58:25 +01:00
Yingwei Zheng
83d178843f
[InstCombine] Set zero_is_poison for ctlz/cttz if they are only used as shift amounts (#85035)
Alive2: https://alive2.llvm.org/ce/z/r-67t9

It would improve the codegen if the target doesn't provide a defined
value for ctlz/cttz with zero.
2024-03-13 21:52:40 +08:00
elhewaty
3f302eaca4
[InstCombine] Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0 (#82280)
- Fixes: https://github.com/llvm/llvm-project/issues/82177
- Alive2: https://alive2.llvm.org/ce/z/Q7mMC3
2024-03-11 15:10:40 +01:00