940 Commits

Author SHA1 Message Date
Nikita Popov
90ba33099c
[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882)
This patch canonicalizes getelementptr instructions with constant
indices to use the `i8` source element type. This makes it easier for
optimizations to recognize that two GEPs are identical, because they
don't need to see past many different ways to express the same offset.

This is a first step towards
https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699.
This is limited to constant GEPs only for now, as they have a clear
canonical form, while we're not yet sure how exactly to deal with
variable indices.

The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives
two representative examples of the kind of optimization improvement we
expect from this change. In the first test SimplifyCFG can now realize
that all switch branches are actually the same. In the second test it
can convert it into simple arithmetic. These are representative of
common optimization failures we see in Rust.

Fixes https://github.com/llvm/llvm-project/issues/69841.
2024-01-24 15:25:29 +01:00
Jeremy Morse
be0c809836 [NFC][Debuginfo][RemoveDIs] Switch an insertion to use iterators
With the soon-to-land new-debug-info storage model, it's going to be
important to use iterators for instruction insertion rather than
instruction pointers. This (single line in instcombine) is the last place
that trips up our internal testing for debug-info, where we insert a PHI
and it should be using an iterator.
2024-01-22 23:12:01 +00:00
Noah Goldstein
60e8915d22 [InstCombine] Add folds for (add/sub/disjoint_or/icmp C, (ctpop (not x)))
`(ctpop (not x))` <-> `(sub nuw nsw BitWidth(x), (ctpop x))`. The
`sub` expression can sometimes be constant folded depending on the use
case of `(ctpop (not x))`.

This patch adds fold for the following cases:

`(add/sub/disjoint_or C, (ctpop (not x))`
    -> `(add/sub/disjoint_or C', (ctpop x))`
`(cmp pred C, (ctpop (not x))`
    -> `(cmp swapped_pred C', (ctpop x))`

Where `C'` depends on how we constant fold `C` with `BitWidth(x)` for
the given opcode.

Proofs: https://alive2.llvm.org/ce/z/qUgfF3

Closes #77859
2024-01-15 12:05:38 -08:00
Yingwei Zheng
7c3bcc307a
[InstCombine] Fold switch(zext/sext(X)) into switch(X) (#76988)
This patch folds `switch(zext/sext(X))` into `switch(X)`.
The original motivation of this patch is to optimize a pattern found in
cvc5. For example:
```
  %bf.load.i = load i16, ptr %d_kind.i, align 8
  %bf.clear.i = and i16 %bf.load.i, 1023
  %bf.cast.i = zext nneg i16 %bf.clear.i to i32
  switch i32 %bf.cast.i, label %if.else [
    i32 335, label %if.then
    i32 303, label %if.then
  ]

if.then:                                          ; preds = %entry, %entry
  %d_children.i.i = getelementptr inbounds %"class.cvc5::internal::expr::NodeValue", ptr %0, i64 0, i32 3
  %cmp.i.i.i.i.i = icmp eq i16 %bf.clear.i, 1023
  %cond.i.i.i.i.i = select i1 %cmp.i.i.i.i.i, i32 -1, i32 %bf.cast.i
```
`%cmp.i.i.i.i.i` always evaluates to false because `%bf.clear.i` can
only be 335 or 303.
Folding `switch i32 %bf.cast.i` to `switch i16 %bf.clear.i` will help
`CVP` to handle this case.
See also
https://github.com/llvm/llvm-project/pull/76928#issuecomment-1877055722.

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=7954c57124b495fbdc73674d71f2e366e4afe522&to=502b13ed34e561d995ae1f724cf06d20008bd86f&stat=instructions:u

|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|+0.03%|+0.06%|+0.07%|+0.00%|-0.02%|-0.03%|+0.02%|
2024-01-06 04:30:07 +08:00
Yingwei Zheng
1259c05122
[InstCombine] Canonicalize switch(X << C) into switch(X) (#77068)
This patch canonicalizes `switch(X << C)` to `switch(X)`. If the shift
may wrap, an and instruction will be created to mask out all of the
shifted bits.
Alive2: https://alive2.llvm.org/ce/z/wSsL5y

NOTE: We can relax the one-use constraint. But I don't see any benefit
in my benchmark.

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=a776740d6296520b8bde156aa3f8d9ecb32cddd9&to=6dd783b9f90ae5f258102d732953567d7e317c02&stat=instructions%3Au

|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|-0.00%|+0.01%|-0.02%|-0.01%|+0.02%|-0.00%|+0.01%|
2024-01-06 01:43:21 +08:00
Yingwei Zheng
f7f7574afe
[InstCombine] Canonicalize switch(C-X) to switch(X) (#77051)
This patch canonicalizes `switch(C-X)` to `switch(X)`.

Compile-time impact:
http://llvm-compile-time-tracker.com/compare.php?from=7954c57124b495fbdc73674d71f2e366e4afe522&to=31a9adff1e633f0f3c423fb8487fc15d17e171f2&stat=instructions:u

|stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
|--|--|--|--|--|--|--|
|-0.01%|+0.02%|+0.02%|+0.05%|-0.07%|-0.02%|-0.02%|
2024-01-05 21:03:24 +08:00
Yingwei Zheng
0ce193708c
[InstCombine] Refactor folding of commutative binops over select/phi/minmax (#76692)
This patch cleans up the duplicate code for folding commutative binops
over `select/phi/minmax`.

Related commits:
+ select support:
88cc35b27e
+ phi support:
8674a023bc
+ minmax support:
624973806c
2024-01-04 15:11:28 +08:00
Yingwei Zheng
949ec83eaf
[InstCombine] Relax the same-underlying-object constraint for the GEP canonicalization (#76583)
7d7001b2cb
canonicalizes `(gep i8, X, (ptrtoint Y) - (ptrtoint X))` into `bitcast
Y` iff `X` and `Y` have the same underlying object.

I find that the result of this pattern is usually used as an operand of
an icmp in some real-world applications. I think we can do the
canonicalization if the result is only used by icmps/ptrtoints.

Alive2: https://alive2.llvm.org/ce/z/j4-HJZ
2024-01-01 00:35:42 +08:00
Yingwei Zheng
2128fca6c1
[InstCombine] Canonicalize gep T* X, V / sizeof(T) to gep i8* X, V (#76458)
This patch canonicalize `gep T* X, V / sizeof(T)` to `gep i8* X, V`.
Alive2: https://alive2.llvm.org/ce/z/7XGjiB

As this pattern has been handled by the backends, the motivation of this
patch is to reduce the ref count of sdiv, which will enable more
optimizations.
2023-12-29 11:30:00 +08:00
Nikita Popov
b8df88b41c [InstCombine] Support zext nneg in gep of sext add fold
Add m_NNegZext() and m_SExtLike() matchers to make doing these kinds
of changes simpler in the future.
2023-12-21 16:38:09 +01:00
Chia
8674a023bc
[InstCombine] fold (Binop phi(a, b) phi(b, a)) -> (Binop a, b) while Binop is commutative. (#75765)
Alive2 proof: https://alive2.llvm.org/ce/z/2P8gq-
This patch closes #73905
2023-12-21 22:47:21 +08:00
Nikita Popov
cd54c47424 [InstCombine] Match poison instead of undef in foldVectorBinop()
Some negative tests turn into positive tests, as the differences
between undef and poison propagation allow additional transforms.
2023-12-18 17:01:59 +01:00
Nikita Popov
ddd11537e2 [InstCombine] Match poison instead of undef in binop of same-mask shuffle fold 2023-12-18 16:41:38 +01:00
Nikita Popov
465ecf872e [InstCombine] Rename UndefElts -> PoisonElts (NFC)
In line with updated shufflevector semantics, this represents the
poison elements rather than undef elements now. This commit is a
pure rename, without any logic changes.
2023-12-18 12:36:19 +01:00
Yingwei Zheng
af2d740d2d
[InstCombine] Treat lshr nneg as ashr in getBinOpsForFactorization (#75521)
This patch reinterprets `lshr nneg C, X` as `ashr nneg C, X` to allow
more factorization opportunities.
Fixes #70582.
2023-12-15 16:32:18 +08:00
Yingwei Zheng
9cf3e31172
[InstCombine] Explicitly fold ~(~X >>u Y) into X >>s Y (#75473)
Fixes #75369.

This patch explicitly folds `~(~X >>u Y)` into `X >>s Y` to fix assertion failure in #75369.
2023-12-14 23:06:38 +08:00
Sizov Nikita
88cc35b27e
[InstCombine] Fold binop (select cond, a, b), (select cond, b, a) to binop a, b (#74953)
```
CommutativeBinOp(select(V, A, B), select(V, B, A) --> CommutativeBinOp(A, B)
CommutativeIntrinsicCall(select(V, A, B), select(V, B, A), ...) --> CommutativeIntrinsicCall(A, B, ...)
```

https://alive2.llvm.org/ce/z/8CDUZ4

Closes #73904
2023-12-13 14:09:27 +08:00
Craig Topper
09a05f5dcb [InstCombine] Drop poison generating flags on Or in simplifyAssocCastAssoc.
Fixes #74739.
2023-12-07 13:35:28 -08:00
Nikita Popov
d77067d08a
[ValueTracking] Add dominating condition support in computeKnownBits() (#73662)
This adds support for using dominating conditions in computeKnownBits()
when called from InstCombine. The implementation uses a
DomConditionCache, which stores which branches may provide information
that is relevant for a given value.

DomConditionCache is similar to AssumptionCache, but does not try to do
any kind of automatic tracking. Relevant branches have to be explicitly
registered and invalidated values explicitly removed. The necessary
tracking is done inside InstCombine.

The reason why this doesn't just do exactly the same thing as
AssumptionCache is that a lot more transforms touch branches and branch
conditions than assumptions. AssumptionCache is an immutable analysis
and mostly gets away with this because only a handful of places have to
register additional assumptions (mostly as a result of cloning). This is
very much not the case for branches.

This change regresses compile-time by about ~0.2%. It also improves
stage2-O0-g builds by about ~0.2%, which indicates that this change results
in additional optimizations inside clang itself.

Fixes https://github.com/llvm/llvm-project/issues/74242.
2023-12-06 14:17:18 +01:00
Nikita Popov
faebb1b2e6 Reapply [InstCombine] Support inverting lshr with non-negative operand
My initial patch contained a typo, resulting in the wrong value
being checked for non-negativeness.

-----

If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.

Proof: https://alive2.llvm.org/ce/z/Ahg4ri
2023-12-01 16:09:54 +01:00
Nikita Popov
8c130996c0 Revert "[InstCombine] Support inverting lshr with non-negative operand"
This reverts commit b92693ac6afc522ea56bede0b9805ca7c138754c.

I've made a silly typo in the condition. Will reapply the corrected
version.
2023-12-01 16:05:17 +01:00
Nikita Popov
b92693ac6a [InstCombine] Support inverting lshr with non-negative operand
If the lshr operand is non-negative, we can treat it the same
way as an ashr. Ideally we would represent this as "lshr nneg",
but for now just perform the necessary ValueTracking query.

Proof: https://alive2.llvm.org/ce/z/Ahg4ri
2023-12-01 15:55:27 +01:00
Jeremy Morse
4424903156
[DebugInfo][RemoveDIs] Handle DPValues at remaining dbg.value using sites (#73788)
This patch updates the last few places in LLVM using findDbgValues that
don't also collect and handle DPValue objects. This largely involves
instcombine and mem2reg changes, and are largely mechanical, calling
existing utilities on collections of DPValues instead of just
DbgValuesInsts.

A variety of tests have had RemoveDIs RUN lines added to them to cover
these behaviours. We have some technical debt of the instcombine sinking
code for DPValues not being implemented yet, so I've left FIXME stubs
indicating that we intend to cover tests with RemoveDIs but haven't yet.
2023-11-30 16:30:32 +00:00
Jeremy Morse
2425e2940e
[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149)
Part of the "RemoveDIs" project to remove debug intrinsics requires
passing block-positions around in iterators rather than as instruction
pointers, allowing some debug-info to reside in BasicBlock::iterator.
This means getInsertionPointAfterDef has to return an iterator, and as
it can return no-instruction that means returning an optional iterator.

This patch changes the signature for getInsertionPtAfterDef and then
patches up the various places that use it to handle the different type.
This would overall be an NFC patch, however in
InstCombinerImpl::freezeOtherUses I've started skipping any debug
intrinsics at the returned insert-position. This should not have any
_meaningful_ effect on the compiler output: at worst it means variable
assignments that are skipped will now cover the freeze instruction and
anything inserted before it, which should be inconsequential.

Sadly: this makes the function signature ugly. This is probably the
ugliest piece of fallout for the "RemoveDIs" work, but it serves the
overall purpose of improving compile times and not allowing `-g` to
affect compiler output, so should be worthwhile in the end.
2023-11-30 12:19:57 +00:00
LiqinWeng
f7247d5041
[InstCombine] Canonicalise SextADD + GEP (#69581) 2023-11-29 09:50:58 +08:00
Noah Goldstein
3039691f53 [InstCombine] add getFreeInverted to perform folds for free inversion of op
With the current logic of `if(isFreeToInvert(Op)) return Not(Op)` its
fairly easy to either 1) cause regressions or 2) infinite loops
if the folds we have for `Not(Op)` ever de-sync with the cases we
know are freely invertible.

This patch adds `getFreeInverted` which is able to build the free
inverted op along with check for free inversion to alleviate this
problem.
2023-11-20 17:59:27 -06:00
Jeremy Morse
f42482def2
[DebugInfo][RemoveDIs] Don't convert debug-intrinsics to Unreachable (#72380)
It might seem obvious, but it's not a good idea to convert a
debug-intrinsic instruction into an UnreachableInst, as this means
things operate differently with and without the -g option. However this
can happen due to the "mutate the next instruction" API calls we make.
With RemoveDIs eliminating debug intrinsics, this behaviour is at risk
of changing, hence this patch ensures we only ever mutate the next _non_
debuginfo instruction into an Unreachable.

The tests instrumented with the --try... flag all exercise this, I've
added some metadata to a SCCP test to ensure it's exercised.
2023-11-20 20:53:24 +00:00
Jeremy Morse
80d3a4c39f
[DebugInfo][RemoveDIs] Add local-utility plumbing for DPValues (#72276)
This patch re-implements a variety of debug-info maintenence functions
to use DPValues instead of DbgValueInst's: supporting the "new"
non-intrinsic representation of debug-info. As per [0], we need to have
parallel implementations of various utilities for a time, and these are
the most fundamental utilities used throughout the compiler.

I've added --try-experimental-debuginfo-iterators to a variety of RUN
lines: this is a flag that turns on "new debug-info" if it's built into
LLVM, and not otherwise. This should ensure that we have the same
behaviour for the same IR inputs, but using a different internal
representation. For the most part these changes affect SROA/Mem2Reg
promotion of dbg.declares into dbg.value intrinsics (now DPValues),
we're leaving dbg.declares as instructions until later in the day.
There's also some salvaging changes made.

I believe the tests that I've added cover almost all the code being
updated here. The only thing I'm not confident about is SimplifyCFG,
which calls rewriteDebugUsers down a variety of code paths. Those
changes can't immediately get full coverage as an additional patch is
needed that updates handling of Unreachable instructions, will upload
that shortly.

[0]
https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939/9
2023-11-20 16:56:31 +00:00
Florian Hahn
1c05fe3500
[InstCombine] Pass InstCombineOptions instead of separate flags (NFC). (#72566)
This makes it simpler to pass additional flags/options in the future.
2023-11-17 09:57:05 +00:00
Craig Topper
2fbd088524
[InstCombine] Queue Xor for deletion after replacing its uses in freelyInvertAllUsersOf. (#72445)
Fixes #72433
2023-11-16 09:46:34 -08:00
Nikita Popov
b43b2a64b5 [InstCombine] Avoid use of shift constant expressions (NFCI)
Use the constant folding API instead. As we're working on
ImmConstants, these folds are guaranteed to succeed.
2023-11-10 16:58:10 +01:00
Nikita Popov
1b1c81772f [InstCombine] Drop poison flags in simplifyAssocCastAssoc()
The nneg flag on zext may no longer hold after the reassociation.
2023-11-09 11:58:02 +01:00
Nikita Popov
03110ddeb2 [IR] Remove ZExtOperator (NFC)
Now that zext constant expressions are no longer supported,
ZExtInst should be used instead.
2023-11-03 14:52:59 +01:00
Florian Hahn
0af5c0668a
[InstCombine] Don't consider aligned_alloc removable if icmp uses result (#69474)
At the moment, all alloc-like functions are assumed to return non-null
pointers, if their return value is only used in a compare. This is based
on being allowed to substitute the allocation function with one that
doesn't fail to allocate the required memory.

aligned_alloc however must also return null if the required alignment
cannot be satisfied, so I don't think the same reasoning as above can be
applied to it.

This patch adds a bail-out for aligned_alloc calls to
isAllocSiteRemovable.
2023-10-19 18:35:27 +01:00
Leonard Chan
ef388334ee Revert "Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass""
This reverts commit 5a36904c515b.

Reverted because this breaks some floating point operations. See the
comment on https://github.com/llvm/llvm-project/commit/5a36904c515b.
2023-10-12 20:23:39 +00:00
Dmitriy Smirnov
e13bed4c5f [PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP
This patch tries to canonicalise add + gep to gep + gep.

Co-authored-by: Paul Walker <paul.walker@arm.com>

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D155688
2023-10-06 12:29:06 +01:00
Matt Arsenault
5a36904c51 Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"
This reverts commit 26bb22b0c89e9b27576fd1f5683e0bc9ac3b4ec9.
2023-10-05 07:49:38 -07:00
Jonas Hahnfeld
26bb22b0c8 Revert "InstCombine: Introduce SimplifyDemandedUseFPClass"
It causes a test failure of clang/test/Headers/__clang_hip_math.hip:
https://lab.llvm.org/buildbot/#/builders/109/builds/75022

This reverts commit 59c6e2e9c1beee0bc73922fe44c9fd4462289847.
2023-10-05 10:26:10 +02:00
Matt Arsenault
59c6e2e9c1 InstCombine: Introduce SimplifyDemandedUseFPClass
This is the floating-point analog of SimplifyDemandedBits. If we know
the edge cases are assumed impossible in uses, it's possible to prune
upstream edge case handling.

Start by only using this on returns in functions with nofpclass
returns (where I'm surprised there are no other combines), but this
can be extended to include any other nofpclass use or FPMathOperator
with flags.

Partially addresses issue #64870

https://reviews.llvm.org/D158648
2023-10-04 21:06:24 -07:00
Nikita Popov
6ce7461eea [InstCombine] Avoid uses of ConstantExpr::getCast()
Add a generalized getLosslessTrunc() helper to simplify this.
2023-09-29 11:32:41 +02:00
Nikita Popov
c00f49cf12 [InstCombine] Remove instcombine-infinite-loop-threshold option
This option has been superseded by the fixpoint verification
functionality.
2023-09-21 15:30:05 +02:00
Paul Walker
c7d65e4466 [IR] Enable load/store/alloca for arrays of scalable vectors.
Differential Revision: https://reviews.llvm.org/D158517
2023-09-14 13:49:01 +00:00
Jeremy Morse
d529943a27 [NFC][RemoveDIs] Prefer iterators over inst-pointers in InstCombine
As per my proposal for how to eliminate debug intrinsics [0], for various
places in InstCombine prefer to insert using an instruction iterator rather
than an instruction pointer. This is so that we can eventually pass more
information in the iterator class. These call-sites where I've changed the
spelling are those that necessary to build a stage2clang to produce an
identical binary in the coming no-debug-intrinsics mode.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152543
2023-09-11 15:04:51 +01:00
Jeremy Morse
6942c64e81 [NFC][RemoveDIs] Prefer iterator-insertion over instructions
Continuing the patch series to get rid of debug intrinsics [0], instruction
insertion needs to be done with iterators rather than instruction pointers,
so that we can communicate information in the iterator class. This patch
adds an iterator-taking insertBefore method and converts various call sites
to take iterators. These are all sites where such debug-info needs to be
preserved so that a stage2 clang can be built identically; it's likely that
many more will need to be changed in the future.

At this stage, this is just changing the spelling of a few operations,
which will eventually become signifiant once the debug-info bearing
iterator is used.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D152537
2023-09-11 11:48:45 +01:00
Jeremy Morse
4427407a29 [NFC][RemoveDIs] Create a new spelling of the moveBefore method
As outlined in my proposal of how to get rid of debug intrinsics, this
patch adds a moveBefore method that signals the caller /intends/ the order
of moved instructions is to stay the same. This semantic difference has an
effect on debug-info, as it signals whether debug-info needs to move with
instructions or not.

The patch just replaces a few calls to moveBefore with calls to
moveBeforePreserving -- and the latter just calls the former, so it's all
NFC right now. A future patch will add an implementation of
moveBeforePreserving that takes action to correctly preserve debug-info,
but that's tightly coupled with our non-instruction debug-info
representation that's still being reviewed.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D156369
2023-09-07 18:37:57 +01:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Matt Arsenault
033d6ffb53 IR: Add operator | and & for FastMathFlags
We only had |= and &= which was annoying.
2023-08-28 19:25:54 -04:00
Nikita Popov
167db7ce55 [InstCombine] Guard against FP min/max in select fold (PR64937)
This is partial revert of cbca9ce91c6440f8815742b8a73a27aa81e806e6.
That commit removed the code guarding against min/max SPF patterns,
because those are now canonicalized to min/max intrinsics. However,
this is only true for integer min/max, while FP min/max can not
always be canonicalized to an intrinsic. As such, restore a
simplified version of the guard that handles only the FP case.

Fixes https://github.com/llvm/llvm-project/issues/64937.
2023-08-24 15:29:59 +02:00
Maksim Kita
341443d731 [InstCombine] Fold (-a >> b) and/or/xor (~a >> b) into (-a and/or/xor ~a) >> b
Fold (-a >> b) and/or/xor (~a >> b) into (-a and/or/xor ~a) >> b.
Depends on D157289.

Differential Revision: https://reviews.llvm.org/D157290
2023-08-21 12:49:20 +03:00
Carlos Alberto Enciso
bf69217bae [instcombine] Sunk instructions with invalid source location.
When the 'int Four = Two;' is sunk into the 'case 0:' block,
the debug value for 'Three' is set incorrectly to 'poison'.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D158171
2023-08-21 06:24:21 +01:00