1029 Commits

Author SHA1 Message Date
Yingwei Zheng
b8337dc4b2
[InstCombine] Handle commuted patterns in foldBinOpShiftWithShift (#122126)
Closes https://github.com/llvm/llvm-project/issues/121775.
2025-01-09 14:36:17 +08:00
Nikita Popov
a8072a0b4e
[InstCombine] Eliminate icmp+zext pairs over phis more aggressively (#121767)
When folding icmp over phi, add a special case for `icmp eq (zext(bool),
0)`, which is known to fold to `!bool` and thus won't increase the
instruction count. This helps convert more phis to i1, esp. in loops.

This is based on existing logic we have to support this for icmp of
ucmp/scmp.
2025-01-07 09:29:36 +01:00
Yingwei Zheng
a4d92400a6
[InstCombine] Fix GEPNoWrapFlags propagation in foldGEPOfPhi (#121572)
Closes https://github.com/llvm/llvm-project/issues/121459.
2025-01-03 23:19:57 +08:00
Alexander Kornienko
23a239267e
Revert "[InstCombine] Infer nuw for gep inbounds from base of object" (#120460)
Reverts llvm/llvm-project#119225 due to the lack of sanitizer support,
large potential of breaking code containing latent UB, non-trivial
localization and investigation, and what seems to be a bad interaction
with msan (a test is in the works).

Related discussions:
https://github.com/llvm/llvm-project/pull/119225#issuecomment-2551904822
https://github.com/llvm/llvm-project/pull/118472#issuecomment-2549986255
2024-12-18 19:06:34 +01:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Matthias Braun
768754807f
[InstCombine] Optimistically allow multiple shufflevector uses in foldOpPhi (#114278)
We would like to optimize situations of the form that happen after loop
vectorization+SROA:
```
loop:
    %phi = phi zeroinitializer, %interleaved

    %deinterleave_a = shufflevector %phi, poison ; pick half of the lanes
    %deinterleave_b = shufflevector %phi, posion ; pick remaining lanes

    ... %a = ... %b = ...

    %interleaved = shufflevector %a, %b ; interleave lanes of a+b
```
where the interleave and de-interleave shuffle operations cancel each
other out.
This could be handled by `foldOpPhi` but does not currently work because
it does
not proceed when there are multiple uses of the `Phi` operation.

This extends `foldOpPhi` to allow multiple `shufflevector` uses when
they are
shown to simplify for all `Phi` input values.
2024-12-12 17:20:48 -08:00
Nikita Popov
e21ab4d16b
[InstCombine] Infer nuw for gep inbounds from base of object (#119225)
When we have a gep inbounds from the base of an object (e.g. alloca or
global), we know that the index cannot be negative, as this would go out
of bounds. As such, we can infer nuw as well.

The implementation is a bit stricter than necessary, we could also
accept one unknown index followed by known-non-negative indices.

Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently
incorrectly doesn't require the inbounds for the alloca case, see
https://github.com/AliveToolkit/alive2/issues/1138).
2024-12-10 10:00:50 +01:00
Nikita Popov
f7685af4a5 [InstCombine] Move gep of phi fold into separate function
This makes sure that an early return during this fold doesn't end
up skipping later gep folds.
2024-12-05 15:20:56 +01:00
Nikita Popov
462cb3cd6c
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.

Proof: https://alive2.llvm.org/ce/z/ihztLy
2024-12-05 14:36:40 +01:00
Ramkumar Ramachandra
51a895aded
IR: introduce struct with CmpInst::Predicate and samesign (#116867)
Introduce llvm::CmpPredicate, an abstraction over a floating-point
predicate, and a pack of an integer predicate with samesign information,
in order to ease extending large portions of the codebase that take a
CmpInst::Predicate to respect the samesign flag.

We have chosen to demonstrate the utility of this new abstraction by
migrating parts of ValueTracking, InstructionSimplify, and InstCombine
from CmpInst::Predicate to llvm::CmpPredicate. There should be no
functional changes, as we don't perform any extra optimizations with
samesign in this patch, or use CmpPredicate::getMatching.

The design approach taken by this patch allows for unaudited callers of
APIs that take a llvm::CmpPredicate to silently drop the samesign
information; it does not pose a correctness issue, and allows us to
migrate the codebase piece-wise.
2024-12-03 13:31:04 +00:00
Nikita Popov
9a844a36eb
[InstCombine] Use InstSimplify in FoldOpIntoSelect (#116073)
Instead of only trying to constant fold the select arms, try to simplify
them. This subsumes https://github.com/llvm/llvm-project/pull/115969
which implements this for extractvalue only.

This is still fairly limited in that we will usually only call
FoldOpIntoSelect in the first place if we have a constant operand. This
can be relaxed in the future if worthwhile.
2024-11-18 10:07:31 +01:00
serge-sans-paille
f5e4ffaa49
Revert "[llvm] Use computeConstantRange to improve llvm.objectsize computation (#114673)"
This reverts commit 5f342816efe1854333f2be41a03fdd25fa0db433.

This seems to break various builders, such as

https://lab.llvm.org/buildbot/#/builders/41/builds/3259
https://lab.llvm.org/buildbot/#/builders/76/builds/4298
2024-11-07 13:40:50 +01:00
serge-sans-paille
5f342816ef
[llvm] Use computeConstantRange to improve llvm.objectsize computation (#114673)
Using LazyValueInfo, it is possible to compute valuable information for
allocation functions, GEP and alloca, even in the presence of dynamic
information.

llvm.objectsize plays an important role in _FORTIFY_SOURCE definitions,
so improving its diagnostic in turns improves the security of compiled
application.

As a side note, as a result of recent optimization improvements, clang
no longer passes
https://github.com/serge-sans-paille/builtin_object_size-test-suite This
commit restores the situation and greatly improves the scope of code
handled by the static version of __builtin_object_size.
2024-11-07 09:01:14 +00:00
Yingwei Zheng
cacbe71af7
[Analysis] Avoid running transform passes that have just been run (#112092)
This patch adds a new analysis pass to track a set of passes and their
parameters to see if we can avoid running transform passes that have
just been run. The current implementation only skips redundant
InstCombine runs. I will add support for other passes in follow-up
patches.

RFC link:
https://discourse.llvm.org/t/rfc-pipeline-avoid-running-transform-passes-that-have-just-been-run/82467

Compile time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=76007138f4ffd4e0f510d12b5e8cad529c21f24d&to=64134cf07ea7eb39c60320087c0c5afdc16c3a2b&stat=instructions%3Au
2024-11-07 07:52:14 +08:00
Yingwei Zheng
f78610af3f
[InstCombine] Add function attribute instcombine-no-verify-fixpoint (#113822)
This patch introduces a function attribute
`instcombine-no-verify-fixpoint` to avoids disabling fix-point
verification for unrelated tests in the same file.
Address comment
https://github.com/llvm/llvm-project/pull/112642#discussion_r1804714387.
2024-10-28 17:45:08 +08:00
Yingwei Zheng
5155c38cee
[InstCombine] Don't check uses of constant exprs (#113684)
This patch skips constant expressions to avoid iterating over uses on
other functions.

Fix crash reported in
https://github.com/llvm/llvm-project/pull/105510#issuecomment-2437521147.
2024-10-28 15:09:20 +08:00
Jay Foad
90cdc03e7f
[IR] Fix undiagnosed cases of structs containing scalable vectors (#113455)
Type::isScalableTy and StructType::containsScalableVectorType failed to
detect some cases of structs containing scalable vectors because
containsScalableVectorType did not call back into isScalableTy to check
the element types. Fix this, which requires sharing the same Visited set
in both functions. Also change the external API so that callers are
never required to pass in a Visited set, and normalize the naming to
isScalableTy.
2024-10-25 12:56:10 +01:00
Ramkumar Ramachandra
7b65971e1f
InstCombine: sink loads with invariant.load metadata (#112692) 2024-10-18 10:35:56 +01:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
David Green
d2408c417c
[InstCombine] Canonicalize more geps with constant gep bases and constant offsets. (#110033)
This is another small but hopefully not performance negative step to
canonicalizing towards i8 geps. We looks for geps with a constant offset
base pointer of the form `gep (gep @glob, C1), x, C2` and expand the gep
instruction, so that the constant can hopefully be combined together (or
the x offset can be computed in common).
2024-10-06 10:44:21 +01:00
Stephen Tozer
caa265e01c
[DebugInfo][InstCombine] Do not overwrite prior DILocation for new Insts (#108565)
When InstCombine replaces an old instruction with a new instruction, it
copies !dbg and !annotation metadata from old to new. For some
InstCombine patterns we set a specific DILocation on the new instruction
prior to insertion, however, which more accurately reflects the new
instruction. This more specific DILocation may be overwritten on
insertion by a less appropriate one, resulting in a less correct line
mapping. This patch changes this behaviour to only copy the DILocation
from old to new if the new instruction has no existing DILocation (which
will always be the case for a new instruction unless InstCombine has
specifically set one).
2024-10-03 17:08:45 +01:00
Nikita Popov
e565a4fa0b [IR] Extract helper for GEPNoWrapFlags intersection (NFC)
When combining two geps into one by adding the offsets, we have
to take some care when intersecting the flags, because nusw flags
cannot be straightforwardly preserved.

Add a helper for this on GEPNoWrapFlags so we won't have to repeat
this logic in various places.
2024-10-01 16:58:23 +02:00
Volodymyr Vasylkun
b189b89bde
[InstCombine] Relax the conditons of fold of ucmp/scmp into phi by allowing the phi node to use the result of ucmp/scmp more than once (#109593)
This extends the optimisation implemented in #107769 by relaxing the
condtions to make it happen. Now, the value produced by `ucmp`/`scmp`
doesn't need to be one-use, but only one-user, meaning it can be present
in a single phi node more than once.
2024-09-23 15:39:11 +01:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Nikita Popov
f1ff3a279f [InstCombine] Rename TTI member for clarity (NFC)
There is already a comment on the member and documentation in the
InstCombine contributor guide, but also rename it to make add
an additional speed bump.
2024-09-19 12:31:11 +02:00
Nikita Popov
dc6876fc98
[ValueTracking] Use isSafeToSpeculativelyExecuteWithVariableReplaced() in more places (#109149)
This replaces some uses of isSafeToSpeculativelyExecute() with
isSafeToSpeculativelyExecuteWithVariableReplaced(), in cases where we
are guarding against operand changes rather plain speculation.

I believe that this is NFC with the current implementation of the
function (as it only does something different from loads), but this
makes us more defensive against future generalizations.
2024-09-19 09:38:20 +02:00
Volodymyr Vasylkun
21e3a212c5
[InstCombine] Replace an integer comparison of a phi node with multiple ucmp/scmp operands and a constant with phi of individual comparisons of original intrinsic's arguments (#107769)
When we have a `phi` instruction with more than one of its incoming
values being a call to `ucmp` or `scmp`, which is then compared with an
integer constant, we can move the comparison through the `phi` into the
incoming basic blocks because we know that a comparison of `ucmp`/`scmp`
with a constant will be simplified by the next iteration of InstCombine.

There's a high chance that other similar patterns can be identified, in
which case they can be easily handled by the same code by moving the
check for "simplifiable" instructions into a lambda.
2024-09-13 19:50:27 +01:00
Nikita Popov
1c298c9274 [InstCombine] Preserve nuw flags when merging geps
These transforms all perform a variant of (gep (gep p, x), y)
to (gep p, (x + y)). We can preserve both inbounds and nuw
during such transforms (https://alive2.llvm.org/ce/z/Stu4cN), but
not nusw, which would require proving that the new add is nsw.

For the constant offset case, I've conservatively retained the
logic that checks for negative intermediate offsets, though I'm
not sure it's still reachable nowadays.
2024-09-13 11:15:22 +02:00
Nikita Popov
cd39242032 [InstCombine] Remove no longer needed constant offset case (NFCI)
Now that we canonicalize constant geps to i8 type, this special
handling should no longer be needed.
2024-09-13 10:15:54 +02:00
Nikita Popov
940f89255e [InstCombine] Do not modify GEP in place
This was modifying the GEP in place, with code to adjust the
inbounds flag. This was correct at the time, but now fails to
account for other GEP flags like nuw, leading to miscompilations.

Remove the special case, and always create a new GEP instruction.
Logic for preserving nuw in the cases where it is valid will be
added in a followup patch.
2024-09-13 10:04:39 +02:00
Nikita Popov
3bc38fb27a
[InstCombine] Generalize and consolidate phi translation check (#106051)
The foldOpIntoPhi() transforms requires all operands to be
phi-translatable. This can be the case either because they are phi nodes
in the same block, or because the operand dominates the block.

Currently, most callers of foldOpIntoPhi() satisfy this pre-condition by
requiring a constant operand, which trivially dominates everything. Only
selects had handling for variable operands.

Move this logic into foldOpIntoPhi(), so things are handled correctly if
other callers are generalized. Also make the implementation a bit more
general by querying the dominator tree.
2024-09-04 16:22:43 +02:00
Nikita Popov
34b10e165d [InstCombine] Remove optional LoopInfo dependency
https://github.com/llvm/llvm-project/pull/106075 has removed the
last dependency on LoopInfo in InstCombine, so don't fetch the
analysis anymore and remove the use-loop-info pass option.
2024-09-02 10:25:45 +02:00
Nikita Popov
f044564db1
[InstCombine] Make backedge check in op of phi transform more precise (#106075)
The op of phi transform wants to prevent moving an operation across a
backedge, as this may lead to an infinite combine loop.

Currently, this is done using isPotentiallyReachable(). The problem with
that is that all blocks inside a loop are reachable from each other.
This means that the op of phi transform is effectively completely
disabled for code inside loops, even when it's not actually operating on
a loop phi (just a phi that happens to be in a loop).

Fix this by explicitly computing the backedges inside the function
instead. Do this via RPOT, which is a bit more efficient than using
FindFunctionBackedges() (which does it without any pre-computed
analyses).

For irreducible cycles, there may be multiple possible choices of
backedge, and this just picks one of them. This is still sufficient to
prevent combine loops.

This also removes the last use of LoopInfo in InstCombine -- I'll drop
the analysis in a followup.
2024-09-02 09:09:21 +02:00
Yingwei Zheng
380fa875ab
[InstCombine] Replace all dominated uses of condition with constants (#105510)
This patch replaces all dominated uses of condition with true/false to
improve context-sensitive optimizations. It eliminates a bunch of
branches in llvm-opt-benchmark.

As a side effect, it may introduce new phi nodes in some corner cases.
See the following case:
```
define i1 @test(i1 %cmp, i1 %cond) {
entry:
   br i1 %cond, label %bb1, label %bb2
bb1:
   br i1 %cmp, label %if.then, label %if.else
if.then:
   br %bb2
if.else:
   br %bb2
bb2:
  %res = phi i1 [%cmp, %entry], [%cmp, %if.then], [%cmp, %if.else]
  ret i1 %res
}
```
It will be simplified into:
```
define i1 @test(i1 %cmp, i1 %cond) {
entry:
   br i1 %cond, label %bb1, label %bb2
bb1:
   br i1 %cmp, label %if.then, label %if.else
if.then:
   br %bb2
if.else:
   br %bb2
bb2:
  %res = phi i1 [%cmp, %entry], [true, %if.then], [false, %if.else]
  ret i1 %res
}
```

I am planning to fix this in late pipeline/CGP since this problem exists
before the patch.
2024-09-01 09:49:23 +08:00
Nikita Popov
b74248dae8 [InstCombine] Pass RPOT to InstCombiner (NFC)
To make use of it in a followup change.
2024-08-26 15:17:38 +02:00
Nikita Popov
2511cdb078 [InstCombine] Adjust fixpoint error message (NFC)
Add a hint to use the no-verify-fixpoint option.
2024-08-20 14:30:09 +02:00
Yingwei Zheng
f364b2ee22
[LLVM] Don't peek through bitcast on pointers and gep with zero indices. NFC. (#102889)
Since we are using opaque pointers now, we don't need to peek through
bitcast on pointers and gep with zero indices.
2024-08-13 22:38:50 +08:00
Yingwei Zheng
62e9f40949
[PatternMatch] Use m_SpecificCmp matchers. NFC. (#100878)
Compile-time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=13996378d81c8fa9a364aeaafd7382abbc1db83a&to=861ffa4ec5f7bde5a194a7715593a1b5359eb581&stat=instructions:u
baseline: 803eaf29267c6aae9162d1a83a4a2ae508b440d3
```
Top 5 improvements:
  stockfish/movegen.ll 2541620819 2538599412 -0.12%
  minetest/profiler.cpp.ll 431724935 431246500 -0.11%
  abc/luckySwap.c.ll 581173720 580581935 -0.10%
  abc/kitTruth.c.ll 2521936288 2519445570 -0.10%
  abc/extraUtilTruth.c.ll 1216674614 1215495502 -0.10%
Top 5 regressions:
  openssl/libcrypto-shlib-sm4.ll 1155054721 1155943201 +0.08%
  openssl/libcrypto-lib-sm4.ll 1155054838 1155943063 +0.08%
  spike/vsm4r_vv.ll 1296430080 1297039258 +0.05%
  spike/vsm4r_vs.ll 1312496906 1313093460 +0.05%
  nuttx/lib_rand48.c.ll 126201233 126246692 +0.04%
Overall: -0.02112308%
```
2024-07-29 10:04:06 +08:00
Kazu Hirata
92f4001906
[Transforms] Use range-based for loops (NFC) (#97576) 2024-07-03 12:53:19 -07:00
Nikita Popov
99d8bc9e76 [InstCombine] Preserve all gep nowrap flags in ptradd canonicalization 2024-07-01 16:46:00 +02:00
Noah Goldstein
2632680006 [InstCombine] Canonicalize (gep <not i8> p, (div exact X, C))
If C % sizeof(gep_element_type) is zero, we can canonicalize to `i8` via:
    `(gep i8 p, (div exact X, C / (sizeof(gep_element_type))))`

Closes #96898
2024-07-01 22:22:35 +08:00
Youngsuk Kim
2051736f7b [llvm][Transforms] Avoid 'raw_string_ostream::str' (NFC)
Since `raw_string_ostream` doesn't own the string buffer, it is
desirable (in terms of memory safety) for users to directly reference
the string buffer rather than use `raw_string_ostream::str()`.

Work towards TODO comment to remove `raw_string_ostream::str()`.
2024-06-30 09:03:29 -05:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
David Green
352a836176
[InstCombine] Canonicalize non-i8 gep of mul to i8 (#96606)
This is a small canonicalization for `gep i32, p, (mul x, C)` -> `gep
i8, p, (mul x, C*4)`, so that the mul can combine both of the constant
multiplications, and we take a small step towards canonicalizing more
geps to i8.

It currently doesn't attempt to check for multiple uses on the mul, but
that should be possible if it sounds better. Let me know what you think
of the idea in general.
2024-06-26 14:25:54 +01:00
Nikita Popov
60984f5be9 [InstCombine] Preserve all gep flags in gep of exact div fold 2024-06-19 14:33:43 +02:00
Nikita Popov
2a1169088c [InstCombine] Preserve all gep flags when emitting offset 2024-06-19 12:36:36 +02:00
Nikita Popov
807222245e [InstCombine] Preserve all gep flags in gep of select fold 2024-06-19 12:31:53 +02:00
Nikita Popov
534e3ad08b [InstCombine] Avoid use of ConstantExpr::getShl()
Use the constant folding API instead. Use ImmConstant to ensure
folding succeeds.
2024-06-18 16:58:11 +02:00
Nikita Popov
9a86d0a6b5 [InstCombine] Prefer source over result element type (NFC)
For single-index GEPs the source and result element types are the
same, but using the source type is semantically more correct.
2024-06-17 11:03:00 +02:00