38597 Commits

Author SHA1 Message Date
Sam Tebbs
795e35a653
Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744)
This relands the reverted #120721 with a fix for cases where neither
reduction operand are the reduction phi. Only
63114239cc8d26225a0ef9920baacfc7cc00fc58 and
63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted
PR.

---------

Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>
2025-01-13 11:20:35 +00:00
Mel Chen
56a37a3c76
[SLPVectorizer] Refactor HorizontalReduction::createOp (NFC) (#121549)
This patch simplifies select-based integer min/max reductions by
utilizing `llvm::getMinMaxReductionPredicate`, and generates
intrinsic-based min/max reductions by utilizing
`llvm::getMinMaxReductionIntrinsicOp`.
2025-01-13 16:11:31 +08:00
Florian Hahn
8df64ed777 [LV] Don't consider IV increments uniform if exit value is used outside.
In some cases, there might be a chain of uniform instructions producing
the exit value. To generate correct code in all cases, consider the IV
increment not uniform, if there are users outside the loop.

Instead, let VPlan narrow the IV, if possible using the logic from
3ff1d01985752.

Test case from #122602 verified with Alive2:
    https://alive2.llvm.org/ce/z/bA4EGj

Fixes https://github.com/llvm/llvm-project/issues/122496.
Fixes https://github.com/llvm/llvm-project/issues/122602.
2025-01-12 22:03:21 +00:00
Florian Hahn
3ff1d01985 Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67.

Re-applies commit with typos fixed.
2025-01-12 20:10:28 +00:00
Florian Hahn
0ebb3ac7c9 Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac.

Typo breaking the build
2025-01-12 19:37:45 +00:00
Florian Hahn
1afba19913 [VPlan] Try to narrow wide and replicating recipes to uniform recipes.
Use the existing VPlan-based analysis to identify recipes that only have
their first lane demanded and transform them to uniform recpliate
recipes. This simplifies the generated code in some places and prepares
for fixing https://github.com/llvm/llvm-project/issues/122496.
2025-01-12 19:32:01 +00:00
Kazu Hirata
43fdd6e81d
[memprof] Migrate away from PointerUnion::is (NFC) (#122622)
Note that PointerUnion::is have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

In this patch, I'm calling call().getBase() for an instance of
PointerUnion.  call() alone would return an instance of IndexCall,
which wraps PointerUnion.  Note that isa<> cannot directly accept an
instance of IndexCall, at least without defining CastInfo.

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2025-01-12 11:06:42 -08:00
Ruhung
4f7dc1b55a
[InstCombine] Fold (add (add A, 1), (sext (icmp ne A, 0))) to call umax(A, 1) (#122491)
Transform (add (add A, 1), (sext (icmp ne A, 0))) into call umax(A, 1).

Fixes #121853.

Alive2: https://alive2.llvm.org/ce/z/TweTan
2025-01-12 16:51:58 +01:00
Florian Hahn
7f59b4e998
[VPlan] Skip non-induction phi recipes in legalizeAndOptimizeInductions.
The body of the loop only applies to wide induction recipes, skip any other
header phi recipes up-frond
2025-01-11 20:33:02 +00:00
Mingjie Xu
876fa60f08
[TySan] Skip instrumentation for function declarations (#122488)
Skip function declarations for instrumentation.

Fixes https://github.com/llvm/llvm-project/issues/122467
2025-01-11 20:15:21 +08:00
Amr Hesham
642e493d4d
[InstCombine] Convert fshl(x, 0, y) to shl(x, and(y, BitWidth - 1)) when BitWidth is pow2 (#122362)
Convert `fshl(x, 0, y)` to `shl(x, and(y, BitWidth - 1))` when BitWidth
is pow2

Alive2 proof: https://alive2.llvm.org/ce/z/3oTEop
Fixes: #122235
2025-01-11 11:48:05 +01:00
Ramkumar Ramachandra
f38c40bff3
VT: teach isImpliedCondMatchingOperands about samesign (#122474)
Move isImplied{True,False}ByMatchingCmp from CmpInst to ICmpInst, so
that it can operate on CmpPredicate instead of CmpInst::Predicate, and
teach it about samesign. There are two callers of this function, and we
choose to migrate the one in ValueTracking, namely
isImpliedCondMatchingOperands to CmpPredicate, hence teaching it about
samesign, with visible test impact.
2025-01-11 09:08:57 +00:00
Veera
2d5f07c828
[InstCombine] Fold X udiv Y to X lshr cttz(Y) if Y is a power of 2 (#121386)
Fixes #115767

This PR folds `X udiv Y` to `X lshr cttz(Y)` if Y is a power of two
since bitwise operations are faster than division.

Proof: https://alive2.llvm.org/ce/z/qHmLta
2025-01-11 13:56:13 +08:00
Vitaly Buka
8af4d206e0
[NFCI][BoundsChecking] Apply nosanitize on local-bounds instrumentation (#122416)
Should be NFCI as we run sanitizer, like msan, before local-bounds.
2025-01-10 18:11:19 -08:00
Vasileios Porpodas
25b90c4ef6 [SandboxVec][SeedCollector][NFC] Remove redundant 'else' and move the assertion within the 'if' 2025-01-10 14:54:44 -08:00
Noah Goldstein
0d9c027ad7 [InstCombine] Make takeLog2 visible in all of InstCombine; NFC
Also add `tryGetLog2` helper that encapsulates the common pattern:

```
if (takeLog2(..., /*DoFold=*/false)) {
    Value * Log2 = takeLog2(..., /*DoFold=*/true);
    ...
}
```

Closes #122498
2025-01-10 16:21:35 -06:00
vporpo
9248428db7
[SandboxVec][DAG][NFC] Refactor setNextNode() and setPrevNode() (#122363)
This patch updates DAG's `setNextNode()` and `setPrevNode()` to update
both nodes of the link.
2025-01-10 13:32:33 -08:00
Han-Kuan Chen
35e76b6a4f Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198)"
This reverts commit f3d6cdc5aebafac3961d4fccbd2ca0e302c6082c.
2025-01-10 10:09:54 -08:00
Alexey Bataev
681c83a2f9 [SLP]Fix mask generation after cost estimation
When estimating the cost of entries shuffles for buildvectors, need to
rebuild original mask, not a generated submask, used for subregisters
analysis.

Fixes #122430
2025-01-10 09:32:35 -08:00
Alex MacLean
986f2ac48f
[SLPVectorizer] minor tweaks around lambdas for compatibility with older compilers (#122348)
Older version of msvc do not have great lambda support and are not able
to handle uses of class data or lambdas with implicit return types in
some cases. These minor changes improve the sources compatibility with
older msvc and don't hurt readability either.
2025-01-10 09:18:28 -08:00
Alexey Bataev
3c9c94a24f Revert "[SLP]Fix mask generation after cost estimation"
This reverts commit 547ba9730bf05df3383150f730a689f2c8336206 to fix
buildbots reported in
https://lab.llvm.org/buildbot/#/builders/123/builds/11370, https://lab.llvm.org/buildbot/#/builders/133/builds/9492
2025-01-10 08:46:42 -08:00
Alexey Bataev
547ba9730b [SLP]Fix mask generation after cost estimation
When estimating the cost of entries shuffles for buildvectors, need to
rebuild original mask, not a generated submask, used for subregisters
analysis.

Fixes #122430
2025-01-10 08:17:56 -08:00
Nikita Popov
c39500f88c Revert "[GVN] MemorySSA for GVN: add optional AllowMemorySSA"
This reverts commit eb63cd62a4a1907dbd58f12660efd8244e7d81e9.

This changes the preservation behavior for MSSA when the new flag
is not enabled.
2025-01-10 12:57:00 +01:00
Momchil Velikov
eb63cd62a4 [GVN] MemorySSA for GVN: add optional AllowMemorySSA
Preparatory work to migrate from MemoryDependenceAnalysis
towards MemorySSA in GVN.

Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>
2025-01-10 10:43:12 +01:00
Mel Chen
e0f14e11c7
[SLPVectorizer] Refine the scope of RdxOpcode in HorizontalReduction::createOp (NFC) (#122239)
This patch is one part of unifying IAnyOf and FAnyOf reduction. #118393
The related patch is #118777.
2025-01-10 16:01:36 +08:00
Han-Kuan Chen
f3d6cdc5ae [SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198)
Add TreeEntry::hasState.
Add assert for getTreeEntry.
Remove the OpValue parameter from the canReuseExtract function.
Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.
2025-01-09 23:41:52 -08:00
Vitaly Buka
4c8fdc2954
[nfc][BoundsChecking] Rename BoundsCheckingOptions into Options (#122359) 2025-01-09 20:38:13 -08:00
Vitaly Buka
9c2de994a1
[nfc][BoundsChecking] Refactor BoundsCheckingOptions (#122346)
Remove ReportingMode and ReportingOpts.
2025-01-09 20:19:01 -08:00
Han-Kuan Chen
5454ac28b3 Revert "[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198)"
This reverts commit 760f550de25792db83cd39c88ef57ab6d80a41a0.
2025-01-09 18:41:47 -08:00
Han-Kuan Chen
36b423e0f8
[SLP] NFC. Refactor getSameOpcode and reduce for loop iterations. (#122241)
Replace Cnt and AltIndex with MainOp and AltOp.
Reduce the number of iterations in the for loop.
2025-01-10 09:06:07 +08:00
Han-Kuan Chen
760f550de2
[SLP] NFC. Replace MainOp and AltOp in TreeEntry with InstructionsState. (#120198)
Add TreeEntry::hasState.
Add assert for getTreeEntry.
Remove the OpValue parameter from the canReuseExtract function.
Remove the Opcode parameter from the ComputeMaxBitWidth lambda function.
2025-01-10 09:05:39 +08:00
Florian Hahn
7ffb691595
[VPlan] Remove dead ToRemove (NFC). 2025-01-09 22:02:32 +00:00
Thurston Dang
4f42e16516
[hwasan] Omit tag check for null pointers (#122206)
If the pointer to be checked is statically known to be zero, the tag
check will always pass since:
1) the tag is zero
2) shadow memory for address 0 is initialized to 0 and never updated.
We can therefore elide the tag check.

We perform the elision in two places:
1) the HWASan pass
2) when lowering the CHECK_MEMACCESS intrinsic. Conceivably, the HWASan
pass may encounter a "cannot currently statically prove to be null"
pointer (and is therefore unable to omit the intrinsic) that later
optimization passes convert into a statically known-null pointer. As a
last line of defense, we perform elision here too.

This also updates the tests from
https://github.com/llvm/llvm-project/pull/122186
2025-01-09 13:48:26 -08:00
Teresa Johnson
3055e86c71
[MemProf] Disable cloning of callsites in recursive cycles by default (#122354)
This disables the support added in PR121985 by default while we
investigate a compile time crash.
2025-01-09 12:01:43 -08:00
vporpo
6312beef78
[SandboxVec][BottomUpVec] Use SeedCollector and slice seeds (#120826)
With this patch we switch from the temporary dummy seeds to actual seeds
provided by the seed collector.
The seeds get sliced and each slice is used as the starting point for
vectorization.
2025-01-09 11:53:48 -08:00
Alexey Bataev
5ff36748cf [SLP]Fix mask processing for reused gathered scalars
Need to sync the mask between cost and actual emission to avoid bugs in
mask calculation

Fixes #122324
2025-01-09 11:24:48 -08:00
Florian Hahn
b0697dc1de
[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994)
Currently we emit early-exit related debug messages/remarks even when
there is a single exit. Update to only check isVectorizableEarlyExitLoop
if there isn't a single exit block.

PR: https://github.com/llvm/llvm-project/pull/121994
2025-01-09 12:05:19 +00:00
Nikita Popov
dcdf44aca7
[InstCombine] Remove foldSelectICmpEq() fold (#122098)
This fold matches complex patterns, for which we have no proof of
real-world relevance, and which does not actually handle the originally
motivating cases from https://github.com/llvm/llvm-project/issues/71792
either.

In https://github.com/llvm/llvm-project/pull/121708 and
https://github.com/llvm/llvm-project/pull/121753 we have handled some
simpler variants by extending existing folds.

I propose to remove this code until we have evidence that it is useful
for something.
2025-01-09 12:33:01 +01:00
Sergio Afonso
b79ed8729b
[OpenMP][OMPIRBuilder] Handle non-failing calls properly (#115863)
The preprocessor definition used to enable asserts and the one that
`llvm::Error` and `llvm::Expected` use to ensure all created instances are
checked are not the same. By making these checks inside of an `assert` in cases
where errors are not expected, certain build configurations would trigger
runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF
-DLLVM_UNREACHABLE_OPTIMIZE=ON`).

The `llvm::cantFail()` function, which was intended for this use case, is used
by this patch in place of `assert` to prevent these runtime failures. In tests,
new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and
`EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release
builds.
2025-01-09 10:28:16 +00:00
Benjamin Maxwell
f88ef1bd1b
[LV] Teach LoopVectorizationLegality about struct vector calls (#119221)
This is a split-off from #109833 and only adds code relating to checking
if a struct-returning call can be vectorized.

This initial patch only allows the case where all users of the struct
return are `extractvalue` operations that can be widened.

```
%call = tail call { float, float } @foo(float %in_val)
%extract_a = extractvalue { float, float } %call, 0
%extract_b = extractvalue { float, float } %call, 1
```

Note: The tests require the VFABI changes from #119000 to pass.
2025-01-09 09:27:29 +00:00
Nikita Popov
71f7b972c3
[Local] Make combineAAMetadata() more principled (#122091)
This moves combineAAMetadata() into Local and implements it via a new
AAOnly flag, which will intersect only AA metadata and keep other known
metadata.

The existing KnownIDs list is dropped, because it is redundant with the
switch in combineMetadata(), which already drops unknown metadata.

I tried a few variants of this, and ultimately went with the AAOnly flag
because this way we make an explicit choice for each metadata kind
supported by combineMetadata(), and ignoring the flag gives you
conservatively correct behavior.

I checked that the memcpy tests still pass if we adjust the logic for
MD_memprof/MD_callsite to drop the metadata instead of arbitrarily
picking one.

Fixes https://github.com/llvm/llvm-project/issues/121495.
2025-01-09 09:34:46 +01:00
Yingwei Zheng
d80bdf7261
[IRBuilder] Add a helper function to intersect FMFs from two instructions (#122059)
Address review comment in
https://github.com/llvm/llvm-project/pull/121899#discussion_r1905765776
2025-01-09 14:36:42 +08:00
Yingwei Zheng
b8337dc4b2
[InstCombine] Handle commuted patterns in foldBinOpShiftWithShift (#122126)
Closes https://github.com/llvm/llvm-project/issues/121775.
2025-01-09 14:36:17 +08:00
Akshat Oke
f6c76d5180
[PM] Remove is_analysis label for LoopSimplify (#121433)
This reverts part of the changes in #118779
2025-01-09 10:11:14 +05:30
Alexey Bataev
5b76a2e51b [SLP]Correctly calculate mask for the inserted vector 2025-01-08 15:18:06 -08:00
Alexey Bataev
0d921f96d4 [SLP][NFC]Introduce and use createInsertVector helper function, NFC 2025-01-08 14:26:13 -08:00
David Green
676c641718
[VectorCombine] Use getInstructionCost to cost Shuffle. (#122068)
This allows it to produce a more accurate cost for the shuffle, using
the more accurate calls to getShuffleCost in getInstructionCost. It
helps fix some of the regressions from vector combine a little while
ago, now that we have better subvector extract costs.
2025-01-08 20:48:40 +00:00
Andreas Jonson
d4182f1b56
[InstCombine] move foldAndOrOfICmpsOfAndWithPow2 into foldLogOpOfMaskedICmps (#121970) 2025-01-08 18:04:38 +01:00
Alexey Bataev
1160994602 [SLP]Fix a crash for very long GEP chains
Need to check if the GEP bases are equal and return false early. Also,
need to return false if the lookup is too deep, considering bases equal
too. Fixes a crash in the assertion.
2025-01-08 06:47:41 -08:00
Yingwei Zheng
03e7862962
[ValueTracking] Move getFlippedStrictnessPredicateAndConstant into ValueTracking. NFC. (#122064)
Needed by https://github.com/llvm/llvm-project/pull/121958.
2025-01-08 20:02:49 +08:00