37616 Commits

Author SHA1 Message Date
Antonio Frighetto
942e872d5b [Instrumentation] Do not request sanitizers for naked functions
Sanitizers instrumentation may be incompatible with naked functions,
which lack of standard prologue/epilogue.
2024-09-17 09:23:39 +02:00
Alexey Bataev
18ef467d73 [SLP]Fix PR108709: postpone buildvector clustered nodes, if required
The "clustered" nodes for buildvector nodes must be postponed in
accordance with the global flag, otherwise it may cause crash because of
the dependency between phi nodes.
2024-09-16 09:53:46 -07:00
Alexey Bataev
f564a48f0e [SLP]Fix PR108700: correctly identify id of the operand node
If the operand node for truncs is not created during construction, but
one of the previous ones is reused instead, need to correctly identify
its index, to correctly emit the code.

Fixes https://github.com/llvm/llvm-project/issues/108700
2024-09-16 09:44:47 -07:00
Kolya Panchenko
b592917eec
[LV] Added verification of EVL recipes (#107630) 2024-09-16 11:58:29 -04:00
Kazu Hirata
f4a3309c9a
[IPO] Avoid repeated hash lookups (NFC) (#108796) 2024-09-16 06:44:34 -07:00
Phoebe Wang
af5a45b34b
[X86,SimplifyCFG] Use passthru to reduce select (#108754) 2024-09-16 20:20:36 +08:00
Nikita Popov
b7e51b4f13
[IPSCCP] Infer attributes on arguments (#107114)
During inter-procedural SCCP, also infer attributes on arguments, not
just return values. This allows other non-interprocedural passes to make
use of the information later.
2024-09-16 10:23:41 +02:00
David Sherwood
b29c5b66fd
[NFC][LoopVectorize] Dont pass LLVMContext to VPTypeAnalysis constructor (#108540)
We already pass a Type object into the VPTypeAnalysis constructor, which
can be used to obtain the context. While in the same area it also made
sense to avoid passing the context into the VPTransformState and
VPCostContext constructors.
2024-09-16 09:12:11 +01:00
Antonio Frighetto
2ae968a0d9
[Instrumentation] Move out to Utils (NFC) (#108532)
Utility functions have been moved out to Utils. Minor opportunity to
drop the header where not needed.
2024-09-15 21:07:40 -07:00
Yingwei Zheng
87663fdab9
[VectorCombine] Don't shrink lshr if the shamt is not less than bitwidth (#108705)
Consider the following case:
```
define <2 x i32> @test(<2 x i64> %vec.ind16, <2 x i32> %broadcast.splat20) {
  %19 = icmp eq <2 x i64> %vec.ind16, zeroinitializer
  %20 = zext <2 x i1> %19 to <2 x i32>
  %21 = lshr <2 x i32> %20, %broadcast.splat20
  ret <2 x i32> %21
}
```
After https://github.com/llvm/llvm-project/pull/104606, we shrink the
lshr into:
```
define <2 x i32> @test(<2 x i64> %vec.ind16, <2 x i32> %broadcast.splat20) {
  %1 = icmp eq <2 x i64> %vec.ind16, zeroinitializer
  %2 = trunc <2 x i32> %broadcast.splat20 to <2 x i1>
  %3 = lshr <2 x i1> %1, %2
  %4 = zext <2 x i1> %3 to <2 x i32>
  ret <2 x i32> %4
}
```
It is incorrect since `lshr i1 X, 1` returns `poison`.
This patch adds additional check on the shamt operand. The lshr will get
shrunk iff we ensure that the shamt is less than bitwidth of the smaller
type. As `computeKnownBits(&I, *DL).countMaxActiveBits() > BW` always
evaluates to true for `lshr(zext(X), Y)`, this check will only apply to
bitwise logical instructions.

Alive2: https://alive2.llvm.org/ce/z/j_RmTa
Fixes https://github.com/llvm/llvm-project/issues/108698.
2024-09-15 18:38:06 +08:00
c8ef
86f0399c1f
[InstCombine] Fold expression using basic properties of floor and ceiling function (#107107)
alive2: ~~https://alive2.llvm.org/ce/z/Ag3Ki7~~
https://alive2.llvm.org/ce/z/ywP5t2
related: #76438

This patch adds the following foldings: `floor(x) <= x --> true` and `x
<= ceil(x) --> true`. We leverage the properties of these math functions
and ensure there is no floating point input of `nan`.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2024-09-15 14:25:00 +04:00
Florian Hahn
012dbec604
[VPlan] Handle ForceTargetInstructionCost in during precomputeCosts.
Make sure ForceTargetInstruction is respected in precomputeCosts.
2024-09-15 10:53:43 +01:00
Florian Hahn
f66509bf52
[VPlan] Clarify comment for replaceVPBBWithIRVPBB and add assert (NFCI).
Follow-up to suggestion during
https://github.com/llvm/llvm-project/pull/100735.

More specifically
9a40ed0919 (diff-6d0b73adfa9f8465923d2225ab6674ddcdeab71666f7a73dfaec7fa1246b3a1f)
2024-09-14 21:51:19 +01:00
Florian Hahn
cfe3f5fa61
[VPlan] Remove unneeded ExitBB variable after f0c5caa814.
Fix buildbot failures due to an unused variable, e.g.
https://lab.llvm.org/buildbot/#/builders/186/builds/2329
2024-09-14 21:35:45 +01:00
Florian Hahn
f0c5caa814
[VPlan] Add VPIRInstruction, use for exit block live-outs. (#100735)
Add a new VPIRInstruction recipe to wrap existing IR instructions not to
be modified during execution, execept for PHIs. For PHIs, a single
VPValue
operand is allowed, and it is used to add a new incoming value for the
single predecessor VPBB. Expect PHIs, VPIRInstructions cannot have any
operands.

Depends on https://github.com/llvm/llvm-project/pull/100658.

PR: https://github.com/llvm/llvm-project/pull/100735
2024-09-14 21:21:55 +01:00
Mircea Trofin
82266d3a2b
[nfc][ctx_prof] Factor the callsite instrumentation exclusion criteria (#108471)
Reusing this in the logic fetching the instrumentation in `CtxProfAnalysis`.
2024-09-13 21:25:47 -07:00
Teresa Johnson
12d4769cb8
Revert "[MemProf] Streamline and avoid unnecessary context id duplication (#107918)" (#108652)
This reverts commit 524a028f69cdf25503912c396ebda7ebf0065ed2, but
manually so that follow on PR108086 /
ae5f1a78d3a930466f927989faac8e0b9d820a7b
is retained (NFC patch to convert tuple to a struct).
2024-09-13 16:20:43 -07:00
Alexey Bataev
1e3536ef31 [SLP]Fix PR108620: Need to check, if the reduced value was transformed
Before trying to include the scalar into the list of
ExternallyUsedValues, need to check, if it was transformed in previous
iteration and use the transformed value, not the original one, to avoid
compiler crash when building external uses.

Fixes https://github.com/llvm/llvm-project/issues/108620
2024-09-13 15:43:06 -07:00
Felipe de Azevedo Piovezan
ddcc601353
[CoroSplit][DebugInfo] Adjust heuristic for moving DIScope of funclets (#108611)
CoroSplit has a heuristic where the scope line for funclets is adjusted
to match the line of the suspend intrinsic that caused the split. This
is useful as it avoids a jump on the line table from the original
function declaration to the line where the split happens.

However, very often using the line of the split is not ideal: if we can
avoid it, we should not have a line entry for the split location, as
this would cause breakpoints by line to match against two functions: the
funclet before and the funclet after the split.

This patch adjusts the heuristics to look for the first instruction with
a non-zero line number after the split. In other words, this patch makes
breakpoints on `await foo()` lines behave much more like a regular
function call.
2024-09-13 15:25:11 -07:00
Florian Hahn
c3fda44147
[VPlan] Use VPBuilder to create scalar IV steps and derived IV (NFCI).
Extend VPBuilder to allow creating VPDerivedIVRecipe, VPScalarCastRecipe
and VPScalarIVStepsRecipe.

Use them to simplify the code to create scalar IV steps slightly.
2024-09-13 22:19:36 +01:00
vporpo
5130f3236f
[SandboxVec] User-defined pass pipeline (#108625)
This patch adds support for a user-defined pass-pipeline that overrides
the default pipeline of the vectorizer.
This will commonly be used by lit tests.
2024-09-13 13:14:06 -07:00
Ramkumar Ramachandra
75a57edadc
VPlan/Builder: inline VPBuilder::createICmp (NFC) (#105650)
Inline VPBuilder::createICmp in the header, in line with the other
VPBuilder functions.
2024-09-13 20:08:11 +01:00
Volodymyr Vasylkun
21e3a212c5
[InstCombine] Replace an integer comparison of a phi node with multiple ucmp/scmp operands and a constant with phi of individual comparisons of original intrinsic's arguments (#107769)
When we have a `phi` instruction with more than one of its incoming
values being a call to `ucmp` or `scmp`, which is then compared with an
integer constant, we can move the comparison through the `phi` into the
incoming basic blocks because we know that a comparison of `ucmp`/`scmp`
with a constant will be simplified by the next iteration of InstCombine.

There's a high chance that other similar patterns can be identified, in
which case they can be easily handled by the same code by moving the
check for "simplifiable" instructions into a lambda.
2024-09-13 19:50:27 +01:00
Alexey Bataev
c13bf6d4a8 [SLP]Return proper value for phi vectorized node
Should not return the original phi vector instruction, need to return
actual vectorized value as a result.
2024-09-13 11:30:29 -07:00
vporpo
39f2d2f156
[SandboxVec] Boilerplate for vectorization passes (#108603)
This patch implements a new empty pass for the Bottom-up vectorizer and
creates a pass pipeline that includes it.
The SandboxVectorizer LLVM pass runs the Sandbox IR pass pipeline.
2024-09-13 11:22:24 -07:00
Tyler Nowicki
4c040c0275
[Coroutines] Move Shape to its own header (#108242)
* To create custom ABIs plugin libraries need access to CoroShape.
* As a step in enabling plugin libraries, move Shape into its own header
* The header will eventually be moved into include/llvm/Transforms/Coroutines

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-13 14:11:30 -04:00
Florian Hahn
76fd69be74
[VPlan] Simplify VPBuilder insert point when live outs for FORs.
Simplifies setting the insert point, addressing a TODO.
2024-09-13 13:21:23 +01:00
David Sherwood
f3029b330a
[NFC][LoopVectorize] Avoid passing ScalarEvolution to VPlanTransforms::optimize (#108380)
Whilst trying to write some VPlan unit tests I realised
that we don't need to pass a ScalarEvolution object into
VPlanTransforms::optimize because the only thing we
actually need is a LLVMContext.
2024-09-13 12:09:00 +01:00
Nikita Popov
1c298c9274 [InstCombine] Preserve nuw flags when merging geps
These transforms all perform a variant of (gep (gep p, x), y)
to (gep p, (x + y)). We can preserve both inbounds and nuw
during such transforms (https://alive2.llvm.org/ce/z/Stu4cN), but
not nusw, which would require proving that the new add is nsw.

For the constant offset case, I've conservatively retained the
logic that checks for negative intermediate offsets, though I'm
not sure it's still reachable nowadays.
2024-09-13 11:15:22 +02:00
Igor Kirillov
1b57cbcf25
[VectorCombine] Refactor Insertion Point setting in shrinkType (#108398) 2024-09-13 10:03:31 +01:00
Nikita Popov
cd39242032 [InstCombine] Remove no longer needed constant offset case (NFCI)
Now that we canonicalize constant geps to i8 type, this special
handling should no longer be needed.
2024-09-13 10:15:54 +02:00
Nikita Popov
940f89255e [InstCombine] Do not modify GEP in place
This was modifying the GEP in place, with code to adjust the
inbounds flag. This was correct at the time, but now fails to
account for other GEP flags like nuw, leading to miscompilations.

Remove the special case, and always create a new GEP instruction.
Logic for preserving nuw in the cases where it is valid will be
added in a followup patch.
2024-09-13 10:04:39 +02:00
David Green
c0e308ba3d
[InstCombine] Pass DomTree and DomTreeCacheto LibCallSimplifier (#108446)
This allows any combines to pick up Known states from dominating
conditions.
2024-09-13 08:36:48 +01:00
Florian Hahn
08d294df55
[VPlan] Simplify VPBuilder insert point when adding users in exit block.
Simplifies setting the insert point, addressing a TODO.
2024-09-12 22:47:03 +01:00
Shilei Tian
4808842771
[NFC][Attributor] Use unsigned integer for address space tracking (#108447) 2024-09-12 13:56:21 -07:00
Florian Hahn
71cb7811bb
[LV] Remove stale completeLoopSkeleton (NFCI).
The function has been removed a while ago, also remove the stable
declaration.
2024-09-12 21:55:43 +01:00
Alexey Bataev
5d7cf504ce [SLP]Fix PR108421: Correctly deduce VF from the masks
Need to select the max of CommonMask and V1 Mask size to correctly
perform reshuffling of the vectors, otherwise incorrect result is
generated.

Fixes https://github.com/llvm/llvm-project/issues/108421
2024-09-12 13:43:44 -07:00
Ramkumar Ramachandra
7e9bd12cd9
MemCpyOpt: clarify logic in processStoreOfLoad (NFC) (#108400) 2024-09-12 21:16:43 +01:00
Farzon Lotfi
c05e29bff0
[LegacyPM][DirectX] Add legacy scalarizer back for use in the DirectX backend (#107427)
As discussed in this
[proposal](https://github.com/llvm/wg-hlsl/pull/62/files?short_path=ac6e592#diff-ac6e59276afe8016e307eedc5c835f534c0cb353707760b44df0fa9d905a5cf8).
We had to bring back the legacy pass manager interface for the
scalarizer pass. Two reasons for this:
1. The DirectX backend is still using the legacy pass manager
2. The new PM isn't hooked up in clang yet via `BackendUtil.cpp`'s
`AddEmitPasses` That means even if we add a `buildCodeGenPipeline` we
won't be able to benefit from the new pass manager's scalarizer pass
interface.

The remaining changes are hooking up the scalarizer pass to the DirectX
backend, updating the DirectX test cases,
and allowing the `optdriver` to not block the legacy invocation of the
scalarizer pass.

Future work still needs to be done to allow the scalarizer pass to
handle target specific intrinsics.

closes #105178
2024-09-12 15:53:50 -04:00
Ramkumar Ramachandra
159e5b3fdf
MemCpyOpt: avoid unnecessary getMemorySSA (NFC) (#108405) 2024-09-12 20:35:01 +01:00
Tyler Nowicki
2670565afc
[Coroutines] Move materialization code into its own utils (#108240)
* Move materialization out of CoroFrame to MaterializationUtils.h
* Move spill related utilities that were used by materialization to
SpillUtils
* Move isSuspendBlock (needed by materialization) to CoroInternal

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
2024-09-12 14:01:23 -04:00
Tyler Nowicki
0989a775ae
[Coroutines] Verify normalization was not missed (#108096)
* Add asserts to verify normalization of the coroutine happened.
* This will be important when normalization becomes part of an ABI
object.

--- From a previous discussion here
https://github.com/llvm/llvm-project/pull/108076

Normalization performs these important steps:

split around each suspend, this adds BBs before/after each suspend so
each suspend now exists in its own block.
split around coro.end (similar to above)
break critical edges and add single-edge phis in the new blocks (also
removing other single-edge phis).
Each of these things can individually be tested
A) Check that each suspend is the only inst in its BB
B) Check that coro.end is the only inst in its BB
C) Check that each edge of a multi-edge phis is preceded by single-edge
phi in an immediate pred

For 1) and 2) I believe the purpose of the transform is in part for
suspend crossing info's analysis so it can specifically 'mark' the
suspend blocks and identify the end of the coroutine. There are some
existing places within suspend crossing info that visit the CoroSuspends
and CoroEnds so we could check A) and B) there.

For 3) I believe the purpose of this transform is for insertSpills to
work properly. Infact there is already a check for the result of this
transform!

assert(PN->getNumIncomingValues() == 1 &&
           "unexpected number of incoming "
           "values in the PHINode");

I think to verify the result of normalization we just need to add checks
A) and B) to suspend crossing info.
2024-09-12 14:00:49 -04:00
Yuxuan Chen
853bff2122
[Coroutines] properly update CallGraph in CoroSplit (#107935)
Fixes https://github.com/llvm/llvm-project/issues/107139.

We weren't updating the call graph properly in CoroSplit. This crash is
due to the await_suspend() function calling the coroutine, forming a
multi-node SCC. The issue bisected to
https://github.com/llvm/llvm-project/pull/79712 but I think this is red
herring. We haven't been properly updating the call graph.

Added an example of such code as a test case.
2024-09-12 10:45:20 -07:00
Mircea Trofin
885ac29910 [nfc][ctx_prof] Change some internal "set" types
- the set used for targets under a callsite is simpler to use if iterators
  are stable (it gets manipulated during updates)
- the set used to fetch the transitive closure of GUIDs under a node can
  be left as a choice to the user.
2024-09-12 10:34:53 -07:00
Amr Hesham
ef7a847be2
[LoopUnswitch] Remove redundant condition. (NFC) (#107893)
Remove redundant condition from  '!A || (A && B)' to '!A || B' 

Fixes: #99799
2024-09-12 16:40:16 +02:00
Igor Kirillov
958a337132
[VectorCombine] Fix trunc generated between PHINodes (#108228) 2024-09-12 10:20:56 +01:00
Philip Reames
54c6e1c3f5 [SLP] Move a non-power-of-two bailout down slightly
The first part of CheckForShuffledLoads isn't doing any subvector
analysis, so it's perfectly safe for arbitrary VL.
2024-09-11 14:33:45 -07:00
Florian Hahn
ea83e1c05a
[LV] Assign cost to all interleave members when not interleaving.
At the moment, the full cost of all interleave group members is assigned
to the instruction at the group's insert position, even if the decision
was to not form an interleave group.

This can lead to inaccurate cost estimates, e.g. if the instruction at
the insert position is dead. If the decision is to not vectorize but
scalarize or scather/gather, then the cost will be to total cost for all
members. In those cases, assign individual the cost per member, to more
closely reflect to choice per instruction.

This fixes a divergence between legacy and VPlan-based cost model.

Fixes https://github.com/llvm/llvm-project/issues/108098.
2024-09-11 21:04:34 +01:00
Vasileios Porpodas
d5bc1f4a16 [SandboxVec][NFC] Rename a variable 2024-09-11 09:04:59 -07:00
Hari Limaye
7858e14547
[LV] Amend check for IV increments in collectUsersInEntryBlock (#108020)
The check for IV increments in collectUsersInEntryBlock currently
triggers for exit-block PHIs which use the IV start value, resulting in
us failing to add the input value for the middle block to these PHIs.

Fix this by amending the check for IV increments to only include
incoming values that are instructions inside the loop.

Fixes #108004
2024-09-11 16:43:34 +01:00