40621 Commits

Author SHA1 Message Date
Yussur Mustafa Oraji
ded1f3ec96
[TSan] Add option to ignore capturing behavior when instrumenting (#148156)
While not needed for most applications, some tools such as
[MUST](https://www.i12.rwth-aachen.de/cms/i12/forschung/forschungsschwerpunkte/lehrstuhl-fuer-hochleistungsrechnen/~nrbe/must/)
depend on the instrumentation being present.
MUST uses the ThreadSanitizer annotation interface to detect data races
in MPI programs, where the capture tracking is detrimental as it has no
bearing on MPI data races, leading to missed races.
2025-08-06 15:47:33 +02:00
Florian Hahn
e80e7e717e
[VPlan] Use scalar VPPhi instead of VPWidenPHIRecipe in createPlainCFG. (#150847)
The initial VPlan closely reflects the original scalar loop, so unsing
VPWidenPHIRecipe here is premature. Widened phi recipes should only be
introduced together with other widened recipes.

PR: https://github.com/llvm/llvm-project/pull/150847
2025-08-06 14:43:03 +01:00
Florian Hahn
777c320e6c
[VPlan] Address comments missed in #142309.
Address additional comments from
https://github.com/llvm/llvm-project/pull/142309.
2025-08-06 11:52:08 +01:00
Andrew Rogers
a3c386d241
[llvm] annotate recently added interfaces for DLL export (#152179)
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates symbols that were recently
added to LLVM and fixes incorrectly annotated symbols.

## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

## Overview

The bulk of these changes were generated automatically using the
[Interface Definition Scanner (IDS)](https://github.com/compnerd/ids)
tool, followed formatting with `git clang-format`.

The following manual adjustments were also applied after running IDS:
- Add `LLVM_EXPORT_TEMPLATE` and `LLVM_TEMPLATE_ABI` annotations to
explicitly instantiated instances of `llvm::object::SFrameParser`.

## Validation

On Windows 11:
```
cmake -B build -S llvm -G Ninja -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra;lldb;lld" -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_BUILD_LLVM_DYLIB_VIS=ON -DLLVM_LINK_LLVM_DYLIB=ON -DLLVM_BUILD_TESTS=ON -DCLANG_LINK_CLANG_DYLIB=OFF -DCMAKE_BUILD_TYPE=Release
ninja -C build
```
2025-08-05 23:12:07 -07:00
Mircea Trofin
f675483905
[profcheck] Annotate select instructions (#152171)
For `select`, we don't have the equivalent of the branch probability analysis to offer defaults, so we make up our own and allow their overriding with flags.

Issue #147390
2025-08-06 02:48:50 +02:00
Florian Hahn
d478502a42
[VPlan] Ensure that IV resume phi for epilogue is always first. (NFCI)
Update handling of canonical IV resume phi for the epilogue loop to make
sure the resume phi for the canonical IV is always the first phi in the
scalar preheader.

This makes it easier to retrieve it in preparePlanForEpilogueVectorLoop.

For now, we keep an assert to make sure we use the same resume phi as
before. This will be removed in the future.
2025-08-05 21:06:41 +01:00
Florian Hahn
47258ca470
[VPlan] Use VPPhi instead of dyn_cast + opcode check in isPhi (NFC). 2025-08-05 19:20:12 +01:00
Nikita Popov
0a8ebdb2f0 [MemCpyOpt] Remove handling for lifetime sizes
Split out from #150248:

Since #150944 the size passed to lifetime.start/end is considered
meaningless. The lifetime always applies to the whole alloca.

Accordingly, remove checks of the lifetime size from MemCpyOpt.
2025-08-05 17:22:12 +02:00
Kazu Hirata
908ef45606 [Utils] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Utils/SplitModuleByCategory.cpp:321:14: error:
  moving a temporary object prevents copy elision
  [-Werror,-Wpessimizing-move]
2025-08-05 07:24:10 -07:00
Maksim Sabianin
3f59a22711
[offload][SYCL] Add Module splitting by categories. (#131347)
This patch adds Module splitting by categories. The splitting algorithm
is the necessary step in the SYCL compilation pipeline. Also it could be
reused for other heterogenous targets.

The previous attempt was at #119713. In this patch there is no
dependency in `TransformUtils` on "IPO" and on "Printing Passes". In
this patch a module splitting is self-contained and it doesn't introduce
linking issues.
2025-08-05 14:04:59 +00:00
Luke Lau
94a6cd464e
[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes (#148274)
This is the VPWidenPointerInductionRecipe equivalent of #118638, with
the motivation of allowing us to use the EVL as the induction step.

There is a new VPInstruction added, WidePtrAdd to allow adding the step
vector to the induction phi, since VPInstruction::PtrAdd only handles
scalars or multiple scalar lanes.

Originally this transformation was copied from the original recipe's
execute code, but it's since been simplifed by teaching
`unrollWidenInductionByUF` to unroll the recipe, which brings it inline
with VPWidenIntOrFpInductionRecipe.
2025-08-05 16:54:02 +08:00
Florian Hahn
c9dd14d1d4
[VPlan] Compute interleave count for VPlan. (#149702)
Move selectInterleaveCount to LoopVectorizationPlanner and retrieve some
information directly from VPlan. Register pressure was already computed
for a VPlan, and with this patch we now also check for reductions
directly on VPlan, as well as checking how many load and store
operations remain in the loop.

This should be mostly NFC, but we may compute slightly different
interleave counts, except for some edge cases, e.g. where dead loads
have been removed. This shouldn't happen in practice, and the patch
doesn't cause changes across a large test corpus on AArch64.

Computing the interleave count based on VPlan allows for making better
decisions in presence of VPlan optimizations, for example when
operations on interleave groups are narrowed.

Note that there are a few test changes for tests that were still
checking the legacy cost-model output when it was computed in
selectInterleaveCount.

PR: https://github.com/llvm/llvm-project/pull/149702
2025-08-05 09:42:55 +01:00
Tommy MᶜMichen
155359c1f2
[llvm][sroa] Disable support for invariant.group (#151743)
Resolves #151574.

> SROA pass does not perform aggregate load/store rewriting on a pointer
whose source is a `launder.invariant.group`.
> 
> This causes failed assertion in `AllocaSlices`.
> 
> ```
> void (anonymous
namespace)::AllocaSlices::SliceBuilder::visitStoreInst(StoreInst &):
> Assertion `(!SI.isSimple() || ValOp->getType()->isSingleValueType())
&&
>  "All simple FCA stores should have been pre-split"' failed.
> ```

Disables support for `{launder,strip}.invariant.group` intrinsics in
SROA.

Updates SROA test for `invariant.group` support.
2025-08-05 09:59:07 +02:00
Nikita Popov
fb632ed237
[GVN] Handle provenance when propagating assume equality (#151953)
If we have a known `p == p2` equality, we cannot replace `p2` with `p`
unless they are known to have the same provenance. GVN handles this when
propagating equalities from conditions, but not for assumes, as these go
through a different code path for uses in the same block.

Call canReplacePointersInUseIfEqual() before performing the replacement.
This is subject to the usual approximations (e.g. that we always allow
replacement with a dereferenceable constant and null).

This restriction does not appear to have any impact in practice.
2025-08-05 09:18:43 +02:00
Nikita Popov
e044cc50ee
[GVN] Handle not in equality propagation (#151942)
Look through `not` when propagating equalities in GVN. Usually these
will be canonicalized away, but they may be retained due to multi-use or
involvement in logical expressions.

Fixes https://github.com/llvm/llvm-project/issues/143529.
2025-08-05 09:11:24 +02:00
Mel Chen
8761b6cf8f
[VPlan] Use VPTypeAnalysis to get the step type of widen pointer induction (#147925)
This patch uses VPTypeAnalysis to determine its type since the induction
step is not always a live-in value in the VPlan and may be defined by a
recipe.
2025-08-05 09:13:44 +08:00
Florian Hahn
0433e1e15f
[VPlan] Add VPlan::getTrue/getFalse convenience helpers (NFC).
Makes it slightly more convenient to create true/false constants.
2025-08-04 21:04:55 +01:00
Florian Hahn
215e6beae0
[LV] Use MapVector for ScalarCostsTy for deterministic iter order (NFC)
We iterate over the scalar costs of instruction when printing costs, and
currently the iteration order is not deterministic. Currently no tests
check the output with multiple instructions in the map, but those will
come soon.
2025-08-04 19:31:07 +01:00
Alexey Bataev
e27831ff9b [SLP] Fix a check for main/alternate interchanged instruction
If the instruction is checked for matching the main instruction, need to
check if the opcode of the main instruction is compatible with the
operands of the instruction. If they are not, need to check the
alternate instruction and its operands for compatibility and return
alternate instruction as a match.

Fixes #151699

Fixed check for non-supported binary operations.
2025-08-04 11:20:54 -07:00
Michael Halkenhäuser
70af09e3a1
Revert "[SLP] Fix a check for main/alternate interchanged instruction" (#151997)
This reverts commit 3ee8d047109ea4bb479095f4b153c2120a8d726c.

Revert reason: FAILED build for openmp-offload-amdgpu-runtime-2 
https://lab.llvm.org/buildbot/#/builders/10/builds/10827
2025-08-04 12:57:20 -04:00
Alexey Bataev
3ee8d04710 [SLP] Fix a check for main/alternate interchanged instruction
If the instruction is checked for matching the main instruction, need to
check if the opcode of the main instruction is compatible with the
operands of the instruction. If they are not, need to check the
alternate instruction and its operands for compatibility and return
alternate instruction as a match.

Fixes #151699
2025-08-04 08:31:35 -07:00
Kazu Hirata
35dd88918f
[llvm] Use llvm::iterator_range::empty (NFC) (#151905) 2025-08-04 07:40:46 -07:00
Andreas Jonson
c6fd3d32c3
[SimplifyCfg] Add nneg to zext for switch to table conversion (#147180) 2025-08-04 16:18:05 +02:00
Simon Pilgrim
88c6448fa2
Revert "[VectorCombine] Shrink loads used in shufflevector rebroadcasts" (#151960)
Reverts llvm/llvm-project#128938 while a crash regression is investigated
2025-08-04 15:03:53 +01:00
Alexey Bataev
7cd1ce3aa0 [SLP]Check vector-like instruction for dominance in copyables
Need to check if the vector-like instruction is dominated by main
operation in the copyables to prevent broken def-use chain

Fixes #151456
2025-08-04 06:14:19 -07:00
Paul Walker
1406058cba
[LLVM][InstCombine] Extend masked_gather's demanded elt analysis. (#151732)
Add support for other Constant types for the mask operand.
2025-08-04 14:05:04 +01:00
Paul Walker
04f98889ae
[LLVM][NumericalStabilitySanitizer] Add support for vector ConstantFPs. (#151739) 2025-08-04 13:58:32 +01:00
Paul Walker
fb4a8f67b9
[LLVM][InstCombine] foldICmpEquality: Compare APInt values rather than addresses. (#151726) 2025-08-04 13:54:44 +01:00
Nikita Popov
e833bb0991 [Local] Do not pass Root to replaceDominatedUsesWith (NFC)
Capture it in the lambdas instead.
2025-08-04 14:22:17 +02:00
Florian Hahn
66a8341f6d
[VPlan] Skip disconnected exit blocks in hasEarlyExit. (#151718)
Currently hasEarlyExit returns true, if there are multiple exit blocks.
ExitBlocks contains the wrapped original IR exit blocks. Without
checking the predecessors we incorrectly return true for loops with
multiple countable exits, that have been vectorized by requiring a
scalar epilogue. In that case, the exit blocks will get disconnected.

Fix this by filtering out disconnected exit blocks.

Currently this should only impact the 'early-exit vectorized' statistic.

PR: https://github.com/llvm/llvm-project/pull/151718
2025-08-04 11:31:00 +01:00
Nikita Popov
4b5b36e5c4 [GVN] Avoid creating lifetime of non-alloca
There is a larger problem here in that we should not be performing
arbitrary pointer replacements for assumes. This is handled for
branches, but assume goes through a different code path.

Fixes https://github.com/llvm/llvm-project/issues/151785.
2025-08-04 12:06:40 +02:00
Leon Clark
1feed444aa
[VectorCombine] Shrink loads used in shufflevector rebroadcasts (#128938)
Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions.

---------

Co-authored-by: Leon Clark <leoclark@amd.com>
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-04 10:49:27 +01:00
Nikita Popov
86727fe9a1
[IR] Allow poison argument to lifetime markers (#151148)
This slightly relaxes the invariant established in #149310, by also
allowing the lifetime argument to be poison. This is to support the
typical pattern of RAUWing with poison when removing an instruction.

It's worth noting that this does not require any conservative
assumptions, lifetimes with poison arguments can simply be skipped.

Fixes https://github.com/llvm/llvm-project/issues/151119.
2025-08-04 10:02:04 +02:00
Mircea Trofin
9a60841dc4
[PGO][profcheck] ignore explicitly cold functions (#151778)
There is a case when branch profile metadata is OK to miss, namely, cold functions. The goal of the RFC (see the referenced issue) is to avoid accidental omission (and, at a later date, corruption) of profile metadata. However, asking cold functions to have all their conditional branches marked with "0" probabilities would be overdoing it. We can just ask cold functions to have an explicit 0 entry count.

This patch:
- injects an entry count for functions, unless they have one (synthetic or not)
- if the entry count is 0, doesn't inject, nor does it verify the rest of the metadata
- at verification, if the entry count is missing, it reports an error

Issue #147390
2025-08-04 03:53:49 +02:00
Austin
c7bacc9f26
[llvm] using wrapper llvm::sort(nfc) (#151000)
using wrapper llvm::sort(nfc)
2025-08-04 09:27:01 +08:00
Kazu Hirata
3549134836
[Vectorize] Remove an unnecessary cast (NFC) (#151850)
getNumElements() already returns unsigned.
2025-08-03 08:44:50 -07:00
Kazu Hirata
c068f8b408
[Scalar] Remove an unnecessary cast (NFC) (#151849)
LoadType is already of Type *.
2025-08-03 08:44:43 -07:00
Florian Hahn
559d1dff89
[VPlan] Materialize BackedgeTakenCount using VPInstructions.
Explicitly compute the backedge-taken count using VPInstruction. This is
needed to model the full skeleton in VPlan.

NFC modulo some instruction re-ordering.
2025-08-03 12:21:28 +01:00
Simon Pilgrim
b983ce8145 [VPlan] handleMaxMinNumReductions - fix gcc Wparentheses warning. NFC. 2025-08-03 11:50:31 +01:00
David Green
d9971be83e
[InstCombine] Make foldCmpLoadFromIndexedGlobal more resilient to non-array geps. (#150639)
My understanding is that gep [n x i8] and gep i8 can be treated
equivalently - the array type conveys no extra information and could be
removed. This goes through foldCmpLoadFromIndexedGlobal and tries to
make it work for non-array gep types, so long as the index type still
matches the array being loaded.
2025-08-03 10:19:42 +01:00
Florian Hahn
39c30665e9
[VPlan] Update type of cloned instruction in scalarizeInstruction.
The operands of the replicate recipe may have been narrowed, resulting
in a narrower result type. Update the type of the cloned instruction to
the correct type.

Fixes https://github.com/llvm/llvm-project/issues/151392.
2025-08-02 19:49:59 +01:00
Florian Hahn
08f50e9665
[VPlan] Use vector tripcount if computable when simplifying conds. (#151034)
Update isConditionTrueViaVFAndUF to use the vector trip count if
computable. This is the case when it has been materialized to a
constant. Otherwise fall back to the trip count.

PR: https://github.com/llvm/llvm-project/pull/151034
2025-08-02 16:31:31 +01:00
Ramkumar Ramachandra
af0be76a35
[VPlan] Replace reverse RPOT with PO traversal (NFC) (#151757) 2025-08-02 08:46:27 +01:00
Florian Hahn
eee9755881
[LV] Refine check to find epilogue IV resume value.
Make sure to check that the vector trip count is containedin the list of
incoming values to serve as tie-breaker with phis with all-zero incoming
values.

Fixes https://github.com/llvm/llvm-project/issues/151686.
2025-08-01 20:54:39 +01:00
Teresa Johnson
dc90472532
[MemProf] Ensure node merging happens for newly created nodes (#151593)
We weren't performing node merging on newly created nodes in some cases.
Use a simple iteration over the node and its callers until no more
opportunities are found. I confirmed that for several large codes the
max iterations is 3 (meaning we only needed to do any work on the first
2, as expected). This can potentially be made more elegant in the
future, but it is a simple and effective solution.

Also fix a bug exposed by the test case, getting the function for a call
instruction in the FullLTO handling, using an existing method to look
through aliases if needed.
2025-08-01 12:51:12 -07:00
Florian Hahn
c300a99ea8
[LV] Use MapVector for InstsToScalarize for deterministic iter order (NFC)
We iterate over InstsToScalarize when printing costs, and currently the
iteration order is not deterministic. Currently no tests check the
output with multiple instructions in InstsToScalarize, but those will
come soon.
2025-08-01 14:29:53 +01:00
Florian Hahn
2ae996cbbe
[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)
This patch extends the logic added in
https://github.com/llvm/llvm-project/pull/128061 to support
dereferenceability information from assumptions as well.

Unfortunately both assumption cache and the dominator tree need to be
threaded through multiple layers to make them available where needed.

PR: https://github.com/llvm/llvm-project/pull/147047
2025-08-01 14:18:07 +01:00
Florian Hahn
7d815c7642
[LV] Remove unused variables after 965231ca0a9a. (NFC)
Clean up unused/dead variables after 965231ca0a9a
(https://github.com/llvm/llvm-project/pull/151311)
2025-08-01 10:12:20 +01:00
Nikita Popov
09dc08b707 [InstCombine] Handle repeated users in foldOpIntoPhi()
If the phi is used multiple times in the same user, it will appear
multiple times in users(), in which case make_early_inc_range()
is insufficient to prevent iterator invalidation.

Fixes the issue reported at:
https://github.com/llvm/llvm-project/pull/151115#issuecomment-3141542852
2025-08-01 11:07:06 +02:00
Kerry McLaughlin
e170676351
[Instcombine] Combine extractelement from a vector_extract at index 0 (#151491)
Extracting any element from a subvector starting at index 0 is
equivalent to extracting from the original vector, i.e.
  extract_elt(vector_extract(x, 0), y) -> extract_elt(x, y)
2025-08-01 09:54:43 +01:00