7973 Commits

Author SHA1 Message Date
Andreas Jonson
330a589450
[PredicateInfo] Handle trunc nuw i1 condition. (#152988)
proof: https://alive2.llvm.org/ce/z/mxtn4L
2025-08-11 13:00:54 +02:00
hanbeom
a750fcb52b
[GVN] Check IndirectBr in Predecessor Terminators (#151188)
Critical edges with an IndirectBr terminator cannot be split. 
Add a check it to prevent assertion failures.

Fixes: #150229
2025-08-11 09:25:52 +02:00
Nikita Popov
35bad229c1
[PredicateInfo] Use bitcast instead of ssa.copy (#151174)
PredicateInfo needs some no-op to which the predicate can be attached.
Currently this is an ssa.copy intrinsic. This PR replaces it with a
no-op bitcast.
    
Using a bitcast is more efficient because we don't have the overhead of
an overloaded intrinsic. It also makes things slightly simpler overall.
2025-08-11 09:25:01 +02:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Matt Arsenault
1110e2ff9f
InlineFunction: Split inlining into predicate and apply functions (#134213)
This is to support a new inline function reduction in llvm-reduce,
which should pre-filter callsites that are not eligible for inlining.

This code was mostly structured as a match and apply, with a few
exceptions. The ugliest piece is for propagating and verifying
compatible
getGC and personalities. Also collection of EHPad and the convergence
token
to use are now cached in InlineFunctionInfo.

I was initially confused by the split between the checks performed here
and isInlineViable, so better document how this system is supposed to
work.
It turns out this split does make sense, in that isInlineViable checks
if it's possible based on the callee content and the ultimate inline
depended on the callsite context. I think more renames of these
functions
would help, and isInlineViable should probably move out of InlineCost to
be
with these transfoms.
2025-08-07 16:13:36 +09:00
Mircea Trofin
f675483905
[profcheck] Annotate select instructions (#152171)
For `select`, we don't have the equivalent of the branch probability analysis to offer defaults, so we make up our own and allow their overriding with flags.

Issue #147390
2025-08-06 02:48:50 +02:00
Kazu Hirata
908ef45606 [Utils] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Utils/SplitModuleByCategory.cpp:321:14: error:
  moving a temporary object prevents copy elision
  [-Werror,-Wpessimizing-move]
2025-08-05 07:24:10 -07:00
Maksim Sabianin
3f59a22711
[offload][SYCL] Add Module splitting by categories. (#131347)
This patch adds Module splitting by categories. The splitting algorithm
is the necessary step in the SYCL compilation pipeline. Also it could be
reused for other heterogenous targets.

The previous attempt was at #119713. In this patch there is no
dependency in `TransformUtils` on "IPO" and on "Printing Passes". In
this patch a module splitting is self-contained and it doesn't introduce
linking issues.
2025-08-05 14:04:59 +00:00
Kazu Hirata
35dd88918f
[llvm] Use llvm::iterator_range::empty (NFC) (#151905) 2025-08-04 07:40:46 -07:00
Andreas Jonson
c6fd3d32c3
[SimplifyCfg] Add nneg to zext for switch to table conversion (#147180) 2025-08-04 16:18:05 +02:00
Nikita Popov
e833bb0991 [Local] Do not pass Root to replaceDominatedUsesWith (NFC)
Capture it in the lambdas instead.
2025-08-04 14:22:17 +02:00
Nikita Popov
86727fe9a1
[IR] Allow poison argument to lifetime markers (#151148)
This slightly relaxes the invariant established in #149310, by also
allowing the lifetime argument to be poison. This is to support the
typical pattern of RAUWing with poison when removing an instruction.

It's worth noting that this does not require any conservative
assumptions, lifetimes with poison arguments can simply be skipped.

Fixes https://github.com/llvm/llvm-project/issues/151119.
2025-08-04 10:02:04 +02:00
Mircea Trofin
9a60841dc4
[PGO][profcheck] ignore explicitly cold functions (#151778)
There is a case when branch profile metadata is OK to miss, namely, cold functions. The goal of the RFC (see the referenced issue) is to avoid accidental omission (and, at a later date, corruption) of profile metadata. However, asking cold functions to have all their conditional branches marked with "0" probabilities would be overdoing it. We can just ask cold functions to have an explicit 0 entry count.

This patch:
- injects an entry count for functions, unless they have one (synthetic or not)
- if the entry count is 0, doesn't inject, nor does it verify the rest of the metadata
- at verification, if the entry count is missing, it reports an error

Issue #147390
2025-08-04 03:53:49 +02:00
Joel E. Denny
37e03b56b8
Revert "[PGO] Add llvm.loop.estimated_trip_count metadata" (#151585)
Reverts llvm/llvm-project#148758

[As
requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31 15:56:31 -04:00
Joel E. Denny
a85c725952 Revert "[Utils] Fix a warning"
This reverts commit 3a18fe33f0763cd9276c99c276448412100f6270.

So that we can revert PR #148758.
2025-07-31 15:54:01 -04:00
Kazu Hirata
3a18fe33f0 [Utils] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Utils/LoopUtils.cpp:818:28: error: unused
  function 'operator<<' [-Werror,-Wunused-function]
2025-07-31 11:24:33 -07:00
Joel E. Denny
f7b65011de
[PGO] Add llvm.loop.estimated_trip_count metadata (#148758)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights` metadata.
As [suggested in the PR #128785
review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036),
it does so via a new `PGOEstimateTripCountsPass` pass, which creates the
new metadata for each loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count,
but later passes can sometimes transform the loop in a way that makes it
possible. Currently, such passes do not necessarily update the metadata,
but eventually that should be fixed. Until then, if the new metadata has
no value, `llvm::getLoopEstimatedTripCount` disregards it and tries
again to estimate the trip count from the loop's current
`branch_weights` metadata.
2025-07-31 12:28:25 -04:00
Florian Hahn
99d70e09a9
[SCEV] Allow adds of constants in tryToReuseLCSSAPhi. (#150693)
Update the logic added in
https://github.com/llvm/llvm-project/pull/147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).

PR: https://github.com/llvm/llvm-project/pull/150693
2025-07-31 16:33:25 +01:00
LU-JOHN
a757f23404
[SimplifyCFG] Extend jump-threading to allow live local defs (#135079)
Extend jump-threading to allow local defs that are live outside of the
threaded block. Allow threading to destinations where the local defs are
not live.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-07-31 09:44:14 -04:00
Nikita Popov
fa6965f722 [SCCP] Extract PredicateInfo handling into separate method (NFC) 2025-07-29 16:36:33 +02:00
Ellis Hoag
819f020b28
Use F.hasOptSize() instead of checking optsize directly (#147348) 2025-07-28 08:38:52 -07:00
Florian Hahn
f9f68af4b8
[SCEV] Make sure LCSSA is preserved when re-using phi if needed.
If we insert a new add instruction, it may introduce a new use outside
the loop that contains the phi node we re-use. Use fixupLCSSAFormFor to
fix LCSSA form, if needed.

This fixes a crash reported in
https://github.com/llvm/llvm-project/pull/147824#issuecomment-3124670997.
2025-07-28 16:24:46 +01:00
Florian Hahn
e21ee41be4
[SCEV] Try to re-use pointer LCSSA phis when expanding SCEVs. (#147824)
Generalize the code added in
https://github.com/llvm/llvm-project/pull/147214 to also support
re-using pointer LCSSA phis when expanding SCEVs with AddRecs.

A common source of integer AddRecs with pointer bases are runtime checks
emitted by LV based on the distance between 2 pointer AddRecs.

This improves codegen in some cases when vectorizing and prevents
regressions with https://github.com/llvm/llvm-project/pull/142309, which
turns some phis into single-entry ones, which SCEV will look through
now (and expand the whole AddRec), whereas before it would have to treat
the LCSSA phi as SCEVUnknown.

Compile-time impact neutral:
https://llvm-compile-time-tracker.com/compare.php?from=fd5fc76c91538871771be2c3be2ca3a5f2dcac31&to=ca5fc2b3d8e6efc09f1624a17fdbfbe909f14eb4&stat=instructions:u

PR: https://github.com/llvm/llvm-project/pull/147824
2025-07-25 15:29:40 +01:00
Kazu Hirata
3e53d4d386
[llvm] Remove unused includes (NFC) (#150265)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-23 15:18:46 -07:00
Mircea Trofin
df2d2d125b
[PGO] Add ProfileInjector and ProfileVerifier passes (#147388)
Adding 2 passes, one to inject `MD_prof` and one to check its presence. A subsequent patch will add these (similar to debugify) to `opt` (and, eventually, a variant of this, to `llc`)

Tracking issue: #147390
2025-07-23 21:34:58 +02:00
Nikita Popov
bdd638a897 [Local] Remove handling for lifetime intrinsic on non-alloca (NFC)
After #149310 this is guaranteed to be an alloca.
2025-07-23 14:21:22 +02:00
Nikita Popov
b59aaf7da7
[Sanitizers] Remove handling for lifetimes on non-alloca insts (NFC) (#149994)
After #149310 the pointer argument of lifetime.start/lifetime.end is
guaranteed to be an alloca, so we don't need to go through
findAllocaForValue() anymore, and don't have to have special handling
for the case where it fails.
2025-07-23 09:48:32 +02:00
Nikita Popov
307256ecbd
[GVNSink] Do not sink lifetimes of different allocas (#149818)
This was always undesirable, and after #149310 it is illegal and will
result in a verifier error.

Fix this by moving SimplifyCFG's check for this into
canReplaceOperandWithVariable(), so it's shared with GVNSink.
2025-07-22 09:44:03 +02:00
Jeremy Morse
c9ceb9b75f
[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816)
This is one of the final remaining debug-intrinsic specific codepaths
out there, and pieces of cross-LLVM infrastructure to do with debug
intrinsics.
2025-07-21 17:49:25 +01:00
Yingwei Zheng
9e587ce6f0
[SCCP] Simplify [us]cmp(X, Y) into X - Y (#144717)
If the difference between [us]cmp's operands is not greater than 1, we
can simplify it into `X - Y`.
Alive2: https://alive2.llvm.org/ce/z/JS55so
llvm-opt-benchmark diff:
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2464/files
2025-07-20 15:01:44 +08:00
Prabhu Rajasekaran
921c6dbeca
[llvm] Introduce callee_type metadata
Introduce `callee_type` metadata which will be attached to the indirect
call instructions.

The `callee_type` metadata will be used to generate `.callgraph` section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewers: morehouse, petrhosek, nikic, ilovepi

Reviewed By: nikic, ilovepi

Pull Request: https://github.com/llvm/llvm-project/pull/87573
2025-07-18 14:40:54 -07:00
Florian Hahn
004c67ea25
[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239)
Update LV to vectorize maxnum/minnum reductions without fast-math flags,
by adding an extra check in the loop if any inputs to maxnum/minnum are
NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros 
are already handled consistently by maxnum/minnum.

If any input is NaN,
 *exit the vector loop,
 *compute the reduction result up to the vector iteration that contained
   NaN inputs and
 * resume in the scalar loop


New recurrence kinds are added for reductions using maxnum/minnum
without fast-math flags.

PR: https://github.com/llvm/llvm-project/pull/148239
2025-07-18 21:58:19 +01:00
Nikita Popov
0121314135 [MemoryTaggingSupport] Remove unnecessary bitcast (NFC)
As the comment indicates, this is no longer necessary with
opaque pointers.
2025-07-18 18:49:33 +02:00
Jeremy Morse
c9d8b68676
[DebugInfo] Suppress lots of users of DbgValueInst (#149476)
This is another prune of dead code -- we never generate debug intrinsics
nowadays, therefore there's no need for these codepaths to run.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2025-07-18 11:31:52 +01:00
Jeremy Morse
2a1869b981
[DebugInfo] Shave even more users of DbgVariableIntrinsic from LLVM (#149136)
At this stage I'm just opportunistically deleting any code using
debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll
get to deleting that in probably one or more two commits.
2025-07-18 08:25:10 +01:00
Antonio Frighetto
c435cd1730 [SimplifyCFG] Cache unique predecessors in simplifyDuplicateSwitchArms
Avoid repeatedly querying `getUniquePredecessor` for already-visited
switch successors so as not to incur quadratic runtime.

Fixes: https://github.com/llvm/llvm-project/issues/147239.
2025-07-18 08:33:42 +02:00
Florian Hahn
46357438ba
[SCEV] Try to re-use existing LCSSA phis when expanding SCEVAddRecExpr. (#147214)
If an AddRec is expanded outside a loop with a single exit block, check
if any of the (lcssa) phi nodes in the exit block match the AddRec. If
that's the case, simply use the existing lcssa phi.

This can reduce the number of instruction created for SCEV expansions,
mainly for runtime checks generated by the loop vectorizer.

Compile-time impact should be mostly neutral

https://llvm-compile-time-tracker.com/compare.php?from=48c7a3187f9831304a38df9bdb3b4d5bf6b6b1a2&to=cf9d039a7b0db5d0d912e0e2c01b19c2a653273a&stat=instructions:u

PR: https://github.com/llvm/llvm-project/pull/147214
2025-07-17 15:47:54 +01:00
Jeremy Morse
7eb65f470c [DebugInfo] Delete a now-unused function after 5328c732a4770 2025-07-16 15:45:36 +01:00
Jeremy Morse
5328c732a4
[DebugInfo] Strip more debug-intrinsic code from local utils (#149037)
SROA and a few other facilities use generic-lambdas and some overloaded
functions to deal with both intrinsics and debug-records at the same time.
As part of stripping out intrinsic support, delete a swathe of this code
from things in the Utils directory.

This is a large diff, but is mostly about removing functions that were
duplicated during the migration to debug records. I've taken a few
opportunities to replace comments about "intrinsics" with "records",
and replace generic lambdas with plain lambdas (I believe this makes
it more readable).

All of this is chipping away at intrinsic-specific code until we get to
removing parts of findDbgUsers, which is the final boss -- we can't
remove that until almost everything else is gone.
2025-07-16 14:13:53 +01:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Kunqiu Chen
a6e1700fa6
[Utils][Local] Preserve !nosanitize in combineMetadata when merging instructions (#148376)
`combineMetadata` helper currently drops `!nosanitize` metadata when
merging two instructions, even if both originally carried `!nosanitize`.

This is problematic because `!nosanitize` is a key mechanism used by
sanitizer (e.g., ASan) to suppress instrumentation. Removing it can lead
to unintended sanitizer behavior.

This patch adds `nosanitize` to the whitelist in combineMetadata,
preserving it only if both instructions carry `!nosanitize`; otherwise,
it is dropped. This patch also adds corresponding tests in a test file
and regenerates it.

---
### Details

**Example (see [Godbolt](https://godbolt.org/z/83P5eWczx) for
details**):

```llvm
%v1 = load i32, ptr %p, !nosanitize
%v2 = load i32, ptr %p, !nosanitize
```
When merged via `combineMetadata(%v1, %v2, ...)`, the resulting
instruction loses its `!nosanitize` metadata.

Tools such as UBSan and AFL rely on `nosanitize` to prevent unwanted
transformations or checks. However, the current implementation of
combineMetadata mistakenly drops !nosanitize. This may lead to
unintended behavior during optimization.

For example, under `-fsanitize=address,undefined -O2`, IR emitted by
UBSan may lose its `!nosanitize` metadata due to the incorrect metadata
merging in optimization. As a result, ASan could unexpectedly instrument
those instructions.
> Note: due to the current UBSan handlers having relatively
coarse-grained attributes, this specific case is difficult to reproduce
end-to-end from source code—UBSan currently inhibits such optimizations
(refer to #135135 for details).

Still, I believe it's necessary to fix this now, to support future
versions of UBSan that might allow such optimizations, and to support
third-party tools (such as AFL-based fuzzers) that rely on the presence
of !nosanitize.
2025-07-14 15:45:08 +08:00
Nikita Popov
8a63133417 [MetaRenamer] Use isIntrinsic() helper (NFC) 2025-07-10 17:34:06 +02:00
Matt Arsenault
1915fa15c3
Utils: Add pass to declare runtime libcalls (#147534)
This will be useful for testing the set of calls for different systems,
and eventually the product of context specific modifiers applied. In
the future we should also know the type signatures, and be able to
emit the correct one.
2025-07-09 00:52:22 +09:00
Andrew Rogers
ff1b37b87a
[llvm] annotate LLVMCloneModule for export (#145570)
## Purpose
This patch is one in a series of code-mods that annotate LLVM’s public
interface for export. This patch annotates the implementation of
`LLVMCloneModule ` with `LLVM_ABI` to match its declaration in
`llvm-c/Core.h`. The annotation currently has no meaningful impact on
the LLVM build; however, it is a prerequisite to support an LLVM Windows
DLL (shared library) build.

## Background
This effort is tracked in #109483. Additional context is provided in
[this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307),
and documentation for `LLVM_ABI` and related annotations is found in the
LLVM repo
[here](https://github.com/llvm/llvm-project/blob/main/llvm/docs/InterfaceExportAnnotations.rst).

## Validation
Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
2025-07-07 13:46:47 -07:00
Stephen Tozer
a34b1755e2
[DLCov] Origin-Tracking: Add debugify support (#143594)
This patch is part of a series that adds origin-tracking to the debugify
source location coverage checks, allowing us to report symbolized stack
traces of the point where missing source locations appear.

This patch completes the feature, having debugify handle origin stack
traces by symbolizing them when an associated bug is found and printing
them into the JSON report file as part of the bug entry. This patch also
updates the script that parses the JSON report and creates a
human-readable HTML report, adding an "Origin" entry to the table that
contains an expandable textbox containing the symbolized stack trace.
2025-07-04 09:52:12 +01:00
Adrian Vogelsgesang
de3c8410d8
[debuginfo][coro] Emit debug info labels for coroutine resume points (#141937)
RFC on discourse:
https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations-take-2/86606

With this commit, we add `DILabel` debug infos to the resume points of a
coroutine. Those labels can be used by debugging scripts to figure out
the exact line and column at which a coroutine was suspended by looking
up current `__coro_index` value inside the coroutines frame, and then
searching for the corresponding label inside the coroutine's resume
function.

The DWARF information generated for such a label looks like:

```
0x00000f71:     DW_TAG_label
                  DW_AT_name    ("__coro_resume_1")
                  DW_AT_decl_file       ("generator-example.cpp")
                  DW_AT_decl_line       (5)
                  DW_AT_decl_column     (3)
                  DW_AT_artificial      (true)
                  DW_AT_LLVM_coro_suspend_idx   (0x01)
                  DW_AT_low_pc  (0x00000000000019be)
```

The labels can be mapped to their corresponding `__coro_idx` values
either via their naming convention `__coro_resume_<N>` or using the new
`DW_AT_LLVM_coro_suspend_idx` attribute. In gdb, those line numebrs can
be looked up using `info line -function my_coroutine -label
__coro_resume_1`. LLDB unfortunately does not understand DW_TAG_label
debug information, yet.

Given this is an artificial compiler-generated label, I did apply the
DW_AT_artificial tag to it. The DWARFv5 standard only allows that tag on
type and variable definitions, but this is a natural extension and was
also blessed in the RFC on discourse.

Also, this commit adds `DW_AT_decl_column` to labels, not only for
coroutines but also for normal C and C++ labels. While not strictly
necessary, I am doing so now because it would be harder to do so later
without breaking the binary LLVM-IR format

Drive-by fixes: While reading the existing test cases to understand how
to write my own test case, I did a couple of small typo fixes and
comment improvements
2025-07-04 10:44:35 +02:00
Austin
a550fef906
[llvm] Use llvm::fill instead of std::fill(NFC) (#146911)
Use llvm::fill instead of std::fill
2025-07-04 14:10:28 +08:00
Gábor Spaits
338fd8b12c
[SimplifyCFG] Transform switch to select when common bits uniquely identify one case (#145233)
Fix #141753 .

This patch introduces a new check, that tries to decide if the
conjunction of all the values uniquely identify the accepted values by
the switch.
2025-07-02 18:16:12 +02:00
Antonio Frighetto
f1cc0b607b [IR] Introduce dead_on_return attribute
Add `dead_on_return` attribute, which is meant to be taken advantage
by the frontend, and states that the memory pointed to by the argument
is dead upon function return. As with `byval`, it is supposed to be
used for passing aggregates by value. The difference lies in the ABI:
`byval` implies that the pointer is explicitly passed as argument to
the callee (during codegen the copy is emitted as per byval contract),
whereas a `dead_on_return`-marked argument implies that the copy
already exists in the IR, is located at a specific stack offset within
the caller, and this memory will not be read further by the caller upon
callee return – or otherwise poison, if read before being written.

RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
2025-07-02 09:29:36 +02:00
Nikita Popov
545cdca488
[SCCP] Improve worklist management (#146321)
SCCP currently stores instructions whose lattice value has changed in a
worklist, and then updates their users in the main loop. This may result
in instructions unnecessarily being visited multiple times (as an
instruction will often use multiple other instructions). Additionally,
we'd often redundantly visit instructions that were already visited when
the containing block first became executable.

Instead, change the worklist to directly store the instructions that
need to be revisited. Additionally, do not add instructions to the
worklist that will already be covered by the main basic block walk.

This change is conceptually NFC, but is expected to produce minor
differences in practice, because the visitation order interacts with the
range widening limit.
2025-06-30 17:17:30 +02:00