660 Commits

Author SHA1 Message Date
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Matt Arsenault
1110e2ff9f
InlineFunction: Split inlining into predicate and apply functions (#134213)
This is to support a new inline function reduction in llvm-reduce,
which should pre-filter callsites that are not eligible for inlining.

This code was mostly structured as a match and apply, with a few
exceptions. The ugliest piece is for propagating and verifying
compatible
getGC and personalities. Also collection of EHPad and the convergence
token
to use are now cached in InlineFunctionInfo.

I was initially confused by the split between the checks performed here
and isInlineViable, so better document how this system is supposed to
work.
It turns out this split does make sense, in that isInlineViable checks
if it's possible based on the callee content and the ultimate inline
depended on the callsite context. I think more renames of these
functions
would help, and isInlineViable should probably move out of InlineCost to
be
with these transfoms.
2025-08-07 16:13:36 +09:00
Jeremy Morse
2a1869b981
[DebugInfo] Shave even more users of DbgVariableIntrinsic from LLVM (#149136)
At this stage I'm just opportunistically deleting any code using
debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll
get to deleting that in probably one or more two commits.
2025-07-18 08:25:10 +01:00
Kazu Hirata
3d5903c4d8
[llvm] Use llvm::is_contained (NFC) (#145844)
llvm::is_contained is shorter than llvm::all_of plus a lambda.
2025-06-26 08:41:18 -07:00
Kazu Hirata
21def215b5
[Utils] Drop const from a return type (NFC) (#145838)
We don't need const on the return type.
2025-06-26 18:57:34 +08:00
Jeremy Morse
9eb0020555
[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)
Seeing how we can't generate any debug intrinsics any more: delete a
variety of codepaths where they're handled. For the most part these are
plain deletions, in others I've tweaked comments to remain coherent, or
added a type to (what was) type-generic-lambdas.

This isn't all the DbgInfoIntrinsic call sites but it's most of the
simple scenarios.

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-17 15:55:14 +01:00
Stephen Tozer
aa8a1fa6f5
[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192)
Following the work in PR #107279, this patch applies the annotative
DebugLocs, which indicate that a particular instruction is intentionally
missing a location for a given reason, to existing sites in the compiler
where their conditions apply. This is NFC in ordinary LLVM builds (each
function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the
instruction in coverage-tracking builds so that it will be ignored by
Debugify, allowing only real errors to be reported. From a developer
standpoint, it also communicates the intentionality and reason for a
missing DebugLoc.

Some notes for reviewers:

- The difference between `I->dropLocation()` and
`I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide
to keep some debug info alive, while the latter will always be empty; in
this patch, I always used the latter (even if the former could
technically be correct), because the former could result in some
(barely) different output, and I'd prefer to keep this patch purely NFC.
- I've generally documented the uses of `DebugLoc::getUnknown()`, with
the exception of the vectorizers - in summary, they are a huge cause of
dropped source locations, and I don't have the time or the domain
knowledge currently to solve that, so I've plastered it all over them as
a form of "fixme".
2025-06-11 17:42:10 +01:00
Teresa Johnson
49d48c32e0
[MemProf] Emit remarks when hinting allocations not needing cloning (#141859)
The context disambiguation code already emits remarks when hinting
allocations (by adding hotness attributes) during cloning. However,
we did not yet emit hints when applying the hotness attributes during
building of the metadata (during matching and again after inlining).
Add remarks when we apply the hint attributes for these
non-context-sensitive allocations.
2025-05-28 16:44:44 -07:00
Nikita Popov
904d0c293e
[Inline] Only consider provenance captures for scoped alias metadata (#138540)
When determining whether an escape source may alias with a noalias
argument, only take provenance captures into account. If only the
address of the argument was captured, an access through the escape
source is not legal.
2025-05-27 15:15:57 +02:00
Kazu Hirata
6e4f501b1b
[Utils] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#139352) 2025-05-10 07:27:29 -07:00
Orlando Cazalet-Hyams
5be080edf7
[KeyInstr][Inline] Don't propagate atoms to inlined nodebug instructions (#133485)
RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
2025-05-07 11:54:57 +01:00
Orlando Cazalet-Hyams
73a7a3dc00
[KeyInstr] Inline atom info (#133481)
Source atom groups are identified by an atom group number and inlined-at pair,
so we simply can copy the atom numbers into the caller when inlining.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
2025-05-06 14:38:41 +01:00
sallto
419a2cb218
[Inliner] Preserve alignment of byval arguments (#137455)
Previously the inliner always produced a memcpy with alignment 1 for src
and destination, leading to potentially suboptimal Codegen.

Since the Src ptr alignment is only available through the CallBase it
has to be passed to HandleByValArgumentInit. Dst Alignment is already
known so it doesn't have to be passed along.

If there is no specified Src Alignment my changes cause the ptr to have
no align data attached instead of align 1 as before (see
inline-tail.ll), I believe this is fine but since I'm a first time
contributor, please confirm.

My changes are already covered by 4 existing regression tests, so I did
not add any additional ones.

The example from #45778 now results in:
```C
opt -S -passes=inline,instcombine,sroa,instcombine test.ll

define dso_local i32 @test(ptr %t) {
entry:
  %.sroa.0.0.copyload = load ptr, ptr %t, align 8       # this used to be align 1 in the original issue
  %arrayidx.i = getelementptr inbounds nuw i8, ptr %.sroa.0.0.copyload, i64 24
  %0 = load i32, ptr %arrayidx.i, align 4
  ret i32 %0
}
```

Fixes #45778.
2025-04-26 21:38:58 +02:00
Matt Arsenault
cf766f5210
InlineFunction: Use use_empty instead of hasNUses(0) (#137347) 2025-04-25 19:01:20 +02:00
Matt Arsenault
f819f46284
Reapply "Inline: Propagate callsite nofpclass attribute" (#135018)
This reverts commit 3f38cd07d820248fd2043efb1341fabaac2d84a6.

Fix case where inner callsite has nofpclass but callsite does not.
2025-04-10 07:15:58 +02:00
Stephen Tozer
9344b2196c
[DebugInfo][Inlining] Propagate inlined resume source loc to new br (#134826)
As part of inlining an invoke instruction, we may replace an inlined
resume instruction with a simple branch to the landing pad block. When
this happens, we should also propagate the resume's DILocation to this
branch, which this patch enables.

Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-09 16:42:06 +01:00
Matt Arsenault
3f38cd07d8 Revert "Inline: Propagate callsite nofpclass attribute"
This reverts commit b0cb672b9968eeee6eb022e98476957dbdf8e6e2.

Breaks bot
2025-04-08 23:15:00 +07:00
Matt Arsenault
b0cb672b99
Inline: Propagate callsite nofpclass attribute
(#134800)

Fixes #134070
2025-04-08 22:53:17 +07:00
Mircea Trofin
f1bb2fe356
[ctxprof] Use isInSpecializedModule as criteria for using contextual profile (#134468)
After #134340, the availability of contextual profile isn't in itself an indication of compiling the module containing all the functions covered by that profile.
2025-04-07 19:55:00 -07:00
Rahul Joshi
74b7abf154
[IRBuilder] Add new overload for CreateIntrinsic (#131942)
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
2025-03-31 08:10:34 -07:00
Kazu Hirata
73dc2afd2c
[Transforms] Use *Set::insert_range (NFC) (#132652)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);

In some cases, we can further fold that into the set declaration.
2025-03-23 19:42:53 -07:00
Mircea Trofin
2068a18c86
[ctxprof][nfc] Prepare CtxProfAnalysis for flat profiles (#129623)
Mostly remove the equivalence "no contexts == no CtxProfAnalysis result", and instead check explicitly there are no contextual profiles.
2025-03-04 16:42:47 -08:00
Nikita Popov
9cbdcfcafd [CaptureTracking] Remove StoreCaptures parameter (NFC)
The implementation doesn't use it, and is unlikely to use it in
the future.

The places that do set StoreCaptures=false, do so incorrectly and
would be broken if the parameter actually did anything.
2025-02-24 12:00:57 +01:00
Yingwei Zheng
9fbd5fbcc6
[IR][NFC] Switch to use LifetimeIntrinsic (#125528) 2025-02-04 02:18:33 +08:00
Jeremy Morse
e14962a39c
[NFC][DebugInfo] Use iterators for instruction insertion in more places (#124291)
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators.
This patch changes some more complex call-sites, those crossing file
boundaries and where I've had to perform some minor rewrites.
2025-01-27 15:25:17 +00:00
Jeremy Morse
6292a808b3
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.

This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).

---------

Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
2025-01-24 13:27:56 +00:00
Harald van Dijk
ccaded2b1d
[Inliner] Prevent adding pointer attributes to non-pointer arguments (#115569)
Fixes a crash seen after #114311
2024-11-09 16:17:16 +00:00
Steven Perron
f405c683ba
[OPT] Search whole BB for convergence token. (#112728)
The spec for llvm.experimental.convergence.entry says that is must be in
the entry block for a function, and must preceed any other convergent
operation. It does not have to be the first instruction in the entry
block.

Inlining assumes that the call to llvm.experimental.convergence.entry
will be the first instruction after any phi instructions. This commit
modifies inlining to search the entire block for the call.
2024-10-30 11:19:23 -04:00
goldsteinn
69a798a996
Reapply "[Inliner] Propagate more attributes to params when inlining (#91101)" (2nd Attempt) (#112749)
Root cause of the bug was code hanging onto `range` attr after
changing BitWidth. This was fixed in PR #112633.
2024-10-17 20:28:47 -05:00
goldsteinn
c85611e858
[SimplifyLibCall][Attribute] Fix bug where we may keep range attr with incompatible type (#112649)
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.

The issue in this case is from the `range` attribute when inlining
`__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an
`i8`, and if the `i32` had a `range` attr assosiated it will cause an
error.

Fixes #112633
2024-10-17 10:32:55 -05:00
Jay Foad
85c17e4092
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
  Fn = Intrinsic::getOrInsertDeclaration(...);
  CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
2024-10-17 16:20:43 +01:00
Arthur Eubanks
9e6d24f61f Revert "[Inliner] Propagate more attributes to params when inlining (#91101)"
This reverts commit ae778ae7ce72219270c30d5c8b3d88c9a4803f81.

Creates broken IR, see comments in #91101.
2024-10-16 21:21:34 +00:00
goldsteinn
ae778ae7ce
[Inliner] Propagate more attributes to params when inlining (#91101)
- **[Inliner] Add tests for propagating more parameter attributes; NFC**
- **[Inliner] Propagate more attributes to params when inlining**

Add support for propagating:
        - `derefereancable`
        - `derefereancable_or_null`
        - `align`
        - `nonnull`
        - `range`
    
These are only propagated if the parameter to the to-be-inlined callsite
match the exact parameter used in the to-be-inlined function.
2024-10-16 11:53:21 -05:00
goldsteinn
3c777f04f0
[Inliner] Don't propagate access attr to byval params (#112256)
- **[Inliner] Add tests for bad propagationg of access attr for `byval`
param; NFC**
- **[Inliner] Don't propagate access attr to `byval` params**

We previously only handled the case where the `byval` attr was in the
callbase's param attr list. This PR also handles the case if the
`ByVal` was a param attr on the function's param attr list.
2024-10-15 09:25:16 -05:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Mircea Trofin
c4952e513f
[nfc][ctx_prof] Efficient profile traversal and update (#110052)
This optimizes profile updates and visits, where we want to access contexts for a specific function. These are all the current update cases. We do so by maintaining a list of contexts for each function, preserving preorder traversal. The list is updated whenever contexts are `std::move`-d or deleted.
2024-09-27 08:09:10 -07:00
Mircea Trofin
783bac7ffb
[ctx_prof] Handle select and its step instrumentation (#109185)
The `step` instrumentation shouldn't be treated, during use, like an `increment`. The latter is treated as a BB ID. The step isn't that, it's more of a type of value profiling. We need to distinguish between the 2 when really looking for BB IDs (==increments), and handle appropriately `step`s. In particular, we need to know when to elide them because `select`s may get elided by function cloning, if the condition of the select is statically known.
2024-09-23 15:21:25 -07:00
goldsteinn
a9352a0d31
[Inliner] Fix bug where attributes are propagated incorrectly (#109347)
- **[Inliner] Add tests for incorrect propagation of return attrs; NFC**
- **[Inliner] Fix bug where attributes are propagated incorrectly**

The bug stems from the fact that we assume the new (inlined) callsite
is calling the same function as the original (callee) callsite. While
this is typically the case, since `VMap` simplifies the new
instructions, callee intrinsics callsites can end up not corresponding
with the same function.

This can lead to buggy propagation.
2024-09-20 19:57:35 -05:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Nikita Popov
55a2473830 [CtxProf] Replace include with forward declaration (NFC)
This header is fairly expensive. Forward declare
PGOContextualProfile instead.
2024-09-04 13:05:09 +02:00
Mircea Trofin
3209766608
[ctx_prof] Add Inlining support (#106154)
Add an overload of `InlineFunction` that updates the contextual profile. If there is no contextual profile, this overload is equivalent to the non-contextual profile variant.

Post-inlining, the update mainly consists of:
- making the PGO instrumentation of the callee "the caller's": the owner function (the "name" parameter of the instrumentation instructions) becomes the caller, and new index values are allocated for each of the callee's indices (this happens for both increment and callsite instrumentation instructions)
- in the contextual profile:
   - each context corresponding to the caller has its counters updated to incorporate the counters inherited from the callee at the inlined callsite. Counter values are copied as-is because no scaling is required since the profile is contextual.
   - the contexts of the callee (at the inlined callsite) are moved to the caller.
   - the callee context at the inlined callsite is deleted.
2024-09-03 16:14:05 -07:00
Sergei Barannikov
75c7bca740
[DataLayout] Remove constructor accepting a pointer to Module (#102841)
The constructor initializes `*this` with `M->getDataLayout()`, which
is effectively the same as calling the copy constructor.
There does not seem to be a case where a copy would be necessary.

Pull Request: https://github.com/llvm/llvm-project/pull/102841
2024-08-13 04:00:19 +03:00
Kazu Hirata
2f55e55101
[Transforms] Use range-based for loops (NFC) (#98725) 2024-07-14 13:44:50 -07:00
Kazu Hirata
4b28b3fae4
[Transforms] Use range-based for loops (NFC) (#97195) 2024-07-02 16:20:44 -07:00
Matt Arsenault
e47359a925
Inline: Fix handling of byval using non-alloca addrspace (#97306)
Use the address space of the original pointer argument instead
of querying the datalayout. This avoids producing a verifier error
since this was changing the address space for the user instructions.

Fixes #97086
2024-07-01 21:09:41 +02:00
Mingming Liu
1518b260ce
[TypeProf][InstrFDO]Implement more efficient comparison sequence for indirect-call-promotion with vtable profiles. (#81442)
Clang's `-fwhole-program-vtables` is required for this optimization to
take place. If `-fwhole-program-vtables` is not enabled, this change is
no-op.
    
* Function-comparison (before):

```
%vtable = load ptr, ptr %obj
%vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
%func = load ptr, ptr %vfn
%cond = icmp eq ptr %func, @callee
br i1 %cond, label bb1, label bb2:

bb1:
   call @callee

bb2:
   call %func
```

* VTable-comparison (after):

```
%vtable = load ptr, ptr %obj
%cond = icmp eq ptr %vtable, @vtable-address-point
br i1 %cond, label bb1, label bb2:

bb1:
   call @callee

bb2:
  %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
  %func = load ptr, ptr %vfn
  call %func
```
    
Key changes:
1. Find out virtual calls and the vtables they come from.
- The ICP relies on type intrinsic `llvm.type.test` to find out virtual
calls and the
compatible vtables, and relies on type metadata to find the address
point for comparison.
2. ICP pass does cost-benefit analysis and compares vtable only when the
number of vtables for a function candidate is within (option specified)
threshold.
3. Sink the function addressing and vtable load instruction to indirect
fallback.
- The sink helper functions are simplified versions of
`InstCombinerImpl::tryToSinkInstruction`. Currently debug intrinsics are
not handled. Ideally `InstCombinerImpl::tryToSinkInstructionDbgValues`
and `InstCombinerImpl::tryToSinkInstructionDbgVariableRecords` could be
moved into Transforms/Utils/Local.cpp (or another util cpp file) to
handle debug intrinsics when moving instructions across basic blocks.
4. Keep value profiles updated
     1) Update vtable value profiles after inline
     2) For either function-based comparison or vtable-based comparison,
          update both vtable and indirect call value profiles.
2024-06-29 23:21:33 -07:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Noah Goldstein
db03d9d33a Recommit "[Inliner] Propagate callee argument memory access attributes before inlining" (2nd Try)
In the re-commit, just dropping the propagation of `writeonly` as that
is the only attribute that can play poorly with call slot optimization
(see issue: #95152 for more details).

Closes #95888
2024-06-21 16:14:28 +08:00