7159 Commits

Author SHA1 Message Date
Orlando Cazalet-Hyams
da45b6c71d
[RemoveDIs][NFC] Remove dbg intrinsic version of calculateFragmentIntersect (#153378) 2025-08-19 13:44:25 +01:00
Matt Arsenault
19ebfa6d0b
RuntimeLibcalls: Move exception call config to tablegen (#151948)
Also starts pruning out these calls if the exception model is
forced to none.

I worked backwards from the logic in addPassesToHandleExceptions
and the pass content. There appears to be some tolerance
for mixing and matching exception modes inside of a single module.
As far as I can tell _Unwind_CallPersonality is only relevant for
wasm, so just add it there.

As usual, the arm64ec case makes things difficult and is
missing test coverage. The set of calls in list form is necessary
to use foreach for the duplication, but in every other context a
dag is more convenient. You cannot use foreach over a dag, and I
haven't found a way to flatten a dag into a list.

This removes the last manual setLibcallImpl call in generic code.
2025-08-19 10:35:59 +09:00
Matt Arsenault
fe67267d19
MSP430: Move __mspabi_mpyll calling conv config to tablegen (#153988)
There are several libcall choices for MUL_I64 which depend on the
subtarget, but this is the base case. The manual custom ISelLowering
is still overriding the decision until we have a way to control
lowering choices, but we can still get the calling convention
set for now.
2025-08-19 10:25:10 +09:00
Kazu Hirata
07eb7b7692
[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
2025-08-18 07:01:29 -07:00
Mircea Trofin
c971c25544
[licm] don't drop MD_prof when dropping other metadata (#152420)
Part of Issue #147390
2025-08-16 07:26:13 -07:00
Matt Arsenault
3e5d8a1439 Reapply "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864)
This reverts commit 334e9bf2dd01fbbfe785624c0de477b725cde6f2.

Check if llvm-nm exists before building the benchmark.
2025-08-16 09:53:50 +09:00
gulfemsavrun
334e9bf2dd
Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864)
…210)"

This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3.

Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)"

This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de.

Revert "TableGen: Emit statically generated hash table for runtime
libcalls (#150192)"

This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05.

Reverted three changes because of a CMake error while building llvm-nm
as reported in the following PR:
https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073
2025-08-15 13:32:27 -07:00
Diana Picus
ac005e16f6
Reapply "[AMDGPU] Intrinsic for launching whole wave functions" (#153584)
This reverts commit 14cd1339318b16e08c1363ec6896bd7d1e4ae281. The
buildbot failure seems to have been a cmake issue which has been
discussed in more detail in this Discourse post:

https://discourse.llvm.org/t/cmake-doesnt-regenerate-all-tablegen-target-files/87901

If any buildbots fail to select arbitrary intrinsics with this patch,
it's worth considering using clean builds with ccache instead of
incremental builds, as recommended here:

https://llvm.org/docs/HowToAddABuilder.html#:~:text=Use%20CCache%20and%20NOT%20incremental%20builds

The original commit message for this patch:
Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave
functions. This will take as its first argument the callee with the
amdgpu_gfx_whole_wave calling convention, followed by the call
parameters which must match the signature of the callee except for the
first function argument (the i1 original EXEC mask, which doesn't need
to be passed in). Indirect calls are not allowed.

Make direct calls to amdgpu_gfx_whole_wave functions a verifier error.

Tail calls are handled in a future patch.
2025-08-15 10:12:47 +02:00
Mircea Trofin
45e6951ba7
Use uint32_t rather than unsigned in downscaleWeights (#153750) 2025-08-14 23:22:45 -07:00
Mircea Trofin
8da1ce559e
Fix after #153735 (#153749)
Example failure
<https://lab.llvm.org/buildbot/#/builders/105/builds/11073>

Seems compiler-dependent.
2025-08-14 23:14:53 -07:00
Mircea Trofin
3b4775d31d
[NFC][PGO] Factor downscaling of branch weights out of Instrumentation into ProfileData (#153735)
The logic isn’t instrumentation-specific, and the refactoring allows users avoid a dependency on `Instrumentation` and just take one on `ProfileData`​ (which a fairly low-level dependency)
2025-08-14 22:44:36 -07:00
Matt Arsenault
769a9058c8
TableGen: Emit statically generated hash table for runtime libcalls (#150192)
a96121089b9c94e08c6632f91f2dffc73c0ffa28 reverted a change
to use a binary search on the string name table because it
was too slow. This replaces it with a static string hash
table based on the known set of libcall names. Microbenchmarking
shows this is similarly fast to using DenseMap. It's possibly
slightly slower than using StringSet, though these aren't an
exact comparison. This also saves on the one time use construction
of the map, so it could be better in practice.

This search isn't simple set check, since it does find the
range of possible matches with the same name. There's also
an additional check for whether the current target supports
the name. The runtime constructed set doesn't require this,
since it only adds the symbols live for the target.

Followed algorithm from this post
http://0x80.pl/notesen/2023-04-30-lookup-in-strings.html

I'm also thinking the 2 special case global symbols should
just be added to RuntimeLibcalls. There are also other global
references emitted in the backend that aren't tracked; we probably
should just use this as a centralized database for all compiler
selected symbols.
2025-08-15 09:02:56 +09:00
peter mckinna
002362bbd8
Add LLVMGlobalAddDebugInfo to Core.cpp (#148747)
This change allows globals to have multiple metadata attached. The
GlobalSetMetadata function only allows only one and is clobbered if
more metadata is attempted to be added. The addDebugInfo
function calls addMetadata. This is needed because some languages have
global structs containing lots of compiler-generated globals.
2025-08-14 14:59:39 +02:00
Matt Arsenault
ddb2dc50af
ARM: Move gnu half convert calling conv config into tablegen (#153394) 2025-08-14 17:36:29 +09:00
Matt Arsenault
bbcac029db
ARM: Move more aeabi libcall config into tablegen (#152109) 2025-08-14 15:43:15 +09:00
Matt Arsenault
32f1fe3770
ARM: Move calling conv config to RuntimeLibcalls (#152065)
Consolidate module level ABI into RuntimeLibcalls
2025-08-14 08:36:03 +09:00
Orlando Cazalet-Hyams
f316009997
[RemoveDIs][NFC] Remove more dbg.assign intrinsics code paths (#153371) 2025-08-13 16:37:04 +01:00
Orlando Cazalet-Hyams
d13341db26
[RemoveDIs][NFC] Remove getAssignmentMarkers (#153214)
getAssignmentMarkers was for debug intrinsics. getDVRAssignmentMarkers
is used for DbgRecords.
2025-08-13 10:56:19 +01:00
Mircea Trofin
374cbfd327
[licm] clone MD_prof when hoisting conditional branch (#152232)
The profiling - related metadata information for the hoisted conditional branch should be copied from the original branch, not from the current terminator of the block it's hoisted to.

The patch adds a way to disable the fix just so we can do an ablation test, after which the flag will be removed. The same flag will be reused for other similar fixes.

(This was identified through `profcheck` (see Issue #147390), and this PR addresses most of the test failures (when running under profcheck) under `Transforms/LICM`.)
2025-08-13 02:01:00 +02:00
Andreas Jonson
ca7ffaaeeb [ConstantRange] add nuw support to truncate (NFC) (#152990) 2025-08-12 12:26:35 +02:00
Nikita Popov
f35e9fa478 Revert "[IR] Optimize stripAndAccumulateConstantOffsets() for common case (NFC)"
This reverts commit a7edc95c799c46665ecf4465a4dc7ff4bee3ced0.

An issue has been reported at: a7edc95c79 (commitcomment-163691175)
2025-08-08 20:44:40 +02:00
Alexander Richardson
3a4b351ba1
[IR] Introduce the ptrtoaddr instruction
This introduces a new `ptrtoaddr` instruction which is similar to
`ptrtoint` but has two differences:

1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance
2) `ptrtoaddr` only extracts (and then extends/truncates) the low
   index-width bits of the pointer

For most architectures, difference 2) does not matter since index (address)
width and pointer representation width are the same, but this does make a
difference for architectures that have pointers that aren't just plain
integer addresses such as AMDGPU fat pointers or CHERI capabilities.

This commit introduces textual and bitcode IR support as well as basic code
generation, but optimization passes do not handle the new instruction yet
so it may result in worse code than using ptrtoint. Follow-up changes will
update capture tracking, etc. for the new instruction.

RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54

Reviewed By: nikic

Pull Request: https://github.com/llvm/llvm-project/pull/139357
2025-08-08 10:12:39 -07:00
Orlando Cazalet-Hyams
1778669739
[KeyInstr] Remove LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS CMake flag (#152735)
The CMake flag has been on by default for a month without any issues.

This makes the feature support in LLVM unconditional (but does not
enable the feature by default).
2025-08-08 17:03:28 +01:00
Nikita Popov
02f3e95a42 [AutoUpgrade] Fix use after free
Determine the intrinsic ID before the name is freed during renaming.
2025-08-08 11:54:09 +02:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Diana Picus
14cd133931
Revert "[AMDGPU] Intrinsic for launching whole wave functions" (#152286)
Reverts llvm/llvm-project#145859 because it broke a HIP test:
```
[34/59] Building CXX object External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o
FAILED: External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o 
/home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time --rocm-path=/opt/botworker/llvm/External/hip/rocm-6.3.0 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /home/botworker/bbot/clang-hip-vega20/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc
fatal error: error in backend: Cannot select: intrinsic %llvm.amdgcn.readfirstlane
```
2025-08-06 12:24:52 +02:00
Diana Picus
0461cd3d1d
[AMDGPU] Intrinsic for launching whole wave functions (#145859)
Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave
functions. This will take as its first argument the callee with the
amdgpu_gfx_whole_wave calling convention, followed by the call
parameters which must match the signature of the callee except for the
first function argument (the i1 original EXEC mask, which doesn't need
to be passed in). Indirect calls are not allowed.

Make direct calls to amdgpu_gfx_whole_wave functions a verifier error.

Unspeakable horrors happen around calls from whole wave functions, the
plan is to improve the handling of caller/callee-saved registers in
a future patch.

Tail calls are also handled in a future patch.
2025-08-06 10:25:53 +02:00
Craig Topper
73685583c8
[VP][RISCV] Add a vp.load.ff intrinsic for fault only first load. (#128593)
There's been some interest in supporting early-exit loops recently.
https://discourse.llvm.org/t/rfc-supporting-more-early-exit-loops/84690

This patch was extracted from our downstream where we've been using it
in our vectorizer.
2025-08-05 16:12:42 -07:00
Matt Arsenault
1392edcc07
ARM: Remove idiv runtime call aliases (#152098)
Really only the i32 variants exist. We don't need synthetic
aliases for illegal types which will be promoted.
2025-08-05 17:49:22 +09:00
Daniel Paoliello
717e753d1e
[win][arm64ec] Handle empty function names (#151609)
While testing Arm64EC, I observed that LLVM crashes when an empty
function name is used. My original fix in #151409 was to raise an error,
but this change now handles the empty name via
`Mangler::getNameWithPrefix` (which assigns a name to the function).

To get this working, I had to create the `Mangler` in
`TargetLoweringObjectFile` early so it would be available to Arm64EC's
lowering. There's no reason why `Mangler` is only created when
`Initialize` is called (or re-created if it exists), and so I moved
creation to the constructor and switched the raw pointer for a
`unique_ptr` to avoid the explicit `delete` in the destructor.
2025-08-04 16:20:55 -07:00
Ramkumar Ramachandra
c467946c3a
[IR] Improve code in isIdenticalToWhenDefined (NFC) (#151881)
Prefer llvm::equal() over std::equal().
2025-08-04 09:44:46 +01:00
Matt Arsenault
1862e3c56c
RuntimeLibcalls: Move __stack_smash_handler config to tablegen (#150870) 2025-08-04 17:27:44 +09:00
Nikita Popov
86727fe9a1
[IR] Allow poison argument to lifetime markers (#151148)
This slightly relaxes the invariant established in #149310, by also
allowing the lifetime argument to be poison. This is to support the
typical pattern of RAUWing with poison when removing an instruction.

It's worth noting that this does not require any conservative
assumptions, lifetimes with poison arguments can simply be skipped.

Fixes https://github.com/llvm/llvm-project/issues/151119.
2025-08-04 10:02:04 +02:00
Matt Arsenault
144cd87088
RuntimeLibcalls: Remove target check for sjlj config (#148792)
I'm assuming this was the set of targets that were relevant
for sjlj handling. Just take the raw exception setting instead,
and assume it makes sense for the target.
2025-08-04 14:15:53 +09:00
Matt Arsenault
5478da99a1
RuntimeLibcalls: Move __stack_chk_fail config to tablegen (#148789) 2025-08-04 13:02:57 +09:00
Matt Arsenault
5b528a1041
RuntimeLibcalls: Remove darwin override of half convert libcalls (#148782)
These are already the default calls set for these conversions, so
they should not require explicit setting. The non-default cases are
currently overridden in ARMISelLowering. Just delete this until
the list of calls and lowering decisions are separated.

This was added back in 6402ad27c01c9503a12d41d7e40646cf0d1f919f. It
appears to not be relevant for AArch64, where calls appear to never
be used for these. It also appears to not be relevant for x86, where
the default calls seem to always end up used anyway.
2025-08-04 11:06:03 +09:00
Austin
c7bacc9f26
[llvm] using wrapper llvm::sort(nfc) (#151000)
using wrapper llvm::sort(nfc)
2025-08-04 09:27:01 +08:00
Matt Arsenault
b2f0ffd659
RuntimeLibcalls: Really move default libcall handling to tablegen (#148780)
Hack in the default setting so it's consistently generated like
the other cases. Maintain a list of targets where this applies.
The alternative would require new infrastructure to sort the system
library initialization in some way.

I wanted the unhandled target case to be treated as a fatal
error, but it turns out there's a hack in IRSymtab using
RuntimeLibcalls, which will fail out in many tests that
do not have a triple set. Many of the failures are simply
running llvm-as with no triple, which probably should not
depend on knowing an accurate set of calls.
2025-08-04 08:32:00 +09:00
Matt Arsenault
d0d3f15c38
RuntimeLibcalls: Stop opting out of exp10 (#148604) 2025-08-04 00:08:46 +09:00
Kazu Hirata
228e96b28a
[llvm] Use std::make_optional (NFC) (#151627)
std::make_optional<T> is a lot like std::make_unique<T> in that it
performs perfect forwarding of arguments for T's constructor.  As a
result, we don't have to repeat type names twice.
2025-08-01 00:24:40 -07:00
Joel E. Denny
37e03b56b8
Revert "[PGO] Add llvm.loop.estimated_trip_count metadata" (#151585)
Reverts llvm/llvm-project#148758

[As
requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31 15:56:31 -04:00
Joel E. Denny
f7b65011de
[PGO] Add llvm.loop.estimated_trip_count metadata (#148758)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights` metadata.
As [suggested in the PR #128785
review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036),
it does so via a new `PGOEstimateTripCountsPass` pass, which creates the
new metadata for each loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count,
but later passes can sometimes transform the loop in a way that makes it
possible. Currently, such passes do not necessarily update the metadata,
but eventually that should be fixed. Until then, if the new metadata has
no value, `llvm::getLoopEstimatedTripCount` disregards it and tries
again to estimate the trip count from the loop's current
`branch_weights` metadata.
2025-07-31 12:28:25 -04:00
David Sherwood
6fbc397964
[IR] Add new CreateVectorInterleave interface (#150931)
This PR adds a new interface to IRBuilder called CreateVectorInterleave,
which can be used to create vector.interleave intrinsics of factors 2-8.

For convenience I have also moved getInterleaveIntrinsicID and
getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where
it can be used by IRBuilder.
2025-07-29 08:47:07 +01:00
Shoreshen
a5deb59dfe
[AMDGPU] Add NoaliasAddrSpace to AAMDnodes (#149247)
This is the following PR of
https://github.com/llvm/llvm-project/pull/136553 which calculate
NoaliasAddrSpace.

This PR carries the info calculated into MIR by adding it into AAMDnodes
2025-07-29 10:10:06 +08:00
Matt Arsenault
1461a1c3b8
DAG: Emit an error if trying to legalize read/write register with illegal types (#145197)
This is a starting point to have better legalization failure diagnostics
2025-07-26 10:54:59 +09:00
Meredith Julian
be58069515
[LLVM][NVPTX] Upstream tanh intrinsic for libdevice (#149596)
Currently __nv_fast_tanhf() in libdevice maps to an nvvm intrinsic that
has not been upstreamed, which is causing issues when using the NVPTX
backend from upstream. Instead of upstreaming the intrinsic, we can
instead use the existing Intrinsic::tanh with the afn flag. This change
adds NVPTX backend support for ISD::TANH, adds auto-upgrade for the old
tanh_approx intrinsic to @llvm.tanh.f32 with afn flag so that libdevice
works properly upstream, and adds a basic codegen test and a case to the
auto-upgrade test.
2025-07-24 14:32:59 -07:00
Craig Topper
7b66629497
[IR] Remove static variables from Type::getWasm_ExternrefTy/getWasm_FuncrefTy. (#150323)
These were caching pointers to memory owned by LLVMContext and can
outlive the LLVMContext. The LLVMContext already caches pointer types so
we shouldn't need any caching here.
2025-07-23 15:56:55 -07:00
Craig Topper
71c06d7a5f
[IR] Remove unnecessary casts from IntegerType::get. NFC (#150299) 2025-07-23 14:34:26 -07:00
Nikita Popov
a7edc95c79 [IR] Optimize stripAndAccumulateConstantOffsets() for common case (NFC)
For the common case where we don't have bit width changing address
space casts, we can directly call accumulateConstantOffset() on the
original Offset. Skip the bit width reconciliation logic in that
case.
2025-07-23 12:19:50 +02:00
Danila Malyutin
b3e720b4de
[PassInstrumentation] Don't insert extra entries in getPassNameForClassName (#150029)
Don't modify ClassToPassName map unless ClassName is found. Instead,
just return empty StringRef if there is no matching entry. This will
prevent possible dangling references in ClassToPassName map in case of
ClassName being freed.
See https://github.com/llvm/llvm-project/pull/145059/files#r2219763671
for more context.
2025-07-22 20:51:49 +04:00