999 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin
5f99854d01
[AMDGPU] Drop A and B neg modifier from amdgcn_wmma_bf16_16x16x32_bf16 (#189468)
Fixes: LCOMPILER-1673
2026-03-30 14:14:22 -07:00
Owen Anderson
3f2e24726a
[CHERI] Allow @llvm.clear_cache to accept pointers in address spaces other than 0. (#189283)
Co-Authored-by: Jessica Clarke <jrtc27@jrtc27.com>
2026-03-30 09:20:49 +02:00
Stanislav Mekhanoshin
a2d84b5d8d
[AMDGPU] Remove neg support from 4 more gfx1250 WMMA (#189115)
These are previously covered by AMDGPUWmmaIntrinsicModsAllReuse.
2026-03-27 15:20:14 -07:00
Stanislav Mekhanoshin
e69c7312f3
[AMDGPU] Disable neg_lo[0:1] and neg_hi[0:1] on wmma_f32_16x16x32_bf16 (#188649)
This is the pilot change, the rest will follow the same idea.
2026-03-26 00:37:05 -07:00
Owen Anderson
ca9ac0e24a
[CHERI] Allow @llvm.returnaddress to return a pointer in any address space. (#188464)
Clang now constructs calls to it using the default program address space from the DataLayout.

Co-authored-by: Alex Richardson <alexrichardson@google.com>
2026-03-25 13:59:38 +00:00
Grigory Pastukhov
f66bd8e81a
[LLVM] Add flatten function attribute to LLVM IR and implement recursive inlining in AlwaysInliner (#174899)
This adds a new function-level `flatten` LLVM IR attribute and
implements support for it in the AlwaysInliner pass, bringing LLVM's
behavior in line with GCC.

Previously, the `flatten` attribute only existed as a Clang attribute,
which was lowered to `alwaysinline` on individual call sites. Per the
RFC discussion [1], the consensus was to match GCC semantics:
recursively inline the entire call tree into the
flattened function, rather than just immediate call sites.

This PR:
- Adds the `flatten` function attribute to LLVM IR
- Implements recursive inlining of all viable callees in AlwaysInliner
- Uses inline history tracking to detect and stop at recursive call
cycles
- Emits optimization remarks when inlining is skipped due to recursion

A follow-up patch will update Clang to emit the LLVM `flatten` attribute
on
functions instead of marking individual call sites with `alwaysinline`.

[1]
https://discourse.llvm.org/t/rfc-function-level-flatten-depth-attribute-for-depth-limited-inlining
2026-03-19 11:25:46 -07:00
CarolineConcatto
d96722b660
[LLVM] Improve IR parsing and printing for target memory locations (#176968)
This patch adds support for specifying all target memory locations using
a
single IR spellings such as:
```
memory(target_mem: read)
```

This form is not supported in TableGen, but it is now accepted by the IR
parser.
When the parser encounters target_mem, it expands it to all
target-memory
locations (e.g., target_mem0, target_mem1, …).

Printing behavior

When all target-memory locations share the same ModRef value, the
printer
now collapses them into a single entry:
```
memory(target_mem: read)
```
Otherwise, each target memory location is printed separately.

Rejected IR:
```
memory(target_mem0: write, target_mem: read)
```
This is invalid because the default access kind for the target memory
group
must appear first.
2026-03-19 17:29:54 +00:00
Vladislav Dzhidzhoev
cf92512e09
[DebugInfo] Add Verifier check for local imports in CU's imports field (#187118)
Since https://reviews.llvm.org/D144004, DwarfDebug asserts if
function-local imported entities are present in the imports field of
DICompileUnit.
This patch adds a Verifier check to detect such invalid IR earlier.

Incorrect occurrences of imported entities in DICompileUnit's imports
field in llvm/test/Bitcode/DIImportedEntity_elements.ll,
llvm/test/Bitcode/DIModule-fortran-external-module.ll are fixed.

This change is extracted from https://reviews.llvm.org/D144008.
2026-03-19 15:44:03 +01:00
gonzalobg
ea8fb06f24
[atomicrmw] fminimumnum/fmaximumnum support (#187030)
Adds support for `atomicrmw` `fminimumnum`/`fmaximumnum` operations.
These were added to C++ in P3008, and are exposed in libc++ in #186716 .
Adding LLVM IR support for these unblocks work in both backends with HW
support, and frontends.
2026-03-18 09:35:49 +01:00
Pedro Lobo
57568c288d
[Reland][IR] Add initial support for the byte type (#186888)
This patch relands https://github.com/llvm/llvm-project/pull/178666. The
original version caused CI failures due to the missing target triple in
`llvm/test/CodeGen/X86/byte-constants.ll`. CI should be green now.
2026-03-16 23:32:24 +00:00
Pedro Lobo
70cd2acbd3
Revert "[IR] Add initial support for the byte type" (#186713)
Reverts llvm/llvm-project#178666 to unblock CI.
`CodeGen/X86/byte-constants.ll` is at fault. 
Will look into it and hopefully fix it by tomorrow.
2026-03-15 23:29:21 +00:00
Pedro Lobo
80f2ef70f5
[IR] Add initial support for the byte type (#178666)
Following the [byte type RFC](https://discourse.llvm.org/t/rfc-add-a-new-byte-type-to-llvm-ir/89522)
and the discussions within the [LLVM IR Formal Specification WG](https://discourse.llvm.org/t/rfc-forming-a-working-group-on-formal-specification-for-llvm/89056), this PR introduces initial support for the byte type in LLVM. This PR:
- Adds the byte type to LLVM's type system
- Extends the `bitcast` instruction to accept the byte operands
- Adds parsing tests for all new functionality
- Fixes failing regressions tests (IR2Vec and IRNormalizer)

---------

Co-authored-by: George Mitenkov <georgemitenk0v@gmail.com>
2026-03-15 21:56:06 +00:00
Teresa Johnson
cfa039e96e
[MemProf] Skip handling of memprof records for non-prevailing functions (#185963)
When building the combined summary index during a thin link, we already
performed a memory optimization for non-prevailing copies of a function
by not recording their allocation and callsite info in the associated
function summary. We can save on the thin link time as well by avoiding
building the memprof summary structures just to throw them away later
in the non-prevailing case.

The reason we were eagerly building these structures is that the memprof
summaries *precede* the corresponding function summary record, and we
don't know whether this is the prevailing copy of the function until we
parse the function summary record. To facilitate the new handling, we
emit the memprof summary records *after* the corresponding function
summary record. The bitcode summary version is bumped, and the reader is
changed to support both versions, for backwards compatibility. Note that
there is already a memprof test that tests an older record type and will
also test reading of the legacy version of the ordering:
(llvm/test/ThinLTO/X86/memprof-old-alloc-context-summary.ll.

To make the new handling even more efficient, the lookup/insertion of
stack IDs in the combined summary index and the caching of their
corresponding stack index in the StackIdToIndex map is made lazy.

This resulted in a 27% reduction in thin link time for a large target
(21% without the lazy insertion change).
2026-03-12 11:00:25 -07:00
Shilei Tian
f05d2e8a39
[AMDGPU] Make uniform-work-group-size a valueless attribute (#183925)
The "uniform-work-group-size" function attribute previously took a
string value of "true" or "false". Since presence alone can convey the
"true" semantics and absence can convey "false", the value is
unnecessary.

This patch converts it to a valueless string attribute: presence
indicates true, absence indicates false. For backward compatibility,
auto-upgrade logic is added in both UpgradeAttributes (bitcode) and
UpgradeFunctionAttributes: if the old value is "true", the attribute is
kept without a value; if "false", the attribute is removed.
2026-03-01 21:29:55 +00:00
yonghong-song
3e05ab6322
[ThinLTO] Reduce the number of renaming due to promotions (#183793)
Currently for thin-lto, the imported static global values (functions,
variables, etc) will be promoted/renamed from e.g., foo() to
foo.llvm.(). Such a renaming caused difficulties in live patching
since function name is changed ([1]).

It is possible that some global value names have to be promoted to avoid
name collision and linker failure. But in practice, majority of name
promotions can be avoided.

In [2], the suggestion is that thin-lto pre-link decides whether
a particular global value needs name promotion or not. If yes, later on
in thinBackend() the name will be promoted.

I compiled a particular linux kernel version (latest bpf-next tree)
and found 1216 global values with suffix .llvm.. With this patch,
the number of promoted functions is 2, 98% reduction from the
original kernel build.

If some native objects are not participating with LTO, name promotions
have to be done to avoid potential linker issues. So the current
implementation cannot be on by default. But in certain cases, e.g., linux kernel
build, people can enable lld flag --lto-whole-program-visibility to reduce the
number of functions like foo.llvm.().

For ThinLTOCodeGenerator.cpp which is used by llvm-lto tool and a
few other rare cases, reducing the number of renaming due to promotion,
is not implemented as lld flag '-lto-whole-program-visibility' is not
supported in ThinLTOCodeGenerator.cpp for now. In summary, this pull
request only supports llvm-lto2 style workflow.

The feature is off by default. To enable the future, lld flag
'-lto-whole-program-visibility'  and llvm flag
'-always-rename-promoted-locals=false' are needed.

The link [3] has more context for the pull request discussions.

[1] https://lpc.events/event/19/contributions/2212
[2] https://discourse.llvm.org/t/rfc-avoid-functions-like-foo-llvm-for-kernel-live-patch/89400
[3] https://github.com/llvm/llvm-project/pull/178587
2026-02-28 12:44:25 -08:00
yonghong-song
cd50a3074b
Revert "[ThinLTO] Reduce the number of renaming due to promotions (#178587)" (#183782)
There is a conflict with existing code. See
  https://github.com/llvm/llvm-project/pull/178587
Revert and resolve the conflict and then will submit later.
2026-02-27 10:04:30 -08:00
yonghong-song
975dba2863
[ThinLTO] Reduce the number of renaming due to promotions (#178587)
Currently for thin-lto, the imported static global values (functions,
variables, etc) will be promoted/renamed from e.g., foo() to
foo.llvm.<hash>(). Such a renaming caused difficulties in live patching
since function name is changed ([1]).

It is possible that some global value names have to be promoted to avoid
name collision and linker failure. But in practice, majority of name
promotions can be avoided.

In [2], the suggestion is that thin-lto pre-link decides whether
a particular global value needs name promotion or not. If yes, later on
in thinBackend() the name will be promoted.

I compiled a particular linux kernel version (latest bpf-next tree)
and found 1216 global values with suffix .llvm.<hash>. With this patch,
the number of promoted functions is 2, 98% reduction from the
original kernel build.

If some native objects are not participating with LTO, name promotions
have to be done to avoid potential linker issues. So the current
implementation cannot be on by default. But in certain cases, e.g., linux kernel
build, people can enable lld flag --lto-whole-program-visibility to reduce the
number of functions like foo.llvm.<hash>().

For ThinLTOCodeGenerator.cpp which is used by llvm-lto tool and a
few other rare cases, reducing the number of renaming due to promotion,
is not implemented as lld flag '-lto-whole-program-visibility' is not supported
in ThinLTOCodeGenerator.cpp for now. In summary, this pull request
only supports llvm-lto2 style workflow.

  [1] https://lpc.events/event/19/contributions/2212
  [2] https://discourse.llvm.org/t/rfc-avoid-functions-like-foo-llvm-for-kernel-live-patch/89400
2026-02-27 09:09:54 -08:00
Peter Collingbourne
943504eb08
IR: Add prefalign attribute for function definitions.
The prefalign attribute determines the function's preferred alignment.
By default, the function's preferred alignment is set in a target-specific
way, but it may be overridden with this attribute.

The backend logic will be added in followup patches.

Part of this RFC:
https://discourse.llvm.org/t/rfc-enhancing-function-alignment-attributes/88019

Reviewers: efriedma-quic, nikic, arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/155527
2026-02-20 10:54:01 -08:00
Sam Elliott
0d08cb0e70
[outliners] Turn nooutline into an Enum Attribute (#163665)
This change turns the `"nooutline"` attribute into an enum attribute
called `nooutline`, and adds an auto-upgrader for bitcode to make the
same change to existing IR.

This IR attribute disables both the Machine Outliner (enabled at Oz for
some targets), and the IR Outliner (disabled by default).
2026-02-10 21:44:17 -08:00
Matt Arsenault
2502e3b7ba
IR: Promote "denormal-fp-math" to a first class attribute (#174293)
Convert "denormal-fp-math" and "denormal-fp-math-f32" into a first
class denormal_fpenv attribute. Previously the query for the effective
denormal mode involved two string attribute queries with parsing. I'm
introducing more uses of this, so it makes sense to convert this
to a more efficient encoding. The old representation was also awkward
since it was split across two separate attributes. The new encoding
just stores the default and float modes as bitfields, largely avoiding
the need to consider if the other mode is set.

The syntax in the common cases looks like this:
  `denormal_fpenv(preservesign,preservesign)`
  `denormal_fpenv(float: preservesign,preservesign)`
  `denormal_fpenv(dynamic,dynamic float: preservesign,preservesign)`

I wasn't sure about reusing the float type name instead of adding a
new keyword. It's parsed as a type but only accepts float. I'm also
debating switching the name to subnormal to match the current
preferred IEEE terminology (also used by nofpclass and other
contexts).

This has a behavior change when using the command flag debug
options to set the denormal mode. The behavior of the flag
ignored functions with an explicit attribute set, per
the default and f32 version. Now that these are one attribute,
the flag logic can't distinguish which of the two components
were explicitly set on the function. Only one test appeared to
rely on this behavior, so I just avoided using the flags in it.

This also does not perform all the code cleanups this enables.
In particular the attributor handling could be cleaned up.

I also guessed at how to support this in MLIR. I followed
MemoryEffects as a reference; it appears bitfields are expanded
into arguments to attributes, so the representation there is
a bit uglier with the 2 2-element fields flattened into 4 arguments.
2026-02-05 13:31:26 +00:00
Vladislav Dzhidzhoev
b9cecee3fb
Reland "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" (#165032)
This is an attempt to merge https://reviews.llvm.org/D144006 with LTO
fix.

The last merge attempt was
https://github.com/llvm/llvm-project/pull/75385.
The issue with it was investigated in
https://github.com/llvm/llvm-project/pull/75385#issuecomment-2386684121.
The problem happens when 
1. Several modules are being linked.
2. There are several DISubprograms that initially belong to different
modules but represent the same source code function (for example, a
function included from the same source code file).
3. Some of such DISubprograms survive IR linking. It may happen if one
of them is inlined somewhere or if the functions that have these
DISubprograms attached have internal linkage.
4. Each of these DISubprograms has a local type that corresponds to the
same source code type. These types are initially from different modules,
but have the same ODR identifier.

If the same (in the sense of ODR identifier/ODR uniquing rules) local
type is present in two modules, and these modules are linked together,
the type gets uniqued. A DIType, that happens to be loaded first,
survives linking, and the references on other types with the same ODR
identifier from the modules loaded later are replaced with the
references on the DIType loaded first. Since defintion subprograms, in
scope of which these types are located, are not deduplicated, the linker
output may contain multiple DISubprogram's having the same (uniqued)
type in their retainedNodes lists.
Further compilation of such modules causes crashes.

To tackle that,
* previous solution to handle LTO linking with local types in
retainedNodes is removed (cloneLocalTypes() function),
* for each loaded distinct (definition) DISubprogram, its retainedNodes
list is scanned after loading, and DITypes with a scope of another
subprogram are removed. If something from a Function corresponding to
the DISubprogram references uniqued type, we rely on cross-CU links.

Additionally:
* a check is added to Verifier to report about local types located in a
wrong retainedNodes list,

Original commit message follows.
---------

RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

Similar to imported declarations, the patch tracks function-local types in
DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with
the aforementioned metadata change and provided a support of function-local
types scoped within a lexical block.

The patch assumes that DICompileUnit's 'enums field' no longer tracks local
types and DwarfDebug would assert if any locally-scoped types get placed there.

Authored-by: Kristina Bessonova <kbessonova@accesssoftek.com>
Co-authored-by: Jeremy Morse <jeremy.morse@sony.com>
2026-02-04 00:34:52 +01:00
Teresa Johnson
b30971c4bb
[ThinLTO] Remove unused relative block frequency support (#177215)
This removes most of the handling of the relative block frequency
support added in 2018 in c73cec84c99e5a63dca961fef67998a677c53a3c, which
was disabled by default and never utilized in the thin link as expected.

Support for reading old Bitcode containing the record is maintained as
required for backwards compatibility requirements, as is the support for
parsing old LLVM assembly containing that information. Tests ensure that
this backwards compatibility is maintained.

This came up in the context of redundant BFI/DT computations which
existed largely for the purpose of computing this information
and are being addressed in PR176646.
2026-01-21 11:39:57 -08:00
Aiden Grossman
e2d7cd685d
[IR] Make dead_on_return attribute optionally sized
This patch makes the dead_on_return parameter attribute optionally require a number
of bytes to be passed in to specify the number of bytes known to be dead
upon function return/unwind. This is aimed at enabling annotating the
this pointer in C++ destructors with dead_on_return in clang. We need
this to handle cases like the following:

```
struct X {
  int n;
  ~X() {
    this[n].n = 0;
  }
};
void f() {
  X xs[] = {42, -1};
}
```

Where we only certain that sizeof(X) bytes are dead upon return of ~X.
Otherwise DSE would be able to eliminate the store in ~X which would not
be correct.

This patch only does the wiring within IR. Future patches will make
clang emit correct sizing information and update DSE to only delete
stores to objects marked dead_on_return that are provably in bounds of
the number of bytes specified to be dead_on_return.

Reviewers: nikic, alinas, antoniofrighetto

Pull Request: https://github.com/llvm/llvm-project/pull/171712
2026-01-21 08:22:05 -08:00
Luke Lau
cee36b23cc
[IR] Allow non-constant offsets in @llvm.vector.splice.{left,right} (#174693)
Following on from #170796, this PR implements the second part of
https://discourse.llvm.org/t/rfc-allow-non-constant-offsets-in-llvm-vector-splice/88974
by allowing non-constant offsets in the vector splice intrinsics.

Previously @llvm.vector.splice had a restriction enforced by the
verifier that the offset had to be known to be within the range of the
vector at compile time. Because we can't enforce this with non-constant
offsets, it's been relaxed so that offsets that would slide the vector
out of bounds return a poison value, similar to
insertelement/extractelement.

@llvm.vector.splice.left also previously only allowed offsets within the
range 0 <= Offset < N, but this has been relaxed to 0 <= Offset <= N so
that it's consistent with @llvm.vector.splice.right.

In lieu of the verifier checks that were removed, InstSimplify has been
taught to fold splices to poison when the offset is out of bounds.

The cost model isn't implemented in this PR, and just returns invalid
for any non-constant offsets for now. I think the correct way to cost
these non-constant offets isn't through getShuffleCost because they
can't handle variable masks, but instead just through
getIntrinsicInstCost.
2026-01-21 10:58:40 +00:00
Matt Arsenault
0d4a35d560
IR: Remove llvm.convert.to.fp16 and llvm.convert.from.fp16 intrinsics (#174484)
These are long overdue for removal. These were originally a hack
to support loading half values before there was any / decent support
for the half type through the backend. There's no reason to continue
supporting these, they're equivalent to fpext/fptrunc with a bitcast.

SelectionDAG stopped translating these directly, and used the
bitcast + fp cast since f7a02c17628e825, so there's been no reason
to use these since 2014.
2026-01-21 09:50:28 +00:00
Nikita Popov
af7c10618b [BitcodeReader] Improve error messages
Avoid using "Invalid record" for all errors. At least mention
what kind of record it is.
2026-01-19 14:28:40 +01:00
Shilei Tian
5a63367b15
Reapply "[AMDGPU] Rework the clamp support for WMMA instructions" (#174674) (#174697)
This reverts commit 0b2f3cfb72a76fa90f3ec2a234caabe0d0712590.
2026-01-07 06:12:19 +00:00
dyung
0b2f3cfb72
Revert "[AMDGPU] Rework the clamp support for WMMA instructions" (#174674)
Reverts llvm/llvm-project#174310

This change is causing 2 cross-project-test failures on
https://lab.llvm.org/buildbot/#/builders/174/builds/29695
2026-01-07 01:18:23 +00:00
Shilei Tian
ccca3b8c67
[AMDGPU] Rework the clamp support for WMMA instructions (#174310)
Fixes #166989.
2026-01-06 15:46:40 -05:00
Luke Lau
ad4bfac732
[IR] Split vector.splice into vector.splice.left and vector.splice.right (#170796)
This PR implements the first change outlined in
https://discourse.llvm.org/t/rfc-allow-non-constant-offsets-in-llvm-vector-splice/88974?u=lukel

In order to allow non-immediate offsets in the llvm.vector.splice
intrinsic, we need to separate out the "shift left" and "shift right"
modes into two separate intrinsics, which were previously determined by
whether or not the offset is positive or negative.

The description in the LangRef has also been reworded in terms of
sliding elements left or right and extracting either the upper or lower
half as opposed to extracting from a certain index, which brings it
inline with the definition of `llvm.fshr.*`/`llvm.fshl.*`.

This patch teaches AutoUpgrade.cpp to upgrade the old intrinsics into
their new equivalent one based on their offset, so existing uses of
vector.splice should still work.

Uses of llvm.vector.splice in `llvm/test/CodeGen` haven't been replaced
in this PR to keep the diff small and kick the tyres on the AutoUpgrader
a bit. I planned to do this in a follow up NFC but can include it in
this PR if reviewers prefer.

Similarly the shuffle costing kind `SK_Splice` has just been kept the
same for now, to be split into `SK_SpliceLeft` and `SK_SpliceRight`
later.
2026-01-06 15:41:26 +08:00
Shilei Tian
c97de4387b
Revert "[AMDGPU] add clamp immediate operand to WMMA iu8 intrinsic (#171069)" (#174303)
This reverts commit 2c376ffeca490a5732e4fd6e98e5351fcf6d692a because it
breaks assembler.

```
$ llvm-mc -triple=amdgcn -mcpu=gfx1250 -show-encoding <<< "v_wmma_i32_16x16x64_iu8 v[16:23], v[0:7], v[8:15], v[16:23] matrix_b_reuse"
  v_wmma_i32_16x16x64_iu8 v[16:23], v[0:7], v[8:15], v[16:23] clamp ; encoding: [0x10,0x80,0x72,0xcc,0x00,0x11,0x42,0x1c]
```

We have a fundamental issue in the clamp support in VOP3P instructions,
which will need more changes.
2026-01-04 02:13:21 +00:00
Muhammad Abdul
2c376ffeca
[AMDGPU] add clamp immediate operand to WMMA iu8 intrinsic (#171069)
Fixes #166989 

- Adds a clamp immediate operand to the AMDGPU WMMA iu8 intrinsic and
threads it through LLVM IR, MIR lowering, Clang builtins/tests, and MLIR
ROCDL dialect so all layers agree on the new operand
- Updates AMDGPUWmmaIntrinsicModsAB so the clamp attribute is emitted,
teaches VOP3P encoding to accept the immediate, and adjusts Clang
codegen/builtin headers plus MLIR op definitions and tests to match
- Documents what the WMMA clamp operand do
- Implement bitcode AutoUpgrade for source compatibility on WMMA IU8
Intrinsic op

Possible future enhancements:
- infer clamping as an optimization fold based on the use context

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-12-27 12:51:29 -05:00
Teresa Johnson
e3c621c50b
[ThinLTO][MemProf] Add option to override max ICP with larger number (#171652)
Adds an option -module-summary-max-indirect-edges, and wiring into the
ICP logic that collects promotion candidates from VP metadata, to
support a larger number of promotion candidates for use in building the
ThinLTO summary. Also use this in the MemProf ThinLTO backend handling
where we perform memprof ICP during cloning.

The new option, essentially off by default, can be used to override the
value of -icp-max-prom, which is checked internally in ICP, with a
larger max value when collecting candidates from the VP metadata.

For MemProf in particular, where we synthesize new VP metadata targets
from allocation contexts, which may not be all that frequent, we need to
be able to include a larger set of these targets in the summary in order
to correctly handle indirect calls in the contexts. Otherwise we will
not set up the callsite graph edges correctly.
2025-12-15 10:16:06 -08:00
anjenner
27651133e2
AMDGPU: Drop and upgrade llvm.amdgcn.atomic.csub/cond.sub to atomicrmw (#105553)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2025-12-09 23:13:33 +00:00
Paul Walker
b5a3b8b704
[LLVM][SVE] Remove aarch64.sve.rev intrinsic, using vector.reverse instead. (#169654) 2025-11-28 11:59:34 +00:00
Paul Walker
8401a8d0be [NFC][LLVM] Add bitcode tests for llvm.aarch64.sve.rev 2025-11-27 10:42:29 +00:00
Peter Collingbourne
d2379effe9
Add deactivation symbol operand to ConstantPtrAuth.
Deactivation symbol operands are supported in the code generator by
building on the previously added support for IRELATIVE relocations.

Reviewers: ojhunt, fmayer, ahmedbougacha, nikic, efriedma-quic

Reviewed By: fmayer

Pull Request: https://github.com/llvm/llvm-project/pull/133537
2025-11-26 12:39:40 -08:00
Peter Collingbourne
6227eb90da
Add IR and codegen support for deactivation symbols.
Deactivation symbols are a mechanism for allowing object files to disable
specific instructions in other object files at link time. The initial use
case is for pointer field protection.

For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

Reviewers: ojhunt, nikic, fmayer, arsenm, ahmedbougacha

Reviewed By: fmayer

Pull Request: https://github.com/llvm/llvm-project/pull/133536
2025-11-26 12:37:09 -08:00
CarolineConcatto
200793ac21
Extend MemoryEffects to Support Target-Specific Memory Locations (#148650)
This patch introduces preliminary support for additional memory
locations.
They are: target_mem0 and target_mem1 and they model memory locations
that cannot be represented with existing memory locations.

It was a solution suggested in :
https://discourse.llvm.org/t/rfc-improving-fpmr-handling-for-fp8-intrinsics-in-llvm/86868/6

Currently, these locations are not yet target-specific. The goal is to
enable the compiler to express read/write effects on these resources.
2025-11-18 11:10:58 +00:00
Jay Foad
f037f41350
[IR] Add new function attribute nocreateundeforpoison (#164809)
Also add a corresponding intrinsic property that can be used to mark
intrinsics that do not introduce poison, for example simple arithmetic
intrinsics that propagate poison just like a simple arithmetic
instruction.

As a smoke test this patch adds the new property to
llvm.amdgcn.fmul.legacy.
2025-11-04 12:00:44 +00:00
Orlando Cazalet-Hyams
aa5fe56db4
[DebugInfo] Add dataSize to DIBasicType to add DW_AT_bit_size to _BitInt types (#164372)
DW_TAG_base_type DIEs are permitted to have both byte_size and bit_size
attributes "If the value of an object of the given type does not fully
occupy the storage described by a byte size attribute"

* Add DataSizeInBits to DIBasicType (`DIBasicType(... dataSize: n ...)` in IR).
* Change Clang to add DataSizeInBits to _BitInt type metadata.
* Change LLVM to add DW_AT_bit_size to base_type DIEs that have non-zero
  DataSizeInBits.

TODO: Do we need to emit DW_AT_data_bit_offset for big endian targets?
See discussion on the PR.

Fixes [#61952](https://github.com/llvm/llvm-project/issues/61952)

---------

Co-authored-by: David Stenberg <david.stenberg@ericsson.com>
2025-10-29 15:23:46 +00:00
Michael Buch
49f918d4c3
[llvm][Bitcode][ObjC] Fix order of setter/getter argument to DIObjCProperty constructor (#165421)
Depends on:
* https://github.com/llvm/llvm-project/pull/165401

We weren't testing `DIObjCProperty` roundtripping. So this was never
caught.

The consequence of this is that the `setter:` would have the getter name
and `getter:` would have the setter name.
2025-10-29 12:14:56 +00:00
paperchalice
4a95cd14b3
[test][Bitcode] Remove unsafe-fp-math uses (NFC) (#164743)
Post cleanup for #164534.
2025-10-23 16:38:22 +08:00
Teresa Johnson
eb74d8e03c
[ThinLTO] Add index flag for internalization/promotion status (#164530)
Add an index-wide flag indicating whether index-based internalization
and promotion have completed. This will be used in a follow on change.
2025-10-22 07:30:43 -07:00
Daniel Kiss
048070ba6f
[ARM][AArch64] BTI,GCS,PAC Module flag update. (#86212)
Module flag is used to indicate the feature to be propagated to the
function. As now the frontend emits all attributes accordingly let's
help the auto upgrade to only do work when old and new bitcodes are
merged.

Depends on #82819 and #86031
2025-10-22 09:29:06 +02:00
Nikita Popov
573ca36753
[IR] Replace alignment argument with attribute on masked intrinsics (#163802)
The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter`
intrinsics currently accept a separate alignment immarg. Replace this
with an `align` attribute on the pointer / vector of pointers argument.

This is the standard representation for alignment information on
intrinsics, and is already used by all other memory intrinsics. This
means the signatures now match llvm.expandload, llvm.vp.load, etc.
(Things like llvm.memcpy used to have a separate alignment argument as
well, but were already migrated a long time ago.)

It's worth noting that the masked.gather and masked.scatter intrinsics
previously accepted a zero alignment to indicate the ABI type alignment
of the element type. This special case is gone now: If the align
attribute is omitted, the implied alignment is 1, as usual. If ABI
alignment is desired, it needs to be explicitly emitted (which the
IRBuilder API already requires anyway).
2025-10-20 08:50:09 +00:00
Michael Buch
cf1cdde24e
[llvm][DebugInfo] Add 'sourceLanguageVersion' field support to DICompileUnit (#162632)
Depends on:
* https://github.com/llvm/llvm-project/pull/162445

In preparation to emit DWARFv6's `DW_AT_language_version`.
2025-10-15 16:52:45 +01:00
Michael Buch
c32753a77a
[llvm][DebugInfo] Add 'sourceLanguageName' field support to DICompileUnit (#162445)
Depends on:
* https://github.com/llvm/llvm-project/pull/162255
* https://github.com/llvm/llvm-project/pull/162434

Part of a patch series to support the DWARFv6
`DW_AT_language_name`/`DW_AT_language_version` attributes.
2025-10-10 09:54:04 +01:00
Marco Elver
224873d7ac
[AllocToken] Introduce sanitize_alloc_token attribute and alloc_token metadata (#160131)
In preparation of adding the "AllocToken" pass, add the pre-requisite
`sanitize_alloc_token` function attribute and `alloc_token` metadata.

---

This change is part of the following series:
  1. https://github.com/llvm/llvm-project/pull/160131
  2. https://github.com/llvm/llvm-project/pull/156838
  3. https://github.com/llvm/llvm-project/pull/162098
  4. https://github.com/llvm/llvm-project/pull/162099
  5. https://github.com/llvm/llvm-project/pull/156839
  6. https://github.com/llvm/llvm-project/pull/156840
  7. https://github.com/llvm/llvm-project/pull/156841
  8. https://github.com/llvm/llvm-project/pull/156842
2025-10-07 12:51:42 +02:00
Nikita Popov
f31bc666f4
[IR] Handle addrspacecast in findBaseObject() (#162076)
Make findBaseObject() look through addrspacecast, so that
getAliaseeObject() works with an aliasee that uses and addrspacecast.
This fixes a crash during module summary index emission.

Fixes https://github.com/llvm/llvm-project/issues/161646.
2025-10-06 16:18:12 +02:00