16127 Commits

Author SHA1 Message Date
David Tenty
63195d3d7a
[NFC][CMake] quote ${CMAKE_SYSTEM_NAME} consistently (#154537)
A CMake change included in CMake 4.0 makes `AIX` into a variable
(similar to `APPLE`, etc.)
ff03db6657

However, `${CMAKE_SYSTEM_NAME}` unfortunately also expands exactly to
`AIX` and `if` auto-expands variable names in CMake. That means you get
a double expansion if you write:

`if (${CMAKE_SYSTEM_NAME}  MATCHES "AIX")`
which becomes:
`if (AIX  MATCHES "AIX")`
which is as if you wrote:
`if (ON MATCHES "AIX")`

You can prevent this by quoting the expansion of "${CMAKE_SYSTEM_NAME}",
due to policy
[CMP0054](https://cmake.org/cmake/help/latest/policy/CMP0054.html#policy:CMP0054)
which is on by default in 4.0+. Most of the LLVM CMake already does
this, but this PR fixes the remaining cases where we do not.
2025-08-20 12:45:41 -04:00
Zhaoxuan Jiang
2738828c0e
[Reland] [CGData] Lazy loading support for stable function map (#154491)
This is an attempt to reland #151660 by including a missing STL header
found by a buildbot failure.

The stable function map could be huge for a large application. Fully
loading it is slow and consumes a significant amount of memory, which is
unnecessary and drastically slows down compilation especially for
non-LTO and distributed-ThinLTO setups. This patch introduces an opt-in
lazy loading support for the stable function map. The detailed changes
are:

- `StableFunctionMap`
- The map now stores entries in an `EntryStorage` struct, which includes
offsets for serialized entries and a `std::once_flag` for thread-safe
lazy loading.
- The underlying map type is changed from `DenseMap` to
`std::unordered_map` for compatibility with `std::once_flag`.
- `contains()`, `size()` and `at()` are implemented to only load
requested entries on demand.

- Lazy Loading Mechanism
- When reading indexed codegen data, if the newly-introduced
`-indexed-codegen-data-lazy-loading` flag is set, the stable function
map is not fully deserialized up front. The binary format for the stable
function map now includes offsets and sizes to support lazy loading.
- The safety of lazy loading is guarded by the once flag per function
hash. This guarantees that even in a multi-threaded environment, the
deserialization for a given function hash will happen exactly once. The
first thread to request it performs the load, and subsequent threads
will wait for it to complete before using the data. For single-threaded
builds, the overhead is negligible (a single check on the once flag).
For multi-threaded scenarios, users can omit the flag to retain the
previous eager-loading behavior.
2025-08-20 06:15:04 -07:00
Joseph Huber
e2777af84b [LLVM] Add missing dependency for offload-wrapper tool 2025-08-19 11:19:35 -05:00
Joseph Huber
4c9b7ff04c
[LLVM] Introduce 'llvm-offload-wrapper' tool (#153504)
Summary:
This is a standalone tool that does the wrapper stage of the
`clang-linker-wrapper`. We want this to be an external tool because
currently there's no easy way to split apart what the
clang-linker-wrapper is doing under the hood. With this tool, users can
manually extract files with `clang-offload-packager`, feed them through
`clang --target=<triple>` and then use this tool to generate a `.bc`
file they can give to the linker. The goal here is to make reproducing
the linker wrapper steps easier.
2025-08-19 11:05:48 -05:00
Matthias Braun
48232594a0
llvm-profgen: Options cleanup / fixes (#147632)
- Add `cl::cat(ProfGenCategory)` to non-hidden options so they show up
  in `--help` output.
- Introduce `Options.h` for options referenced in multiple files.
2025-08-18 21:42:55 +00:00
Matthias Braun
43df97a909
llvm-profgen: Avoid "using namespace" in headers (#147631)
Avoid global `using namespace` directives in headers as they are bad
style.
2025-08-18 18:55:23 +00:00
Chris B
be0135538a
[DirectX][objdump] Add support for printing signatures (#153320)
This adds support for printing the signature sections as part of the
`-p` flag for printing private headers.

The formatting aims to roughly match the formatting used by DXC's
`/dumpbin` flag.

The original version's printed output left some trailing whitespace on
lines, which caused the tests to fail with the strict whitespace
matching.

Re-lands #152531.
Resolves #152380.
2025-08-15 18:10:49 -05:00
Sterling-Augustine
5b0619e79b
Move function info word into its own data structure (#153627)
The sframe generator needs to construct this word separately from FDEs
themselves, so split them into a separate data structure.
2025-08-15 13:16:34 -07:00
Pavel Labath
dab971ed23
[llvm-readobj] Dump SFrame relocations as well (#153161)
If there is a relocation for a particular FDE, print it as well. This is
mainly meant for human consumption (otherwise, there's no way to tell
which function a given (relocatable) FDE refers to). For testing of
relocation generation, I'd still recommend using the regular relocation
dumper, as this code will not detect (e.g.) any superfluous relocations.

I've considered handling relocations inside the SFrameParser class, but
I couldn't find an elegant way to do that. Right now, I don't have a use
case for resolving relocations there as lldb (my other use case for
SFrameParser) will always operate on linked objects.
2025-08-15 10:30:41 +00:00
Nikita Popov
598562077a [llvm-c] Fix memory leak in test 2025-08-15 10:33:08 +02:00
Kazu Hirata
0923aafcf9 [llvm-c-test] Fix a warning
This patch fixes:

  llvm/tools/llvm-c-test/debuginfo.c:447:27: error: unused variable
  'ME' [-Werror,-Wunused-variable]
2025-08-14 22:29:05 -07:00
Shoreshen
f2a6fcd311
[AMDGPU] Delete amdgpu-unify-metadata in optdriver.cpp (#153717)
Fix up for https://github.com/llvm/llvm-project/pull/153548, which is
from https://github.com/llvm/llvm-project/issues/153150.
2025-08-15 09:07:25 +08:00
Kyungwoo Lee
07d3a73d70 Revert "[CGData] Lazy loading support for stable function map (#151660)"
This reverts commit 76dd742f7b32e4d3acf50fab1dbbd897f215837e.
2025-08-14 16:56:54 -07:00
Zhaoxuan Jiang
76dd742f7b
[CGData] Lazy loading support for stable function map (#151660)
The stable function map could be huge for a large application. Fully
loading it is slow and consumes a significant amount of memory, which is
unnecessary and drastically slows down compilation especially for
non-LTO and distributed-ThinLTO setups. This patch introduces an opt-in
lazy loading support for the stable function map. The detailed changes
are:

- `StableFunctionMap`
- The map now stores entries in an `EntryStorage` struct, which includes
offsets for serialized entries and a `std::once_flag` for thread-safe
lazy loading.
- The underlying map type is changed from `DenseMap` to
`std::unordered_map` for compatibility with `std::once_flag`.
- `contains()`, `size()` and `at()` are implemented to only load
requested entries on demand.

- Lazy Loading Mechanism
- When reading indexed codegen data, if the newly-introduced
`-indexed-codegen-data-lazy-loading` flag is set, the stable function
map is not fully deserialized up front. The binary format for the stable
function map now includes offsets and sizes to support lazy loading.
- The safety of lazy loading is guarded by the once flag per function
hash. This guarantees that even in a multi-threaded environment, the
deserialization for a given function hash will happen exactly once. The
first thread to request it performs the load, and subsequent threads
will wait for it to complete before using the data. For single-threaded
builds, the overhead is negligible (a single check on the once flag).
For multi-threaded scenarios, users can omit the flag to retain the
previous eager-loading behavior.
2025-08-14 13:49:09 -07:00
peter mckinna
002362bbd8
Add LLVMGlobalAddDebugInfo to Core.cpp (#148747)
This change allows globals to have multiple metadata attached. The
GlobalSetMetadata function only allows only one and is clobbered if
more metadata is attempted to be added. The addDebugInfo
function calls addMetadata. This is needed because some languages have
global structs containing lots of compiler-generated globals.
2025-08-14 14:59:39 +02:00
Kazu Hirata
0f77887108 [llvm-exegesis] Fix a warning
This patch fixes:

  llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp:602:6: error:
  unused function 'printInstructions' [-Werror,-Wunused-function]
2025-08-13 09:41:40 -07:00
Lakshay Kumar
d35686b25c
[llvm-exegesis] Print generated assembly snippet (#142540)
Debug generated disassembly by passing argument
`debug-only="print-gen-assembly"` or `debug-only=preview-gen-assembly`
of exegesis call.
`--debug-only="print-gen-assembly"` debugs the whole generated assembly
snippet .
`--debug-only=preview-gen-assembly` debugs the initial 10 instructions
and ending 3 lines.
Thus, We can in glance see the initial setup code like registers setup
and instruction followed by truncated middle and finally print out the
last 3 instructions.

This helps us look into assembly that exegesis is execution in hardware,
Thus, it is simply functionally alias to separate objdump command on the
dumped object file.
2025-08-13 10:37:24 +01:00
Chris B
6e59d1da08
Revert "[DirectX][objdump] Add support for printing signatures" (#153313)
Reverts llvm/llvm-project#152531
2025-08-12 17:33:56 -05:00
Chris B
9526d3b0b9
[DirectX][objdump] Add support for printing signatures (#152531)
This adds support for printing the signature sections as part of the
`-p` flag for printing private headers.

The formatting aims to roughly match the formatting used by DXC's
`/dumpbin` flag.

Resolves #152380.
2025-08-12 17:00:14 -05:00
Philip Reames
49b17a0c1c
[MIR] Further cleanup on mutliple save/restore point support [nfc] (#153250)
Remove the type alias now that the std::variant aspect is gone, directly
using std::vector in the few places that need it is more idiomatic.

Move a routine from a core header to single user.
2025-08-12 14:16:41 -07:00
Elizaveta Noskova
bbde6be841
[llvm] Support multiple save/restore points in mir (#119357)
Currently mir supports only one save and one restore point
specification:

```
  savePoint:       '%bb.1'
  restorePoint:    '%bb.2'
```

This patch provide possibility to have multiple save and multiple
restore points in mir:

```
  savePoints:
    - point:           '%bb.1'
  restorePoints:
    - point:           '%bb.2'
```

Shrink-Wrap points split Part 3.
RFC:
https://discourse.llvm.org/t/shrink-wrap-save-restore-points-splitting/83581

Part 1: https://github.com/llvm/llvm-project/pull/117862
Part 2: https://github.com/llvm/llvm-project/pull/119355
Part 4: https://github.com/llvm/llvm-project/pull/119358
Part 5: https://github.com/llvm/llvm-project/pull/119359
2025-08-12 16:34:29 +03:00
Pavel Labath
66aa46da6c
Reapply "[Object] Parsing and dumping of SFrame Frame Row Entries" (#152650) (#152695)
This reapplies #152650 with a build fix for clang-11 (need explicit
template parameters for ArrayRef construction) and avoiding the
default-in-a-switch-covering-enum warning. It also adds two new tests.

The original commit message was:

The trickiest part here is that the FREs have a variable size, in two
(or three?) dimensions:
- the size of the StartAddress field. This determined by the FDE they
are in, so it is uniform across all FREs in one FDE.
- the number and sizes of offsets following the FRE. This can be
different for each FRE.
    
While vending this information through a template API would be possible,
I believe such an approach would be very unwieldy, and it would still
require a sequential scan through the FRE list. This is why I'm
implementing this by reading the data into a common data structure using
the fallible iterator pattern.
    
For more information about the SFrame unwind format, see the
[specification](https://sourceware.org/binutils/wiki/sframe) and the
related
[RFC](https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900).
2025-08-12 10:10:45 +02:00
Kazu Hirata
c2bf1ca400
[llvm-objdump] Remove unnecessary casts (NFC) (#153128)
Ptr is already of const char *.
2025-08-11 22:51:48 -07:00
Pavel Labath
7e8a251f75
Revert "[Object] Parsing and dumping of SFrame Frame Row Entries" (#152650)
Reverts llvm/llvm-project#151301 - build breakage on multiple bots.
2025-08-08 08:29:58 +02:00
Pavel Labath
a82ca1b560
[Object] Parsing and dumping of SFrame Frame Row Entries (#151301)
The trickiest part here is that the FREs have a variable size, in two
(or three?) dimensions:
- the size of the StartAddress field. This determined by the FDE they
are in, so it is uniform across all FREs in one FDE.
- the number and sizes of offsets following the FRE. This can be
different for each FRE.

While vending this information through a template API would be possible,
I believe such an approach would be very unwieldy, and it would still
require a sequential scan through the FRE list. This is why I'm
implementing this by reading the data into a common data structure using
the fallible iterator pattern.

For more information about the SFrame unwind format, see the
[specification](https://sourceware.org/binutils/wiki/sframe) and the
related
[RFC](https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900).
2025-08-08 08:22:08 +02:00
Sam Elliott
4e11f89904
[RISCV] Basic Objdump Mapping Symbol Support (#151452)
This implements very basic support for RISC-V mapping symbols in
llvm-objdump, sharing the implementation with how Arm/AArch64/CSKY
implement this feature.

This only supports the `$x` (instruction) and `$d` (data) mapping
symbols for RISC-V, and not the version of `$x` which includes an
architecture string suffix.
2025-08-07 11:28:07 -07:00
Kazu Hirata
ebaaf4d2fb
[llvm-objdump] Remove unnecessary casts (NFC) (#152443)
data() alaready returns const char *.
2025-08-07 07:22:58 -07:00
Maksim Sabianin
3f59a22711
[offload][SYCL] Add Module splitting by categories. (#131347)
This patch adds Module splitting by categories. The splitting algorithm
is the necessary step in the SYCL compilation pipeline. Also it could be
reused for other heterogenous targets.

The previous attempt was at #119713. In this patch there is no
dependency in `TransformUtils` on "IPO" and on "Printing Passes". In
this patch a module splitting is self-contained and it doesn't introduce
linking issues.
2025-08-05 14:04:59 +00:00
Chris B
2fe96439fb
[DirectX] Add ObjectFile boilerplate for objdump (#151434)
This change adds boilerplate code to implement the object::ObjectFile
interface for the DXContainer object file and an empty implementation of
the objdump Dumper object.

Adding an ObjectFile implementation for DXContainer is a bit odd because
the DXContainer format doesn't have a symbol table, so there isn't a
reasonable implementation for the SymbolicFile interfaces. That said, it
does have sections, and it will be useful for objdump to be able to
inspect some of the structured data stored in some of the special named
sections.

At this point in the implementation it can't do much other than dump the
part names, offsets, and sizes. Dumping detailed structured section
contents to be extended in subsequent PRs.

Fixes #151433
2025-08-04 10:57:25 -05:00
Mircea Trofin
32efbb707e
[nfc][profcheck] fix cl::desc typo (#151979)
The pass is `prof-verify`, not `prof-check`. (Meanwhile, `profcheck` is an informal handle we could use for the whole feature in the [RFC](https://discourse.llvm.org/t/rfc-profile-information-propagation-unittesting/73595))
2025-08-04 07:54:51 -07:00
Kazu Hirata
35dd88918f
[llvm] Use llvm::iterator_range::empty (NFC) (#151905) 2025-08-04 07:40:46 -07:00
Nikita Popov
549990124d
[llvm-reduce] Do not replace lifetime pointer arg with zero/one/poison (#151697)
The lifetime argument is now required to be an alloca, so we should not
try to replace it with a constant.

We also shouldn't try to change the size argument to zero/one, similar
to how we avoid zero-size allocas.
2025-08-04 09:06:38 +02:00
Kazu Hirata
3412735d29
[llvm-readobj] Remove an unnecessary cast (NFC) (#151851)
Addr is already of const uint8_t *.
2025-08-03 08:44:57 -07:00
S. VenkataKeerthy
21f1f9558d
[IR2Vec][llvm-ir2vec] Changing clEnumValN to cl::SubCommand (#151384)
Refactor llvm-ir2vec to use subcommands instead of a mode flag for better CLI usability.

- Converted the `--mode` flag to three distinct subcommands: `triplets`, `entities`, and `embeddings`
- Updated documentation, tests, and python script
2025-08-02 13:44:55 -07:00
Kazu Hirata
8bc2c7ceb3
[llvm-objdump] Remove an unnecessary cast (NFC) (#151799)
Size is already of uint32_t.
2025-08-02 08:09:26 -07:00
jeremyd2019
65990d6148
[lli] Fix crash with --no-process-syms on MinGW (#151386)
In this case, `J->getProcessSymbolsJITDylib()` returns a NULL pointer.
In order to make sure `__main` is still defined, add the symbol to
`J->getMainJITDylib()` instead in that case. This returns a reference
and thus cannot be NULL.

Fixes #143080
2025-08-01 18:16:07 -07:00
Nikita Popov
c4c0a59741
[llvm-reduce] Do not convert lifetime operand to argument (#151694)
The lifetime argument is now required to be an alloca, so do not try to
convert it to a function argument.

The reduction is now going to leave behind an unused alloca with
lifetime markers, which should be cleaned up separately.

I'd say this fixes https://github.com/llvm/llvm-project/issues/93713. It
doesn't remove the lifetime markers as the issue suggests, but at least
they're now not going to be on the argument.
2025-08-01 15:34:52 +02:00
Artem Belevich
4e596fc285
[ELF] handle new NVIDIA GPU variants. (#151604) 2025-07-31 17:21:40 -07:00
Aakanksha Patil
d6c85fc9ab
Reapply "Allow "[[FLAGS=<none>]]" value in the ELF Fileheader Flags field (#143845)" (#151094)
This fixes the issues with 0b054e2

This reverts commit b80ce054206db223ec8c3cd55fad510c97afbc9f.
2025-07-31 14:11:06 -07:00
Daniel Paoliello
4adce336f4
[win][arm64ec] Fixes to unblock building LLVM and Clang as Arm64EC (#150068)
These changes allow LLVM and Clang to be built with Clang targeting
Arm64EC using the MSVC linker.

Built with these options:
```
-DLLVM_ENABLE_PROJECTS="clang"
-DLLVM_HOST_TRIPLE=arm64ec-pc-windows-msvc
-DCMAKE_C_COMPILER=clang-cl.exe
-DCMAKE_C_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_CXX_COMPILER=clang-cl.exe
-DCMAKE_CXX_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_LINKER_TYPE=MSVC
```
2025-07-31 09:30:05 -07:00
Rahul Joshi
4f39139df3
[llvm-mc] Add --runs option for benchmarking (#151149)
Add support for measuring decode times in llvm-mc tool. Add command line
options to enable time-trace profiling (similar to llc or opt) and add
option `runs` to run the decoder several times on each instruction.
2025-07-30 09:05:46 -07:00
Pavel Labath
ded255e56e
[Object] Parsing and dumping of SFrame FDEs (#149828)
Also known as Function Description Entries. The entries occupy a
contiguous piece of the section, so the code is mostly straight-forward.

For more information about the SFrame unwind format, see the
[specification](https://sourceware.org/binutils/wiki/sframe) and the
related [RFC](https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900).
2025-07-30 11:23:35 +02:00
kkent030315
32127045c8
[llvm-readobj][COFF] Add support for more CET and hotpatch flags (#150967)
- Added `IMAGE_DLL_CHARACTERISTICS_EX_CET_COMPAT_STRICT_MODE`
- Added
`IMAGE_DLL_CHARACTERISTICS_EX_CET_SET_CONTEXT_IP_VALIDATION_RELAXED_MODE`
- Added
`IMAGE_DLL_CHARACTERISTICS_EX_CET_DYNAMIC_APIS_ALLOW_IN_PROC_ONLY`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_CET_RESERVED_1`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_CET_RESERVED_2`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_FORWARD_CFI_COMPAT`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_HOTPATCH_COMPATIBLE`
2025-07-30 00:51:57 +03:00
S. VenkataKeerthy
130f24b28d
[IR2Vec][llvm-ir2vec] Revamp triplet generation and add entity mapping mode (#149214)
Add entity mapping mode to llvm-ir2vec and improve triplet generation format for knowledge graph embedding training.

This change streamlines the workflow for training the vocabulary embeddings with IR2Vec by:
1. Directly generating numeric IDs instead of requiring string-to-ID preprocessing
2. Providing entity mappings in standard knowledge graph embedding format
3. Structuring triplet output in train2id format compatible with knowledge graph embedding frameworks
4. Adding metadata headers to simplify post-processing and training setup

These improvements make IR2Vec more compatible with standard knowledge graph embedding training pipelines and reduce the preprocessing steps needed before training.

See #149215 for more details on how it is used.

(Tracking issues - #141817, #141834)
2025-07-29 11:56:52 -07:00
jeremyd2019
28b3190053
[LLVM][Cygwin] Enable conditions that are shared with MinGW (#149638)
Cygwin and MinGW share the auto import behavior that could result in
__stack_check_guard being non-dso-local. Allow windres to assume a
Cygwin target as well as a MinGW one, so defines like _WIN32 would not
be present on Cygwin.
2025-07-29 10:01:04 -07:00
Kazu Hirata
e874615a62
[llc] Remove an unnecessary cast (NFC) (#151085)
getObjFileLowering() already returns TargetLoweringObjectFile *.
2025-07-29 08:19:09 -07:00
Davide Grohmann
0121a8e431
Reland "[mlir][spirv] Fix int type declaration duplication when serializing" (#145687)
This relands PRs #143108 and #144538.

The original PR was reverted due to a mistake that made all the mlir
tests run only if SPIRV target was enabled. This is now resolved since
enabling spirv-tools does not required SPIRV target any longer.

spirv-tools are not required by default to run SPIRV mlir tests, but
they can be optionally enabled in some SPIRV mlir test to verify that
the produced SPIRV assembly pass validation.

The other reverted PR #144685 is not longer needed and not part of this
relanding.

Original commit message:

> At the MLIR level unsigned integer and signless integers are different
types. Indeed when looking up the two types in type definition cache
they do not match.
> Hence when translating a SPIR-V module which contains both usign and
signless integers will contain the same type declaration twice
(something like OpTypeInt 32 0) which is not permitted in SPIR-V and
such generated modules fail validation.
> This patch solves the problem by mapping unisgned integer types to
singless integer types before looking up in the type definition cache.

---------

Signed-off-by: Davide Grohmann <davide.grohmann@arm.com>
2025-07-28 12:34:30 -04:00
Fangrui Song
f517ac2083 MCSectionCOFF: Avoid cast
The object file format specific derived classes are used in context like
MCStreamer and MCObjectTargetWriter where the type is statically known.
We don't use isa/dyn_cast and we want to eliminate
MCSection::SectionVariant in the base class.
2025-07-26 10:04:04 -07:00
Mircea Trofin
931228e28f
[PGO] Drive profile validator from opt (#147418)
Add option to `opt` to run the `ProfileInjectorPass` before the passes opt would run, and then `ProfileVerifierPass` after. This will then be a mode in which we run tests on a specialized buildbot, with the goal of finding passes that drop (and, later, corrupt) profile information.
2025-07-26 16:14:00 +02:00
Jim Lin
fd86b2e26c
[RISCV][llvm-exegesis] Add missing operand frm for FCVT_D_W (#149989)
We encountered the index of operand out of bounds crash because FCVT_D_W
lacks frm operand.
2025-07-24 08:53:09 +08:00