This patch adds a Clang-compatible -mtune option to llc, to enable
decoupled ISA and microarchitecture targeting, which is especially
important for backend development. For example, it can enable to easily
test a subtarget feature or scheduling model effects on codegen across a
variaty of workloads on the IR corpus benchmark:
https://github.com/dtcxzyw/llvm-codegen-benchmark.
The implementation adds an isolated generic codegen flag, to establish a
base for wider usage - the plan is to add it to `opt` as well in a
followup patch. Then `llc` consumes it, and sets `tune-cpu` attributes
for functions, which are further consumed by the backend.
While the functionality of this flag is obvious in the implementation,
tool users may not know what it does with the short description
provided. Notably, it is not obvious from the short description that:
* Functions provided will be converted to internal linkage (and thus
discarded if unused) even if unreferenced.
* Functions in the first file will not be internalized, even if
referenced by a later one.
The Rust for Linux project has [found use for this
flag](https://lore.kernel.org/all/20251202-inline-helpers-v1-0-879dae33a66a@google.com/)
to support inlining `static inline` functions in C into code compiled by
Rust when `rustc` and `clang` share a LLVM.
Reverts llvm/llvm-project#182532 to unblock CI.
The original patch causes some test failures related to undef bits, as
it incorrectly assumes `std::uniform_int_distribution` returns the same
result with different C++ stdlib vendors.
BPF users expect to see basic block labels (e.g. <L0>, <L1>) in
disassembly output
(https://github.com/llvm/llvm-project/pull/95103#issuecomment-3771234810).
Default --symbolize-operands to on for BPF targets when neither
--symbolize-operands nor --no-symbolize-operands is explicitly
specified.
Add --no-symbolize-operands to allow users to opt out.
# Motivation
When using `dsymutil` to generate dSYM for large binaries, sometimes we
want to keep only some of the object files. This has the benefits of
reduced dSYM size and improved performance of LLDB and other tools which
consume the dSYM.
The current way to achieve this is to use YAML input (the `dsymutil -y`
option). However, it doesn't really solve the problem:
1. The whole debug map has to be parsed somewhere else first, before
filtered down to the wanted parts.
2. Said info is written then read by `dsymutil`. The I/O is redundant.
# Change
This patch propose a new way, by adding new options (`dsymutil --allow
<path>` and `--disallow <path>`), which will only process object files
that pass the allow/disallow list.
Important details:
* The input file is YAML. See test files for format.
* Currently, only object files are filtered. The design allows to extend
the filtering to other aspects of the debug map in the future (e.g.
functions).
* If `--oso-prepend-path` is specified, it applies to the entries in the
input file. I.e. Entries in the input file should exact match that of
N_OSO entries.
For crash reduction, I don't think it does anything that llvm-reduce
can't. Pass pipeline reduction also has a separate reduction script.
The main thing there isn't a replacement tool is the miscompilation
reducer, but I'm not sure that's actually functioned for years.
There are still some references to bugpoint in various comments
and pieces of documentation that don't all necessarily make sense
to replace or remove. In particular there are a few passes documented
as "only for bugpoint", but I've left those alone in case they are
useful for manual reductions.
This patch is a reland of #157499
Introduce a new flag --call-graph-info which outputs call graph
information in the ELF call graph section in JSON or LLVM style.
As far as I can tell there are 2 parallel plugin mechanisms.
opt -load=plugin does not work, and is ignored. opt -load-pass-plugin
does work. PluginLoader.h forces a static definition of the "load"
cl::opt into included TUs. Delete the cases with no tests.
This patch implements the initial support for upstreaming
[llubi](https://github.com/dtcxzyw/llvm-ub-aware-interpreter). It only
provides the minimal functionality to run a simple main function. I hope
we can focus on the interface design in this PR, rather than trivial
implementations for each instruction.
RFC link:
https://discourse.llvm.org/t/rfc-upstreaming-llvm-ub-aware-interpreter/89645
Excluding the driver `llubi.cpp`, this patch contains three components
for better decoupling:
+ `Value.h/cpp`: Value representation
+ `Context.h/cpp`: Global state management (e.g., memory) and
interpreter configuration
+ `Interpreter.cpp`: The main interpreter loop
Compared to the out-of-tree version, the major differences are listed
below:
+ The interpreter logic always returns the control to its caller, i.e.,
it never calls `exit/abort` when immediate UBs are triggered.
+ `EventHandler` provides an interface to dump the trace. It also allows
callers to inspect the actual value and verify the correctness of
analysis passes (e.g, KnownBits/SCEV).
+ The context is designed to be reentrant. That is, you can call
`runFunction` multiple times. But its usefulness remains in doubt due to
side effects made by previous calls.
+ `runFunction` handles function calls with a loop, instead of calling
itself recursively. This makes it no longer bounded by the stack depth.
+ Uninitialized memory is planned to be approximated by returning random
values each time an uninitialized byte is loaded.
Currently the default strip-all behavior is to remove sections known
to LLVM but leave others. Now that the standard specifies the section
name
"metadata.code.*" as used for compiler annotations interpreted by Wasm
engines, we can more confidently give strip its more conventional
behavior
of removing everything that won't be used by the engine.
There was no real benefit to disallowing this, and it sometimes caused
unnecessary churn in the RUN lines of tests which were updated from
single-to-multiple or multiple-to-single prefixes.
This effectively makes -check-prefixes the primary option and
-check-prefix just an alias of it. The documentation is upated
accordingly.
This adds support for `--symbolize-operands`, so that local references
are turned back into labels by objdump, which makes it easier to tell
what is going on with a linked object.
When using `--symbolize-operands`, branch target addresses are not
printed, only the referenced symbol is printed, and the address is
elided:
```
# Without --symbolize-operands
0: 04a05263 blez a0, 0x44 <.text+0x44>
...
40: fd1ff06f j 0x10 <.text+0x10>
44: 00000613 li a2, 0x0
# With --symbolize-operands
0: 04a05263 blez a0, <L3>
...
40: fd1ff06f j <L0>
<L3>:
44: 00000613 li a2, 0x0
```
Patch 1 of 3 to add to llvm-dwarfdump the ability to measure DWARF
coverage of local variables in terms of source lines, as discussed in
[this
RFC](https://discourse.llvm.org/t/rfc-debug-info-coverage-tool-v2/83266).
This patch adds the basic variable coverage implementation. By default,
inlined instances are shown separately (displaying the full inlining
chain). Alternatively, a combined view that averages across all inlined
instances can be returned using `--combine-instances`.
In this patch, we simply print a count of source lines over which each
variable is covered. Later patches in the series will add the comparison
against a baseline.
Example output:
```
$ llvm-dwarfdump --show-variable-coverage somefile
Variable coverage statistics:
Function InlChain Variable Decl LinesCovered
foo bar path/to/somefile.h:54 3
foo path/to/someotherfile.c:32 bar path/to/somefile.h:54 2
foo baz main.c:76 9
```
```
$ llvm-dwarfdump --show-variable-coverage somefile --combine-instances
Variable coverage statistics:
Function InstanceCount Variable Decl LinesCovered
foo 2 bar path/to/somefile.h:54 2.5
foo 1 baz main.c:76 9
```
This reapplies #169646, fixing some ambiguous overloads that caused
several bots to fail.
Patch 1 of 3 to add to llvm-dwarfdump the ability to measure DWARF
coverage of local variables in terms of source lines, as discussed in
this RFC:
https://discourse.llvm.org/t/rfc-debug-info-coverage-tool-v2/83266)
This patch adds the basic variable coverage implementation. By default,
inlined instances are shown separately (displaying the full inlining
chain). Alternatively, a combined view that averages across all inlined
instances can be returned using `--combine-instances`.
In this patch, we simply print a count of source lines over which each
variable is covered. Later patches in the series will add the comparison
against a baseline.
This PR adds support for selecting specific archive members in
llvm-symbolizer using the `archive.a(member.o)` syntax, with
architecture-aware member selection.
**Key features:**
1. **Archive member selection syntax**: Specify archive members using
`archive.a(member.o)` format
2. **Architecture selection via `--default-arch` flag**: Select the
appropriate member when multiple members have the same name but
different architectures
3. **Architecture selection via `:arch` suffix**: Alternative syntax
`archive.a(member.o):arch` for specifying architecture
This functionality is primarily designed for AIX big archives, which can
contain multiple members with the same name but different architectures
(32-bit and 64-bit). However, the implementation works with all archive
formats (GNU, BSD, Darwin, big archive) and handles same-named members
created with llvm-ar q.
---------
Co-authored-by: Midhunesh <midhuensh.p@ibm.com>
This reverts commit 3847648e84d2ff5194f605a8a9a5c0a5e5174939.
Relands https://github.com/llvm/llvm-project/pull/158043 which got
auto-merged on a revision which wasn't approved.
The only addition to the approved version was that we adjust how we set
the time for failed tests. We used to just assign it the negative value
of the elapsed time. But if the test failed with `0` seconds (which some
of the new tests do), we would mark it `-0`. But the check for whether
something failed checks for `time < 0`. That messed with the new
`--filter-failed` option of this PR. This was only an issue on Windows
CI, but presumably can happen on any platform. Happy to do this in a
separate PR.
---- Original PR
This patch adds a new --filter-failed option to llvm-lit, which when
set, will only run the tests that have previously failed.
Reverts llvm/llvm-project#158043
This was approved for earlier revisions but the tests were failing on
Windows. I pushed a speculative fix and that fixed the CI, which caused
auto-merge to merge the PR. But I'd like to have approval for the latest
revision. So reverting for now and resubmitting a new PR
This patch adds a Clang-compatible --save-stats option to opt, to
provide an easy to use way to save LLVM statistics files when working
with opt on the middle end.
This is a follow up on the addition to `llc`:
https://github.com/llvm/llvm-project/pull/163967
Like on Clang, one can specify --save-stats, --save-stats=cwd, and
--save-stats=obj with the same semantics and JSON format. The
pre-existing --stats option is not affected.
The implementation extracts the flag and its methods into the common
`CodeGen/CommandFlags` as `LLVM_ABI`, using a new registration class to
conservatively enable opt-in rather than let all tools take it. Its only
needed for llc and opt for now. Then it refactors llc and adds support
for opt.
This patch adds a Clang-compatible `--save-stats` option, to provide an
easy to use way to save LLVM statistics files when working with llc on
the backend.
Like on Clang, one can specify `--save-stats`, `--save-stats=cwd`, and
`--save-stats=obj` with the same semantics and JSON format.
The implementation uses 2 methods `MaybeEnableStats` and
`MaybeSaveStats` called before and after `compileModule` respectively
that externally own the statistics related logic, while `compileModule`
is now required to return the resolved output filename via an output
param.
Note: like on Clang, the pre-existing `--stats` option is not affected.
This patch adds a new option `--child-tags` (`-t` for short), which
makes dwarfdump only dump children whose DWARF tag is in the list of
tags specified by the user.
Motivating examples are:
* dumping all global variables in a CU
* dumping all non-static data members of a structure
* dumping all module import declarations of a CU
* etc.
For tags not known to dwarfdump, we pretend that the tag wasn't
specified.
Note, this flag only takes effect when `--show-children` is set (either
explicitly or implicitly). We error out when trying to use the flag
without dumping children.
Example:
```
$ builds/release/bin/llvm-dwarfdump -t DW_TAG_structure_type a.out.dSYM
...
0x0000000c: DW_TAG_compile_unit
DW_AT_producer ("clang version 22.0.0git (git@github.com:Michael137/llvm-project.git 737da3347c2fb01dd403420cf83e9b8fbea32618)")
DW_AT_language (DW_LANG_C11)
...
0x0000002a: DW_TAG_structure_type
DW_AT_APPLE_block (true)
DW_AT_byte_size (0x20)
0x00000067: DW_TAG_structure_type
DW_AT_APPLE_block (true)
DW_AT_name ("__block_descriptor")
DW_AT_byte_size (0x10)
...
```
```
$ builds/release/bin/llvm-dwarfdump -t DW_TAG_structure_type -t DW_TAG_member a.out.dSYM
...
0x0000000c: DW_TAG_compile_unit
DW_AT_producer ("clang version 22.0.0git (git@github.com:Michael137/llvm-project.git 737da3347c2fb01dd403420cf83e9b8fbea32618)")
DW_AT_language (DW_LANG_C11)
DW_AT_name ("macro.c")
...
0x0000002a: DW_TAG_structure_type
DW_AT_APPLE_block (true)
DW_AT_byte_size (0x20)
0x0000002c: DW_TAG_member
DW_AT_name ("__isa")
DW_AT_type (0x00000051 "void *")
DW_AT_data_member_location (0x00)
0x00000033: DW_TAG_member
DW_AT_name ("__flags")
DW_AT_type (0x00000052 "int")
DW_AT_data_member_location (0x08)
0x0000003a: DW_TAG_member
DW_AT_name ("__reserved")
DW_AT_type (0x00000052 "int")
DW_AT_data_member_location (0x0c)
0x00000041: DW_TAG_member
DW_AT_name ("__FuncPtr")
DW_AT_type (0x00000056 "void (*)(int)")
DW_AT_data_member_location (0x10)
0x00000048: DW_TAG_member
DW_AT_name ("__descriptor")
DW_AT_type (0x00000062 "__block_descriptor *")
DW_AT_alignment (8)
DW_AT_data_member_location (0x18)
0x00000067: DW_TAG_structure_type
DW_AT_APPLE_block (true)
DW_AT_name ("__block_descriptor")
DW_AT_byte_size (0x10)
0x0000006a: DW_TAG_member
DW_AT_name ("reserved")
DW_AT_type (0x00000079 "unsigned long")
DW_AT_data_member_location (0x00)
0x00000071: DW_TAG_member
DW_AT_name ("Size")
DW_AT_type (0x00000079 "unsigned long")
DW_AT_data_member_location (0x08)
...
```
The default behavior is to _not_ copy such swiftmodules into the dSYM,
as perviously implemented in 96f95c9d89d8a1784d3865fa941fb1c510f4e2d7.
This patch adds the option to override the behavior, so that such
swiftmodules can be copied into the dSYM.
This is useful when the dSYM will be used on a machine which has a
different Xcode/SDK than where the swiftmodules were built. Without
this, when LLDB is asked to "p/po" a Swift variable, the underlying
Swift compiler code would rebuild the dependent `.swiftmodule` files of
the Swift stdlibs, which takes ~1 minute in some cases.
See PR for tests.
It looks like the documentation for `llvm-cxxfilt`'s
`--[no-]strip-underscore` options weren't updated when
https://github.com/llvm/llvm-project/pull/106233 was made.
CC @Michael137 (I don't have merge rights myself).
%t is currently documented as:
temporary file name unique to the test
https://llvm.org/docs/CommandGuide/lit.html#substitutions
Which I take to mean if the path is a/b/c/tempfile, then %t would be
tempfile. It is not, it's the whole path.
(which is hinted at by %basename_t, but why would you read that if you
didn't need to use it)
As seen in #164396 this can create confusion when people use it as if it
were just the file name.
Make it clear in the docs that this is a unique path, which can be used
to make files or folders.
Add support for Machine IR (MIR) triplet and entity generation in llvm-ir2vec.
This change extends llvm-ir2vec to support Machine IR (MIR) in addition to LLVM IR, enabling the generation of training data for MIR2Vec embeddings. MIR2Vec provides machine-level code embeddings that capture target-specific instruction semantics, complementing the target-independent IR2Vec embeddings.
- Extended llvm-ir2vec to support triplet and entity generation for Machine IR (MIR)
- Added `--mode=mir` option to specify MIR mode (vs LLVM IR mode)
- Implemented MIR triplet generation with Next and Arg relationships
- Added entity mapping generation for MIR vocabulary
- Updated documentation to explain MIR-specific features and usage
(Partially addresses #162200 ; Tracking issue - #141817)
Add MIR2Vec support to the llvm-ir2vec tool, enabling embedding generation for Machine IR alongside the existing LLVM IR functionality.
(This is an initial integration; Other entity/triplet gen for vocab generation would follow as separate patches)
Summary:
This tool is pretty much a generic interface into creating and managing
the offloading binary format. The binary format itself is just a fat
binary block used to create heterogeneous objects. This should be made
more general than just `clang` since it's likely going to be used for
larger offloading projects and is the expected way to extract
heterogeneous objects from offloading code.
Relatively straightforward rename, a few tweaks and documentation
changes. Kept in `clang-offload-packager` for legacy compatibility as we
looked this tool up by name in places, will probably delete it next
release.
Utilize new extensions to LLVM Offloading API to
handle offloading fatbin Bundles.
The tool will output a list of available offload bundles
using URI syntax.
---------
Co-authored-by: dsalinas_amdeng <david.salinas@amd.com>
Do not include the ``__PAGEZERO`` segment when calculating size information
for Mach-O files when `--exclude-pagezero` is used. The ``__PAGEZERO``
segment is a virtual memory region used for memory protection that does not
contribute to actual size, and excluding can provide a better representation of
actual size.
Fixes#86644
This patch removes support for %T from llvm-lit. For now we mark the
test unresolved and add an error message noting the substitution is
deprecated. This is exactly the same as the error handling for other
substitution failures. We intend to remove support for the nice error
message once 22 branches as users should have moved over by the they are
upgrading to v23.
Reviewers: petrhosek, jh7370, ilovepi, pogo59, cmtice
Reviewed By: cmtice, jh7370, ilovepi
Pull Request: https://github.com/llvm/llvm-project/pull/160028
Currently MCA takes instruction properties from scheduling model.
However, some instructions may execute differently depending on external
factors - for example, latency of memory instructions may vary
differently depending on whether the load comes from L1 cache, L2 or
DRAM. While MCA as a static analysis tool cannot model such differences
(and currently takes some static decision, e.g. all memory ops are
treated as L1 accesses), it makes sense to allow manual modification of
instruction properties to model different behavior (e.g. sensitivity of
code performance to cache misses in particular load instruction). This
patch addresses this need.
The library modification is intentionally generic - arbitrary
modifications to InstrDesc are allowed. The tool support is currently
limited to changing instruction latencies (single number applies to all
output arguments and MaxLatency) via coments in the input assembler
code; the format is the like this:
add (%eax), eax // LLVM-MCA-LATENCY:100
Users of MCA library can already make additional customizations; command
line tool can be extended in the future.
Note that InstructionView currently shows per-instruction information
according to scheduling model and is not affected by this change.
See https://github.com/llvm/llvm-project/issues/133429 for additional
clarifications (including explanation why existing customization
mechanisms do not provide required functionality)
---------
Co-authored-by: Min-Yih Hsu <min@myhsu.dev>
This patch adds a new %{readfile:<file name>} substitution to lit. This
is needed for porting a couple of tests to lit's internal shell. These
tests are all using subshells to pass some option to a command are not
feasible to run within the internal shell without this functionality.
Reviewers: petrhosek, jh7370, ilovepi, cmtice
Reviewed By: jh7370, cmtice
Pull Request: https://github.com/llvm/llvm-project/pull/158441
- The output for `--output-sort=id` matches `--output-sort=offset` for
the available readers. Tests were updated accordingly.
- For `--output-sort=none`, and per `LVReader::sortScopes()`,
`LVScope::sort()` is called on the root scope.
`LVScope::sort()` has no effect if `getSortFunction() == nullptr`, and
thus the elements are currently traversed in the order in which they
were initially added. This should change, however, after
`LVScope::Children` is removed.
This pr adds the `extract-section` option to `llvm-objcopy` as a common
option. It differs from `dump-section` as it will produce a standalone
object with just one section, as opposed to just the section contents.
For more context as to other options considered, see
https://github.com/llvm/llvm-project/pull/153265#issuecomment-3195696003.
This difference in behaviour is used for DXC compatibility with
`extract-rootsignature` and `/Frs`.
This pr then implements this functionality for `DXContainer` objects.
This is the second step of
https://github.com/llvm/llvm-project/issues/150277 to implement as a
compiler action that invokes `llvm-objcopy` for functionality.
This also completes the implementation of `extract-rootsignature` as
described in https://github.com/llvm/llvm-project/issues/149560.