750 Commits

Author SHA1 Message Date
Fangrui Song
6ef632ad36 [BOLT,test] Fix lsda.ldscript when MAXPAGESIZE>=0x10000
The intention is to check a section name different from
.gcc_except_table . Rather than using a linker script, use llvm-objcopy
--rename-section instead.
2024-06-03 13:18:58 -07:00
Sayhaan Siddiqui
598f37bb27
[BOLT][DWARF][NFC] Add split-dwarf4 test with multiple CUs (#93741)
Adds a split-dwarf test for DWARF4 with multiple CUs.
2024-06-01 08:14:41 -07:00
Fangrui Song
0353f6abdd [BOLT][test] Use correct normalized triple
bolt/test/lit.local.cfg wants to use the system GCC installation but it
specifies a wrong triple ("linux" instead of "linux-gnu") and relies on
clangDriver's loose GCC installation detection to pick up "*-linux-gnu".

This loose behavior may not work. Use "linux-gnu" instead.

Note: neither "linux" nor "linux-gnu" detects "linux-musl" triples, so
these tests currently fail on musl based systems.

Other files changes are cosmetic.
2024-05-31 17:50:58 -07:00
Sayhaan Siddiqui
a585446110
[BOLT][DWARF][NFC] Fix formatting issue in DWARF4 split-dwarf test (#93747)
Remove double escape characters before a RUN in a test.
2024-05-31 15:16:41 -07:00
Sayhaan Siddiqui
8d239d7fdf
[BOLT][DWARF][NFC] Fix formatting issue in DWARF5 split-dwarf test (#93746)
Remove double escape characters before a RUN in a test.
2024-05-31 15:16:21 -07:00
Sayhaan Siddiqui
278b396465
[BOLT][DWARF][NFC] Add tests with multiple CUs (#93615)
Adds DWARF4 and DWARF5 tests with multiple CUs.
2024-05-31 15:15:40 -07:00
Sayhaan Siddiqui
11791ae7b0
[BOLT][DWARF][NFC] Added double escape characters (#93348)
Added double escape characters to lines that describe a test.
2024-05-31 15:14:37 -07:00
Amir Ayupov
e9954ec087 [BOLT] Detect .warm split functions as cold fragments (#93759)
CDSplit splits functions up to three ways: main fragment with no suffix,
and fragments with .cold and .warm suffixes.

Add .warm suffix to the regex used to recognize split fragments.

Test Plan: updated register-fragments-bolt-symbols.s
2024-05-30 17:48:12 -07:00
Fangrui Song
ce5b371606
[BOLT,test] Make linker scripts less sensitive to lld's orphan placement (#93763)
Then two tests rely on .interp being the first section.
llvm-bolt would crash if lld places .interp after .got
(f639b57f7993cadb82ee9c36f04703ae4430ed85).

For best portability, when a linker scripts specifies a SECTIONS
command, the first section for each PT_LOAD segment should be specified
with a MAXPAGESIZE alignment. Otherwise, linkers have freedom to decide
how to place orphan sections, which might break intention.
2024-05-30 10:12:41 -07:00
Michael Kruse
c5a3f664fe
[BOLT] Revise IDE folder structure (#89742)
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode (`set_property(TARGET <target>
PROPERTY FOLDER "<title>")`) when using the respective CMake's IDE
generator.

 * Ensure that every target is in a folder
 * Use a folder hierarchy with each LLVM subproject as a top-level folder
 * Use consistent folder names between subprojects
 * When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
2024-05-25 17:15:37 +02:00
Alexander Yermolovich
8c2da89ec4
[BOLT] Do not emit debug_names entry for DIEs with DW_AT_declaration (#93347)
Previously BOLT was only doing it for DW_TAG_variables. It looks like
other type of DIEs can have this. So making it global.
2024-05-25 07:48:57 -07:00
Amir Ayupov
d1d9545ed3
[BOLT][BAT] Add entries for deleted basic blocks
Deleted basic blocks are required for correct mapping of branches
modified by SCTC.

Increases BAT size, bytes:
- large binary: 8622496 -> 8703244.
- small binary (X86/bolt-address-translation.test): 928 -> 940.

Test Plan: updated bb-with-two-tail-calls.s

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/91906
2024-05-23 19:19:07 -07:00
Amir Ayupov
97025bd9d5
[BOLT] Use getLocationName in YAMLProfileWriter (#92493)
Disambiguate local functions using the containing file symbol in BAT
mode. Make local function naming consistent across BAT fdata and YAML
profiles.

Test Plan: updated register-fragments-bolt-symbols.s
2024-05-21 20:24:46 -07:00
Amir Ayupov
935b946b1f
[BOLT] Process cross references between ignored functions in BAT mode (#92484)
To align YAML and fdata profiles produced in BAT mode, lift two
restrictions applied in non-relocation mode when BAT is present:
1) register secondary entry points from ignored functions,
2) treat functions with secondary entry points as simple.

This allows constructing CFG for non-simple functions in non-relocation
mode and emitting YAML profile for them, which can then be used for
optimizations in relocation mode.

Test Plan: added test ignored-interprocedural-reference.s
2024-05-21 20:22:12 -07:00
Amir Ayupov
a9b67490b2
[BOLT] Report adjusted program stats from perf2bolt in BAT mode (#91683) 2024-05-21 18:54:15 -07:00
Amir Ayupov
bb627b0a0c
[BOLT] Ignore special symbols as function aliases in updateELFSymbolTable
Exempt special symbols (hot text/data and _end symbol) from normal
handling. We only need to set their value and make them absolute.

If these symbols are handled as normal symbols and if they alias
functions we may create non-sensical symbols, e.g. __hot_start.cold.

Test Plan: updated hot-end-symbol.s

Reviewers: maksfb, rafaelauler, ayermolo, dcci

Reviewed By: dcci, maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/92713
2024-05-20 16:55:11 -07:00
Amir Ayupov
91423d7193
[BOLT][NFC] Don't assign YAML profile to functions with no CFG (#92487)
YAML profile for non-simple functions without CFG is
  1) useless for optimizations,
  2) can't be attached, similar to fdata profile,
  3) would be reported as invalid/stale even if the profile is valid.

Don't attempt to attach the profile in this case, aligning the behavior
to DataReader.

Test Plan: added yaml-non-simple.test
2024-05-19 20:15:31 -07:00
Amir Ayupov
878642954f
[BOLT] Fix preserved offset in fixDoubleJumps (#92485) 2024-05-19 13:23:04 -07:00
Alexander Yermolovich
99fad7ebd8
[BOLT][DWARF] Update DW_AT_comp_dir/DW_AT_dwo_name for DWO TUs (#91486)
Type unit DIE generated by clang contains DW_AT_comp_dir/DW_AT_dwo_name.
This was added to clang to help LLDB to figure out where type unit come
from when accessing an entry in a .debug_names accelerator table and
type units in .dwp file.

When BOLT writes out .dwo files it changes the name of them. User can
also specify directory of where they can be written out. Added support
to BOLT to update those attributes.
2024-05-14 15:08:45 -07:00
Amir Ayupov
b06f97b039
[BOLT] Allow pass-through blocks in YAMLProfileReader (#91828) 2024-05-13 18:02:38 -07:00
Amir Ayupov
4ecf2caf68
[BOLT] Use aggregated FuncBranchData in writeBATYAML
Switch from FuncBranchData intermediate maps (Intra/InterIndex)
to aggregated Data, same as one used by DataReader:
e62ce1f884/bolt/lib/Profile/DataReader.cpp (L385-L389)
This aligns the order of the output between YAMLProfileWriter and
writeBATYAML.

Test Plan: updated bolt-address-translation-yaml.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: ayermolo, maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/91289
2024-05-13 14:23:32 -07:00
Amir Ayupov
54b17fa4ee [BOLT] Preserve Offset annotation in fixDoubleJumps (#91898)
Offset annotation was missed when optimizing an unconditional branch to
a tail call.

Test Plan: update bb-with-two-tail-calls.s
2024-05-13 12:31:48 -07:00
Amir Ayupov
b5af667b01
[BOLT] Map branch source address to the containing basic block in BAT YAML
Fix an issue where the profile for all branches that have a BRANCHENTRY
is dropped. If the branch has an entry in BAT, it will be translated to
its input offset. We used to only permit the basic block offset as a
branch source. Perform a lookup of containing basic block instead.

Test Plan: Updated bolt-address-translation-yaml.test

Reviewers: maksfb, dcci, rafaelauler, ayermolo

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/91273
2024-05-12 17:11:09 -07:00
Maksim Panchenko
c8864bceeb
[BOLT] Fix race condition in a test (#91866)
Fix race condition in internal NFC test.
2024-05-11 23:59:57 -07:00
Amir Ayupov
4f127667ca
[BOLT] Set entry counts in BAT YAML profile (#91775)
Align with DataReader::readProfile that sets entry block counts from
FuncBranchData->EntryData.

Test Plan: updated bolt-address-translation-yaml.test
2024-05-10 22:23:45 -07:00
Amir Ayupov
bbcdd4f4b2
[BOLT] Use disambiguated local names in BAT YAML
Align BAT YAML to fdata profile.

Test Plan: updated register-fragments-bolt-symbols.s

Reviewers: dcci, rafaelauler, ayermolo, maksfb

Reviewed By: dcci

Pull Request: https://github.com/llvm/llvm-project/pull/91773
2024-05-10 22:18:50 -07:00
Amir Ayupov
6b9bca8faa
[BOLT] Preserve Offset annotation in SCTC (#91693)
Offset annotation is used in writing BAT tables.

Test Plan: updated sctc-bug4.test
2024-05-10 13:20:51 -07:00
Maksim Panchenko
73a01448c7
[BOLT] Add test case for PIC fixed indirect jump (#91547)
A compiler can generate a redundant indirection for a jump via a fixed
jump table target. Add a test case that covers such pattern that covers
PIC case. We already have non-PIC case detection.

Currently XFAIL.
2024-05-08 17:56:44 -07:00
Amir Ayupov
db29f20fdd
[BOLT] Ignore returns in DataAggregator
Returns are ignored in perf/pre-aggregated/fdata profile reader (see
DataReader::convertBranchData). They are also omitted in
YAMLProfileWriter by virtue of not having the profile attached to them
in the reader, and YAMLProfileWriter converting the profile attached to
BinaryFunctions. Thus, return profile is universally ignored across all
profile types except BAT YAML.

To make returns ignored for YAML produced in BAT mode, we can:
1) ignore them in YAMLProfileReader,
2) omit them from YAML profile in profile conversion/writing.

The first option is prone to profile staleness issue, where the profiled
binary doesn't match the one to be optimized, and thus returns in the
profile can no longer be reliably detected (as we don't distinguish them
from calls in the profile).

The second option is robust to staleness but requires disassembling the
branch source instruction.

Test Plan: Updated bolt-address-translation-yaml.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/90807
2024-05-08 12:02:18 -07:00
Maksim Panchenko
ff0c5ccbe8
[BOLT] Add a test for BOLT-reserved space in a binary (#91399)
Test case for #90300.
2024-05-07 16:05:10 -07:00
Maksim Panchenko
99b4532b8b
[BOLT] Add support for Linux kernel .smp_locks section (#90798)
Parse .smp_locks section entries and create fixups that are going to be
used to update the section before the binary emission.
2024-05-02 13:08:37 -07:00
Maksim Panchenko
59ab29213d
[BOLT] Register Linux kernel dynamic branch offsets (#90677)
To match profile data to code we need to know branch instruction offsets
within a function. For this reason, we mark branches with the "Offset"
annotation while disassembling the code. However, _dynamic_ branches in
the Linux kernel could be NOPs in disassembled code, and we ignore them
while adding annotations. We need to explicitly add the "Offset"
annotation while creating dynamic branches.

Note that without this change, `getInstructionAtOffset()` would still
return a branch instruction if the offset matched the last instruction
in a basic block (and the profile data was matched correctly). However,
the function failed for cases when the searched instruction was followed
by an unconditional jump. "Offset" annotation solves this case.
2024-05-01 21:56:55 -07:00
Amir Ayupov
721c31e3bd Revert "[BOLT] Avoid reference updates for non-JT symbol operands (#88838)"
This reverts commit 9d5411ffba0d94b60050cc873773935addca9533.

Breaks aarch64 buildbot:
https://lab.llvm.org/buildbot/#/builders/221/builds/22130
2024-04-30 09:03:53 -07:00
sinan
9d5411ffba
[BOLT] Avoid reference updates for non-JT symbol operands (#88838)
Skip updating references for operands that do not directly
refer to jump table symbols but fall within a jump table's
address range to prevent unintended modifications.
2024-04-30 11:27:18 +08:00
Amir Ayupov
c4c4e17c99
[BOLT] Use heuristic for matching split local functions (#90424)
Use known order of BOLT split function symbols: fragment symbols
immediately precede the parent fragment symbol.

Depends On: https://github.com/llvm/llvm-project/pull/89648

Test Plan: Added register-fragments-bolt-symbols.s
2024-04-29 16:18:13 -07:00
Amir Ayupov
a1e9608b0f
[BOLT] Use symbol table info in registerFragment (#89648)
Fragment matching relies on symbol names to identify and register split
function fragments. However, as split fragments are often local symbols,
name aliasing is possible. For such cases, use symbol table to resolve
ambiguities.

This requires the presence of FILE symbols in the input binary. As BOLT
requires non-stripped binary, this is a reasonable assumption. Note that
`strip -g` removes FILE symbols by default, but `--keep-file-symbols`
can be used to preserve them.

Depends on: https://github.com/llvm/llvm-project/pull/89861

Test Plan:
Updated X86/fragment-lite.s
2024-04-29 11:14:31 -07:00
Fangrui Song
e982032199
[BOLT,RISCV] Remove empty name special case from #68977
The special case is unneeded after #89693.

Pull Request: https://github.com/llvm/llvm-project/pull/90004
2024-04-25 20:42:40 -07:00
Amir Ayupov
5fb59e7447
[BOLT] Print program stats in perf2bolt/aggregate-only mode (#89763) 2024-04-25 19:08:51 +02:00
Amir Ayupov
090c92e015
[BOLT] Emit synthetic FILE symbol for local cold fragments of global symbols (#89794) 2024-04-25 04:53:15 +02:00
Fangrui Song
b9f2c16b50 [MC] Rename temporary symbols of empty name to ".L0 " (#89693)
Temporary symbols generated for .eh_frame and .debug_line have an empty
name, which appear in .symtab in the presence of RISC-V style linker
relaxation and will not be discarded by ld/objcopy --discard-locals
(-X).

In contrast, GNU assembler's riscv port assigns a fake name ".L0 " (with
a trailing space) to these symbols so that will be discarded by
ld/objcopy --discard-locals.

This patch matches the GNU behavior. Since Clang's RISC-V targets pass
-X to ld, and GNU ld defaults to -X for RISC-V targets, these ".L0 "
symbols will be discarded after linking by default, as expected by
users.

The llvm-symbolizer special case for RISC-V `SF_FormatSpecific` symbols
https://reviews.llvm.org/D98669 needs to be adjusted.

Note: `"":` in assembly currently crashes.
2024-04-24 16:25:45 -07:00
Maksim Panchenko
418e4b0c4f
[BOLT] Detect incorrect update of dynamic relocations (#89681)
When we rewrite dynamic relocations, there could be cases where they
reference code locations inside functions that were rewritten. When this
happens, we need to precisely map old address to a new one. Until we can
reliably perform the mapping, detect such condition and issue an error
refusing to write a broken binary.
2024-04-24 14:03:33 -07:00
Fangrui Song
d9169ffaf7 [BOLT,test] Update AArch64/constant_island_pie_update.s after llvm-readelf -r RELR change 2024-04-19 15:29:14 -07:00
Amir Ayupov
b79b6f9cf0
[BOLT] Use offset deduplication for cold fragments
Apply deduplication for uniformity and BAT section size reduction.

Changes BAT section size to:
- large binary: 39541552 bytes (1.02x original),
- medium binary: 3828996 bytes (0.64x),
- small binary: 928 bytes (0.65x).

Test Plan: Updated bolt-address-translation.test

Reviewers: rafaelauler, dcci, ayermolo, JDevlieghere, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/87853
2024-04-15 09:50:12 +02:00
Maksim Panchenko
43d0891d3b
[BOLT] Fix handling of trailing entries in jump tables (#88444)
If a jump table has entries at the end that are a result of
__builtin_unreachable() targets, BOLT can confuse them with function
pointers. In such case, we should exclude these targets from the table
as we risk incorrectly updating the function pointers. It is safe to
exclude them as branching on such targets is considered an undefined
behavior.
2024-04-11 16:11:00 -07:00
Amir Ayupov
3997f0eb81
[BOLT] Cover all call sites in writeBATYAML
Call site information setting was conditioned on branch information
presence for a given block. However, it's possible to have sampled
profile lacking one or the other for a given basic block.

Iterate over branch profiles and call profiles independently to cover
all recorded profile data.

Depends on https://github.com/llvm/llvm-project/pull/87569

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/87743
2024-04-11 21:15:04 +02:00
Amir Ayupov
8840992667
[BOLT][BAT] Fix handling of split functions
Move BAT parent function lookup outside `getLocationName`, to the
scope where we retrieve `FuncBranchData` linked with the function.

Previously DataAggregator would store branch profile recorded in the
split fragment in `FuncBranchData` associated with the fragment, and
perform name translation in `getLocationName` for symbol name only.
This works for fdata profile which is printed out as-is, but doesn't
work with BAT YAML profile writer which requires a combined profile.

The issue necessitated `fixupBATProfile` which partially addressed the
issue (reassigned inter-fragment calls back into intra-function
branches). However, `fixupBATProfile` fails to address disjoint
profiles (i.e. doesn't merge `FuncBranchData` for fragments back
into parent). This diff eliminates the need for `fixupBATProfile` by
removing the root cause of the issue.

Test Plan: NFC for existing tests

Reviewers: ayermolo, dcci, rafaelauler, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/87569
2024-04-11 21:07:36 +02:00
Amir Ayupov
def6174d2a
[BOLT] Emit empty FDE for injected functions
This fixes an issue where `PatchEntries` overwrites function body but
keeps CFI untouched. Existing FDEs thus become invalid. This doesn't
affect unwinding because patched functions are transparent from
EH/unwinding perspective, but it breaks BOLT during disassembling those
functions.

Emit empty FDE for injected functions (emitted to the same address as
.org functions) that take precedence over the original FDE.

This adds eh_frame overhead, but restores the ability to disassemble
.org functions. Note that the overhead is avoided in `-use-old-text`
mode.

Test Plan: updated bolt/test/X86/patch-entries.test

Reviewers: rafaelauler, maksfb, dcci, ayermolo

Reviewed By: maksfb, dcci

Pull Request: https://github.com/llvm/llvm-project/pull/87967
2024-04-11 10:37:57 +02:00
Amir Ayupov
0227623915
[BOLT][BAT] Support multi-way split functions
BAT writeMaps encoded the assumption that functions are only split into
two fragments (hot and cold). However, BOLT supports splitting into
arbitrary number of fragments. Relax that assumption and look up primary
(hot) fragment explicitly.

Depends on: https://github.com/llvm/llvm-project/pull/86219

Test Plan: Updated bolt/test/X86/yaml-secondary-entry-discriminator.s

Reviewers: ayermolo, rafaelauler, maksfb, dcci

Reviewed By: maksfb, dcci

Pull Request: https://github.com/llvm/llvm-project/pull/87123
2024-04-05 17:50:50 -07:00
Amir Ayupov
2d3c827c05
[BOLT] Use BAT for YAML profile call target information
Provide a mechanism to resolve call target information for calls from non-BAT
functions to BAT functions (`YAMLProfileWriter::convert`). Make it generic for
future use in BAT-to-BAT calls.

Test Plan: Updated bolt/test/X86/bolt-address-translation-yaml.test

Reviewers: ayermolo, maksfb, rafaelauler, dcci

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/86219
2024-04-05 16:08:59 -07:00
Maksim Panchenko
7de82ca369
[BOLT] Don't terminate on trap instruction for Linux kernel (#87021)
Under normal circumstances, we terminate basic blocks on a trap
instruction. However, Linux kernel may resume execution after hitting a
trap (ud2 on x86). Thus, we introduce "--terminal-trap" option that will
specify if the trap instruction should terminate the control flow. The
option is on by default except for the Linux kernel mode when it's off.
2024-03-29 16:41:15 -07:00