763 Commits

Author SHA1 Message Date
shaw young
32e4906c28
Revert "[BOLT] Hash-based function matching" (#96568)
Reverts llvm/llvm-project#95821
2024-06-24 18:44:24 -04:00
shaw young
5e097c79d8
[BOLT] Hash-based function matching (#95821)
Using the hashes of binary and profiled functions
to recover functions with changed names.

Test Plan: added 
hashing-based-function-matching.test.
2024-06-24 15:29:44 -07:00
Maksim Panchenko
ad2905e52c
[BOLT] Skip optimization of functions with alt instructions (#95172)
Alternative instructions in the Linux kernel may modify control flow in
a function. As such, it is unsafe to optimize functions with alternative
instructions until we properly support CFG alternatives.

Previously, we marked functions with alt instructions before the
emission, but that could be too late if we remove or replace
instructions with alternatives. We could have marked functions as
non-simple immediately after reading .altinstructions, but it's nice to
be able to view functions after CFG is built. Thus assign the non-simple
status after building CFG.
2024-06-18 12:33:37 -07:00
Hans Wennborg
69753aa43b [bolt] stale-matching-min-matched-block.test requires asserts
Because of the --debug-only flag.
2024-06-18 13:41:57 +02:00
shaw young
753498eed1
[BOLT] Add sink block to flow CFG in profile inference (#95047)
Summary: Constructing an artificial sink block for the
flow CFG in stale profile inference to allow profile
inference to be run on CFGs with blocks that terminate
and have successors.

Testing Plan: Added infer_no_exits.test to verify that 
functions with exit blocks with a landing pad are 
covered by stale profile inference.

---------

Co-authored-by: Amir Ayupov <fads93@gmail.com>
2024-06-17 16:58:26 -07:00
Maksim Panchenko
c67ecf3853
[BOLT][tests] Fix jrcxz instruction test (#95861)
Rewrite the test case intended to check that BOLT does not separate
jrcxz instruction from its destination by more than a one-byte offset.
2024-06-17 16:45:34 -07:00
shaw young
68fc8dffe4 [BOLT] Drop high discrepancy profiles in matching (#95156)
Summary: Functions with high discrepancy 
(measured by matched function blocks) 
can be ignored with an added command line 
argument for better performance.

Test Plan: Added 
stale-matching-min-matched-block.test

---------

Co-authored-by: Amir Ayupov <aaupov@fb.com>
2024-06-17 15:14:35 -07:00
Paschalis Mpeis
a13bc9714a
[BOLT][AArch64] Implement PLTCall optimization (#93584)
`convertCallToIndirectCall` applies the PLTCall optimization and returns
an (updated if needed) iterator to the converted call instruction. Since
AArch64 requires to inject additional instructions to implement this
pass, the relevant BasicBlock and an iterator was passed to the
`convertCallToIndirectCall`.

`NumCallsOptimized` is updated only on successful application of the
pass.

Tests:
- Inputs/plt-tailcall.c: an example of a tail call optimized PLT call.
- AArch64/plt-call.test: it is the actual A64 test, that runs the
PLTCall optimization on the above input file and verifies the
application of the pass to the calls: 'printf' and 'puts'.
2024-06-11 19:21:11 +01:00
Maksim Panchenko
540893e43f
[BOLT] Add auto parsing for Linux kernel .altinstructions (#95068)
.altinstructions section contains a list of structures where fields can
have different sizes while other fields could be present or not
depending on the kernel version. Add automatic detection of such
variations and use it by default. The user can still overwrite the
automatic detection with `--alt-inst-has-padlen` and
`--alt-inst-feature-size` options.
2024-06-11 10:52:51 -07:00
Alexander Yermolovich
61589b8599
[BOLT][DWARF] Fix parent chain in debug_names entries with forward declaration. (#93865)
Previously when an entry was skipped in parent chain a child will point
to the next valid entry in the chain. After discussion in
https://github.com/llvm/llvm-project/pull/91808 this is not very useful.
Changed implemenation so that all the children of the entry that is
skipped won't have DW_IDX_parent.
2024-06-05 09:57:11 -07:00
Maksim Panchenko
c923d39509
[BOLT] Fix ValidateMemRefs pass (#94406)
In ValidateMemRefs pass, when we validate references in the form of
`Symbol + Addend`, we should check `Symbol` not `Symbol + Addend`
against aliasing a jump table.

Recommitting with a modified test case:
https://github.com/llvm/llvm-project/pull/88838

Co-authored-by: sinan <sinan.lin@linux.alibaba.com>
2024-06-04 16:12:29 -07:00
Sayhaan Siddiqui
7103e60f65
[BOLT][DWARF][NFC] Add split-dwarf5 test with multiple CUs (#93744)
Adds a split-dwarf test for DWARF5 with multiple CUs.
2024-06-04 10:34:29 -07:00
Mehdi Amini
c7b7875e1e
Fix lsda-section-name adding back RUN line incorrectly removed in 6ef632ad36c522b0 (#94301) 2024-06-03 18:43:49 -07:00
Fangrui Song
6ef632ad36 [BOLT,test] Fix lsda.ldscript when MAXPAGESIZE>=0x10000
The intention is to check a section name different from
.gcc_except_table . Rather than using a linker script, use llvm-objcopy
--rename-section instead.
2024-06-03 13:18:58 -07:00
Sayhaan Siddiqui
598f37bb27
[BOLT][DWARF][NFC] Add split-dwarf4 test with multiple CUs (#93741)
Adds a split-dwarf test for DWARF4 with multiple CUs.
2024-06-01 08:14:41 -07:00
Fangrui Song
0353f6abdd [BOLT][test] Use correct normalized triple
bolt/test/lit.local.cfg wants to use the system GCC installation but it
specifies a wrong triple ("linux" instead of "linux-gnu") and relies on
clangDriver's loose GCC installation detection to pick up "*-linux-gnu".

This loose behavior may not work. Use "linux-gnu" instead.

Note: neither "linux" nor "linux-gnu" detects "linux-musl" triples, so
these tests currently fail on musl based systems.

Other files changes are cosmetic.
2024-05-31 17:50:58 -07:00
Sayhaan Siddiqui
a585446110
[BOLT][DWARF][NFC] Fix formatting issue in DWARF4 split-dwarf test (#93747)
Remove double escape characters before a RUN in a test.
2024-05-31 15:16:41 -07:00
Sayhaan Siddiqui
8d239d7fdf
[BOLT][DWARF][NFC] Fix formatting issue in DWARF5 split-dwarf test (#93746)
Remove double escape characters before a RUN in a test.
2024-05-31 15:16:21 -07:00
Sayhaan Siddiqui
278b396465
[BOLT][DWARF][NFC] Add tests with multiple CUs (#93615)
Adds DWARF4 and DWARF5 tests with multiple CUs.
2024-05-31 15:15:40 -07:00
Sayhaan Siddiqui
11791ae7b0
[BOLT][DWARF][NFC] Added double escape characters (#93348)
Added double escape characters to lines that describe a test.
2024-05-31 15:14:37 -07:00
Amir Ayupov
e9954ec087 [BOLT] Detect .warm split functions as cold fragments (#93759)
CDSplit splits functions up to three ways: main fragment with no suffix,
and fragments with .cold and .warm suffixes.

Add .warm suffix to the regex used to recognize split fragments.

Test Plan: updated register-fragments-bolt-symbols.s
2024-05-30 17:48:12 -07:00
Fangrui Song
ce5b371606
[BOLT,test] Make linker scripts less sensitive to lld's orphan placement (#93763)
Then two tests rely on .interp being the first section.
llvm-bolt would crash if lld places .interp after .got
(f639b57f7993cadb82ee9c36f04703ae4430ed85).

For best portability, when a linker scripts specifies a SECTIONS
command, the first section for each PT_LOAD segment should be specified
with a MAXPAGESIZE alignment. Otherwise, linkers have freedom to decide
how to place orphan sections, which might break intention.
2024-05-30 10:12:41 -07:00
Michael Kruse
c5a3f664fe
[BOLT] Revise IDE folder structure (#89742)
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode (`set_property(TARGET <target>
PROPERTY FOLDER "<title>")`) when using the respective CMake's IDE
generator.

 * Ensure that every target is in a folder
 * Use a folder hierarchy with each LLVM subproject as a top-level folder
 * Use consistent folder names between subprojects
 * When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
2024-05-25 17:15:37 +02:00
Alexander Yermolovich
8c2da89ec4
[BOLT] Do not emit debug_names entry for DIEs with DW_AT_declaration (#93347)
Previously BOLT was only doing it for DW_TAG_variables. It looks like
other type of DIEs can have this. So making it global.
2024-05-25 07:48:57 -07:00
Amir Ayupov
d1d9545ed3
[BOLT][BAT] Add entries for deleted basic blocks
Deleted basic blocks are required for correct mapping of branches
modified by SCTC.

Increases BAT size, bytes:
- large binary: 8622496 -> 8703244.
- small binary (X86/bolt-address-translation.test): 928 -> 940.

Test Plan: updated bb-with-two-tail-calls.s

Reviewers: ayermolo, dcci, maksfb, rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/91906
2024-05-23 19:19:07 -07:00
Amir Ayupov
97025bd9d5
[BOLT] Use getLocationName in YAMLProfileWriter (#92493)
Disambiguate local functions using the containing file symbol in BAT
mode. Make local function naming consistent across BAT fdata and YAML
profiles.

Test Plan: updated register-fragments-bolt-symbols.s
2024-05-21 20:24:46 -07:00
Amir Ayupov
935b946b1f
[BOLT] Process cross references between ignored functions in BAT mode (#92484)
To align YAML and fdata profiles produced in BAT mode, lift two
restrictions applied in non-relocation mode when BAT is present:
1) register secondary entry points from ignored functions,
2) treat functions with secondary entry points as simple.

This allows constructing CFG for non-simple functions in non-relocation
mode and emitting YAML profile for them, which can then be used for
optimizations in relocation mode.

Test Plan: added test ignored-interprocedural-reference.s
2024-05-21 20:22:12 -07:00
Amir Ayupov
a9b67490b2
[BOLT] Report adjusted program stats from perf2bolt in BAT mode (#91683) 2024-05-21 18:54:15 -07:00
Amir Ayupov
bb627b0a0c
[BOLT] Ignore special symbols as function aliases in updateELFSymbolTable
Exempt special symbols (hot text/data and _end symbol) from normal
handling. We only need to set their value and make them absolute.

If these symbols are handled as normal symbols and if they alias
functions we may create non-sensical symbols, e.g. __hot_start.cold.

Test Plan: updated hot-end-symbol.s

Reviewers: maksfb, rafaelauler, ayermolo, dcci

Reviewed By: dcci, maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/92713
2024-05-20 16:55:11 -07:00
Amir Ayupov
91423d7193
[BOLT][NFC] Don't assign YAML profile to functions with no CFG (#92487)
YAML profile for non-simple functions without CFG is
  1) useless for optimizations,
  2) can't be attached, similar to fdata profile,
  3) would be reported as invalid/stale even if the profile is valid.

Don't attempt to attach the profile in this case, aligning the behavior
to DataReader.

Test Plan: added yaml-non-simple.test
2024-05-19 20:15:31 -07:00
Amir Ayupov
878642954f
[BOLT] Fix preserved offset in fixDoubleJumps (#92485) 2024-05-19 13:23:04 -07:00
Alexander Yermolovich
99fad7ebd8
[BOLT][DWARF] Update DW_AT_comp_dir/DW_AT_dwo_name for DWO TUs (#91486)
Type unit DIE generated by clang contains DW_AT_comp_dir/DW_AT_dwo_name.
This was added to clang to help LLDB to figure out where type unit come
from when accessing an entry in a .debug_names accelerator table and
type units in .dwp file.

When BOLT writes out .dwo files it changes the name of them. User can
also specify directory of where they can be written out. Added support
to BOLT to update those attributes.
2024-05-14 15:08:45 -07:00
Amir Ayupov
b06f97b039
[BOLT] Allow pass-through blocks in YAMLProfileReader (#91828) 2024-05-13 18:02:38 -07:00
Amir Ayupov
4ecf2caf68
[BOLT] Use aggregated FuncBranchData in writeBATYAML
Switch from FuncBranchData intermediate maps (Intra/InterIndex)
to aggregated Data, same as one used by DataReader:
e62ce1f884/bolt/lib/Profile/DataReader.cpp (L385-L389)
This aligns the order of the output between YAMLProfileWriter and
writeBATYAML.

Test Plan: updated bolt-address-translation-yaml.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: ayermolo, maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/91289
2024-05-13 14:23:32 -07:00
Amir Ayupov
54b17fa4ee [BOLT] Preserve Offset annotation in fixDoubleJumps (#91898)
Offset annotation was missed when optimizing an unconditional branch to
a tail call.

Test Plan: update bb-with-two-tail-calls.s
2024-05-13 12:31:48 -07:00
Amir Ayupov
b5af667b01
[BOLT] Map branch source address to the containing basic block in BAT YAML
Fix an issue where the profile for all branches that have a BRANCHENTRY
is dropped. If the branch has an entry in BAT, it will be translated to
its input offset. We used to only permit the basic block offset as a
branch source. Perform a lookup of containing basic block instead.

Test Plan: Updated bolt-address-translation-yaml.test

Reviewers: maksfb, dcci, rafaelauler, ayermolo

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/91273
2024-05-12 17:11:09 -07:00
Maksim Panchenko
c8864bceeb
[BOLT] Fix race condition in a test (#91866)
Fix race condition in internal NFC test.
2024-05-11 23:59:57 -07:00
Amir Ayupov
4f127667ca
[BOLT] Set entry counts in BAT YAML profile (#91775)
Align with DataReader::readProfile that sets entry block counts from
FuncBranchData->EntryData.

Test Plan: updated bolt-address-translation-yaml.test
2024-05-10 22:23:45 -07:00
Amir Ayupov
bbcdd4f4b2
[BOLT] Use disambiguated local names in BAT YAML
Align BAT YAML to fdata profile.

Test Plan: updated register-fragments-bolt-symbols.s

Reviewers: dcci, rafaelauler, ayermolo, maksfb

Reviewed By: dcci

Pull Request: https://github.com/llvm/llvm-project/pull/91773
2024-05-10 22:18:50 -07:00
Amir Ayupov
6b9bca8faa
[BOLT] Preserve Offset annotation in SCTC (#91693)
Offset annotation is used in writing BAT tables.

Test Plan: updated sctc-bug4.test
2024-05-10 13:20:51 -07:00
Maksim Panchenko
73a01448c7
[BOLT] Add test case for PIC fixed indirect jump (#91547)
A compiler can generate a redundant indirection for a jump via a fixed
jump table target. Add a test case that covers such pattern that covers
PIC case. We already have non-PIC case detection.

Currently XFAIL.
2024-05-08 17:56:44 -07:00
Amir Ayupov
db29f20fdd
[BOLT] Ignore returns in DataAggregator
Returns are ignored in perf/pre-aggregated/fdata profile reader (see
DataReader::convertBranchData). They are also omitted in
YAMLProfileWriter by virtue of not having the profile attached to them
in the reader, and YAMLProfileWriter converting the profile attached to
BinaryFunctions. Thus, return profile is universally ignored across all
profile types except BAT YAML.

To make returns ignored for YAML produced in BAT mode, we can:
1) ignore them in YAMLProfileReader,
2) omit them from YAML profile in profile conversion/writing.

The first option is prone to profile staleness issue, where the profiled
binary doesn't match the one to be optimized, and thus returns in the
profile can no longer be reliably detected (as we don't distinguish them
from calls in the profile).

The second option is robust to staleness but requires disassembling the
branch source instruction.

Test Plan: Updated bolt-address-translation-yaml.test

Reviewers: rafaelauler, dcci, ayermolo, maksfb

Reviewed By: maksfb

Pull Request: https://github.com/llvm/llvm-project/pull/90807
2024-05-08 12:02:18 -07:00
Maksim Panchenko
ff0c5ccbe8
[BOLT] Add a test for BOLT-reserved space in a binary (#91399)
Test case for #90300.
2024-05-07 16:05:10 -07:00
Maksim Panchenko
99b4532b8b
[BOLT] Add support for Linux kernel .smp_locks section (#90798)
Parse .smp_locks section entries and create fixups that are going to be
used to update the section before the binary emission.
2024-05-02 13:08:37 -07:00
Maksim Panchenko
59ab29213d
[BOLT] Register Linux kernel dynamic branch offsets (#90677)
To match profile data to code we need to know branch instruction offsets
within a function. For this reason, we mark branches with the "Offset"
annotation while disassembling the code. However, _dynamic_ branches in
the Linux kernel could be NOPs in disassembled code, and we ignore them
while adding annotations. We need to explicitly add the "Offset"
annotation while creating dynamic branches.

Note that without this change, `getInstructionAtOffset()` would still
return a branch instruction if the offset matched the last instruction
in a basic block (and the profile data was matched correctly). However,
the function failed for cases when the searched instruction was followed
by an unconditional jump. "Offset" annotation solves this case.
2024-05-01 21:56:55 -07:00
Amir Ayupov
721c31e3bd Revert "[BOLT] Avoid reference updates for non-JT symbol operands (#88838)"
This reverts commit 9d5411ffba0d94b60050cc873773935addca9533.

Breaks aarch64 buildbot:
https://lab.llvm.org/buildbot/#/builders/221/builds/22130
2024-04-30 09:03:53 -07:00
sinan
9d5411ffba
[BOLT] Avoid reference updates for non-JT symbol operands (#88838)
Skip updating references for operands that do not directly
refer to jump table symbols but fall within a jump table's
address range to prevent unintended modifications.
2024-04-30 11:27:18 +08:00
Amir Ayupov
c4c4e17c99
[BOLT] Use heuristic for matching split local functions (#90424)
Use known order of BOLT split function symbols: fragment symbols
immediately precede the parent fragment symbol.

Depends On: https://github.com/llvm/llvm-project/pull/89648

Test Plan: Added register-fragments-bolt-symbols.s
2024-04-29 16:18:13 -07:00
Amir Ayupov
a1e9608b0f
[BOLT] Use symbol table info in registerFragment (#89648)
Fragment matching relies on symbol names to identify and register split
function fragments. However, as split fragments are often local symbols,
name aliasing is possible. For such cases, use symbol table to resolve
ambiguities.

This requires the presence of FILE symbols in the input binary. As BOLT
requires non-stripped binary, this is a reasonable assumption. Note that
`strip -g` removes FILE symbols by default, but `--keep-file-symbols`
can be used to preserve them.

Depends on: https://github.com/llvm/llvm-project/pull/89861

Test Plan:
Updated X86/fragment-lite.s
2024-04-29 11:14:31 -07:00
Fangrui Song
e982032199
[BOLT,RISCV] Remove empty name special case from #68977
The special case is unneeded after #89693.

Pull Request: https://github.com/llvm/llvm-project/pull/90004
2024-04-25 20:42:40 -07:00