Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning
on the boundary between JITLink and ORC, and allows pointer comparisons (rather
than string comparisons) between Symbol names. This should improve the
performance and readability of code that bridges between JITLink and ORC (e.g.
ObjectLinkingLayer and ObjectLinkingLayer::Plugins).
To enable use of SymbolStringPtr a std::shared_ptr<SymbolStringPool> is added to
LinkGraph and threaded through to its construction sites in LLVM and Bolt. All
LinkGraphs that are to have symbol names compared by pointer equality must point
to the same SymbolStringPool instance, which in ORC sessions should be the pool
attached to the ExecutionSession.
---------
Co-authored-by: Lang Hames <lhames@gmail.com>
This reapplies aba6bb0820b, which was reverted in 28e2a891210 due to bot
failures. It contains fixes to silence warnings for uncovered switches,
and for incorrect initializer-symbol handling on ELF and COFF.
SideEffectsOnly is a new jitlink::Scope value that corresponds to the
JITSymbolFlags::MaterializationSideEffectsOnly flag: Symbols with this scope
can be looked up (and form part of the initial interface of a LinkGraph) but
never actually resolve to an address (so can only be looked up with a
WeaklyReferencedSymbol lookup).
Previously ObjectLinkingLayer implicitly treated JITLink symbols as having this
scope, regardless of a Symbol's actual scope, if the
MaterializationSideEffectsOnly flag was set on the corresponding symbol in the
MaterializationResponsibility object. Using an explicit scope in JITLink for
this (1) allows JITLink plugins to identify and correctly handle
side-effects-only symbols, and (2) allows raw LinkGraphs to define
side-effects-only symbols without clients having to manually modify their
`MaterializationUnit::Interface`.
When transferring resources, the destination tracker key may not be in
the internal map, invalidating iterators and value references. The added
test creates such situation and would fail before with "Finalized
allocation was not deallocated."
For good measure, fix the same pattern in RTDyldObjectLinkingLayer
which is harder to test because it "only" results in memory managers
being deleted in the wrong order.
Instead, when a MaterializationResponsibility contains an initializer symbol,
the Platform classes (MachO, COFF, ELFNix) will now add a defined symbol with
the same name to an arbitary block within the initializer sections, and then
add keep-alive edges from that symbol to all other init section blocks.
ObjectLinkingLayer is updated to automatically discard symbols where the
corresponding MaterializationResponsibility entry has the
MaterializationSideEffecstsOnly flag. This change simplifies both the
ObjectLinkingLayer::Plugin interface and the dependence tracking algorithm,
which no longer needs a special case for "synthetic"
(MaterializationSideEffectsOnly) symbols.
ObjectLinkingLayer::registerDependencies used to propagate external symbol
dependencies (dependencies on symbols outside the current graph) to all nodes.
Since ebe8733a11e, which merged addDependencies into notifyEmitted, the
notifyEmitted function will propagate intra-graph dependencies, so
registerDependencies no longer needs to do this.
This patch updates ObjectLinkingLayer::registerDependencies to just propagate
named dependencies (on both internal and external symbols) through anonymous
blocks, leaving the rest of the work to ExecutionSession::notifyEmitted.
It also choses a key symbol to use for blocks containing multiple symbols. The
result is both easier to read and faster.
In certain pathological object files we were getting extremely slow
linking because we were repeatedly propogating dependencies to the same
blocks instead of accumulating as many changes as possible. Change the
order of iteration so that we go through every node in the worklist
before returning to any previous node, reducing the number of expensive
dependency iterations.
In practice, this took one case from 60 seconds to 2 seconds. Note: the
performance is still non-deterministic, because the block order itself
is non-deterministic.
rdar://133734391
This fixes a bug in ObjectLinkingLayer::computeBlockNonLocalDeps: The worklist
needs to be built *after* all immediate dependencies / dependants are recorded,
rather than trying to populate it as part of the same loop. (Trying to do the
latter causes us to miss some blocks that should have been included in the
worklist).
This fixes a bug discovered by @Sahil123 on discord during work on
out-of-process execution support in the clang-repl.
No testcase yet. This *might* be testable with a unit test and a custom
JITLinkContext but I believe some aspects of the algorithm depend on memory
layout. I'll need to investigate that. Alternatively we could add llvm-jitlink
testcases that exercise concurrent linking (and should probably do that anyway).
Either option will require some investment and I don't want to hold this fix up
in the mean time.
Previously ObjectLinkingLayer held unique ownership of Plugins, and links
always used the Layer's plugin list at each step. This can cause problems if
plugins are added while links are in progress however, as the newly added
plugin may receive only some of the callbacks for links that are already
running.
In this patch each link gets its own copy of the pipeline that remains
consistent throughout the link's lifetime, and it is guaranteed that Plugin
objects (now with shared ownership) will remain valid until the link completes.
Coding my way home: 9.80469S, 139.03167W
If notifyEmitted encounters a failure (either because some plugin returned one,
or because the ResourceTracker was defunct) then we need to deallocate the
FinalizedAlloc manually.
No testcase yet: This requires a concurrent setup -- we'll need to build some
infrastructure to coordinate links and deliberately injected failures in order
to reliably test this.
Remove an overly aggressive cantFail: This call to defineMaterializing should
never fail with a duplicate symbols error (since all new symbols shoul be
weak), but may fail if the tracker has become defunct in the mean time. In that
case we need to propagate the error.
Removes the MaterializationResponsibility::addDependencies and
addDependenciesForAll methods, and transfers dependency registration to
the notifyEmitted operation. The new dependency registration allows
dependencies to be specified for arbitrary subsets of the
MaterializationResponsibility's symbols (rather than just single symbols
or all symbols) via an array of SymbolDependenceGroups (pairs of symbol
sets and corresponding dependencies for that set).
This patch aims to both improve emission performance and simplify
dependence tracking. By eliminating some states (e.g. symbols having
registered dependencies but not yet being resolved or emitted) we make
some errors impossible by construction, and reduce the number of error
cases that we need to check. NonOwningSymbolStringPtrs are used for
dependence tracking under the session lock, which should reduce
ref-counting operations, and intra-emit dependencies are resolved
outside the session lock, which should provide better performance when
JITing concurrently (since some dependence tracking can happen in
parallel).
The Orc C API is updated to account for this change, with the
LLVMOrcMaterializationResponsibilityNotifyEmitted API being modified and
the LLVMOrcMaterializationResponsibilityAddDependencies and
LLVMOrcMaterializationResponsibilityAddDependenciesForAll operations
being removed.
Adds a function to create a LinkGraph of absolute symbols, and a
callback in dynamic library search generators to enable using it to
expose its symbols to the platform/orc runtime. This allows e.g. using
__orc_rt_run_program to run a precompiled function that was found via
dlsym. Ideally we would use this in llvm-jitlink's own search generator,
but it will require more work to align with the Process/Platform
JITDylib split, so not handled here.
As part of this change we need to handle LinkGraphs that only have
absolute symbols.
This re-applies 4b17c81d5a5, "[jitlink/rtdydl][checker] Add TargetFlag
dependent disassembler switching support", which was reverted in
4871a9ca546 due to bot failures.
The patch has been updated to add missing plumbing for Subtarget Features and
a CPU string, which should fix the failing tests.
https://reviews.llvm.org/D158280
Some targets such as AArch32 make use of TargetFlags to indicate ISA mode. Depending
on the TargetFlag, MCDisassembler and similar target specific objects should be
reinitialized with the correct Target Triple. Backends with similar needs can
easily extend this implementation for their usecase.
The drivers llvm-rtdyld and llvm-jitlink have their SymbolInfo's extended to take
TargetFlag into account. RuntimeDyldChecker can now create necessary TargetInfo
to reinitialize MCDisassembler and MCInstPrinter. The required triple is obtained
from the new getTripleFromTargetFlag function by checking the TargetFlag.
In addition, breaking changes for RuntimeDyld COFF Thumb tests are fixed by making
the backend emit a TargetFlag.
Reviewed By: lhames, sgraenitz
Differential Revision: https://reviews.llvm.org/D158280
ExecutorAddr was introduced in b8e5f918166 as an eventual replacement for
JITTargetAddress. ExecutorSymbolDef is introduced in this patch as a
replacement for JITEvaluatedSymbol: ExecutorSymbolDef is an (ExecutorAddr,
JITSymbolFlags) pair, where JITEvaluatedSymbol was a (JITTargetAddress,
JITSymbolFlags) pair.
A number of APIs had already migrated from JITTargetAddress to ExecutorAddr,
but many of ORC's internals were still using the older type. This patch aims
to address that.
Some public APIs are affected as well. If you need to migrate your APIs you can
use the following operations:
* ExecutorAddr::toPtr replaces jitTargetAddressToPointer and
jitTargetAddressToFunction.
* ExecutorAddr::fromPtr replace pointerToJITTargetAddress.
* ExecutorAddr(JITTargetAddress) creates an ExecutorAddr value from a
JITTargetAddress.
* ExecutorAddr::getValue() creates a JITTargetAddress value from an
ExecutorAddr.
JITTargetAddress and JITEvaluatedSymbol will remain in JITSymbol.h for now, but
the aim will be to eventually deprecate and remove these types (probably when
MCJIT and RuntimeDyld are deprecated).
This first version lays the foundations for AArch32 support in JITLink. ELFLinkGraphBuilder_aarch32 processes REL-type relocations and populates LinkGraphs from ELF object files for both big- and little-endian systems. The ArmCfg member controls subarchitecture-specific details throughout the linking process (i.e. it's passed to ELFJITLinker_aarch32).
Relocation types follow the ABI documentation's division into classes: Data (endian-sensitive), Arm (32-bit little-endian) and Thumb (2x 16-bit little-endian, "Thumb32" in the docs). The implementation of instruction encoding/decoding for relocation resolution is implemented symmetrically and is testable in isolation (see AArch32 category in JITLinkTests).
Callable Thumb functions are marked with a ThumbSymbol target-flag and stored in the LinkGraph with their real addresses. The thumb-bit is added back in when the owning JITDylib requests the address for such a symbol.
The StubsManager can generate (absolute) Thumb-state stubs for branch range extensions on v7+ targets. Proper GOT/PLT handling is not yet implemented.
This patch is based on the backend implementation in ez-clang and has just enough functionality to model the infrastructure and link a Thumb function `main()` that calls `printf()` to dump "Hello Arm!" on Armv7a. It was tested on Raspberry Pi with 32-bit Raspbian OS.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D144083
This first version lays the foundations for AArch32 support in JITLink. ELFLinkGraphBuilder_aarch32 processes REL-type relocations and populates LinkGraphs from ELF object files for both big- and little-endian systems. The ArmCfg member controls subarchitecture-specific details throughout the linking process (i.e. it's passed to ELFJITLinker_aarch32).
Relocation types follow the ABI documentation's division into classes: Data (endian-sensitive), Arm (32-bit little-endian) and Thumb (2x 16-bit little-endian, "Thumb32" in the docs). The implementation of instruction encoding/decoding for relocation resolution is implemented symmetrically and is testable in isolation (see AArch32 category in JITLinkTests).
Callable Thumb functions are marked with a ThumbSymbol target-flag and stored in the LinkGraph with their real addresses. The thumb-bit is added back in when the owning JITDylib requests the address for such a symbol.
The StubsManager can generate (absolute) Thumb-state stubs for branch range extensions on v7+ targets. Proper GOT/PLT handling is not yet implemented.
This patch is based on the backend implementation in ez-clang and has just enough functionality to model the infrastructure and link a Thumb function `main()` that calls `printf()` to dump "Hello Arm!" on Armv7a. It was tested on Raspberry Pi with 32-bit Raspbian OS.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D144083
AArch32 branch offsets explicitly encode the target instruction subset (Arm/Thumb) in their least significant bit. We want this bit set (or clear) in addreses we hand out, but the addresses in the LinkGraph should be the real/physical addresses.
This patch allows ELFLinkGraphBuilder's to set target-specific flags in jitlink::Symbol and prepares ObjectLinkingLayer to account for them.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D146641
Adds a getJITSymbolFlagsForSymbol function that returns the JITSymbolFlags
for a given jitlink::Symbol, and replaces severalredundant copies of that
mapping with calls to the new function. This fixes a bug in
LinkGraphMaterializationUnit::scanLinkGraph where we were failing to set the
JITSymbolFlags::Weak flag for weak symbols, and a bug in
ObjectLinkingLayer::claimOrExternalizeWeakAndCommonSymbols where we were
failing to set the JITSymbolFlags::Callable flag for callable symbols.
In some cases it's helpful to group trackers by JITDylib. E.g. Platform classes
may want to track initializer symbols with a `JITDylib -> Tracker -> [ Symbol ]`
map. This makes it easy to collect all symbols for the JITDylib, while still
allowing efficient removal of a single tracker. Passing the JITDylib as an
argument to ResourceManager::notifyRemovingResources and
ResourceManager::notifyTransferringResources supports such use-cases.
ObjectLinkingLayer attempts to claim responsibility for weak definitions that
are present in LinkGraphs, but not present in the corresponding
MaterializationResponsibility object. Where such a claim is successful, the
symbol should be marked as live to prevent it from being dead stripped.
(For the curious: Such "late-breaking" definitions are introduced somewhere in
the materialization pipeline after the initial responsibility set is calculated.
The usual source is the complier or assembler. Examples of common late-breaking
definitions include personality pointers, e.g. "DW.ref.__gxx_personality_v0",
and named constant pool entries, e.g. __realXX..XX.)
The failure to mark these symbols live caused few problems in practice because
late-breaking definitions are usually anchored by existing live definitions
within the graph (e.g. DW.ref.__gxx_personality_v0 is transitively referenced by
functions via eh-frame records), and so they usually survived dead-stripping
anyway. This accidental persistence isn't a principled solution though, and it
fails altogether if a late-breaking definition is not otherwise referenced by
the graph, with the result that the now-claimed symbol is stripped triggering a
"Failed to materialize symbols" error in ORC. Marking such symbols live is the
correct solution.
No testcase, as it's difficult to construct a situation where a late-breaking
definition is inserted without being referenced outside the context of new
backend bringup or plugin-specific shenanigans.
See discussion in https://reviews.llvm.org/D133452 and
https://reviews.llvm.org/D136877.
Previously we stripped Weak flags from JITDylib symbol table entries once they
were resolved (there was no particularly good reason for this). Now we want to
retain them and query them when setting the Linkage on external symbols in
LinkGraphs during symbol resolution (this was the motivation for 75404e9ef88).
Making weak linkage of external definitions discoverable in the LinkGraph will
in turn allow future plugins to implement correct handling for them (by
recording locations that depend on exported weak definitions and pointing all
of these at one chosen definition at runtime).
Implements SECTION/SECREL relocation. These are used by debug info (pdb) data.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D130275
ELF-based platforms currently support defining multiple static
initializer table sections with differing priorities, for example
.init_array.0 or .init_array.100; the default .init_array corresponds
to a priority of 65535. When building a shared library or executable,
the system linker normally sorts these sections and combines them into
a single .init_array section. This change adds the capability to
recognize ELF static initializers with priorities other than the
default, and to properly sort them by priority, to Orc and the Orc
runtime.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D127056
This patch removes the unintended resolution of locally scoped absolute symbols
(which was causing unexpected definition errors).
It stops using the JITSymbolFlags::Absolute flag (it isn't set or used elsewhere,
and causes mismatch-flags asserts), and adds JITSymbolFlags::Exported to default
scoped absolute symbols.
Finally, we now set the scope of absolute symbols correctly in
MachOLinkGraphBuilder.
This re-applies 133f86e95492b2a00b944e070878424cfa73f87c, which was reverted in
c5965a411c635106a47738b8d2e24db822b7416f while I investigated bot failures.
The original failure contained an arithmetic conversion think-o (on line 419 of
EHFrameSupport.cpp) that could cause failures on 32-bit platforms. The issue
should be fixed in this patch.
This adds a GetObjectFileInterface callback member to
StaticLibraryDefinitionGenerator, and adds an optional argument for initializing
that member to StaticLibraryDefinitionGenerator's named constructors. If not
supplied, it will default to getObjectFileInterface from ObjectFileInterface.h.
To enable testing a `-hidden-l<x>` option is added to the llvm-jitlink tool.
This allows archives to be loaded with all contained symbol visibilities demoted
to hidden.
The ObjectLinkingLayer::setOverrideObjectFlagsWithResponsibilityFlags method is
(belatedly) hooked up, and enabled in llvm-jitlink when `-hidden-l<x>` is used
so that the demotion is also applied at symbol resolution time (avoiding any
"mismatched symbol flags" crashes).
MaterializationUnit::Interface holds the values that make up the interface
(for ORC's purposes) of a materialization unit: the symbol flags map and
initializer symbol.
Having a type for this will make functions that build materializer interfaces
more readable and maintainable.
We were adding all defined weak symbols to the materialization
responsibility, but local symbols will not be in the symbol table, so it
failed to materialize due to the "missing" symbol.
Local weak symbols come up in practice when using `ld -r` with a hidden
weak symbol.
rdar://85574696
Similar to how the other swift sections are registered by the ORC
runtime's macho platform, add the __swift5_types section, which contains
type metadata. Add a simple test that demonstrates that the swift
runtime recognized the registered types.
rdar://85358530
Differential Revision: https://reviews.llvm.org/D113811
Adds explicit narrowing casts to JITLinkMemoryManager.cpp.
Honors -slab-address option in llvm-jitlink.cpp, which was accidentally
dropped in the refactor.
This effectively reverts commit 6641d29b70993bce6dbd7e0e0f1040753d38842f.
This commit substantially refactors the JITLinkMemoryManager API to: (1) add
asynchronous versions of key operations, (2) give memory manager implementations
full control over link graph address layout, (3) enable more efficient tracking
of allocated memory, and (4) support "allocation actions" and finalize-lifetime
memory.
Together these changes provide a more usable API, and enable more powerful and
efficient memory manager implementations.
To support these changes the JITLinkMemoryManager::Allocation inner class has
been split into two new classes: InFlightAllocation, and FinalizedAllocation.
The allocate method returns an InFlightAllocation that tracks memory (both
working and executor memory) prior to finalization. The finalize method returns
a FinalizedAllocation object, and the InFlightAllocation is discarded. Breaking
Allocation into InFlightAllocation and FinalizedAllocation allows
InFlightAllocation subclassses to be written more naturally, and FinalizedAlloc
to be implemented and used efficiently (see (3) below).
In addition to the memory manager changes this commit also introduces a new
MemProt type to represent memory protections (MemProt replaces use of
sys::Memory::ProtectionFlags in JITLink), and a new MemDeallocPolicy type that
can be used to indicate when a section should be deallocated (see (4) below).
Plugin/pass writers who were using sys::Memory::ProtectionFlags will have to
switch to MemProt -- this should be straightworward. Clients with out-of-tree
memory managers will need to update their implementations. Clients using
in-tree memory managers should mostly be able to ignore it.
Major features:
(1) More asynchrony:
The allocate and deallocate methods are now asynchronous by default, with
synchronous convenience wrappers supplied. The asynchronous versions allow
clients (including JITLink) to request and deallocate memory without blocking.
(2) Improved control over graph address layout:
Instead of a SegmentRequestMap, JITLinkMemoryManager::allocate now takes a
reference to the LinkGraph to be allocated. The memory manager is responsible
for calculating the memory requirements for the graph, and laying out the graph
(setting working and executor memory addresses) within the allocated memory.
This gives memory managers full control over JIT'd memory layout. For clients
that don't need or want this degree of control the new "BasicLayout" utility can
be used to get a segment-based view of the graph, similar to the one provided by
SegmentRequestMap. Once segment addresses are assigned the BasicLayout::apply
method can be used to automatically lay out the graph.
(3) Efficient tracking of allocated memory.
The FinalizedAlloc type is a wrapper for an ExecutorAddr and requires only
64-bits to store in the controller. The meaning of the address held by the
FinalizedAlloc is left up to the memory manager implementation, but the
FinalizedAlloc type enforces a requirement that deallocate be called on any
non-default values prior to destruction. The deallocate method takes a
vector<FinalizedAlloc>, allowing for bulk deallocation of many allocations in a
single call.
Memory manager implementations will typically store the address of some
allocation metadata in the executor in the FinalizedAlloc, as holding this
metadata in the executor is often cheaper and may allow for clean deallocation
even in failure cases where the connection with the controller is lost.
(4) Support for "allocation actions" and finalize-lifetime memory.
Allocation actions are pairs (finalize_act, deallocate_act) of JITTargetAddress
triples (fn, arg_buffer_addr, arg_buffer_size), that can be attached to a
finalize request. At finalization time, after memory protections have been
applied, each of the "finalize_act" elements will be called in order (skipping
any elements whose fn value is zero) as
((char*(*)(const char *, size_t))fn)((const char *)arg_buffer_addr,
(size_t)arg_buffer_size);
At deallocation time the deallocate elements will be run in reverse order (again
skipping any elements where fn is zero).
The returned char * should be null to indicate success, or a non-null
heap-allocated string error message to indicate failure.
These actions allow finalization and deallocation to be extended to include
operations like registering and deregistering eh-frames, TLS sections,
initializer and deinitializers, and language metadata sections. Previously these
operations required separate callWrapper invocations. Compared to callWrapper
invocations, actions require no extra IPC/RPC, reducing costs and eliminating
a potential source of errors.
Finalize lifetime memory can be used to support finalize actions: Sections with
finalize lifetime should be destroyed by memory managers immediately after
finalization actions have been run. Finalize memory can be used to support
finalize actions (e.g. with extra-metadata, or synthesized finalize actions)
without incurring permanent memory overhead.