With the help of @lhames, This pull request introduces the `dlupdate`
function in the ORC runtime. `dlupdate` enables incremental execution of
new initializers introduced in the REPL environment. Unlike traditional
`dlopen`, which manages initializers, code mapping, and library
reference counts, `dlupdate` focuses exclusively on running new
initializers.
Dependant lists hold raw pointers back to EDUs that depend on them. We need to
remove these entries before destroying the EDU or we'll be left with a dangling
reference that can result in use-after-free bugs.
No testcase: This has only been observed in multi-threaded setups that
reproduce the issue inconsistently.
rdar://135403614
In certain pathological object files we were getting extremely slow
linking because we were repeatedly propogating dependencies to the same
blocks instead of accumulating as many changes as possible. Change the
order of iteration so that we go through every node in the worklist
before returning to any previous node, reducing the number of expensive
dependency iterations.
In practice, this took one case from 60 seconds to 2 seconds. Note: the
performance is still non-deterministic, because the block order itself
is non-deterministic.
rdar://133734391
We weren't taking account of the space we require in the stubs for
things that are dllimported, and as a result we could hit the assertion
failure for running out of stub space. Fix that.
Also add a couple of `override` specifiers that were missing last time
(#102586).
rdar://133473673
We weren't taking account of the space we require in the stubs for
things that are dllimported, and as a result we could hit the assertion
failure for running out of stub space. Fix that.
rdar://133473673
---------
Co-authored-by: Saleem Abdulrasool <compnerd@compnerd.org>
Co-authored-by: Lang Hames <lhames@gmail.com>
Co-authored-by: Ben Barham <b.n.barham@gmail.com>
This allows us to rewrite part of StaticLibraryDefinitionGenerator in terms of
loadLinkableFile.
It's also useful for clients who may not know (either from file extensions or
context) whether a given path will be an object file, an archive, or a
universal binary.
rdar://134638070
`JITDylibSearchOrderResolver` local variable can be destroyed before
completion of all callbacks. Capture it together with `Deps` in
`OnEmitted` callback.
Original error:
```
==2035==ERROR: AddressSanitizer: stack-use-after-return on address 0x7bebfa155b70 at pc 0x7ff2a9a88b4a bp 0x7bec08d51980 sp 0x7bec08d51978
READ of size 8 at 0x7bebfa155b70 thread T87 (tf_xla-cpu-llvm)
#0 0x7ff2a9a88b49 in operator() llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:58
#1 0x7ff2a9a88b49 in __invoke<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:149:25
#2 0x7ff2a9a88b49 in __call<(lambda at llvm/lib/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.cpp:55:9) &, const llvm::DenseMap<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> >, llvm::DenseMapInfo<llvm::orc::JITDylib *, void>, llvm::detail::DenseMapPair<llvm::orc::JITDylib *, llvm::DenseSet<llvm::orc::SymbolStringPtr, llvm::DenseMapInfo<llvm::orc::SymbolStringPtr, void> > > > &> libcxx/include/__type_traits/invoke.h:224:5
#3 0x7ff2a9a88b49 in operator() libcxx/include/__functional/function.h:210:12
#4 0x7ff2a9a88b49 in void std::__u::__function::__policy_invoker<void (llvm::DenseMap<llvm::orc::JITDylib*, llvm::DenseSet<llvm::orc::SymbolStringPtr,
```
API clients may want to use things other than paths as the buffer identifiers.
No testcase -- I haven't thought of a good way to expose this via the regression
testing tools.
rdar://133536831
ORC supports loading relocatable object files into a JIT'd process. The
raw "add object file" API (ObjectLayer::add) accepts plain relocatable
object files as llvm::MemoryBuffers only and does not check that the
object file's format or architecture are compatible with the process
that it will be linked in to. This API is flexible, but places the
burden of error checking and universal binary support on clients.
This commit introduces a new utility, loadRelocatableObject, that takes
a path to load and a target triple and then:
1. If the path does not exist, returns a FileError containing the
invalid path.
2. If the path points to a MachO universal binary, identifies and
returns MemoryBuffer covering the slice that matches the given triple
(checking that the slice really does contains a valid MachO relocatable
object with a compatible arch).
3. If the path points to a regular relocatable object file, verifies
that the format and architecture are compatible with the triple.
Clients can use loadRelocatableObject in the common case of loading
object files from disk to simplify their code.
Note: Error checking for ELF and COFF is left as a FIXME.
rdar://133653290
In 93509b4462a74 MachOPlatform was updated to store object symbols in a shared
vector during bootstrap (this table is later attached to the
bootstrap-completion graph once the ORC runtime's symbol table registration
code is ready). The shared vector was not guarded with a mutex, so use of a
concurrent dispatcher could lead to races during bootstrap. This commit fixes
the issue by guarding access to the table with the BootstrapInfo mutex.
No testcase since this only manifests rarely when both a concurrent dispatcher
and the ORC runtime are used. Once we add a concurrent dispatcher option to
llvm-jitlink we may be able to test this with a regression test in the ORC
runtime and TSan enabled.
rdar://133520308
This relocation is used in order to address GOT entries using 15 bit
offset in ldr instruction. The offset is calculated relative to GOT
section page address.
This reapplies 785d376d123, which was reverted in c49837f5f68 due to bot
failures. The fix was to relax some asserts to allow common symbols to be
resolved with either common or weak flags, rather than requiring one or the
other.
Duplicate common definitions should be coaleseced, rather than being treated as
duplicate definitions. Strong definitions should override common definitions.
rdar://132314264
Changes "MyObj.o" to "/path/to/libMyLib.a(MyObj.o)".
This allows us to differentiate between objects that have the same
basename but came from different archives. It also fixes a bug where if
two such objects were both linked and both have initializer sections
their initializer symbol would cause a duplicate symbol error.
rdar://131782514
Previously this took a reference to a map and returned a bool to say
whether it succeeded. We can return a StringMap instead, as all callers
but 1 simply iterated the map if the bool was true, and passed in empty
maps as the starting point.
lldb's lit-cpuid did specifically check whether the call failed, but due
to the way the x86 routines work this works out the same as checking if
the returned map is empty.
If OnLoaded failed, return after passing the error to OnEmitted instead
of also calling finalizeAsync (which would use values that have already
been moved and perform another call to OnEmitted).
This patch is based on clang-tidy's modernize-make-unique but limited
to those cases where type names are mentioned twice like
std::unique_ptr<Type>(new Type()), which is a bit mouthful.
The ARM architecture uses the LSB bit for ARM/Thumb mode switch
flagging. This is true for alignments of 2 and 4 but in data
relocations the alignment is 1 allowing the LSB bit to be set.
Now only `ELF::STT_FUNC` typed symbols are used in the
TargetFlag mechanism.
The test is a minimal example of the issue mentioned below.
Fixes#95911 "Orc global constructor order test fails on 32
bit ARM".
This fixes a bug in ObjectLinkingLayer::computeBlockNonLocalDeps: The worklist
needs to be built *after* all immediate dependencies / dependants are recorded,
rather than trying to populate it as part of the same loop. (Trying to do the
latter causes us to miss some blocks that should have been included in the
worklist).
This fixes a bug discovered by @Sahil123 on discord during work on
out-of-process execution support in the clang-repl.
No testcase yet. This *might* be testable with a unit test and a custom
JITLinkContext but I believe some aspects of the algorithm depend on memory
layout. I'll need to investigate that. Alternatively we could add llvm-jitlink
testcases that exercise concurrent linking (and should probably do that anyway).
Either option will require some investment and I don't want to hold this fix up
in the mean time.
JITDylib::removeTracker already runs with the session mutex locked (and must do
so), so remove the redundant locking and add an 'IL_' ("inside lock") prefix to
the method name.
This reverts commit edd6f0c544785d6f6276a24b94222e0064413cd1.
The newly added test uncovered a pre-existing issue on Arm 32 bit,
so as we did https://github.com/llvm/llvm-project/issues/94994, disable
it while we find the problem.
Constructors with the same priority should keep their relative order
that was specified. This is important for `clang-repl` with many `const`
variables after commit 05137ecfca ("[clang-repl] Emit const variables
only once").
Add missing __DATA sections that the objc runtime expects to register.
This fixes running objc code that makes use of `@protocol` references
and `__attribute__((objc_nonlazy_class))` classes.
rdar://129368761
Casting the result of `Section.getAddressWithOffset()` goes wrong if we
are on a 32-bit platform whose addresses are regarded as signed; in that
case, just doing
```
(uint64_t)Section.getAddressWithOffset(...)
```
or
```
reinterpret_cast<uint64_t>(Section.getAddressWithOffset(...))
```
will result in sign-extension.
We use these expressions when constructing branch stubs, which is before
we know the final load address, so we can just switch to the
`Section.getLoadAddressWithOffset(...)` method instead.
Doing that is also more consistent, since when calculating relative
offsets for relocations, we use the load address anyway, so the code
currently only works because `Section.Address` is equal to
`Section.LoadAddress` at this point.
Fixes#94478.
Remove support for the icmp and fcmp constant expressions.
This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
We don't know the load addresses when this function is called, so it
shouldn't be trying to use them to determine whether or not the branch
is short. Notably, this will fail in the case where the code is being
loaded into a target in such a way that the section offsets differ
between the process generating the code and the target process.
rdar://127673408
In ARM mode, the Program Counter (PC) points to the current instruction's
address + 8 instead of + 4. An offset is added to RuntimeDyldChecker to
use `next_pc` expression in JITLink tests with both Thumb and Arm.