This refactor was motivated by two bugs identified in out-of-tree
builds:
1. Some implementations of the VisitMembersFunction type (often used to
implement special loading semantics, e.g. -all_load or -ObjC) were assuming
that buffers for archive members were null-terminated, which they are not in
general. This was triggering occasional assertions.
2. Archives may include multiple members with the same file name, e.g.
when constructed by appending files with the same name:
% llvm-ar crs libfoo.a foo.o
% llvm-ar q libfoo.a foo.o
% llvm-ar t libfoo.a foo.o
foo.o
While confusing, these members may be safe to link (provided that they're
individually valid and don't define duplicate symbols). In ORC however, the
archive member name may be used to construct an ORC initializer symbol,
which must also be unique. In that case the duplicate member names lead to a
duplicate definition error even if the members define unrelated symbols.
In addition to these bugs, StaticLibraryDefinitionGenerator had grown a
collection of all member buffers (ObjectFilesMap), a BumpPtrAllocator
that was redundantly storing synthesized archive member names (these are
copied into the MemoryBuffers created for each Object, but were never
freed in the allocator), and a set of COFF-specific import files.
To fix the bugs above and simplify StaticLibraryDefinitionGenerator this
patch makes the following changes:
1. StaticLibraryDefinitionGenerator::VisitMembersFunction is generalized
to take a reference to the containing archive, and the index of the
member within the archive. It now returns an Expected<bool> indicating
whether the member visited should be treated as loadable, not loadable,
or as invalidating the entire archive.
2. A static StaticLibraryDefinitionGenerator::createMemberBuffer method
is added which creates MemoryBuffers with unique names of the form
`<archive-name>[<index>](<member-name>)`. This defers construction of
member names until they're loaded, allowing the BumpPtrAllocator (with
its redundant name storage) to be removed.
3. The ObjectFilesMap (symbol name -> memory-buffer-ref) is replaced
with a SymbolToMemberIndexMap (symbol name -> index) which should be
smaller and faster to construct.
4. The 'loadability' result from VisitMemberFunctions is now taken into
consideration when building the SymbolToMemberIndexMap so that members
that have already been loaded / filtered out can be skipped, and do not
take up any ongoing space.
5. The COFF ImportedDynamicLibraries member is moved out into the
COFFImportFileScanner utility, which can be used as a
VisitMemberFunction.
This fixes the bugs described above; and should lower memory consumption
slightly, especially for archives with many files and / or symbol where
most files are eventually loaded.
This patch adds support for forced loading of archive members, similar to the
behavior of the -all_load and -ObjC options in ld64. To enable this, the
StaticLibraryDefinitionGenerator class constructors are extended with a
VisitMember callback that is called on each member file in the archive at
generator construction time. This callback can be used to unconditionally add
the member file to a JITDylib at that point.
To test this the llvm-jitlink utility is extended with -all_load (all platforms)
and -ObjC (darwin only) options. Since we can't refer to symbols in the test
objects directly (these would always cause the member to be linked in, even
without the new flags) we instead test side-effects of force loading: execution
of constructors and registration of Objective-C metadata.
rdar://134446111
This allows us to rewrite part of StaticLibraryDefinitionGenerator in terms of
loadLinkableFile.
It's also useful for clients who may not know (either from file extensions or
context) whether a given path will be an object file, an archive, or a
universal binary.
rdar://134638070
API clients may want to use things other than paths as the buffer identifiers.
No testcase -- I haven't thought of a good way to expose this via the regression
testing tools.
rdar://133536831
ORC supports loading relocatable object files into a JIT'd process. The
raw "add object file" API (ObjectLayer::add) accepts plain relocatable
object files as llvm::MemoryBuffers only and does not check that the
object file's format or architecture are compatible with the process
that it will be linked in to. This API is flexible, but places the
burden of error checking and universal binary support on clients.
This commit introduces a new utility, loadRelocatableObject, that takes
a path to load and a target triple and then:
1. If the path does not exist, returns a FileError containing the
invalid path.
2. If the path points to a MachO universal binary, identifies and
returns MemoryBuffer covering the slice that matches the given triple
(checking that the slice really does contains a valid MachO relocatable
object with a compatible arch).
3. If the path points to a regular relocatable object file, verifies
that the format and architecture are compatible with the triple.
Clients can use loadRelocatableObject in the common case of loading
object files from disk to simplify their code.
Note: Error checking for ELF and COFF is left as a FIXME.
rdar://133653290