In most circumstances BSS segments are not required in the output binary
but combineOutputSegments was erroneously including them. This meant
that PIC binaries were including the BSS data as zero in the binary.
Fixes: https://github.com/emscripten-core/emscripten/issues/23683
Instead of always generating __wasm_apply_data_relocs when relevant
options like -pie and -shared are specified, generate it only when the
relevant relocations are actually necessary.
Note: omitting empty __wasm_apply_data_relocs is not a problem because
the export is optional in the spec (DynamicLinking.md) and all runtime
linker implementations I'm aware of implement it that way. (emscripten,
toywasm, wasm-tools)
Motivations:
* This possibly reduces the module size
* This is also a preparation to fix
https://github.com/llvm/llvm-project/issues/107387, for which it isn't
obvious if we need these relocations at the time of
createSyntheticSymbols. (unless we introduce a new explicit option like
--non-pie-dynamic-link.)
This only effects folks building with wasm64 + shared memory which
is not currently a supported configuration in emscripten or any other
wasm toolchain.
Differential Revision: https://reviews.llvm.org/D141005
Instead, export `__wasm_apply_data_relocs` and `__wasm_call_ctors`
separately.
This is required since user code in a shared library (such as static
constructors) should not be run until relocations have been applied to
all loaded libraries.
See: https://github.com/emscripten-core/emscripten/issues/17295
Differential Revision: https://reviews.llvm.org/D128515
It turns out we were already allocating static address space for TLS
data along with the non-TLS static data, but this space was going
unused/ignored.
With this change, we include the TLS segment in `__wasm_init_memory`
(which does the work of loading the passive segments into memory when a
module is first loaded). We also set the `__tls_base` global to point
to the start of this segment.
This means that the runtime can use this static copy of the TLS data for
the first/primary thread if it chooses, rather than doing a runtime
allocation prior to calling `__wasm_init_tls`.
Practically speaking, this will allow emscripten to avoid dynamic
allocation of TLS region on the main thread.
Differential Revision: https://reviews.llvm.org/D126107
We already perform memory initialization and apply global relocations
during start. It makes sense to performs data relocations too. I think
the reason we were not doing this already is solely historical.
Differential Revision: https://reviews.llvm.org/D117412
Previously we were relying on the dynamic loader to take care of this
but it simple and correct for us to do it here instead.
Now we initialize bss segments as part of `__wasm_init_memory` at the
same time we initialize passive segments.
In addition we extent the us of `__wasm_init_memory` outside of shared
memory situations. Specifically it is now used to initialize bss
segments when the memory is imported.
Differential Revision: https://reviews.llvm.org/D112667
Rather than depending on the hex dump from obj2yaml. Now the test shows the
expected function body in a human readable format.
Differential Revision: https://reviews.llvm.org/D109730
For multithreaded modules (i.e. modules with a shared memory), lld injects a
synthetic Wasm start function that is automatically called during instantiation
to initialize memory from passive data segments. Even though the module will be
instantiated separately on each thread, memory initialization should happen only
once. Furthermore, memory initialization should be finished by the time each
thread finishes instantiation. Since multiple threads may be instantiating their
modules at the same time, the synthetic function must synchronize them.
The current synchronization tries to atomically increment a flag from 0 to 1 in
memory then enters one of two cases. First, if the increment was successful, the
current thread is responsible for initializing memory. It does so, increments
the flag to 2 to signify that memory has been initialized, then notifies all
threads waiting on the flag. Otherwise, the thread atomically waits on the flag
with an expected value of 1 until memory has been initialized. Either the
initializer thread finishes initializing memory (i.e. sets the flag to 2) first
and the waiter threads do not end up blocking, or the waiter threads succesfully
start waiting before memory is initialized so they will be woken by the
initializer thread once it has finished.
One complication with this scheme is that there are various contexts on the Web,
most notably on the main browser thread, that cannot successfully execute a
wait. Executing a wait in these contexts causes a trap, and in this case would
cause instantiation to fail. The embedder must therefore ensure that these
contexts win the race and become responsible for initializing memory, since that
is the only code path that does not execute a wait.
Unfortunately, since only one thread can win the race and initialize memory,
this scheme makes it impossible to have multiple threads in contexts that cannot
wait. For example, it is not currently possible to instantiate the module on
both the main browser thread as well as in an AudioWorklet. To loosen this
restriction, this commit inserts an extra check so that the wait will not be
executed at all when memory has already been initialized, i.e. when the flag
value is 2. After this change, the module can be instantiated on threads in
non-waiting contexts as long as the embedder can guarantee either that the
thread will win the race and initialize memory (as before) or that memory has
already been initialized when instantiation begins. Threads in contexts that can
wait can continue racing to initialize memory.
Fixes (or at least improves) PR51702.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D109722
Prior to this change build with `-shared/-pie` and using TLS (but
without -shared-memory) would hit this assert:
"Currenly only a single data segment is supported in PIC mode"
This is because we were not including TLS data when merging data
segments. However, when we build without shared-memory (i.e. without
threads) we effectively lower away TLS into a normal active data
segment.. so we were ending up with two active data segments: the merged
data, and the lowered TLS data.
To fix this problem we can instead avoid combining data segments at
all when running in shared memory mode (because in this case all
segment initialization is passive). And then in non-shared memory
mode we know that TLS has been lowered and therefore we can can
and should combine all segments.
So with this new behavior we have two different modes:
1. With shared memory / mutli-threaded: Never combine data segments
since it is not necessary. (All data segments as passive already).
2. Wihout shared memory / single-threaded: Combine *all* data segments
since we treat TLS as normal data. (We end up with a single
active data segment).
Differential Revision: https://reviews.llvm.org/D102937
With dynamic linking we have the current limitation that there can be
only a single active data segment (since we use __memory_base as the
load address and we can't do arithmetic in constant expresions).
This change delays the merging of active segments until a little later
in the linking process which means that the grouping of data by section,
and the magic __start/__end symbols work as expected under dynamic
linking.
Differential Revision: https://reviews.llvm.org/D96453
We have two types of relocations that we apply on startup:
1. Relocations that apply to wasm globals
2. Relocations that apply to wasm memory
The first set of relocations use only the `__memory_base` import to
update a set of internal globals. Because wasm globals are thread local
these need to run on each thread. Memory relocations, like static
constructors, must only be run once.
To ensure global relocations run on all threads and because the only
depend on the immutable `__memory_base` import we can run them during
the WebAssembly start functions, instead of waiting until the
post-instantiation __wasm_call_ctors.
Differential Revision: https://reviews.llvm.org/D93066
This change improves our support for shared memory to include
PIC executables (and shared libraries).
To handle this case the linker-generated `__wasm_init_memory`
function (that only exists in shared memory builds) must be
capable of loading memory segements at non-const offsets based
on the runtime value of `__memory_base`.
Differential Revision: https://reviews.llvm.org/D92620
The conditional guarding createInitMemoryFunction was incorrect and
didn't match that guarding the creation of the associated symbol.
Rather that reproduce the same conditions in multiple places we can
simply use the presence of the associated symbol.
Also, add an assertion that would have caught this bug.
Also, add a new test for this flag combination.
This is part of an ongoing effort to enable dynamic linking with
threads in emscripten.
See https://github.com/emscripten-core/emscripten/issues/3494
Differential Revision: https://reviews.llvm.org/D92520
Summary:
This patch fixes a bug where initialization code for .bss segments was
emitted in the memory initialization function even though the .bss
segments were discounted in the datacount section and omitted in the
data section. This was producing invalid binaries due to out-of-bounds
segment indices on the memory.init and data.drop instructions that
were trying to operate on the nonexistent .bss segments.
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80354
Summary:
WebAssembly memories are zero-initialized, so when module does not
import its memory initializing .bss sections is guaranteed to be a
no-op. To reduce binary size and initialization time, .bss sections
are simply not emitted into the final binary unless the memory is
imported.
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68965
llvm-svn: 374940
Summary:
This was always the intended behavior, but had not been
implemented. This ordering is important for Emscripten when generating
.mem files while compiling to JS, since only zeros at the end of
initialized memory can be dropped.
Fixes https://github.com/emscripten-core/emscripten/issues/8999
Reviewers: sbc100
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67736
llvm-svn: 372284
Summary:
- `__wasm_init_memory` is now the WebAssembly start function instead
of being called from `__wasm_call_ctors` or called directly by the
runtime.
- Adds a new synthetic data symbol `__wasm_init_memory_flag` that is
atomically incremented from zero to one by the thread responsible
for initializing memory.
- All threads now unconditionally perform data.drop on all passive
segments.
- Removes --passive-segments and --active-segments flags and controls
segment type based on --shared-memory instead. The deleted flags
were only present to ameliorate the upgrade path in Emscripten.
Reviewers: sbc100, aheejin
Subscribers: dschuff, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65783
llvm-svn: 370965
Summary:
This change makes it so that passing --shared-memory is all a user
needs to do to get proper multithreaded code. This default can still
be explicitly overridden for any reason using --passive-segments and
--active-segments.
Reviewers: sbc100, quantum
Subscribers: dschuff, jgravelle-google, aheejin, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64950
llvm-svn: 366504
Summary:
Thread local variables are placed inside a `.tdata` segment. Their symbols are
offsets from the start of the segment. The address of a thread local variable
is computed as `__tls_base` + the offset from the start of the segment.
`.tdata` segment is a passive segment and `memory.init` is used once per thread
to initialize the thread local storage.
`__tls_base` is a wasm global. Since each thread has its own wasm instance,
it is effectively thread local. Currently, `__tls_base` must be initialized
at thread startup, and so cannot be used with dynamic libraries.
`__tls_base` is to be initialized with a new linker-synthesized function,
`__wasm_init_tls`, which takes as an argument a block of memory to use as the
storage for thread locals. It then initializes the block of memory and sets
`__tls_base`. As `__wasm_init_tls` will handle the memory initialization,
the memory does not have to be zeroed.
To help allocating memory for thread-local storage, a new compiler intrinsic
is introduced: `__builtin_wasm_tls_size()`. This instrinsic function returns
the size of the thread-local storage for the current function.
The expected usage is to run something like the following upon thread startup:
__wasm_init_tls(malloc(__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, kripken, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, jfb, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D64537
llvm-svn: 366272
Summary:
This was causing large addresses to be emitted as negative numbers,
which rightfully caused crashes in binaryen.
Reviewers: aheejin, dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64612
llvm-svn: 365930
Summary:
Adds `--passive-segments` and `--active-segments` flags to control
what kind of segments are emitted. For now the default is always
to emit active segments so this is not a breaking change, but in
the future the default will be changed to passive segments when
shared memory is requested and active segments otherwise. When
passive segments are emitted, corresponding memory.init and
data.drop instructions are emitted in a `__wasm_init_memory`
function that is automatically called at the beginning of
`__wasm_call_ctors`.
Reviewers: sbc100, aheejin, dschuff
Subscribers: azakai, dschuff, jgravelle-google, sunfish, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59343
llvm-svn: 365088