Fixes#85578, a use-after-free caused by some `MCSymbolWasm` data being
freed too early.
Previously, `WebAssemblyAsmParser` owned the data that is moved to
`MCContext` by this PR, which caused problems when handling module ASM,
because the ASM parser was destroyed after parsing the module ASM, but
the symbols persisted.
The added test passes locally with an LLVM build with AddressSanitizer
enabled.
Implementation notes:
* I've called the added method
<code>allocate<b><i>Generic</i></b>String</code> and added the second
paragraph of its documentation to maybe guide people a bit on when to
use this method (based on my (limited) understanding of the `MCContext`
class). We could also just call it `allocateString` and remove that
second paragraph.
* The added `createWasmSignature` method does not support taking the
return and parameter types as arguments: Specifying them afterwards is
barely any longer and prevents them from being accidentally specified in
the wrong order.
* This removes a _"TODO: Do the uniquing of Signatures here instead of
ObjectFileWriter?"_ since the field it's attached to is also removed.
Let me know if you think that TODO should be preserved somewhere.
In WebAssembly, we have `WASM_SYMBOL_NO_STRIP` symbol flag to mark the
referenced content as retained. However, the flag is not enough to
express retained data that is not referenced by any symbol. This patch
adds a new segment flag`WASM_SEG_FLAG_RETAIN` to support "private"
linkage data that is retained by llvm.used.
This kind of data that is not referenced but must be retained is usually
used with encapsulation symbols (__start/__stop). Swift runtime uses
this technique and depends on the fact "all metadata sections in live
objects are retained", which was not guaranteed with `--gc-sections`
before this patch.
This is a revised version of https://reviews.llvm.org/D126950 (has been
reverted) based on @MaskRay's comments
nm already prints sizes for data symbols. Do that for function symbols
too, and update objdump to also print size information.
Implements item 3 from https://github.com/llvm/llvm-project/issues/76107
LLVM ObjectFile currently records the start offsets of sections as the
start of the section header, whereas most other tools (WABT, emscripten,
wasm-tools) record it as the start of the section content, after the
header. This affects binutils tools such as objdump and nm, but not
compilation/assembly (since that is driven by symbols and assembler
labels which already have their values inside the section payload rather
in the header. This patch updates LLVM to match the other tools.
Enable Type Units with DWARF5 accelerator tables for monolithic DWARF.
Implementation relies on linker to tombstone offset in LocalTU list to
-1 when
it deduplciates type units using COMDAT.
Annotation attributes may be attached to a function to mark it with
custom data that will be contained in the final Wasm file. The
annotation causes a custom section named
"func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will
contain each function's index value that was marked with the annotation.
A new patchable relocation type for function indexes had to be created so
the custom section could be updated during linking.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D150803
Conventionally, parsing methods return false on success and true on
error. However, directive parsing methods need a third state: the
directive is not target specific. AsmParser::parseStatement detected
this case by using a fragile heuristic: if the target parser did not
consume any tokens, the directive is assumed to be not target-specific.
Some targets fail to follow the convention: they return success after
emitting an error or do not consume the entire line and return failure
on successful parsing. This was partially worked around by checking for
pending errors in parseStatement.
This patch tries to improve the situation by introducing parseDirective
method that returns ParseStatus -- three-state class. The new method
should eventually replace the old one returning bool.
ParseStatus is intentionally implicitly constructible from bool to allow
uses like `return Error(Loc, "message")`. It also has a potential to
replace OperandMatchResulTy as it is more convenient to use due to the
implicit construction from bool and more type safe.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D154101
This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0
since I forgot the lit.local.cfg files in that one.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: barannikov88, kwk
Differential Revision: https://reviews.llvm.org/D150762
Reland D147506 after fixing the failure in bot
https://lab.llvm.org/buildbot/#/builders/247/builds/4125
Some debuggers like DBX on AIX assume the address in debug line
entries is always incremental. But clang generates two entries (entry
for file scope line and entry for prologue end) with same address if
prologue is empty
And if the prologue is empty, seems the first debug line entry for the
function is unnecessary(i.e. removing the first entry won't impact the
behavior in GDB on Linux), so I implement this for all debuggers.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D147506
Some debuggers like DBX on AIX assume the address in debug line
entries is always incremental. But clang generates two entries (entry
for file scope line and entry for prologue end) with same address if
prologue is empty
And if the prologue is empty, seems the first debug line entry for the
function is unnecessary(i.e. removing the first entry won't impact the
behavior in GDB on Linux), so I implement this for all debuggers.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D147506
This PR introduces the `BrStack` member to store the info about
`loop`, `block`, `if` and `try`. It can check whether `br` immediate number
out of range.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D148054
When we encounter an `else`, `catch`, or `catch_all`, we currently just
push the structure `NestingType` and don't preserve the original `if`
and `try`'s signature. So after we pass `else`/`catch`/`catch_all`, we
can't check if the values on stack have the correct types when we
encounter `end_if` or `end_try`. This CL fixes the issue, and modifies
the existing test to be correct (some of them had `try` without
`catch`).
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D147881
We disable type check in unreachable code, but when the unreachable code
is enclosed within a block-like structure, the block as a whole has a
valid type and we should continue type checking after the block. But it
looks we currently only do that for blocks and not other block-like
structures (`loop`s, `try`s, and `if`s). Also unreachable code within
`if`'s true body shouldn't disable type checking in `else` body, and
that in `try` body shouldn't disable type checking in `catch/catch_all`
body.
This also causes the values/types on the stack to be correctly checked
when encounterint `catch`, `catch_all`, and `delegate`.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D147852
The current code is
```
ExpectBlockType = false;
TC.setLastSig(*Signature.get());
if (ExpectBlockType)
NestingStack.back().Sig = *Signature.get();
```
Because of the first line, the third line's `if (ExpectBlockType)` is
always false and we don't get to update `NestingStack.back().Sig`. This
results in not correctly erroring out when the types of remaining values
on the stack do not match the block type if the block type is written in
the form of a function type. We should set `ExpectBlockType` to false
after the `if`.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D147837
This code currently assumes that all bulk memory operations occur on
memory 0 which who's type is determined by the wasm32 vs wasm64 target
triple. Further improvements would be need to support multi-memory.
Differential Revision: https://reviews.llvm.org/D147540
Local info is supposed to be emitted in the start of every function.
When there are locals, `.local` section should be present, and we emit
local info according to the section.
If there is no locals, empty local info should be emitted. This empty
local info is emitted whenever a first instruction is emitted within a
function without encountering a `.local` section. If there is no
instruction, `end_function` pseudo instruction should be present and the
empty local info will be emitted when parsing the pseudo instruction.
The following assembly is malformed because the function `test` doesn't
have an `end_function` at the end, and the parser doesn't end up
emitting the empty local info needed. But currently we don't error out
and silently produce an invalid binary.
```
.functype test () -> ()
test:
```
This patch adds one extra state to the Wasm assembly parser,
`FunctionLabel` to detect whether a function label is parsed but not
ended properly when the next function starts or the file ends.
It is somewhat tricky to distinguish `FunctionLabel` and
`FunctionStart`, because it is not always possible to ensure the state
goes from `FunctionLabel` -> `FunctionStart`. `.functype` directive does
not seem to be mandated before a function label, in which case we don't
know if the label is a function at the time of parsing. But when we do
know the label is function, we would like to ensure it ends with an
`end_function` properly. Also we would like to error out when it does
not.
For example,
```
.functype test() -> ()
test:
```
We should error out for this because we know `test` is a function and it
doesn't end with an `end_function`. This PR fixes this.
```
test:
```
We don't error out for this because there is no info that `test` is a
function, so we don't know whether there should be an `end_function` or
not.
```
test:
.functype test() -> ()
```
We error out for this currently already, because we currently switch to
`FunctionStart` state when we first see `.functype` directive after its
label definition.
Fixes https://github.com/llvm/llvm-project/issues/57427.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D141103
Once we are in the `Unreachable` we want to disable type checking, but
we were unconditionally returning `true` here which means we encountered
and error. Instead we unconditionally return false to signal no error.
Fixes: https://github.com/llvm/llvm-project/issues/56935
Differential Revision: https://reviews.llvm.org/D135195
Although we only currently have one error produced in this function I am
working on changes right now that add some more. This change makes the
error location more accurate.
Differential Revision: https://reviews.llvm.org/D133016
Warn if `.size` is specified for a function symbol. The size of a
function symbol is determined solely by its content.
I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.
Differential Revision: https://reviews.llvm.org/D132929
Custom type-checking (in WebAssemblyAsmTypeCheck.cpp) is used to
workaround the fact that separate variants of the instruction are
defined for externref and funcref.
Based on an initial patch by Paulo Matos <pmatos@igalia.com>.
Differential Revision: https://reviews.llvm.org/D123484
The previous code didn't take account for the fact that parseExpression
my lex additional tokens - because of this, it's necessary to record the
location of the current token ahead of the call. This patch additionally
makes use of the fact parseExpression will set its End parameter to the
end of the expression.
Although this fix could be added independently of D122127, I've opted to
make it a child patch in order to ensure the change has some test
coverage.
Differential Revision: https://reviews.llvm.org/D122128
This addresses a series of FIXMEs introduced in D122020.
A follow-up patch (D122128) addresses the bug that is exposed by this
change (an issue with source location information when lexing
identifiers).
Differential Revision: https://reviews.llvm.org/D122127
While looking at bugs like PR54022, I noted that there is no real test
coverage for the asm type checker. This patch starts to address that,
adding a series of tests for the errors messages produced, as well as
some FIXMEs as an XFAIL test for some current issues.
It's not intended to be an exhaustive test, but does have test cases for
each of the instructions that the type checker has specific handling
for.
Differential Revision: https://reviews.llvm.org/D122020
This fixes bug <https://github.com/llvm/llvm-project/issues/54022>. For
now this means that defined functions will have two .functype directives
emitted. Given discussion in that bug has suggested interest in moving
towards using something other than .functype to mark the beginning of a
function (which would, as a side-effect, solve this issue), this patch
doesn't attempt to avoid that duplication.
Some test cases that used CHECK-LABEL: foo rather than CHECK-LABEL: foo:
are broken by this change. This patch updates those test cases to always
have a colon at the end of the CHECK-LABEL string.
Differential Revision: https://reviews.llvm.org/D122134
Fix the instruction names to match the WebAssembly spec:
- `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero`
- `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero`
Also rename related things like intrinsics, builtins, and test functions to
match.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D121661
Also increase coverage of call_indirect via explict function table
(enabled when reference types is enabled) in
llvm/test/CodeGen/WebAssembly/call-indirect.ll (I believe this
was an oversight that it was not added in https://reviews.llvm.org/D90948)
Differential Revision: https://reviews.llvm.org/D120521
Intrinsics like `memset` were not emitted as `.functype` because
WebAssemblyAsmPrinter::emitExternalDecls explicitly skips symbols
that are isIntrinsic. Removing that check doesn't work, since the symbol
from the module refers to a 4-argument `llvm.memset.p0i8.i32` rather
than the 3-argument `memset` symbol referenced in the call.
Our `WebAssemblyMCLowerPrePass` however does collect the
`memset` symbol, so the current solution is as simple as emitting
`.functype` for those.
Fixes: https://github.com/llvm/llvm-project/issues/53712
Differential Revision: https://reviews.llvm.org/D120365
Reland of 00bf4755.
This patches fixes the visibility and linkage information of symbols
referring to IR globals.
Emission of external declarations is now done in the first execution
of emitConstantPool rather than in emitLinkage (and a few other
places). This is the point where we have already gathered information
about used symbols (by running the MC Lower PrePass) and not yet
started emitting any functions so that any declarations that need to
be emitted are done so at the top of the file before any functions.
This changes the order of a few directives in the final asm file which
required an update to a few tests.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D118995
This reverts commit 00bf4755e90c89963a135739218ef49c2417109f.
This change broke the emscripten builder (among other things):
https://ci.chromium.org/ui/p/emscripten-releases/builders/try/linux/b8823500584349280721/overview
Sample failure:
```
test_unistd_unlink (test_core.core0) ...
wasm-ld: error: symbol type mismatch: __stdio_write
>>> defined as WASM_SYMBOL_TYPE_FUNCTION in /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/libc-debug.a(__stdio_write.o)
>>> defined as WASM_SYMBOL_TYPE_DATA in /usr/local/google/home/sbc/dev/wasm/emscripten/cache/sysroot/lib/wasm32-emscripten/libc-debug.a(stderr.o)
```
This patches fixes the visibility and linkage information of symbols
referring to IR globals.
Emission of external declarations is now done in the first execution
of emitConstantPool rather than in emitLinkage (and a few other
places). This is the point where we have already gathered information
about used symbols (by running the MC Lower PrePass) and not yet
started emitting any functions so that any declarations that need to
be emitted are done so at the top of the file before any functions.
This changes the order of a few directives in the final asm file which
required an update to a few tests.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D118122
Try to revert D113741 once again.
This also reverts 0ac75e82fff93a80ca401d3db3541e8d1d9098f9 (D114705)
as it causes LLDB's lldb-api.lang/cpp/nsimport.TestCppNsImport.py test
failure w/o D113741.
This reverts commit f9607d45f399e2afc39ec16222ea68b4e0831564.
Differential Revision: https://reviews.llvm.org/D116225