Fixes#85578, a use-after-free caused by some `MCSymbolWasm` data being
freed too early.
Previously, `WebAssemblyAsmParser` owned the data that is moved to
`MCContext` by this PR, which caused problems when handling module ASM,
because the ASM parser was destroyed after parsing the module ASM, but
the symbols persisted.
The added test passes locally with an LLVM build with AddressSanitizer
enabled.
Implementation notes:
* I've called the added method
<code>allocate<b><i>Generic</i></b>String</code> and added the second
paragraph of its documentation to maybe guide people a bit on when to
use this method (based on my (limited) understanding of the `MCContext`
class). We could also just call it `allocateString` and remove that
second paragraph.
* The added `createWasmSignature` method does not support taking the
return and parameter types as arguments: Specifying them afterwards is
barely any longer and prevents them from being accidentally specified in
the wrong order.
* This removes a _"TODO: Do the uniquing of Signatures here instead of
ObjectFileWriter?"_ since the field it's attached to is also removed.
Let me know if you think that TODO should be preserved somewhere.
LLVM models some features found in the binary format with raw integers
and others with nested or enumerated types. This PR switches modeling of
tables and segments to use wasm::ValType rather than uint32_t. This NFC
change is in preparation for modeling more reference types, but IMO is
also cleaner and closer to the spec.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
These files satisfy all of the following:
- misc-include-cleaner indicates that these files do not need
Endian.h.
- They do not mention "endian" anywhere.
- They do not include any *.inc or *.def, which could need
llvm::support::endian.
This finishes the work of replacing OperandMatchResultTy with
ParseStatus, started in D154101.
As a drive-by change, rename some RegNo variables to just Reg
(a leftover from the days when RegNo had 'unsigned' type).
Conventionally, parsing methods return false on success and true on
error. However, directive parsing methods need a third state: the
directive is not target specific. AsmParser::parseStatement detected
this case by using a fragile heuristic: if the target parser did not
consume any tokens, the directive is assumed to be not target-specific.
Some targets fail to follow the convention: they return success after
emitting an error or do not consume the entire line and return failure
on successful parsing. This was partially worked around by checking for
pending errors in parseStatement.
This patch tries to improve the situation by introducing parseDirective
method that returns ParseStatus -- three-state class. The new method
should eventually replace the old one returning bool.
ParseStatus is intentionally implicitly constructible from bool to allow
uses like `return Error(Loc, "message")`. It also has a potential to
replace OperandMatchResulTy as it is more convenient to use due to the
implicit construction from bool and more type safe.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D154101
This reduces dependencies on `llvm-tblgen` so much.
`CodeGenTypes` depends on `Support` at the moment.
Be careful to append deps on this, since Targets' tablegens
depend on this.
Depends on D149024
Differential Revision: https://reviews.llvm.org/D148769
This is rework of;
- D30046 (LLT)
Since I have introduced `llvm-min-tblgen` as D146352, `llvm-tblgen`
may depend on `CodeGen`.
`LowLevlType.h` originally belonged to `CodeGen`. Almost all userse are
still under `CodeGen` or `Target`. I think `CodeGen` is the right place
to put `LowLevelType.h`.
`MachineValueType.h` may be moved as well. (later, D149024)
I have made many modules depend on `CodeGen`. It is consistent but
inefficient. It will be split out later, D148769
Besides, I had to isolate MVT and LLT in modmap, since
`llvm::PredicateInfo` clashes between `TableGen/CodeGenSchedule.h`
and `Transforms/Utils/PredicateInfo.h`.
(I think better to introduce namespace llvm::TableGen)
Depends on D145937, D146352, and D148768.
Differential Revision: https://reviews.llvm.org/D148767
This PR introduces the `BrStack` member to store the info about
`loop`, `block`, `if` and `try`. It can check whether `br` immediate number
out of range.
Reviewed By: aheejin
Differential Revision: https://reviews.llvm.org/D148054
When we encounter an `else`, `catch`, or `catch_all`, we currently just
push the structure `NestingType` and don't preserve the original `if`
and `try`'s signature. So after we pass `else`/`catch`/`catch_all`, we
can't check if the values on stack have the correct types when we
encounter `end_if` or `end_try`. This CL fixes the issue, and modifies
the existing test to be correct (some of them had `try` without
`catch`).
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D147881
We disable type check in unreachable code, but when the unreachable code
is enclosed within a block-like structure, the block as a whole has a
valid type and we should continue type checking after the block. But it
looks we currently only do that for blocks and not other block-like
structures (`loop`s, `try`s, and `if`s). Also unreachable code within
`if`'s true body shouldn't disable type checking in `else` body, and
that in `try` body shouldn't disable type checking in `catch/catch_all`
body.
This also causes the values/types on the stack to be correctly checked
when encounterint `catch`, `catch_all`, and `delegate`.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D147852
The current code is
```
ExpectBlockType = false;
TC.setLastSig(*Signature.get());
if (ExpectBlockType)
NestingStack.back().Sig = *Signature.get();
```
Because of the first line, the third line's `if (ExpectBlockType)` is
always false and we don't get to update `NestingStack.back().Sig`. This
results in not correctly erroring out when the types of remaining values
on the stack do not match the block type if the block type is written in
the form of a function type. We should set `ExpectBlockType` to false
after the `if`.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D147837
This code currently assumes that all bulk memory operations occur on
memory 0 which who's type is determined by the wasm32 vs wasm64 target
triple. Further improvements would be need to support multi-memory.
Differential Revision: https://reviews.llvm.org/D147540
WebAssemblyUtils depends on CodeGen which depends on all middle end
optimization libraries.
This component is used by WebAssembly's AsmParser, Disassembler, and
MCTargetDesc libraries. Because of this, any MC layer tool built with
WebAssembly support includes a larger portion of LLVM than it should.
To fix this I've created an MC only version of WebAssemblyTypeUtilities.cpp
in MCTargetDesc to be used by the MC components.
This shrinks llvm-objdump and llvm-mc on my local release+asserts
build by 5-6 MB.
Reviewed By: MaskRay, aheejin
Differential Revision: https://reviews.llvm.org/D144354
Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A future patch will remove opInfo_begin and opInfo_end.
Also use it instead of raw access to the OpInfo pointer. A future patch
will remove this pointer.
Differential Revision: https://reviews.llvm.org/D142213
Local info is supposed to be emitted in the start of every function.
When there are locals, `.local` section should be present, and we emit
local info according to the section.
If there is no locals, empty local info should be emitted. This empty
local info is emitted whenever a first instruction is emitted within a
function without encountering a `.local` section. If there is no
instruction, `end_function` pseudo instruction should be present and the
empty local info will be emitted when parsing the pseudo instruction.
The following assembly is malformed because the function `test` doesn't
have an `end_function` at the end, and the parser doesn't end up
emitting the empty local info needed. But currently we don't error out
and silently produce an invalid binary.
```
.functype test () -> ()
test:
```
This patch adds one extra state to the Wasm assembly parser,
`FunctionLabel` to detect whether a function label is parsed but not
ended properly when the next function starts or the file ends.
It is somewhat tricky to distinguish `FunctionLabel` and
`FunctionStart`, because it is not always possible to ensure the state
goes from `FunctionLabel` -> `FunctionStart`. `.functype` directive does
not seem to be mandated before a function label, in which case we don't
know if the label is a function at the time of parsing. But when we do
know the label is function, we would like to ensure it ends with an
`end_function` properly. Also we would like to error out when it does
not.
For example,
```
.functype test() -> ()
test:
```
We should error out for this because we know `test` is a function and it
doesn't end with an `end_function`. This PR fixes this.
```
test:
```
We don't error out for this because there is no info that `test` is a
function, so we don't know whether there should be an `end_function` or
not.
```
test:
.functype test() -> ()
```
We error out for this currently already, because we currently switch to
`FunctionStart` state when we first see `.functype` directive after its
label definition.
Fixes https://github.com/llvm/llvm-project/issues/57427.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D141103
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
Once we are in the `Unreachable` we want to disable type checking, but
we were unconditionally returning `true` here which means we encountered
and error. Instead we unconditionally return false to signal no error.
Fixes: https://github.com/llvm/llvm-project/issues/56935
Differential Revision: https://reviews.llvm.org/D135195
Although we only currently have one error produced in this function I am
working on changes right now that add some more. This change makes the
error location more accurate.
Differential Revision: https://reviews.llvm.org/D133016
Warn if `.size` is specified for a function symbol. The size of a
function symbol is determined solely by its content.
I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.
Differential Revision: https://reviews.llvm.org/D132929
Custom type-checking (in WebAssemblyAsmTypeCheck.cpp) is used to
workaround the fact that separate variants of the instruction are
defined for externref and funcref.
Based on an initial patch by Paulo Matos <pmatos@igalia.com>.
Differential Revision: https://reviews.llvm.org/D123484
The previous code didn't take account for the fact that parseExpression
my lex additional tokens - because of this, it's necessary to record the
location of the current token ahead of the call. This patch additionally
makes use of the fact parseExpression will set its End parameter to the
end of the expression.
Although this fix could be added independently of D122127, I've opted to
make it a child patch in order to ensure the change has some test
coverage.
Differential Revision: https://reviews.llvm.org/D122128
This addresses a series of FIXMEs introduced in D122020.
A follow-up patch (D122128) addresses the bug that is exposed by this
change (an issue with source location information when lexing
identifiers).
Differential Revision: https://reviews.llvm.org/D122127
D114979 changed the textual formal of ref.null - dropping ref.null in
favour of ref.null_extern and ref.null_func. Therefore, the type checker
no longer needs logic to handle "ref.null".
Differential Revision: https://reviews.llvm.org/D122123
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:
llvm/MC/MCParser/MCAsmParser.h no longer includes llvm/MC/MCParser/MCAsmLexer.h
Preprocessed lines to build llvm on my setup:
after: 1068185081
before: 1068324320
So no compile time benefit to expect, but we still get the looser coupling
between files which is great.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119359
When hitting an else clause the type Stack should be reset to as it was at the start of the if, without taking into account the Type inserted into the Stack during the then branch of the if.
Reviewed By: aardappel
Differential Revision: https://reviews.llvm.org/D115748
This patch implements the intrinsic for ref.null.
In the process of implementing int_wasm_ref_null_func() and
int_wasm_ref_null_extern() intrinsics, it removes the redundant
HeapType.
This also causes the textual assembler syntax for ref.null to
change. Instead of receiving an argument: `func` or `extern`, the
instruction mnemonic is either ref.null_func or ref.null_extern,
without the need for a further operand.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D114979
To support return (it not being supported well was the ground cause for
https://github.com/WebAssembly/wasi-sdk/issues/200) we also have to have
at least a basic notion of unreachable, which in this case just means to stop
type checking until there is an end_block (an incoming control flow edge).
This is conservative (may miss on some type checking opportunities) but is
simple and an improvement over what we had before.
Differential Revision: https://reviews.llvm.org/D112953