388 Commits

Author SHA1 Message Date
Dan Gohman
118445841d
[WebAssembly] Protect memory.fill and memory.copy from zero-length ranges. (#112617)
WebAssembly's `memory.fill` and `memory.copy` instructions trap if the
pointers are out of bounds, even if the length is zero. This is
different from LLVM, which expects that it can call `memcpy` on
arbitrary invalid pointers if the length is zero. To avoid spurious
traps, branch around `memory.fill` and `memory.copy` when the length is
zero.

---------

Co-authored-by: Heejin Ahn <aheejin@gmail.com>
2024-10-24 14:13:58 -07:00
Jordan Rupprecht
33363521ca
[NFC][WebAssembly] Inline var only used in assertion (#113507) 2024-10-23 18:51:25 -05:00
Alex Crichton
c2293b33dd
[WebAssembly] Implement the wide-arithmetic proposal (#111598)
This commit implements the [wide-arithmetic] proposal which has recently
reached phase 2 in the WebAssembly proposals process. The goal here is
to implement support in LLVM for emitting these instructions which are
gated behind a new feature flag by default. A new `wide-arithmetic`
feature flag is introduced which gates these four new instructions from
being emitted.

Emission of each instruction itself is relatively simple given LLVM's
preexisting lowering rules and infrastructure. The main gotcha is that
due to the multi-result nature of all of these instructions it needed
the lowerings to be implemented in C++ rather than in TableGen.

[wide-arithmetic]: https://github.com/WebAssembly/wide-arithmetic
2024-10-23 11:39:58 -07:00
Jeffrey Byrnes
853c43d04a
[TTI] NFC: Port TLI.shouldSinkOperands to TTI (#110564)
Porting to TTI provides direct access to the instruction cost model,
which can enable instruction cost based sinking without introducing code
duplication.
2024-10-09 14:30:09 -07:00
Simon Pilgrim
f8f0a266e0
[clang][wasm] Replace the target integer sub saturate intrinsics with the equivalent generic __builtin_elementwise_sub_sat intrinsics (#109405)
Remove the Intrinsic::wasm_sub_sat_signed/wasm_sub_sat_unsigned entries
and just use sub_sat_s/sub_sat_u directly
2024-09-22 10:12:41 +01:00
Brendan Dahl
c076638c70
[WebAssembly] Support BUILD_VECTOR with F16x8. (#108117)
Convert BUILD_VECTORS with FP16x8 to I16x8 since there's no FP16 scalar
value to intialize v128.const.
2024-09-11 10:00:10 -07:00
Brendan Dahl
415288a2a7
[WebAssembly] Add load and store patterns for V8F16. (#108119) 2024-09-11 09:53:53 -07:00
Brendan Dahl
5703d8572f
[WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (#106465)
Getting this to work required a few additional changes:
- Add builtins for any instructions that can't be done with plain C
currently.
- Add support for the saturating version of fp_to_<s,i>_I16x8. Other
vector sizes supported this already.
- Support bitcast of f16x8 to v128. Needed to return a __f16x8 as
v128_t.
2024-08-30 08:42:37 -07:00
Sergei Barannikov
4d7a0abae8
[DataLayout] Change return type of getStackAlignment to MaybeAlign (#105478)
Currently, `getStackAlignment` asserts if the stack alignment wasn't
specified. This makes it inconvenient to use and complicates testing.

This change also makes `exceedsNaturalStackAlignment` method redundant.
2024-08-27 22:59:33 +03:00
Brendan Dahl
7d373cef49
[WebAssembly] Change half-precision feature name to fp16. (#105434)
This better aligns with how the feature is being referred to and what
runtimes (V8) are calling it.
2024-08-22 09:44:33 -07:00
Sam Parker
76c4529515
[WebAssembly] Fix assertion in LowerBUILD_VECTOR (#101961)
The assertion was failing in the case where we were trying to lower to
loadxx_zero, but lane zero was undef.
2024-08-05 14:38:12 -07:00
Sam Parker
08decd20a9
[WebAssembly] load_zero to initialise build_vector (#100610)
Instead of splatting a single lane, to initialise a build_vector, lower
to scalar_to_vector which can be selected to load_zero.

Also add load_zero and load_lane patterns for f32x4 and f64x2.
2024-08-02 10:11:21 +01:00
Amara Emerson
f270a4dd66
[AArch64] Don't tail call memset if it would convert to a bzero. (#98969)
Well, not quite that simple. We can tc memset since it returns the first
argument but bzero doesn't do that and therefore we can end up
miscompiling.

This patch also refactors the logic out of isInTailCallPosition() into the callers.
As a result memcpy and memmove are also modified to do the same thing
for consistency.

rdar://131419786
2024-07-17 01:31:52 -07:00
Roger Ferrer Ibáñez
05e6bb40eb
[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795)
The current way of lowering `llvm.clear_cache` is a bit unusual. As
suggested by Matt Arsenault we are better off using an ISD node.

This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall
by default named `__clear_cache` and the default legalisation is a
libcall.

This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE`
needed by RISC-V on some platforms.
2024-05-30 14:55:32 +02:00
Brendan Dahl
60bce6eab4
[WebAssembly] Implement all f16x8 binary instructions. (#93360)
This reuses most of the code that was created for f32x4 and f64x2 binary
instructions and tries to follow how they were implemented.

add/sub/mul/div - use regular LL instructions
min/max - use the minimum/maximum intrinsic, and also have builtins
pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins

Specified at:

29a9b9462c/proposals/half-precision/Overview.md
2024-05-28 16:33:20 -07:00
Heejin Ahn
c179d50fd3
[WebAssembly] Add exnref type (#93586)
This adds (back) the exnref type restored in the new EH proposal adopted
in Oct 2023 CG meeting:

https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md:x
2024-05-28 16:10:11 -07:00
Brendan Dahl
09c5525610
[WebAssembly] Implement prototype f16x8.splat instruction. (#93228)
Adds a builtin and intrinsic for the f16x8.splat instruction.

Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f16x8.splat as opcode 0x123, but this is
incorrect and will be changed to 0x120 soon.
2024-05-23 20:05:22 -07:00
Sam Clegg
39d32b238d
[WebAssembly] Use 64-bit table when targeting wasm64 (#92042)
See https://github.com/WebAssembly/memory64/issues/51
2024-05-23 18:25:58 -07:00
Brendan Dahl
8a3277acbc
[WebAssembly] Implement prototype f32.store_f16 instruction. (#91545)
Adds a builtin and intrinsic for the f32.store_f16 instruction.

The instruction stores an f32 value as an f16 memory. Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.store_f16 as opcode 0xFD0121, but this is
incorrect and will be changed to 0xFC31 soon.
2024-05-09 15:38:13 -07:00
Brendan Dahl
1a2a1fbd7c
[WebAssembly] Implement prototype f32.load_f16 instruction. (#90906)
Adds a builtin and intrinsic for the f32.load_f16 instruction.

The instruction loads an f16 value from memory and puts it in an f32.
Specified at:

29a9b9462c/proposals/half-precision/Overview.md

Note: the current spec has f32.load_f16 as opcode 0xFD0120, but this is
incorrect and will be changed to 0xFC30 soon.
2024-05-07 11:33:10 -07:00
Heejin Ahn
c921ac724f
[WebAssembly] Enable multivalue return when multivalue ABI is used (#88492)
Multivalue feature of WebAssembly has been standardized for several
years now. I think it makes sense to be able to enable it in the feature
section by default for our clang/llvm-produced binaries so that the
multivalue feature can be used as necessary when necessary within our
toolchain and also when running other optimizers (e.g. wasm-opt) after
the LLVM code generation.

But some WebAssembly toolchains, such as Emscripten, do not provide both
mulvalue-returning and not-multivalue-returning versions of libraries.
Also allowing the uses of multivalue in the features section does not
necessarily mean we generate them whenever we can to the fullest, which
is a different code generation / optimization option.

So this makes the lowering of multivalue returns conditional on the use
of 'experimental-mv' target ABI. This ABI is turned off by default and
turned on by passing `-Xclang -target-abi -Xclang experimental-mv` to
`clang`, or `-target-abi experimental-mv` to `clang -cc1` or `llc`.

But the purpose of this PR is not tying the multivalue lowering to this
specific 'experimental-mv'. 'experimental-mv' is just one multivalue ABI
we currently have, and it is still experimental, meaning it is not very
well optimized or tuned for performance. (e.g. it does not have the
limitation of the max number of multivalue-lowered values, which can be
detrimental to performance.) We may change the name of this ABI, or
improve it, or add a new multivalue ABI in the future. Also I heard that
WASI is planning to add their multivalue ABI soon. So the plan is,
whenever any one of multivalue ABIs is enabled, we enable the lowering
of multivalue returns in the backend. We currently have only
'experimental-mv' in the repo so we only check for that in this PR.

Related past discussions:
 #82714
https://github.com/WebAssembly/tool-conventions/pull/223#issuecomment-2008298652
2024-04-23 17:48:59 +09:00
Arthur Eubanks
94c988bcfd [NFC] Remove unused parameter from shouldAssumeDSOLocal() 2024-03-11 19:48:17 +00:00
Heejin Ahn
8506a63bf7 Revert "[WebAssembly] Disable multivalue emission temporarily (#82714)"
This reverts commit 6e6bf9f81756ba6655b4eea8dc45469a47f89b39.

It turned out the multivalue feature had active outside users and it
could cause some disruptions to them, so I'd like to investigate more
about the workarounds before doing this.
2024-02-28 01:02:39 +00:00
Heejin Ahn
6e6bf9f817
[WebAssembly] Disable multivalue emission temporarily (#82714)
We plan to enable multivalue in the features section soon (#80923) for
other reasons, such as the feature having been standardized for many
years and other features being developed (e.g. EH) depending on it. This
is separate from enabling Clang experimental multivalue ABI (`-Xclang
-target-abi -Xclang experimental-mv`), but it turned out we generate
some multivalue code in the backend as well if it is enabled in the
features section.

Given that our backend multivalue generation still has not been much
used nor tested, and enabling the feature in the features section can be
a separate decision from how much multialue (including none) we decide
to generate for now, I'd like to temporarily disable the actual
generation of multivalue in our backend. To do that, this adds an
internal flag `-wasm-emit-multivalue` that defaults to false. All our
existing multivalue tests can use this to test multivalue code. This
flag can be removed later when we are confident the multivalue
generation is well tested.
2024-02-22 19:17:15 -08:00
Alex Bradbury
197214e39b
[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710)
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
    
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
2024-01-09 12:25:17 +00:00
Benjamin Kramer
858d6a15a0 [wasm] Don't crash on non-simple value types during shuffle combine
These still exist during the DAGCombine phase.
2023-10-24 12:35:43 +02:00
Björn Pettersson
4acb96c99f
[SelectionDAG] Tidy up around endianness and isConstantSplat (#68212)
The BuildVectorSDNode::isConstantSplat function could depend on
endianness, and it takes a bool argument that can be used to indicate
if big or little endian should be considered when internally casting
from a vector to a scalar. However, that argument is default set to
false (= little endian). And in many situations, even in target
generic code such as DAGCombiner, the endianness isn't specified when
using the function.

The intent with this patch is to highlight that endianness doesn't
matter, depending on the context in which the function is used.

In DAGCombiner the code is slightly refactored. Back in the days when
the code was written it wasn't possible to request a MinSplatBits
size when calling isConstantSplat. Instead the code re-expanded the
found SplatValue to match with the EltBitWidth. Now we can just
provide EltBitWidth as MinSplatBits and remove the logic for doing
the re-expand.

While being at it, tidying up around isConstantSplat, this patch also
adds an explicit check in BuildVectorSDNode::isConstantSplat to break
out from the loop if trying to split an on VecWidth into two halves.
Haven't been able to prove that there could be miscompiles involved
if not doing so. There are lit tests that trigger that scenario,
although I think they happen to later discard the returned SplatValue
for other reasons.
2023-10-16 14:53:53 +02:00
Paulo Matos
a29e8ef1c3
[WebAssembly] Add path to PIC mode for wasm tables (#67545)
Currently tables cannot be shared between compilation units, therefore
no special treatment is needed for tables.

Fixes #65191
2023-10-03 08:00:21 +02:00
Yolanda Chen
291101aa8e [WebAssembly] Optimize vector shift using a splat value from outside block
The vector shift operation in WebAssembly uses an i32 shift amount type, while
the LLVM IR requires binary operator uses the same type of operands. When the
shift amount operand is splated from a different block, the splat source will
not be exported and the vector shift will be unrolled to scalar shifts. This
patch enables the vector shift to identify the splat source value from the other
block, and generate expected WebAssembly bytecode when lowering.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D158399
2023-08-25 08:13:27 -07:00
Reid Kleckner
984dc4b9cd [WebAssembly] Create separation between MC and CodeGen layers
Move WebAssemblyUtilities from Utils to the CodeGen library. It
primarily deals in MIR layer types, so it really lives in the CodeGen
library.

Move a variety of other things around to try create better separation.

See issue #64166 for more info on layering.

Move llvm/include/CodeGen/WasmAddressSpaces.h back to
llvm/lib/Target/WebAssembly/Utils.

Differential Revision: https://reviews.llvm.org/D156472
2023-08-18 14:08:37 -07:00
Thomas Lively
4f065fcb57 [WebAssembly] Fix incorrect assertion in SIMD reduction codegen
The codegen routine introduced in 18077e9fd688 did not account for vectors with
more than 16 lanes. Remove the incorrect assertion and bail out of the
optimization when encountering this case. Add test cases that previously
triggered the assertion. Unfortunately, these test cases now have terrible
codegen, but that is at least better than crashing.

Fixes #63500.

Differential Revision: https://reviews.llvm.org/D154124
2023-06-30 11:30:18 -07:00
xortoast
bb648c9177 [WebAssembly] Add lowering for llvm.rint and llvm.roundeven
WebAssembly doesn't expose inexact exceptions, so frint can be mapped to
fnearbyint. Likewise, WebAssembly always rounds ties-to-even, so
froundeven can be mapped to fnearbyint.

Differential Revision: https://reviews.llvm.org/D153451
2023-06-23 14:07:11 -07:00
Paulo Matos
55aeb23fe0 [clang][WebAssembly] Implement support for table types and builtins
This commit implements support for WebAssembly table types and
respective builtins. Table tables are WebAssembly objects to store
reference types. They have a large amount of semantic restrictions
including, but not limited to, only being allowed to be declared
at the top-level as static arrays of zero-length. Not being arguments
or result of functions, not being stored ot memory, etc.

This commit introduces the __attribute__((wasm_table)) to attach to
arrays of WebAssembly reference types. And the following builtins to
manage tables:

* ref   __builtin_wasm_table_get(table, idx)
* void  __builtin_wasm_table_set(table, idx, ref)
* uint  __builtin_wasm_table_size(table)
* uint  __builtin_wasm_table_grow(table, ref, uint)
* void  __builtin_wasm_table_fill(table, idx, ref, uint)
* void  __builtin_wasm_table_copy(table, table, uint, uint, uint)

This commit also enables reference-types feature at bleeding-edge.

This is joint work with Alex Bradbury (@asb).

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D139010
2023-06-10 15:53:13 +02:00
Caleb Zulawski
18077e9fd6 [WebAssembly] Re-land 8392bf6000ad
Correctly handle single-element vectors to fix an assertion failure. Add tests
that were missing from the original commit.

Differential Revision: D151782
2023-06-09 08:42:27 -07:00
Thomas Lively
100c756d96 Revert "Improve WebAssembly vector bitmask, mask reduction, and extending"
This reverts commit 8392bf6000ad039bd0e55383d40a05ddf7b4af13.

The commit missed some edge cases that led to crashes. Reverting to resolve
downstream breakage while a fix is pending.
2023-06-08 14:36:29 -07:00
Caleb Zulawski
8392bf6000 Improve WebAssembly vector bitmask, mask reduction, and extending
This is inspired by a recently filed Rust issue noting poor codegen for vector masks (https://github.com/rust-lang/portable-simd/issues/351).

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D151782
2023-06-07 10:20:22 -07:00
Thomas Lively
72a72315b0 [WebAssembly] Mark @llvm.wasm.shuffle lane indices as immediates
This intrinsic is meant to lower directly to the i8x16.shuffle instruction,
which takes its lane index arguments as immmediates. The ISel for the intrinsic
assumed that the lane index arguments were constants, so bitcode that
"incorrectly" used this intrinsic with non-immediate arguments caused an
assertion failure in the backend.

Avoid the crash by defining the lane index arguments to be immediates, matching
the underlying instruction. Update ISel accordingly. This change means that the
bitcode that previously caused a crash will now fail to validate.

Fixes #55559.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D149898
2023-05-05 08:12:41 -07:00
Peter Rong
3b2476910b [WASM] Prevent casting undef to CosntantSDNode
WebAssembly tries to cast an `undef` to `CosntantSDNode` during `LowerAccessVectorElement`.
These operations will trigger an assertion error in cast.
To avoid this issue, we prevent casting, and abort the lowering operation.
A unit test is also included.

This patch fixes [pr61828](https://github.com/llvm/llvm-project/issues/61828)

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D147198
2023-03-30 20:14:11 -07:00
Peter Rong
51a93828d7 [WASM] Fix legalizer for LowerBUILD_VECTOR.
Constants in BUILD_VECTOR may be down cast into a smaller value that fits LaneBits, i.e., the bit width of elements in the vector.
This cast didn't consider 2^N where it would be cast into -2^N, which still doesn't fit into LaneBits after casting.
This will cause an assertion in later legalization.

2^N should be cast into 0, and this patch reflects such behavior.
This patch also includes a test to reflect the fix.
This patch fixes [issue 61780](https://github.com/llvm/llvm-project/issues/61780)

Related patch: https://reviews.llvm.org/D108669

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D147208
2023-03-30 19:20:04 -07:00
Peter Rong
163d7bb941 [WASM] Precommit WebAssemblyISelLowering.cpp format changes for D147198
Signed-off-by: Peter Rong <PeterRong96@gmail.com>
2023-03-29 22:18:53 -07:00
Heejin Ahn
d91c9aef9b [WebAssembly] Select call_indirect for alloca calls
Currently calling stack locations is selected using `CALL` in ISel,
resulting in an invalid code and crashing in AsmPrinter. FastISel
correctly selects it will `CALL_INDIRECT`.

Fixes the problem reported in D146781.

Reviewed By: tlively, HerrCai0907

Differential Revision: https://reviews.llvm.org/D147033
2023-03-29 12:46:58 -07:00
Jun Ma
403926aefe [WebAssembly] Skip implied bitmask operation in LowerShift
This patch skips redundant explicit masks of the shift count since
it is implied inside wasm shift instruction.

Differential Revision: https://reviews.llvm.org/D144619
2023-03-02 09:37:25 +08:00
Luke Lau
fb6602616c [WebAssembly] Explicitly add {z,s}ext so extends are selected
During DAG legalization, {u,s}itofp instructions on v2i8, v2i16, v4i8
and v4i16 types ended up being legalized into scalar instructions, when
they could just be extended to v2i32/v4i32 instead.

Fixes https://github.com/llvm/llvm-project/issues/57182

Differential Revision: https://reviews.llvm.org/D140916
2023-01-06 12:28:29 +00:00
Luke Lau
f841ad30d7 [WebAssembly] Replace LOAD_SPLAT with SPLAT_VECTOR
Splats were selected by matching on uses of `build_vector` with
identical elements, but a while back a target independent node for
vector splatting was added.
This removes the WebAssembly specific LOAD_SPLAT intrinsic, and instead
makes SPLAT_VECTOR legal and adds patterns for splat loads.

Differential Revision: https://reviews.llvm.org/D139871
2023-01-04 15:07:47 +00:00
Luke Lau
8ef5da7010 [WebAssembly] Fix crash when selecting 64 bit lane extract operand
The tablegen patterns on vector_extract only match i32 constants, but
on wasm64 these come in as i64 constants. In certain situations this
would cause crashes whenever it couldn't select an extract_vector_elt
instruction.
Rather than add duplicate patterns for every instruction, this just
canonicalizes the constant to be i32 when lowering.
Fixes https://github.com/llvm/llvm-project/issues/57577

Differential Revision: https://reviews.llvm.org/D140205
2022-12-19 10:37:19 +00:00
Fangrui Song
b0df70403d [Target] llvm::Optional => std::optional
The updated functions are mostly internal with a few exceptions (virtual functions in
TargetInstrInfo.h, TargetRegisterInfo.h).
To minimize changes to LLVMCodeGen, GlobalISel files are skipped.

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 22:43:14 +00:00
Kazu Hirata
20cde15415 [Target] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:06 -08:00
Stanislav Mekhanoshin
bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Paulo Matos
1bd1a44070 [WebAssembly] Use intrinsics for table.get/set instructions
Initial table.get/set implementation would match and lower combinations
of GEP+load/store to table.get/set instructions. However, this is error
prone due to potential combinations of GEP+load/store we don't implement,
and load/store optimizations. By changing the code to using intrinsics, we
 avoid both issues and simplify the code.

New builtins implemented:
* @llvm.wasm.table.get.externref
* @llvm.wasm.table.get.funcref
* @llvm.wasm.table.set.externref
* @llvm.wasm.table.set.funcref

Reviewed By: asb, tlively

Differential Revision: https://reviews.llvm.org/D134436
2022-09-27 09:16:30 +02:00
Fanchen Kong
28557e8c98 [WebAssembly] Improve codegen for shuffles with undefined lane indices
For undefined lane indices, fill the mask with {0..N} instead of zeros to allow
further reduction to word/dword shuffle on the VM.

Reviewed By: tlively, penzn

Differential Revision: https://reviews.llvm.org/D133473
2022-09-13 16:03:18 -07:00