1315 Commits

Author SHA1 Message Date
Benjamin Maxwell
b91eb9b4e5
[SDAG] Implement missing legalization for ISD::VECTOR_FIND_LAST_ACTIVE (#180290)
This lowers the splitting as:
```
any_active(hi_mask)
  ? (find_last_active(hi_mask) + lo_mask.getVectorElementCount())
  : find_last_active(lo_mask)
```

And trivially lowers `<1 x i1>` scalarization to returning zero. Which
is a natural result of the splitting (and the lack of a sentinel
"none-active" result value).

The lowerings likely can be improved. This patch is for completeness.

Should fix:
https://github.com/llvm/llvm-project/pull/178862#issuecomment-3862310334
Fixes #180212
2026-02-10 09:01:13 +00:00
Demetrius Kanios
4919e0da50
[WebAssembly][FastISel] Make use of sign-ext proposals instructions when available (#179855)
Enables FastISel to use the dedicated sign-extension instructions
(rather than shl, shr) when available.
2026-02-06 12:41:39 -08:00
Demetrius Kanios
9976e5702f
[WebAssembly][GlobalISel] Part 1 - Setup skeleton (#178796)
This PR is the first step towards bringing GlobalISel to the Wasm
backend.

Split from #157161
2026-02-06 18:38:56 +00:00
Derek Schuff
c3db52701e
[MC][Wasm] Emit useful error message when encountering common symbols (#179586)
We don't currently support common symbols for Wasm, and we currently
emit a generic error with a backtrace. Instead, don't crash, and report
the names of the offending symbols.
2026-02-06 00:40:25 +00:00
hanbeom
22f53531d7
[WebAssembly] Combine shuffle and signed extend to extend_high (#179166)
Fold shuffles and bitcasts feeding extend_low_s into extend_high_s.
This enables i32x4.dot_i16x8_s selection and removes redundant shuffles.

Fixed: https://github.com/llvm/llvm-project/issues/179145
2026-02-03 17:02:53 +09:00
Demetrius Kanios
95ac9314df
[WebAssembly] Prevent FastISel from trying to select funcref calls (#178742)
Before, Wasm FastISel treated all indirect calls the same, causing
miscompilations at O0 when trying to call a funcref (`call ptr
addrspace(20)`), as it would treat the funcref as a normal `ptr`

This adds a check so it falls back to ISelDAG when encountering calls
outside addrspace 0 (which covers direct calls and indirect calls
through normal function pointers).

Related: #140933
2026-01-30 12:05:15 -08:00
hanbeom
16d8d4b84e
[WebAssembly] Fix crash in ReplaceNodeResults for ANY_EXTEND_VECTOR_INREG (#178374)
Fixes a crash during type legalization by allowing
ISD::ANY_EXTEND_VECTOR_INREG to fall back to default expansion instead
of hitting llvm_unreachable.

Fixed: #177209
2026-01-28 20:45:04 +09:00
Sam Parker
1e0114c21d
[WebAssembly] Zero and NaN checks for min/max (#177968)
Custom lower FMINNUM, FMINIMUMNUM, FMAXNUM and FMAXIMUMNUM to generate
relaxed_min and relaxed_max when the inputs cannot be NaN or signed
zero.

Tablegen patterns have also been modified to check the above conditions
when trying to match relaxed min/max using the pmin/pmax pattern.
2026-01-28 09:25:41 +00:00
valadaptive
cdc6a84c14
TargetLowering: Allow FMINNUM/FMAXNUM to lower to FMINIMUM/FMAXIMUM even without nsz (#177828)
This restriction was originally added in
https://reviews.llvm.org/D143256, with the given justification:

> Currently, in TargetLowering, if the target does not support fminnum,
we lower to fminimum if neither operand could be a NaN. But this isn't
quite correct because fminnum and fminimum treat +/-0 differently; so,
we need to prove that one of the operands isn't a zero.

As far as I can tell, this was never correct. Before
https://github.com/llvm/llvm-project/pull/172012, `minnum` and `maxnum`
were nondeterministic with regards to signed zero, so it's always been
perfectly legal to lower them to operations that order signed zeroes.
2026-01-25 18:24:12 -05:00
Matt Arsenault
0d4a35d560
IR: Remove llvm.convert.to.fp16 and llvm.convert.from.fp16 intrinsics (#174484)
These are long overdue for removal. These were originally a hack
to support loading half values before there was any / decent support
for the half type through the backend. There's no reason to continue
supporting these, they're equivalent to fpext/fptrunc with a bitcast.

SelectionDAG stopped translating these directly, and used the
bitcast + fp cast since f7a02c17628e825, so there's been no reason
to use these since 2014.
2026-01-21 09:50:28 +00:00
Sam Parker
b84ffe040b
[WebAssembly] LoadLane matching with offsets (#176005) 2026-01-15 08:39:42 +00:00
Sam Parker
94913cf150
[NFC][WebAssembly] More memory interleave tests (#175918) 2026-01-14 12:26:10 +00:00
Sam Parker
e5b6833e49
[WebAssembly] vi8 mul cost modelling. (#175177)
We've already optimised these, so update the cost model to reflect it.
And skip the isBeforeLegalize check when lowering i8 muls, because it
then misses the cases where, say v32i8, has been type legalised into 2x
v16i8.

Also explicitly disable memory interleaving for any factor other than
two or four.
2026-01-12 09:25:54 +00:00
Derek Schuff
4c61843e44
[WebAssembly] Add wasm64 testing to varargs.ll [NFC] (#175102)
Looking at https://github.com/llvm/llvm-project/pull/173580 revealed
that our testing of varargs is inadequate. This is a start on improving it.
2026-01-09 08:51:06 -08:00
Derek Schuff
7a22bea512
[WebAssembly] Expand vector frem instructions (#174854)
Commit
6ad41bcc49
changed how frem is expanded during legalization and it
broke WebAssembly but we were missing test coverage. We want to maintain
our previous behavior of unrolling vectors and using a libcall to
implement scalar frem. I'm not sure why this now has to be different
(in ISelLowering) from other libcalls like fsin which work the same way
in the end, but this code does accurately describe what we want.

Fixes: https://github.com/emscripten-core/emscripten/issues/25991
2026-01-08 16:19:44 -08:00
Trevor Gross
4903c6260c
[WebAssembly] Change half to use soft promotion rather than PromoteFloat (#152833)
The default `half` legalization, which Wasm currently uses, does not
respect IEEE conventions: for example, casting to bits may invoke a lossy
libcall, meaning soft float operations cannot be correctly implemented.
Change to the soft promotion legalization which passes `f16` as an `i16`
and treats each `half` operation as an individual
f16->f32->libcall->f32->f16 sequence.

Of note in the test updates are that `from_bits` and `to_bits` are now
libcall-free, and that chained operations now round back to `f16` after
each step.

Fixes the wasm portion of
https://github.com/llvm/llvm-project/issues/97981
Fixes the wasm portion of
https://github.com/llvm/llvm-project/issues/97975
Fixes: https://github.com/llvm/llvm-project/issues/96437
Fixes: https://github.com/llvm/llvm-project/issues/96438
2026-01-08 15:07:59 +01:00
Derek Schuff
c99db7136e
[WebAssembly] Disable explicit-locals in the libcalls.ll test. NFC (#174811)
The keep-registers mode isn't super useful without disabling
explicit-locals,
as the local gets/sets are irrelevant noise in most cases.
Switching this test makes the output much more concise and will make
upcoming
changes easier to review.
2026-01-07 16:52:52 -08:00
hanbeom
1171e30cb0
[WebAssembly] Support v128.load{32,64}_zero for f32 and f64 types (#172291)
This patch extends the `load_zero` pattern matching to
support floating-point vector types (`v4f32` and `v2f64`).

Previously, the optimization to generate `v128.load32_zero` and
`v128.load64_zero` was only enabled for integer types
(`v4i32` and `v2i64`). This change adds the necessary TableGen
patterns to correctly match scalar floating-point loads inserted
into zero-initialized vectors.
2026-01-08 09:28:14 +09:00
Stefan Weigl-Bosker
da8497ed08
[IR][Verifier] Verification for target-features attribute (#173119)
Fixes https://github.com/llvm/llvm-project/issues/172647

Currently, MC assumes that all `target-feature` flag attributes are well
formed and will crash otherwise. This change handles those cases more
gracefully.
2025-12-22 11:13:56 +01:00
Derek Schuff
6d60d3d7e4
Revert "[WebAssembly] Implement addrspacecast to funcref" (#170785)
Reverts llvm/llvm-project#166820
There was a failure in the ENABLE_EXPENSIVE_CHECKS configuration.
2025-12-04 17:24:14 -08:00
Demetrius Kanios
d3b9fd0f86
[WebAssembly] Implement addrspacecast to funcref (#166820)
Adds lowering of `addrspacecast [0 -> 20]` to allow easy conversion of
function pointers to Wasm `funcref`

When given a constant function pointer, it lowers to a direct
`ref.func`. Otherwise it lowers to a `table.get` from
`__indirect_function_table` using the provided pointer as the index.
2025-12-04 16:34:42 -08:00
Jasmine Tang
e0db7f347c
[WebAssembly] Optimize away mask of 63 for sra and srl( zext (and i32 63))) (#170128)
Follow up to #71844 after shl implementation
2025-12-02 18:23:17 +00:00
Jasmine Tang
edd1856686
[WebAssembly] Optimize away mask of 63 for shl ( zext (and i32 63))) (#152397)
Fixes https://github.com/llvm/llvm-project/issues/71844
2025-12-01 11:32:46 +00:00
hstk30-hw
a6cec3f3e5
Reland "[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)" (#169219)
Reland d5f3ab8ec97786476a077b0c8e35c7c337dfddf2, fix testcases.
2025-11-24 09:27:25 +08:00
Aiden Grossman
d5f3ab8ec9 Revert "[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)"
This reverts commit 0859ac5866a0228f5607dd329f83f4a9622dedcc.

This caused a couple test failures, likely due to a mid-air collision.
Reverting for now to get the tree back to green and allow the original
author to run UTC/friends and verify the output.
2025-11-23 05:17:45 +00:00
hstk30-hw
0859ac5866
[RegAlloc] Fix the terminal rule check for interfere with DstReg (#168661)
This maybe a bug which is introduced by commit
6749ae36b4a33769e7a77cf812d7cd0a908ae3b9, and has been present ever
since.
In this case, `OtherReg` always overlaps with `DstReg` cause they from
the `Copy` all.
2025-11-23 10:11:24 +08:00
Sam Parker
e44646b795
[WebAssembly] Lower ANY_EXTEND_VECTOR_INREG (#167529)
Treat it in the same manner of zero_extend_vector_inreg and generate an
extend_low_u if possible. This is to try an prevent expensive shuffles
from being generated instead. computeKnownBitsForTargetNode has also
been updated to specify known zeros on extend_low_u.
2025-11-20 08:57:08 +00:00
Jasmine Tang
672757bf55
[WebAssembly] Add patterns for extadd pairwise (#167960)
Add a few patterns for extadd pairwise.
2025-11-18 02:41:16 -08:00
Hongyu Chen
63e6373efd
[WebAssembly] Truncate extra bits of large elements in BUILD_VECTOR (#167223)
Fixes https://github.com/llvm/llvm-project/issues/165713
This patch handles out-of-bound vector elements and truncates extra
bits.
2025-11-17 10:39:18 +00:00
Matt Arsenault
dfdada1b78
CodeGen: Remove target hook for terminal rule (#165962)
Enables the terminal rule for remaining targets
2025-11-12 21:12:19 +00:00
Hongyu Chen
9697f4b9e4
[WebAssembly][FastISel] Bail out on meeting non-integer type in selectTrunc (#167165)
Fixes https://github.com/llvm/llvm-project/issues/165438
With `simd128` enabled, we may meet vector type truncation in FastISel.
To respect #138479, this patch merely bails out on non-integer IR types,
though I prefer bailing out for all non-simple types as most targets
(X86, AArch64) do.
2025-11-12 04:33:41 +08:00
Sam Parker
d47fdfec2b
[NFC][WebAssembly] Precommit test. (#167520) 2025-11-11 16:20:12 +00:00
Sam Parker
d10a85167a
[WebAssembly] Implement more of getCastInstrCost (#164612)
Fill out more information for sign and zero extend and add some truncate
information; however, the primary change is to int/fp conversions. In
particular, fp to (narrow) int appears to be relatively expensive.
2025-11-10 08:07:16 +00:00
Sam Parker
9e6a31f832
[WebAssembly] vf32 to vi8, vi16 lowering (#164644)
Avoid scalarizing the conversion and use trunc_sat and narrow instead.
2025-11-06 08:32:44 +00:00
Kleis Auke Wolthuizen
4b367e0b85
[WebAssembly] Use IRBuilder in FixFunctionBitcasts (NFC) (#164268)
Simplifies the code a bit.
2025-11-05 01:35:15 +00:00
Jasmine Tang
e6cd7a52bc
[WebAssembly] [Codegen] Add pattern for relaxed min max from pmin/pmax-based patterns over v4f32 and v2f64 (#164486)
Related to https://github.com/llvm/llvm-project/issues/55932
2025-10-23 01:39:02 -07:00
Florian Hahn
a7672fee0f
[WebAssembly] Fixup test after bfc322dd724735.
Test update was missed in bfc322dd724735 due a codegen test running
loop-vectorize directly. The loop does not get vectorized any longer.
2025-10-22 22:34:47 +01:00
Sam Parker
20340accf2
[NFC][WebAssembly] FP conversion interleave tests (#164576) 2025-10-22 11:43:44 +01:00
Jasmine Tang
1fbfac30f1
[WebAssembly] [Codegen] Add pattern for relaxed min max from fminimum/fmaximum over v4f32 and v2f64 (#162948)
Related to #55932
2025-10-22 03:08:24 -07:00
Sam Parker
aa63949428
[WebAssembly] Avoid dot for v16i8 partial_smla (#163796)
The sequence is shorter, by two extend operations, if we just use extmul
and extadd_pairwise.
2025-10-20 09:12:00 +01:00
Jasmine Tang
893b1d4187
[WebAssembly] [Codegen] Add patterns for relaxed dot (#163266)
The pattern I added for `relaxed dot` similar to normal dot @
https://github.com/llvm/llvm-project/pull/151775.

For `relaxed dot add`, i noticed that in the proposal the portion of dot
implementation is similar to `relaxed dot`, so I think we can add a
pattern where after we do relaxed dot and do extadd pairwise, we can do
`relaxed dot add`.

One current obstacles is I don't think there is any pattern to singly
create a extadd pairwise from other instructions so the `relaxed dot
add` pattern would not cover a wide range of instructions.

related to https://github.com/llvm/llvm-project/issues/55932
2025-10-16 15:01:57 +00:00
Sam Parker
65363e64f8
[WebAssembly] Partial SMLA with relaxed dot (#163529)
Lower v16i8 to v4i32 partial_smla to relaxed_dot_add. I'm still unsure
whether we could/should take advantage of the unknown signedness of the
rhs, and also lower the partial_sumla operation too.
2025-10-16 07:09:16 +01:00
Derek Schuff
19a58a5208
[WebAssembly] Optimize lowering of constant-sized memcpy and memset (#163294)
We currently emit a check that the size operand isn't zero, to avoid
executing the wasm memory.copy instruction when it would trap.
But this isn't necessary if the operand is a constant.

Fixes #163245
2025-10-14 22:00:25 +00:00
Derek Schuff
3e22438320
[CodeGen] Use getObjectPtrOffset to generate loads/stores for mem intrinsics (#80184)
This causes address arithmetic to be generated with the 'nuw' flag, 
allowing WebAssembly constant offset folding.

Fixes #79692
2025-10-13 17:22:48 -07:00
Jasmine Tang
55d4e92c88
[WebAssembly] Add extra pattern for dot (#151775)
Fixes https://github.com/llvm/llvm-project/issues/50154
2025-10-13 10:27:12 -07:00
Sam Parker
1820102167
Wasm fmuladd relaxed (#163177)
Reland #161355, after fixing up the cross-projects-tests for the wasm
simd intrinsics.

Original commit message:
Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions.
If we have FP16, then lower v8f16 fmuladds to FMA.

I've introduced an ISD node for fmuladd to maintain the rounding
ambiguity through legalization / combine / isel.
2025-10-13 16:50:53 +01:00
Sam Parker
30d3441cf0
Revert "[WebAssembly] Lower fmuladd to madd and nmadd" (#163171)
Reverts llvm/llvm-project#161355

Looks like I've broken some intrinsic code generation.
2025-10-13 11:53:40 +01:00
Sam Parker
a4eb7ea225
[WebAssembly] Lower fmuladd to madd and nmadd (#161355)
Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions.
If we have FP16, then lower v8f16 fmuladds to FMA.

I've introduced an ISD node for fmuladd to maintain the rounding
ambiguity through legalization / combine / isel.
2025-10-13 10:36:08 +01:00
Folkert de Vries
761be78dd7
[WebAssembly] recognize saturating truncation (#155470)
fixes https://github.com/llvm/llvm-project/issues/153838
using the same approach as
https://github.com/llvm/llvm-project/pull/155377

Recognize a manual saturating truncation and select the corresponding
instruction. This is useful in general, but came up specifically in
https://github.com/rust-lang/stdarch because it will allow us to drop
more target-specific intrinsics in favor of cross-platform ones.
2025-10-08 11:52:18 -07:00
Derek Schuff
abc8aac6d2
[WebAssembly] Check intrinsic argument count before Any/All combine (#162163)
This code is activated on all INTRINSIC_WO_CHAIN but only handles
a selection. However it was trying to read the arguments before
checking which intrinsic it was handling. This fails for intrinsics
that have no arguments.
2025-10-07 23:52:25 +00:00