llvm-project

Author	SHA1	Message	Date
Sam Parker	1e0114c21d	[WebAssembly] Zero and NaN checks for min/max (#177968 ) Custom lower FMINNUM, FMINIMUMNUM, FMAXNUM and FMAXIMUMNUM to generate relaxed_min and relaxed_max when the inputs cannot be NaN or signed zero. Tablegen patterns have also been modified to check the above conditions when trying to match relaxed min/max using the pmin/pmax pattern.	2026-01-28 09:25:41 +00:00
Sam Parker	b84ffe040b	[WebAssembly] LoadLane matching with offsets (#176005 )	2026-01-15 08:39:42 +00:00
hanbeom	1171e30cb0	[WebAssembly] Support v128.load{32,64}_zero for f32 and f64 types (#172291 ) This patch extends the `load_zero` pattern matching to support floating-point vector types (`v4f32` and `v2f64`). Previously, the optimization to generate `v128.load32_zero` and `v128.load64_zero` was only enabled for integer types (`v4i32` and `v2i64`). This change adds the necessary TableGen patterns to correctly match scalar floating-point loads inserted into zero-initialized vectors.	2026-01-08 09:28:14 +09:00
Jasmine Tang	672757bf55	[WebAssembly] Add patterns for extadd pairwise (#167960 ) Add a few patterns for extadd pairwise.	2025-11-18 02:41:16 -08:00
Jasmine Tang	e6cd7a52bc	[WebAssembly] [Codegen] Add pattern for relaxed min max from pmin/pmax-based patterns over v4f32 and v2f64 (#164486 ) Related to https://github.com/llvm/llvm-project/issues/55932	2025-10-23 01:39:02 -07:00
Jasmine Tang	1fbfac30f1	[WebAssembly] [Codegen] Add pattern for relaxed min max from fminimum/fmaximum over v4f32 and v2f64 (#162948 ) Related to #55932	2025-10-22 03:08:24 -07:00
Sam Parker	aa63949428	[WebAssembly] Avoid dot for v16i8 partial_smla (#163796 ) The sequence is shorter, by two extend operations, if we just use extmul and extadd_pairwise.	2025-10-20 09:12:00 +01:00
Jasmine Tang	893b1d4187	[WebAssembly] [Codegen] Add patterns for relaxed dot (#163266 ) The pattern I added for `relaxed dot` similar to normal dot @ https://github.com/llvm/llvm-project/pull/151775. For `relaxed dot add`, i noticed that in the proposal the portion of dot implementation is similar to `relaxed dot`, so I think we can add a pattern where after we do relaxed dot and do extadd pairwise, we can do `relaxed dot add`. One current obstacles is I don't think there is any pattern to singly create a extadd pairwise from other instructions so the `relaxed dot add` pattern would not cover a wide range of instructions. related to https://github.com/llvm/llvm-project/issues/55932	2025-10-16 15:01:57 +00:00
Sam Parker	65363e64f8	[WebAssembly] Partial SMLA with relaxed dot (#163529 ) Lower v16i8 to v4i32 partial_smla to relaxed_dot_add. I'm still unsure whether we could/should take advantage of the unknown signedness of the rhs, and also lower the partial_sumla operation too.	2025-10-16 07:09:16 +01:00
Jasmine Tang	55d4e92c88	[WebAssembly] Add extra pattern for dot (#151775 ) Fixes https://github.com/llvm/llvm-project/issues/50154	2025-10-13 10:27:12 -07:00
Sam Parker	1820102167	Wasm fmuladd relaxed (#163177 ) Reland #161355, after fixing up the cross-projects-tests for the wasm simd intrinsics. Original commit message: Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions. If we have FP16, then lower v8f16 fmuladds to FMA. I've introduced an ISD node for fmuladd to maintain the rounding ambiguity through legalization / combine / isel.	2025-10-13 16:50:53 +01:00
Sam Parker	30d3441cf0	Revert "[WebAssembly] Lower fmuladd to madd and nmadd" (#163171 ) Reverts llvm/llvm-project#161355 Looks like I've broken some intrinsic code generation.	2025-10-13 11:53:40 +01:00
Sam Parker	a4eb7ea225	[WebAssembly] Lower fmuladd to madd and nmadd (#161355 ) Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions. If we have FP16, then lower v8f16 fmuladds to FMA. I've introduced an ISD node for fmuladd to maintain the rounding ambiguity through legalization / combine / isel.	2025-10-13 10:36:08 +01:00
Folkert de Vries	761be78dd7	[WebAssembly] recognize saturating truncation (#155470 ) fixes https://github.com/llvm/llvm-project/issues/153838 using the same approach as https://github.com/llvm/llvm-project/pull/155377 Recognize a manual saturating truncation and select the corresponding instruction. This is useful in general, but came up specifically in https://github.com/rust-lang/stdarch because it will allow us to drop more target-specific intrinsics in favor of cross-platform ones.	2025-10-08 11:52:18 -07:00
Sam Parker	156e9b4b69	[WebAssembly] Use partial_reduce_mla ISD nodes (#161184 ) Addresssing issue #160847. Move away from combining the intrinsic call and instead lower the ISD nodes, using tablegen for pattern matching.	2025-09-30 08:28:56 +01:00
Sam Parker	586c0ad918	[WebAssembly] Support partial-reduce accumulator (#158060 ) We currently only support partial.reduce.add in the case where we are performing a multiply-accumulate. Now add support for any partial reduction where the input is being extended, where we can take advantage of extadd_pairwise.	2025-09-12 07:03:49 +01:00
Sam Parker	6dacdc31ec	[WebAssembly] extadd_pairwise for PartialReduce (#157669 ) Avoid using extends, and adding the high and low half and use extadd_pairwise instead.	2025-09-10 08:13:46 +01:00
Jasmine Tang	7fcee5fe08	[WebAssembly] Add support for avgr_u in loops (#153252 ) Fixes https://github.com/llvm/llvm-project/issues/150550. With the test case ``` void f(unsigned char x, unsigned char y, int n) { // should have been vectorized into avgr_u instead of seperated vectorized add and logical right shift for (int i = 0; i < n; i++) x[i] = (x[i] + y[i] + 1) / 2; } ``` the backend failed to recognize that this can be reduced to avgr_u since the loop vectorizer doesn't transform into the existing pattern in tablegen. This PR sets AVGCEIL_U as legal for v8i16 and v16i8 and selects it to avgr_u in the tablegen file.	2025-08-22 09:52:49 -07:00
Jasmine Tang	522ac23609	[WebAssembly] Add pattern for relaxed nmadd (#150684 ) Following footstep of https://github.com/llvm/llvm-project/pull/147487 (support for madd), this PR adds support for nmadd. https://github.com/llvm/llvm-project/issues/55932 tracks this	2025-07-28 10:20:04 -07:00
jjasmine	6640b0a293	[WebAssembly] Add patterns for relaxed madd (#147487 ) [WebAssembly] Fold fadd contract (fmul contract) to relaxed madd w/ -mattr=+simd128,+relaxed-simd Fixes #121311 - Precommit test for #121311 - Fold fadd contract (fmul contract) to relaxed madd w/ -mattr=+simd128,+relaxed-simd - Move PatFrag of fadd_contract in ARM.td and WebAssembly.td to TargetSelectionDAG.td for reuse of pattern	2025-07-15 00:56:28 +08:00
Brendan Dahl	67056c280a	[WebAssembly] Support shuffle for F16x8 vectors. (#127857 )	2025-02-25 10:39:54 -08:00
Sam Parker	df2de13695	[WebAssembly] Autovec support for dot (#123207 ) Enable the use of partial.reduce.add that we can lower to dot or a tree of (add (extmul_low_u, extmul_high_u)) for the unsigned case. We support both v8i16 and v16i8 inputs.	2025-02-03 08:58:43 +00:00
Jay Foad	922992a22f	Fix typo "instrinsic" (#112899 )	2024-10-18 15:58:33 +01:00
Simon Pilgrim	f8f0a266e0	[clang][wasm] Replace the target integer sub saturate intrinsics with the equivalent generic `__builtin_elementwise_sub_sat` intrinsics (#109405 ) Remove the Intrinsic::wasm_sub_sat_signed/wasm_sub_sat_unsigned entries and just use sub_sat_s/sub_sat_u directly	2024-09-22 10:12:41 +01:00
Brendan Dahl	07a7bdc806	[WebAssembly] Fix lane index size for f16x8 extract_lane. (#108118 )	2024-09-11 15:27:38 -07:00
Brendan Dahl	415288a2a7	[WebAssembly] Add load and store patterns for V8F16. (#108119 )	2024-09-11 09:53:53 -07:00
Brendan Dahl	923a1c1fc3	[WebAssembly] Update FP16 opcodes to match current spec. (#106759 ) `f267a3d544/proposals/half-precision/Overview.md (binary-format)`	2024-08-30 13:01:16 -07:00
Brendan Dahl	5703d8572f	[WebAssembly] Add intrinsics to wasm_simd128.h for all FP16 instructions (#106465 ) Getting this to work required a few additional changes: - Add builtins for any instructions that can't be done with plain C currently. - Add support for the saturating version of fp_to_<s,i>_I16x8. Other vector sizes supported this already. - Support bitcast of f16x8 to v128. Needed to return a __f16x8 as v128_t.	2024-08-30 08:42:37 -07:00
Brendan Dahl	7d373cef49	[WebAssembly] Change half-precision feature name to fp16. (#105434 ) This better aligns with how the feature is being referred to and what runtimes (V8) are calling it.	2024-08-22 09:44:33 -07:00
Sam Parker	08decd20a9	[WebAssembly] load_zero to initialise build_vector (#100610 ) Instead of splatting a single lane, to initialise a build_vector, lower to scalar_to_vector which can be selected to load_zero. Also add load_zero and load_lane patterns for f32x4 and f64x2.	2024-08-02 10:11:21 +01:00
Brendan Dahl	0dbd72d6ab	[WebAssembly] Implement f16x8.replace_lane instruction. (#99388 ) Use a builtin and intrinsic until half types are better supported for instruction selection.	2024-07-24 11:55:36 -07:00
Sam Parker	a3de21cac1	[WebAssembly] Ofast pmin/pmax pattern matchers (#100107 ) With fast-math, the ordered setcc nodes are converted to setcc nodes which do not care about NaNs, so add patterns that use setlt, setle, setgt and setge.	2024-07-24 09:23:49 +01:00
Brendan Dahl	928b780840	[WebAssembly] Implement trunc_sat and convert instructions for f16x8. (#95180 ) These instructions can be generated using regular LL intrinsics. Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-06-25 10:39:05 -07:00
Brendan Dahl	3ab6d12625	[WebAssembly] Implement f16x8 madd and nmadd instructions. (#95151 ) Implemented with intrinsics and builtins. Specified at: https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md	2024-06-11 16:10:00 -07:00
Brendan Dahl	dfd1a2f081	[WebAssembly] Implement all f16x8 unary instructions. (#94063 ) All of these instructions can be generated using regular LL intrinsics. Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-06-04 13:06:16 -04:00
Brendan Dahl	8aa8019975	[WebAssembly] Implement all f16x8 relation instructions. (#93751 ) All of these instructions can be generated using regular LL instructions. Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-05-30 09:02:17 -07:00
Brendan Dahl	60bce6eab4	[WebAssembly] Implement all f16x8 binary instructions. (#93360 ) This reuses most of the code that was created for f32x4 and f64x2 binary instructions and tries to follow how they were implemented. add/sub/mul/div - use regular LL instructions min/max - use the minimum/maximum intrinsic, and also have builtins pmin/pmax - use the wasm.pmax/pmin intrinsics and also have builtins Specified at: `29a9b9462c/proposals/half-precision/Overview.md`	2024-05-28 16:33:20 -07:00
Brendan Dahl	4ebe9bba59	[WebAssembly] Implement prototype f16x8.extract_lane instruction. (#93272 ) Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f16x8.extract_lane as opcode 0x124, but this is incorrect and will be changed to 0x121 soon.	2024-05-24 08:31:07 -07:00
Brendan Dahl	09c5525610	[WebAssembly] Implement prototype f16x8.splat instruction. (#93228 ) Adds a builtin and intrinsic for the f16x8.splat instruction. Specified at: `29a9b9462c/proposals/half-precision/Overview.md` Note: the current spec has f16x8.splat as opcode 0x123, but this is incorrect and will be changed to 0x120 soon.	2024-05-23 20:05:22 -07:00
Thomas Lively	767e0c8bce	[WebAssembly] Select BUILD_VECTOR with large unsigned lane values (#85880 ) Previously we expected lane constants to be in the range of signed values for each lane size, but the included test case produced large unsigned values that fall outside that range. Allow instruction selection to proceed in this case rather than failing. Fixes #63817.	2024-03-20 08:42:42 -07:00
xortoast	bb648c9177	[WebAssembly] Add lowering for llvm.rint and llvm.roundeven WebAssembly doesn't expose inexact exceptions, so frint can be mapped to fnearbyint. Likewise, WebAssembly always rounds ties-to-even, so froundeven can be mapped to fnearbyint. Differential Revision: https://reviews.llvm.org/D153451	2023-06-23 14:07:11 -07:00
Thomas Lively	72a72315b0	[WebAssembly] Mark @llvm.wasm.shuffle lane indices as immediates This intrinsic is meant to lower directly to the i8x16.shuffle instruction, which takes its lane index arguments as immmediates. The ISel for the intrinsic assumed that the lane index arguments were constants, so bitcode that "incorrectly" used this intrinsic with non-immediate arguments caused an assertion failure in the backend. Avoid the crash by defining the lane index arguments to be immediates, matching the underlying instruction. Update ISel accordingly. This change means that the bitcode that previously caused a crash will now fail to validate. Fixes #55559. Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D149898	2023-05-05 08:12:41 -07:00
Thomas Lively	abdb5e041c	[WebAssembly] Remove incorrect result from wasm64 store_lane instructions The wasm64 versions of the v128.storeX_lane instructions was incorrectly defined as returning a v128 value, which resulted in spurious drop instructions being emitted and causing validation to fail. This was not caught earlier because wasm64 has been experimental and not well tested. Update the relevant test file to test both wasm32 and wasm64. Fixes #62443. Differential Revision: https://reviews.llvm.org/D149780	2023-05-03 16:00:20 -07:00
Samuel Parker	28ee604071	[WebAssembly] pmin/pmax fixes Reverse the operand ordering to ? rhs : lhs. Differential Revision: https://reviews.llvm.org/D144466	2023-02-22 10:02:16 +00:00
Jun Ma	e9d7f96a11	[WebAssembly] Add more combine pattern for vector shift After change with D144169, the codegen generates redundant instructions like and and wrap. This fixes it. Differential Revision: https://reviews.llvm.org/D144360	2023-02-22 09:53:00 +08:00
Samuel Parker	a674a12dd5	[WebAssembly] Additional patterns for pmin/pax Each operation was missing their inverted condition using olt or ogt. Also, as we don't need to discern +/-0, I think we should also be able to use ole and oge. Differential Revision: https://reviews.llvm.org/D143581	2023-02-10 09:54:45 +00:00
Luke Lau	f841ad30d7	[WebAssembly] Replace LOAD_SPLAT with SPLAT_VECTOR Splats were selected by matching on uses of `build_vector` with identical elements, but a while back a target independent node for vector splatting was added. This removes the WebAssembly specific LOAD_SPLAT intrinsic, and instead makes SPLAT_VECTOR legal and adds patterns for splat loads. Differential Revision: https://reviews.llvm.org/D139871	2023-01-04 15:07:47 +00:00
Luke Lau	0cd9c51766	[WebAssembly] Use ComplexPattern on remaining memory instructions This continues the refactoring work of selecting offset + address operands with the AddrOpsN pattern, previously called LoadOpsN. This is not an NFC, since constant addresses are now folded into the offset in more places for v128.storeN_lane. Differential Revision: https://reviews.llvm.org/D139950	2022-12-15 10:20:06 +00:00
Luke Lau	982b8e0bbb	[WebAssembly][NFC] Add ComplexPattern for loads This refactors out the offset and address operand pattern matching into a ComplexPattern, so that one pattern fragment can match the dynamic and static (offset) addresses in all possible positions. Split out from D139530, which also contained an improvement to global address folding. Differential Revision: https://reviews.llvm.org/D139631	2022-12-14 12:11:30 +00:00
Thomas Lively	ae96b5bd2d	[WebAssembly] Update relaxed-simd instruction names Including builtin and intrinsic names. These should be the final names for the proposal. https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md Reviewed By: aheejin, maratyszcza Differential Revision: https://reviews.llvm.org/D138249	2022-11-21 12:40:15 -08:00

1 2 3 4 5

215 Commits